The spread of aggressive memes on social media is a major problem, demanding solutions for the automatic detection of these undesired memes. However, the line between freedom of expression and the dissemination of aggressive messages is very blurred and difficult to define. Moreover, the detection of aggressive elements in memes presents a significant challenge because memes are mostly multimodal, e.g., a composition of an image with a concise textual content which often lacks a direct connection or correlation. The advent of Generative Models presents an interesting alternative to unravel the inherent meaning of aggressive memes by allowing a fully multimodal analysis. This paper examines the use of Generative Models to detect aggressive memes. The proposed methodology groups aggressive memes from existing datasets according to levels of multimodal analysis difficulty and, together with the creation of prompts via zero-shot, tests and verifies how well these models perform in the task of detecting this type of content and where failures may occur. To validate the methodology, a series of experiments using three models and five datasets was conducted. According to manual annotation, these data were divided into three datasets representing three multimodality reasoning levels, which were submitted to the models to perform the task of identifying the presence of aggressive content. The results show the viability of using generative models in this task, as well as highlight the better performance of some models compared to others. Moreover, a comparison with a non-generative model specially trained on the meme datasets shows that generative models can be competitive in the task of detecting memes with aggressive content as long as the multimodality reasoning level required is not very high. The findings can help academics, managers, and decision-makers develop more effective tools for detecting aggressive content while balancing freedom of expression. The research also serves as a reference for improving moderation systems and guiding future policies in online communication.
