Preprint Article Version 1 This version is not peer-reviewed

A Hybrid Deep-Learning Approach for Multi-class Classification of Cyberbullying Using Multi-modal Social Media Data

Version 1 : Received: 5 November 2024 / Approved: 6 November 2024 / Online: 7 November 2024 (07:20:38 CET)

How to cite: Tabassum, I.; Nunavath, V. A Hybrid Deep-Learning Approach for Multi-class Classification of Cyberbullying Using Multi-modal Social Media Data. Preprints 2024, 2024110392. https://doi.org/10.20944/preprints202411.0392.v1 Tabassum, I.; Nunavath, V. A Hybrid Deep-Learning Approach for Multi-class Classification of Cyberbullying Using Multi-modal Social Media Data. Preprints 2024, 2024110392. https://doi.org/10.20944/preprints202411.0392.v1

Abstract

Cyberbullying is defined as the use of social media platforms to hurt or humiliate people online. The anonymity on these platforms makes it easy to spread hurtful content, sometimes leading victims to self-harm. This highlights the urgent need for efficient methods to identify to prevent cyberbullying. Numerous studies have addressed this issue by focusing on cyberbullying classification, primarily through binary classification using multi-modal data or multi-class approaches targeting either text or image data. While deep-learning advancements have improved cyberbullying identification, a gap remains in the multi-class classification of cyberbullying utilizing multi-modal data such as memes and this research aims to bridge this gap by accurately classifying cyberbullying across multiple data modalities through a hybrid deep-learning model that combines RoBERTa for text extraction and Vision Transformer (ViT) for images extraction as hybrid(RoBERTa+ViT) model using late fusion module process. Two datasets were utilized: a Private-dataset was collected from comments on social media videos and a Public-dataset that was downloaded from existing research. The hybrid model was trained on these two datasets and the model demonstrated notable performance, achieving an an accuracy of 99.24% on the Public-dataset and 96.1% on the Private-dataset, with F1-scores of 0.9924 and 0.9599, respectively.

Keywords

Cyberbullying; Multi-modal Data; Multi-class Classification; Deep-Learning; Hybrid (RoBERTa+ViT) Model; Late Fusion; Social Media

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.