Details

AI-Powered Cyberbullying Detection in Indian Languages: An Ensemble Approach With XAI

Manav Shah

Vellore Institute of Technology, Vellore

Dr. Ranjithkumar S

Professor, Vellore Institute of Technology, Vellore

57-75

Vol: 16, Issue: 1, 2026

Receiving Date: 2025-12-21 Acceptance Date:

2026-01-20

Publication Date:

2026-02-08

Download PDF

http://doi.org/10.37648/ijrst.v16i01.007

Abstract

In recent times, all of us have been a clear witness to how social media has become so inevitable for us and how we are surrounded with data all around us. With the extensive use of social media, the major downside has been cyberbullying. Therefore, detecting cyberbullying and prevention of cyberbullying is of vital importance. Majority of the work that has been done in this field includes only English and Arabic. In this study, a cyberbullying detection model based on DL techniques has been used. An Ensemble model consisting of ERNIE and RNN has been proposed that detects if the data present is cyberbully content or not and provides a good accuracy. The model is evaluated using a dataset sourced from Kaggle and social media, featuring content in three languages: English, Hindi, and Bengali. Also, the model is tested on English as well as some regional languages spoken widely in a multilingual country like India. The dataset consists of comments and posts in English as well as other languages like Hindi, and Bengali. The model provides an accuracy of 98.4% for English language, 86.6% for Hindi and 83.75% for Bengali.

Keywords: Artificial Neural Networks (ANN); Explainable Artificial Intelligence (XAI); Bidirectional Long-Short Term Memory (Bi-LSTM); Convolutional Neural Networks (CNN)

References

  1. Muneer, A., Alwadain, A., Ragab, M. G., & Alqushaibi, A. (2023). Cyberbullying detection on social media using stacking ensemble learning and enhanced BERT. Information, 14(8), 467. https://doi.org/10.3390/info14080467
  2. López-Vizcaíno, M. F., Nóvoa, F. J., Carneiro, V., & Cacheda, F. (2021). Early detection of cyberbullying on social media networks. Future Generation Computer Systems, 118, 219–229. https://doi.org/10.1016/j.future.2021.01.006
  3. Jain, V., Kumar, V., Pal, V., & Vishwakarma, D. K. (2021, April). Detection of cyberbullying on social media using machine learning. In 2021 5th International Conference on Computing Methodologies and Communication (ICCMC) (pp. 1091–1096). IEEE. https://doi.org/10.1109/ICCMC51019.2021.9417777
  4. Sultan, T., Jahan, N., Basak, R., Jony, M. S. A., & Nabil, R. H. (2023). Machine learning in cyberbullying detection from social-media image or screenshot with optical character recognition. International Journal of Intelligent Systems and Applications (IJISA), 15(2), 1–13. https://doi.org/10.5815/ijisa.2023.02.01
  5. Alduailaj, A. M., & Belghith, A. (2023). Detecting Arabic cyberbullying tweets using machine learning. Machine Learning and Knowledge Extraction, 5(1), 29–42. https://doi.org/10.3390/make5010018
  6. Obaid, M. H., Guirguis, S. K., & Elkaffas, S. M. (2023). Cyberbullying detection and severity determination model. IEEE Access, 11, 110723–110735. https://doi.org/10.1109/ACCESS.2023.3321237
  7. Xingyi, G., & Adnan, H. (2024). Potential cyberbullying detection in social media platforms based on a multi-task learning framework. International Journal of Data and Network Science, 8(1), 25–34. https://doi.org/10.5267/j.ijdns.2023.10.021
  8. Fati, S. M., Muneer, A., Alwadain, A., & Balogun, A. O. (2023). Cyberbullying detection on Twitter using deep learning-based attention mechanisms and continuous bag of words feature extraction. Mathematics, 11(16), 3567. https://doi.org/10.3390/math11163567
  9. Ahmed, M. T., Akter, N., Rahman, M., Das, D., Azm, T., & Rashed, G. (2023). Multimodal cyberbullying meme detection from social media using deep learning approach. International Journal of Computer Science and Information Technology (IJCSIT), 15(4), 27–37. https://doi.org/10.5121/ijcsit.2023.15403
  10. Rani, M. U., Ramesh, M. A., Srinivas, M. G., Ganesh, M. S., & Lakshmi, M. D. V. (2023). Detection of cyberbullying on social media. Journal of Engineering Sciences, 14(4), 1–15. https://doi.org/10.36897/jes.2023.14.04.01
  11. Awate, V., Bagad, V., Jadhav, S., & Jadhao, B. (2023). Detection of cyberbullying on social media using machine learning. Advancement in Image Processing and Pattern Recognition, 6(2), 6–12.
  12. Muhariya, A., Riadi, I., Prayudi, Y., & Saputro, I. A. (2023). Utilizing K-means clustering for the detection of cyberbullying within Instagram comments. Ingénierie des Systèmes d’Information, 28(4).
  13. Huang, H., & Qi, D. (2023). Cyberbullying detection on social media. Higher Education and Oriental Studies, 3(1).
  14. Balakrishnan, V., & Ng, S. K. (2023). Personality and emotion based cyberbullying detection on YouTube using ensemble classifiers. Behaviour and Information Technology, 42(13), 2296–2307. https://doi.org/10.1080/0144929X.2022.2079756
  15. Nahar, K. M., Alauthman, M., Yonbawi, S., & Almomani, A. (2023). Cyberbullying detection and recognition with type determination based on machine learning. Computers, Materials & Continua, 75(3).
  16. Dewani, A., Memon, M. A., Bhatti, S., Sulaiman, A., Hamdi, M., Alshahrani, H., Alghamdi, A., & Shaikh, A. (2023). Detection of cyberbullying patterns in low resource colloquial Roman Urdu microtext using natural language processing, machine learning, and ensemble techniques. Applied Sciences, 13(4), 2062. https://doi.org/10.3390/app13042062
  17. Mehta, H., & Passi, K. (2022). Social media hate speech detection using explainable artificial intelligence (XAI). Algorithms, 15(8), 291. https://doi.org/10.3390/a15080291
  18. Neelakandan, S., Sridevi, M., Chandrasekaran, S., Murugeswari, K., Pundir, A. K. S., Sridevi, R., & Bheema, T. (2022). Deep learning approaches for cyberbullying detection and classification on social media.
  19. Ejaz, N., Razi, F., & Choudhury, S. (2024). Towards comprehensive cyberbullying detection: A dataset incorporating aggressive texts, repetition, peerness, and intent to harm. Computers in Human Behavior, 153, 108123. https://doi.org/10.1016/j.chb.2023.108123
  20. Sultan, D., Omarov, B., Kozhamkulova, Z., Kazbekova, G., Alimzhanova, L., Dautbayeva, A., Zholdassov, Y., & Abdrakhmanov, R. (2023). Review of machine learning techniques in cyberbullying detection.
  21. Roy, P. K., & Mali, F. U. (2022). Cyberbullying detection using deep transfer learning. Complex & Intelligent Systems, 8, 5449–5467. https://doi.org/10.1007/s40747-022-00756-6
  22. Luo, Y., Zhang, X., Hua, J., & Shen, W. (2021). Multi-featured cyberbullying detection based on deep learning. In 2021 16th International Conference on Computer Science and Education (ICCSE) (pp. 746– 751). IEEE. https://doi.org/10.1109/ICCSE52927.2021.9548577
  23. Alam, K. S., Bhowmik, S., & Prosun, P. R. K. (2021). Cyberbullying detection: An ensemble-based machine learning approach. In 2021 Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV) (pp. 710–715). IEEE. https://doi.org/10.1109/ICICV52345.2021.9515567
  24. Dewani, A., Memon, M. A., & Bhatti, S. (2021). Cyberbullying detection: Advanced preprocessing techniques and deep learning architecture for Roman Urdu data. Journal of Big Data, 8(1), 1–20. https://doi.org/10.1186/s40537-021-00504-3
  25. Mahat, M. (2021). Detecting cyberbullying across multiple social media platforms using deep learning. In 2021 International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE) (pp. 299–301). IEEE. https://doi.org/10.1109/ICACITE51222.2021.9538727
  26. Jain, N., Hegde, A., Jain, A., Joshi, A., & Madake, J. (2021). Pseudo-conventional approach for cyberbullying and hate-speech detection. In 2021 International Conference on Advances in Computing, Communication, and Control (ICAC3) (pp. 1–8). IEEE. https://doi.org/10.1109/ICAC353735.2021.9591367
  27. Yadav, Y., Bajaj, P., Gupta, R. K., & Sinha, R. (2021). A comparative study of deep learning methods for hate speech and offensive language detection in textual data. In 2021 IEEE 18th India Council International Conference (INDICON) (pp. 1–6). IEEE. https://doi.org/10.1109/INDICON52577.2021.9701347
  28. Lee, E., Rustam, F., Washington, P. B., El Barakaz, F., Aljedaani, W., & Ashraf, I. (2022). Racism detection by analysing differential opinions through sentiment analysis of tweets using stacked ensemble GCRNN model. IEEE Access, 10, 9717–9728. https://doi.org/10.1109/ACCESS.2022.3143125
  29. Berrimi, M., Moussaoui, A., Oussalah, M., & Saidi, M. (2020). Attention based networks for analysing inappropriate speech in Arabic text. In 2020 4th International Symposium on Informatics and Its Applications (ISIA) (pp. 1–6). IEEE. https://doi.org/10.1109/ISIA50717.2020.9311357
  30. Dubey, K., Nair, R., Khan, M. U., & Shaikh, S. (2020). Toxic comment detection using LSTM. In 2020 Third International Conference on Advances in Electronics, Computers and Communications (ICAECC) (pp. 1– 8). IEEE. https://doi.org/10.1109/ICAECC48674.2020.9312322
  31. d’Sa, A. G., Illina, I., & Fohr, D. (2020). BERT and fasttext embeddings for automatic detection of toxic speech. In 2020 International Multi-Conference on: “Organization of Knowledge and Advanced Technologies” (OCTA) (pp. 1–5). IEEE. https://doi.org/10.1109/OCTA51217.2020.9312707
  32. Yuvaraj, N., Chang, V., Gobinathan, B., Pinagapani, A., Kannan, S., Dhiman, G., & Rajan, A. R. (2021). Automatic detection of cyberbullying using multi-feature based artificial intelligence with deep decision tree classification. Computers & Electrical Engineering, 92, 107186. https://doi.org/10.1016/j.compeleceng.2021.107186
Back

Disclaimer: Indexing of published papers is subject to the evaluation and acceptance criteria of the respective indexing agencies. While we strive to maintain high academic and editorial standards, International Journal of Research in Science and Technology does not guarantee the indexing of any published paper. Acceptance and inclusion in indexing databases are determined by the quality, originality, and relevance of the paper, and are at the sole discretion of the indexing bodies.