JoWUA

Volume 17 - Issue 1

Early Detection of Multilingual Mental Health Depression Using Pretrained Transformers and Machine Learning

Ali Sami Azeez Department of Information Technology Management, Technical College of Management, Middle Technical University, Baghdad, Iraq.
ali.sami@mtu.edu.iq 0000-0003-3433-7
Osama Abduljaleel Ali Computer Center, Al-Muthanna University, Al-Muthanna, Iraq.
osama@mu.edu.iq 0000-0002-0711-5025
Nawar Abbood Fadhil Department of Information Technology Management, Technical College of Management, Middle Technical University, Baghdad, Iraq.
nawar@mtu.edu.iq 0000-0002-7741-2965
Dr. Ali Mohammed Sahan Department of Information Technology Management, Technical College of Management, Middle Technical University, Baghdad, Iraq.
dralimohammed2@gmail.com 0000-0001-5161-4756

DOI: 10.58346/JOWUA.2026.I1.016

Keywords: Multilingual Depression Detection, Mental Health Analytics, Social Media Mining, Transformer Models, XLM-RoBERTa, Machine Learning, Natural Language Processing, Digital Mental Health.

Abstract

The social media is producing vast amounts of user-generated text, which can serve as a great indicator of initial mental health diagnosis. This paper develops a scalable, multilingual depression classifier based on classical machine learning (ML) methods and state-of-the-art, pretrained transformer-based models to overcome the weaknesses of language-specific and binary-only methods in previous studies. In a contrast to the majority of the studies, the work is a systematic exploration of bilingual and multilingual depression recognition in the context of Arabic, English, Russian, and Spanish data in a single pipeline. TF-IDF is used to represent textual information to conventional ML classifiers, such as SVM, Random Forest, Naive Bayes and AdaBoost, and transformers, such as XLM-RoBERTa and XLNet are used to train contextual semantic representations. Decades of experiments demonstrate that models using transformers always perform better in comparison to traditional models of machine learning. XLM-RoBERTa provided 94.33% accuracy, 0.94 F1-score, and 0.99 AUC, which outperforms SVM (93% accuracy) and means a lot in terms of preforming XLNet (72.36% accuracy). XLM-RoBERTa achieved 99.5% accuracy in Russian, 98% in English, 96% in Arabic, and 85.9% in Spanish in single-language tests, which shows that it is strong in various languages. The findings reveal the usefulness of pretrained multilingual transformers to identify subtle cases of depression, which offers a dependable, language-independent approach to screening early cases of digital depression in mental-health monitoring systems in the real world.

Date

March 2026

Page Number

274-293