Alharbi, Abdullah Ibrahim M. ORCID: https://orcid.org/0000-0002-2620-0049 (2023). Enhancing word representations for emotional intensity and offensive language detection in Arabic microblog text. University of Birmingham. D.Sc.
|
Alharbi2023PhD.pdf
Text - Accepted Version Available under License All rights reserved. Download (3MB) | Preview |
Abstract
Social media motivates people to express their emotions and share them publicly. However, at the same time, there are those who use it to spread racism and offensive language. Detecting emotional intensity and offensive language can be challenging in the context of social media microblogs, such as Twitter. This task becomes even more complicated when morphology-rich languages, such as Arabic, are involved. Social media communications typically consist of a range of dialects and sub-dialects that are not ruled by consistent standards. Therefore, there is a need to adopt effective methods and resources to better comprehend and treat a variety of linguistic forms when seeking to understand the emotional intensity and offensive language in Arabic short texts.
In this dissertation, we study two main problems: detection of emotional intensity and of offensive language in Arabic microblogs. First, we propose a novel combination of static character- and word-level embeddings (ACWE) to improve the detection of emotional intensity. For this purpose, we create word-and character-level embeddings using a large number of tweets enriched by the diversity of affective vocabulary words and Arabic dialects. ACWE significantly outperforms state-of-the-art pre-trained Ara�bic word embeddings in emotional intensity tasks. Second, we enhance contextualised language models by incorporating ACWE to identify emotional intensity. We show that our proposed method obtains state-of-the-art results in seven affect tasks, includ�ing our main task, emotional intensity detection. Lastly, we exploit emotional intensity and other affect-related tasks in the offensive language task using transfer learning ap�proaches. We find that incorporating the best-performing contextual language models with anger intensity and emotion-related tasks enhances the performance of offensive language detection.
Type of Work: | Thesis (Doctorates > D.Sc.) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Award Type: | Doctorates > D.Sc. | |||||||||
Supervisor(s): |
|
|||||||||
Licence: | All rights reserved | |||||||||
College/Faculty: | Colleges (2008 onwards) > College of Engineering & Physical Sciences | |||||||||
School or Department: | School of Computer Science | |||||||||
Funders: | Other | |||||||||
Other Funders: | King Abdulaziz University | |||||||||
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science | |||||||||
URI: | http://etheses.bham.ac.uk/id/eprint/13283 |
Actions
Request a Correction | |
View Item |
Downloads
Downloads per month over past year