Please use this identifier to cite or link to this item: https://hdl.handle.net/10216/106028
Full metadata record
DC FieldValueLanguage
dc.creatorPaula Cristina Teixeira Fortuna
dc.date.accessioned2025-11-06T16:26:52Z-
dc.date.available2025-11-06T16:26:52Z-
dc.date.issued2017-07-07
dc.date.submitted2017-08-01
dc.identifier.othersigarra:202853
dc.identifier.urihttps://hdl.handle.net/10216/106028-
dc.description.abstractNowadays people are using more and more social networks to communicate their opinions, share information and experiences. In social networks people have the feeling of being deindividualized and can incur more frequently in aggressive communication. In this context, it is important that government and social networks platforms have tools to detect hate speech because it is harmful to its targets. In our work we investigate the problem of detecting hate speech online. Our first goal is to make a complete overview on the topic. However, describing the state of the art in the area of hate speech is not simple, because this topic is regarded by different areas, such as text mining, social sciences, and law. Our literature review is focused on the perspective of computer science and engineering and it is distinct from other works we found. We adopted an exhaustive and methodical method. We called it Systematic Literature Review. As a result, we concluded that the majority of the studies tackles this problem as a machine learning classification task and the studies use either general text mining features (e.g n-grams, word2vec), or hate speech specific features (e.g othering discourse). In the majority of the studies new datasets are collected, but those remain private, which makes more difficult to compare the results across the different studies. We concluded also that this field is still in an early stage, with several open research opportunities. As we found no research on the topic in Portuguese, the second goal of this work was to annotate a dataset for this language. Regarding the dataset annotation, we built a classification using a hierarchical structure. This is an innovative way of approaching the problem of hate speech automatic classification. Its main advantage is that it allows to better consider nuances in the hate speech concepts. We collect a dataset with 5,668 messages, from 1156 distinct users, annotated not only for hate speech, but also for more 83 subtypes of hate. Finally, we also try to prove that the hierarchical structure of classes used also allows to improve the performance of the classification models, since it is better suited for consider the different subtypes of hate speech and the intersections between those classes.
dc.language.isopor
dc.rightsopenAccess
dc.subjectEngenharia electrotécnica, electrónica e informática
dc.subjectElectrical engineering, Electronic engineering, Information engineering
dc.titleAutomatic detection of hate speech in text: an overview of the topic and dataset annotation with hierarchical classes
dc.typeDissertação
dc.contributor.uportoFaculdade de Engenharia
dc.identifier.doi10.34626/hy8t-f260
dc.identifier.tid201801990
dc.subject.fosCiências da engenharia e tecnologias::Engenharia electrotécnica, electrónica e informática
dc.subject.fosEngineering and technology::Electrical engineering, Electronic engineering, Information engineering
thesis.degree.disciplineMestrado Integrado em Engenharia Informática e Computação
thesis.degree.grantorFaculdade de Engenharia
thesis.degree.grantorUniversidade do Porto
thesis.degree.level1
Appears in Collections:FEUP - Dissertação

Files in This Item:
File Description SizeFormat 
202853.pdfAutomatic detection of hate speech in text: an overview of the topic and dataset annotation with hierarchical classes1.61 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.