Enhancing distilbert algorithm using CNN for image captioning and defending against adversarial attacks in online hate speech.
By: Jerry Luck S. Balut, Micah Therese T. Tabon
Language: English Manila: PLM, 2024Description: Undergraduate Thesis : (Bachelor of Science in Computer Science) - Pamantasan ng Lungsod ng Maynila, 2024Content type: text Media type: unmediated Carrier type: volumeGenre/Form: academic writingDDC classification: . LOC classification: QA76.9 A43 B35 2024| Item type | Current location | Home library | Collection | Call number | Status | Date due | Barcode | Item holds |
|---|---|---|---|---|---|---|---|---|
| Thesis/Dissertation | PLM | PLM Filipiniana Section | Filipiniana-Thesis | QA76.9 A43 B35 2024 (Browse shelf) | Available | FT7865 |
Browsing PLM Shelves , Shelving location: Filipiniana Section , Collection code: Filipiniana-Thesis Close shelf browser
ABSTRACT: The task of hate speech detection has garnered significant attention in the research community, employing machine learning algorithms and natural language processing (NLP) for classification. As hate speech proliferates, its profound influence on society can lead to negative consequences. As such, it is crucial to establish mechanisms to detect hate speech accurately to minimize its presence on social media. Through this research, the researchers could discern three main points of interest to improve the hate speech detection of the DistilBERT algorithm. DistilBERT is limited to textual data, which poses a challenge for hate speech detection that extends to diverse data formats like images and videos. The model’s limitations also encompass its inability to detect hate speech written in leetspeak and its susceptibility to false predictions caused by beningn word insertions, presenting a gap in comprehensive hate speech identification. The enhancement by the researcher, hate speech detection in images, is made possible using convolutional neural networks (CNN). In addition to this, hate speech containing leetspeak and benign word insertion can now be detected correctly. An optical character recognition (OCR) engine was used to decode the text to address the limitation of hate speech detection in leetspeak. Contextual understanding and polarity scores were established to deal with false predictions due to benign word insertions. Based on the results, the proposed DistilBERT algorithm presents notable advancements over the existing algorithm in terms ofprecision, F1 score, and accuracy while maintaining a competitive level of recall.
5
ABSTRACT: The task of hate speech detection has garnered significant attention in the research community, employing machine learning algorithms and natural language processing (NLP) for classification. As hate speech proliferates, its profound influence on society can lead to negative consequences. As such, it is crucial to establish mechanisms to detect hate speech accurately to minimize its presence on social media. Through this research, the researchers could discern three main points of interest to improve the hate speech detection of the DistilBERT algorithm. DistilBERT is limited to textual data, which poses a challenge for hate speech detection that extends to diverse data formats like images and videos. The model's limitations also encompass its inability to detect hate speech written in leetspeak and its susceptibility to false predictions caused by beningn word insertions, presenting a gap in comprehensive hate speech identification. The enhancement by the researcher, hate speech detection in images, is made possible using convolutional neural networks (CNN). In addition to this, hate speech containing leetspeak and benign word insertion can now be detected correctly. An optical character recognition (OCR) engine was used to decode the text to address the limitation of hate speech detection in leetspeak. Contextual understanding and polarity scores were established to deal with false predictions due to benign word insertions. Based on the results, the proposed DistilBERT algorithm presents notable advancements over the existing algorithm in terms ofprecision, F1 score, and accuracy while maintaining a competitive level of recall.
Fiction
5

There are no comments for this item.