Cabalsi, Glaidelyn M.; Latina, Jeffrey V.; Sanchez, Joervy R.; Vallejo, Elnard Don M. 4 0
UNIQ: Utilization of NLP techniques in plagiarism detection system through semantic analysis using WORD2VEC and BERT / 6
6
Cabalsi, Glaidelyn M.; Latina, Jeffrey V.; Sanchez, Joervy R.; Vallejo, Elnard Don M.
-
-
- ix, 118 pp.
-
-
-
-
- .
- .
- 0 .
- .
- 0 .
Undergraduate Thesis: (Bachelor of Science in Information Technology) - Pamantasan ng Lungsod ng Maynila, 2024.
5
ABSTRACT: Detecting strongly paraphrased and translated texts presents a challenge for existing detection tools, which mainly rely on traditional word searching and matching methods. While automated systems are crucial for spotting plagiarism, they mostly focus on finding identical strings in both suspicious and source documents. As a result, detecting obfuscated plagiarism remains difficult because current tools are limited to identifying straightforward copy-and-paste cases, despite advancements in plagiarism detection technology. Thus, this arises a necessity for a plagiarism detection system tailored to identify paraphrased passages. In this study, it presents a system that detects plagiarism in paraphrased texts using NLP Technique through Word Embedding Techniques, specifically Word2Vec and Bidirectional Encoder Representation from Transformers (BERT). Additionally, a recommendation system is integrated to suggest scholarly materials based on the documents submitted by the users. This feature uses a Content-based filtering technique to offer multiple sources and references based on the calculation of the similarity between the features of the input document and each potential source in the corpus. By combining different approaches and techniques for checking the similarities, the results showed that the hybrid model to the system, the evaluation using the ISO 25010 accumulated excellent results of Functional Suitability, Performance Efficiency, Usability, and Reliability with means of 4.29, 4.29, 4.59, and 4.34, respectively. These results underscore the system's exceptional quality and outstanding performance in detecting plagiarism in paraphrased texts.