Enhancement of generalized mean distance k-nearest neighbors algorithm applied in detecting Filipino phishing short messaging system

By: Labajo, Angelika Louise R.; Villuga, Emmanuelle N
Publisher: c2025Description: Undergraduate Thesis: (Bachelor of Science in Computer Science) - Pamantasan ng Lungsod ng Maynila, 2025Content type: text Media type: unmediated Carrier type: volumeLOC classification: QA76.9 A43 L33 2025
Contents:
ABSTRACT: This study enhance the Generalized Mean Distance K-Nearest Neighbors (GMD-KNN) algorithm for detecting Filipino phishing SMS attacks. The current implementation uses the Euclidean distance metric, which has limitations in handling outliers, leading to reduced classification performance. To overcome this, cosine similarity is introduced as an alternative distance metric, improving classification accuracy by better capturing semantic relationships in text data and reducing outlier sensitivity. To assess performance, proponents evaluated the proposed and existing algorithms using both the confusion matrix and accuracy score, with accuracy being based on the best PCA components in the enhance algorithm. The enhanced GMD-KNN algorithm showed notable improvements over the original Euclidean-based version. The accuracy reached 95.59%, precision was 95.39%, sensitivity was 95.59%, specificity was 95.47%, and the Matthew’s Correlation Coefficient (MCC) increased to 90.95%, showing a total improvement of 5% over the original algorithm. These findings emphasize the effectiveness of cosine similarity in improving text classification within the GMD-KNN framework. By addressing these challenges, this study significantly enhances phishing detection mechanisms, with potential applications in mitigating SMS-based threats on mobile platforms.
Tags from this library: No tags from this library for this title. Log in to add tags.
    Average rating: 0.0 (0 votes)

ABSTRACT: This study enhance the Generalized Mean Distance K-Nearest Neighbors (GMD-KNN) algorithm for detecting Filipino phishing SMS attacks. The current implementation uses the Euclidean distance metric, which has limitations in handling outliers, leading to reduced classification performance. To overcome this, cosine similarity is introduced as an alternative distance metric, improving classification accuracy by better capturing semantic relationships in text data and reducing outlier sensitivity. To assess performance, proponents evaluated the proposed and existing algorithms using both the confusion matrix and accuracy score, with accuracy being based on the best PCA components in the enhance algorithm. The enhanced GMD-KNN algorithm showed notable improvements over the original Euclidean-based version. The accuracy reached 95.59%, precision was 95.39%, sensitivity was 95.59%, specificity was 95.47%, and the Matthew’s Correlation Coefficient (MCC) increased to 90.95%, showing a total improvement of 5% over the original algorithm. These findings emphasize the effectiveness of cosine similarity in improving text classification within the GMD-KNN framework. By addressing these challenges, this study significantly enhances phishing detection mechanisms, with potential applications in mitigating SMS-based threats on mobile platforms.

There are no comments for this item.

to post a comment.

© Copyright 2024 Phoenix Library Management System - Pinnacle Technologies, Inc. All Rights Reserved.