Enhancement of random forest algorithm applied to SMS fraud detection

By: Liwag, Justin E.; Balaoro, Clarisse Anne D
Publisher: c2025Description: Undergraduate Thesis: (Bachelor of Science in Computer Science) - Pamantasan ng Lungsod ng Maynila, 2025Content type: text Media type: unmediated Carrier type: volumeLOC classification: QA76.9 A43 L59 2025
Contents:
ABSTRACT: Random Forest is a powerful machine learning algorithm that builds multiple decision trees from randomly selected subsets of features and data. However, its performance declines when dealing with imbalanced datasets, which reduces the accuracy. Selecting features randomly slows down model training and contributes to a lack of interpretability. This study entitled Enhancement of Random Forest Algorithm Applied to SMS Fraud Detection aims to enhance the algorithm’s ability to manage imbalanced datasets and minimize false negatives in classifying fraudulent messages. Spectral Co-Clustering, Reduced Error Pruning, and Contextual Feature Contribution Network (CFCN) were incorporated to improve algorithm’s accuracy, training time, and transparency. The enhanced algorithm was evaluated using the SMS Spam Collection Dataset, with performance metrics compared against the existing Random Forest. The results show an increased accuracy by 1% (from 97% to 98%), a recall for spam detection by 5% (from 78% to 83%), and an F1-socre by (2% (from 93% to 95%). Reduced Error Pruning reduced time b y 65.8%, enhancing computational efficiency. The CFCN provided transparent insights into feature contributions, addressing traditional models :black-box” nature. These enhancements strengthen the model’s ability to detect SMS fraud while maintaining robustness against imbalanced data. The study contributes fraud detection systems by offering a more accurate, efficient, and interpretable machine learning framework for safeguarding digital communication.
Tags from this library: No tags from this library for this title. Log in to add tags.
    Average rating: 0.0 (0 votes)

ABSTRACT: Random Forest is a powerful machine learning algorithm that builds multiple decision trees from randomly selected subsets of features and data. However, its performance declines when dealing with imbalanced datasets, which reduces the accuracy. Selecting features randomly slows down model training and contributes to a lack of interpretability. This study entitled Enhancement of Random Forest Algorithm Applied to SMS Fraud Detection aims to enhance the algorithm’s ability to manage imbalanced datasets and minimize false negatives in classifying fraudulent messages. Spectral Co-Clustering, Reduced Error Pruning, and Contextual Feature Contribution Network (CFCN) were incorporated to improve algorithm’s accuracy, training time, and transparency. The enhanced algorithm was evaluated using the SMS Spam Collection Dataset, with performance metrics compared against the existing Random Forest. The results show an increased accuracy by 1% (from 97% to 98%), a recall for spam detection by 5% (from 78% to 83%), and an F1-socre by (2% (from 93% to 95%). Reduced Error Pruning reduced time b y 65.8%, enhancing computational efficiency. The CFCN provided transparent insights into feature contributions, addressing traditional models :black-box” nature. These enhancements strengthen the model’s ability to detect SMS fraud while maintaining robustness against imbalanced data. The study contributes fraud detection systems by offering a more accurate, efficient, and interpretable machine learning framework for safeguarding digital communication.

There are no comments for this item.

to post a comment.

© Copyright 2024 Phoenix Library Management System - Pinnacle Technologies, Inc. All Rights Reserved.