Enhancement of random forest algorithm applied to SMS fraud detection
By: Liwag, Justin E.; Balaoro, Clarisse Anne D
Publisher: c2025Description: Undergraduate Thesis: (Bachelor of Science in Computer Science) - Pamantasan ng Lungsod ng Maynila, 2025Content type: text Media type: unmediated Carrier type: volumeLOC classification: QA76.9 A43 L59 2025| Item type | Current location | Home library | Collection | Call number | Status | Date due | Barcode | Item holds |
|---|---|---|---|---|---|---|---|---|
| Thesis/Dissertation | PLM | PLM Filipiniana Section | Filipiniana-Thesis | QA76.9 A43 L59 2025 (Browse shelf) | Available | FT8928 |
Browsing PLM Shelves , Shelving location: Filipiniana Section , Collection code: Filipiniana-Thesis Close shelf browser
ABSTRACT: Random Forest is a powerful machine learning algorithm that builds multiple decision trees from randomly selected subsets of features and data. However, its performance declines when dealing with imbalanced datasets, which reduces the accuracy. Selecting features randomly slows down model training and contributes to a lack of interpretability. This study entitled Enhancement of Random Forest Algorithm Applied to SMS Fraud Detection aims to enhance the algorithm’s ability to manage imbalanced datasets and minimize false negatives in classifying fraudulent messages. Spectral Co-Clustering, Reduced Error Pruning, and Contextual Feature Contribution Network (CFCN) were incorporated to improve algorithm’s accuracy, training time, and transparency. The enhanced algorithm was evaluated using the SMS Spam Collection Dataset, with performance metrics compared against the existing Random Forest. The results show an increased accuracy by 1% (from 97% to 98%), a recall for spam detection by 5% (from 78% to 83%), and an F1-socre by (2% (from 93% to 95%). Reduced Error Pruning reduced time b y 65.8%, enhancing computational efficiency. The CFCN provided transparent insights into feature contributions, addressing traditional models :black-box” nature. These enhancements strengthen the model’s ability to detect SMS fraud while maintaining robustness against imbalanced data. The study contributes fraud detection systems by offering a more accurate, efficient, and interpretable machine learning framework for safeguarding digital communication.

There are no comments for this item.