Enhancement of naïve Bayes classifier algorithm applied to email spam filtering
By: Masan, Jhon Patrick D.; Molon, Miriam Juliene F
Publisher: c2025Description: Undergraduate Thesis: (Bachelor of Science in Computer Science) - Pamantasan ng Lungsod ng Maynila, 2025Content type: text Media type: unmediated Carrier type: volumeLOC classification: QA76.9 A43 M37 2025| Item type | Current location | Home library | Collection | Call number | Status | Date due | Barcode | Item holds |
|---|---|---|---|---|---|---|---|---|
| Thesis/Dissertation | PLM | PLM Filipiniana Section | Filipiniana-Thesis | QA76.9 A43 M37 2025 (Browse shelf) | Available | FT8930 |
Browsing PLM Shelves , Shelving location: Filipiniana Section , Collection code: Filipiniana-Thesis Close shelf browser
ABSTRACT: This study focuses on enhancing the Naïve Bayes classifier for email spam detection by addressing its core limitations: high dimensionality, zero-probability issues, and class imbalance. Specifically, the enhancement aims to reduce the impact of high dimensionality through Term Frequency-Inverse Document Frequency (TF-IDF), resolve the zero-probability problem using Laplace Smoothing for more reliable probability estimation, and address class imbalance by applying the Synthetic Minority Over-sampling Techniques (SMOTE) to improve spam recognition. A labeled dataset of email messages was used for training and evaluation. Results showed that these enhancements significantly improved classification performance. Accuracy increased from 0.96 to 0.99, demonstrating overall improvement in correct predictions. The macro average F1 score rose from 0.90 to 0.98, indicating better balance across both classes. Recall for the spam class improved from 0.71 to 0.99, while its F1 score increased from 0.83 to 0.96, reflecting greater success in detecting and classifying spam. Cross-validation F1 scores also improved from approximately 0.91 to 0.99, confirming enhanced model generalization and stability. These findings demonstrate that the proposed enhancements make the Naïve Bayes classifier significantly more robust, accurate and effective for spam filtering application.

There are no comments for this item.