000 01998nam a22001817a 4500
003 FT8930
005 20260112150231.0
050 _aQA76.9 A43 M37 2025
100 1 _a Masan, Jhon Patrick D.; Molon, Miriam Juliene F.
245 _aEnhancement of naïve Bayes classifier algorithm applied to email spam filtering
264 1 _cc2025
300 _bUndergraduate Thesis: (Bachelor of Science in Computer Science) - Pamantasan ng Lungsod ng Maynila, 2025
336 _2text
_atext
_btext
337 _2unmediated
_aunmediated
_bunmediated
338 _2volume
_avolume
_bvolume
505 _aABSTRACT: This study focuses on enhancing the Naïve Bayes classifier for email spam detection by addressing its core limitations: high dimensionality, zero-probability issues, and class imbalance. Specifically, the enhancement aims to reduce the impact of high dimensionality through Term Frequency-Inverse Document Frequency (TF-IDF), resolve the zero-probability problem using Laplace Smoothing for more reliable probability estimation, and address class imbalance by applying the Synthetic Minority Over-sampling Techniques (SMOTE) to improve spam recognition. A labeled dataset of email messages was used for training and evaluation. Results showed that these enhancements significantly improved classification performance. Accuracy increased from 0.96 to 0.99, demonstrating overall improvement in correct predictions. The macro average F1 score rose from 0.90 to 0.98, indicating better balance across both classes. Recall for the spam class improved from 0.71 to 0.99, while its F1 score increased from 0.83 to 0.96, reflecting greater success in detecting and classifying spam. Cross-validation F1 scores also improved from approximately 0.91 to 0.99, confirming enhanced model generalization and stability. These findings demonstrate that the proposed enhancements make the Naïve Bayes classifier significantly more robust, accurate and effective for spam filtering application.
942 _2lcc
_cMS
999 _c37427
_d37427