Enhancement of logistic regression algorithm applied in email spam detection (Record no. 37392)

000 -LEADER
fixed length control field 02193nam a22001457a 4500
003 - CONTROL NUMBER IDENTIFIER
control field FT8905
050 ## - LIBRARY OF CONGRESS CALL NUMBER
Classification number QA76.9 A43 C37 2025
100 1# - MAIN ENTRY--PERSONAL NAME
Personal name Carlos, Vince Anthony S.; Pancho, John Cedric C.
245 ## - TITLE STATEMENT
Title Enhancement of logistic regression algorithm applied in email spam detection
264 #1 - PRODUCTION, PUBLICATION, DISTRIBUTION, MANUFACTURE, AND COPYRIGHT NOTICE
Date of production, publication, distribution, manufacture, or copyright notice c2025
300 ## - PHYSICAL DESCRIPTION
Other physical details Undergraduate Thesis: (Bachelor of Science in Computer Science) - Pamantasan ng Lungsod ng Maynila, 2025
338 ## - CARRIER TYPE
Source volume
Carrier type term volume
Carrier type code volume
505 ## - FORMATTED CONTENTS NOTE
Formatted contents note ABSTRACT: Logistic regression is a popular binary classification approach, but like any machine learning algorithms, it has its limitations and possible concerns such as class imbalance, large datasets, and overfitting, which reduce its accuracy and efficiency. This study enhanced the Logistic Regression algorithm’s performance for email spam detection by addressing these problems using the techniques of Term Frequency-Inverse Document Frequency (TF-IDF) for class imbalance, Recursive Feature Elimination (RFE) for large datasets, and Principal Component Analysis (PCA) for overfitting concerns, TF-IDF improves feature representation, highlighting key terms that differentiate spam from non-spam. RFE systematically eliminates irrelevant features, reducing computational complexity and enhancing efficiency, particularly for large datasets, PCA mitigates overfitting by reducing the dimensionality of feature spaces, ensuring the model generalizes effectively to unseen data. Experimental results showed that the enhanced Logistic Regression model demonstrated a significant improvement in spam detection accuracy, achieving up to 98% accuracy with TF-IDF compared to the baseline model’s 91% RFE reduced training time by 35% while maintaining robust performance on large datasets, and PCA improved model generalization by reducing variance in predictions. The proposed enhancements successfully address the key limitations of traditional Logistic Regression models in spam detection. This refined approach improves predictive accuracy, computational efficiency, and robustness, making it highly applicable to real-world email security systems, and enhancing spam filtering effectiveness.
942 ## - ADDED ENTRY ELEMENTS
Source of classification or shelving scheme
Item type Thesis/Dissertation
Holdings
Withdrawn status Lost status Source of classification or shelving scheme Damaged status Not for loan Collection code Permanent Location Current Location Shelving location Date acquired Fund Source Total Checkouts Full call number Barcode Date last seen Price effective from Item type
          Filipiniana-Thesis PLM PLM Filipiniana Section 2025-10-24 donation   QA76.9 A43 C37 2025 FT8905 2026-01-05 2026-01-05 Thesis/Dissertation

© Copyright 2024 Phoenix Library Management System - Pinnacle Technologies, Inc. All Rights Reserved.