000 02496nam a22002417a 4500
003 ft8875
005 20251215170130.0
008 251215b ||||| |||| 00| 0 eng d
041 _aengtag
050 _aQA76.9 A43 B39 2025
082 _a.
100 1 _aArpon, Jasmia C.; Japson Denise H.
245 _aEnhancement of K-means algorithm applied to movie recommendation system
264 1 _a.
_b.
_cc2025
300 _bUndergraduate Thesis: (Bachelor of Science in Computer Science) - Pamantasan ng Lungsod ng Maynila, 2025
336 _2 text
_a text
_b text
337 _2unmediated
_aunmediated
_bunmediated
338 _2volume
_avolume
_bvolume
505 _aABSTRACT: This study aims to enhance the traditional K-Means clustering algorithm, which is known for its sensitivity to outliers, reliance on manually selected cluster numbers, and difficulty in clustering data with varying sizes and densities. To address these issues, the enhanced algorithm integrated three key enhancements: optimal cluster selection using the Calinski-Harabasz Index (CHI), outlier detection though Local Outlier Factor (LOF), and the use of Cosine Similarity for distance metric. The CHI determined that only 2 clusters were optimal, compared to the 5 clusters used in the original method, simplifying interpretation and automating the selection of k clusters. To address the algorithm’s challenges in clustering data of varying size and density, the enhanced method utilized Cosine Similarity, allowing it to handle clusters with irregular shapes and varying densities more effectively than Euclidean distance. This resulted in clearer boundaries and reduced overlap between user groups. Lastly, to address the algorithm’s sensitivity to outliers, LOF was implemented which effectively identified and removed 51 outliers from the original 610-user dataset. This resulted in tighter, less noisy clusters. These enhancements led to an improved silhouette score from 0.01012 to 0.1359, demonstrating greater intra-cluster cohesion and inter-cluster separation. The results, visualized through comparative plots, highlight the performance advantage of the enhanced algorithm in generating cleaner and more meaningful clusters. Overall, the enhanced K-Means method more effective in capturing user preferences by generating accurate and robust clusters, making it a valuable tool for recommendation systems and user behavior analysis.
526 _aF
655 _aacademic writing
942 _2lcc
_cMS
999 _c37352
_d37352