Enhancement of birch algorithm by celiz and mayo applied in customer segmentation

By: Ponce, Lemuel A.; Satuito, Janine Beatriz M.; Tesoro, Russelliza B
Language: English Publisher: . . c2025Description: Undergraduate Thesis: (Bachelor of Science in Computer Science) - Pamantasan ng Lungsod ng Maynila, 2025Content type: text Media type: unmediated Carrier type: volumeGenre/Form: academic writingDDC classification: . LOC classification: QA76.9 A43 P66 2025
Contents:
ABSTRACT: The BIRCH algorithm is known for being able to cluster large datasets. However, the some algorithms, it also faces challenges. First, the algorithm still struggles to cluster irregular shaped data effectively, resulting in imprecise clustering results for non-spherical data. Secondly, noise is still present in the existing BIRCH, leading to reduced clustering accuracy and distorted cluster boundaries. Hence, it effects its performance. Lastly, existing BIRCH relies heavily on Traditional Distance Metrics, resulting in handling inadequate categorical components which yields to suboptimal clustering outcomes. The proposed method to address the algorithm’s challenges in handling irregular shaped data is to apply the whole iteration phase of the clustering algorithm to capture irregular data within a dataset. To resolve the algorithm’s struggle in noise, the proposed technique is the implementation of the Local Outlier Factor in the algorithm’s process for the purpose of the identification and noise reduction to further improve its clustering accuracy. Finally, for better handling of categorial data, the proposed technique is the application of the Gower Distance Metric for the clustering of mixed-type data to optimize clustering performance for both numerical and categorical data. The result of applying the whole iteration phase showed that the enhanced BIRCH gained a higher Silhouette Score than the existing BIRCH algorithm. While the result of the implementation of the Local Outlier Factor showed that most of the datasets obtained a higher Adjusted Rand Index than the existing algorithm. Lastly, the result of incorporating Gower Distance Metric into the algorithm showed good Manual Information results, most datasets gathered higher MI results than the existing algorithm. Overall, the challenges and struggles in the existing algorithm have been successfully addressed and the enhanced BIRCH showed better performance and yielded good results
Tags from this library: No tags from this library for this title. Log in to add tags.
    Average rating: 0.0 (0 votes)
Item type Current location Home library Collection Call number Status Date due Barcode Item holds
Thesis/Dissertation PLM
PLM
Filipiniana Section
Filipiniana-Thesis QA76.9 A43 P66 2025 (Browse shelf) Available FT8894
Total holds: 0

ABSTRACT: The BIRCH algorithm is known for being able to cluster large datasets. However, the some algorithms, it also faces challenges. First, the algorithm still struggles to cluster irregular shaped data effectively, resulting in imprecise clustering results for non-spherical data. Secondly, noise is still present in the existing BIRCH, leading to reduced clustering accuracy and distorted cluster boundaries. Hence, it effects its performance. Lastly, existing BIRCH relies heavily on Traditional Distance Metrics, resulting in handling inadequate categorical components which yields to suboptimal clustering outcomes. The proposed method to address the algorithm’s challenges in handling irregular shaped data is to apply the whole iteration phase of the clustering algorithm to capture irregular data within a dataset. To resolve the algorithm’s struggle in noise, the proposed technique is the implementation of the Local Outlier Factor in the algorithm’s process for the purpose of the identification and noise reduction to further improve its clustering accuracy. Finally, for better handling of categorial data, the proposed technique is the application of the Gower Distance Metric for the clustering of mixed-type data to optimize clustering performance for both numerical and categorical data. The result of applying the whole iteration phase showed that the enhanced BIRCH gained a higher Silhouette Score than the existing BIRCH algorithm. While the result of the implementation of the Local Outlier Factor showed that most of the datasets obtained a higher Adjusted Rand Index than the existing algorithm. Lastly, the result of incorporating Gower Distance Metric into the algorithm showed good Manual Information results, most datasets gathered higher MI results than the existing algorithm. Overall, the challenges and struggles in the existing algorithm have been successfully addressed and the enhanced BIRCH showed better performance and yielded good results

Filipiniana

There are no comments for this item.

to post a comment.

© Copyright 2024 Phoenix Library Management System - Pinnacle Technologies, Inc. All Rights Reserved.