Enhancement of birch algorithm by celiz and mayo applied in customer segmentation
By: Ponce, Lemuel A.; Satuito, Janine Beatriz M.; Tesoro, Russelliza B
Language: English Publisher: . . c2025Description: Undergraduate Thesis: (Bachelor of Science in Computer Science) - Pamantasan ng Lungsod ng Maynila, 2025Content type: text Media type: unmediated Carrier type: volumeGenre/Form: academic writingDDC classification: . LOC classification: QA76.9 A43 P66 2025| Item type | Current location | Home library | Collection | Call number | Status | Date due | Barcode | Item holds |
|---|---|---|---|---|---|---|---|---|
| Thesis/Dissertation | PLM | PLM Filipiniana Section | Filipiniana-Thesis | QA76.9 A43 P66 2025 (Browse shelf) | Available | FT8894 |
ABSTRACT: The BIRCH algorithm is known for being able to cluster large datasets. However, the some algorithms, it also faces challenges. First, the algorithm still struggles to cluster irregular shaped data effectively, resulting in imprecise clustering results for non-spherical data. Secondly, noise is still present in the existing BIRCH, leading to reduced clustering accuracy and distorted cluster boundaries. Hence, it effects its performance. Lastly, existing BIRCH relies heavily on Traditional Distance Metrics, resulting in handling inadequate categorical components which yields to suboptimal clustering outcomes. The proposed method to address the algorithm’s challenges in handling irregular shaped data is to apply the whole iteration phase of the clustering algorithm to capture irregular data within a dataset. To resolve the algorithm’s struggle in noise, the proposed technique is the implementation of the Local Outlier Factor in the algorithm’s process for the purpose of the identification and noise reduction to further improve its clustering accuracy. Finally, for better handling of categorial data, the proposed technique is the application of the Gower Distance Metric for the clustering of mixed-type data to optimize clustering performance for both numerical and categorical data. The result of applying the whole iteration phase showed that the enhanced BIRCH gained a higher Silhouette Score than the existing BIRCH algorithm. While the result of the implementation of the Local Outlier Factor showed that most of the datasets obtained a higher Adjusted Rand Index than the existing algorithm. Lastly, the result of incorporating Gower Distance Metric into the algorithm showed good Manual Information results, most datasets gathered higher MI results than the existing algorithm. Overall, the challenges and struggles in the existing algorithm have been successfully addressed and the enhanced BIRCH showed better performance and yielded good results
Filipiniana

There are no comments for this item.