000 04455nam a2201225Ia 4500
000 02279ntm a2200193 i 4500
001 90699
003 0
005 20250920173710.0
008 240717n 000 0 eng d
010 _z
_z
_o
_a
_b
015 _22
_a
016 _2
_2
_a
_z
020 _e
_e
_a
_b
_z
_c
_q
_x
022 _y
_y
_l
_a2
024 _2
_2
_d
_c
_a
_q
028 _a
_a
_b
029 _a
_a
_b
032 _a
_a
_b
035 _a
_a
_b
_z
_c
_q
037 _n
_n
_c
_a
_b
040 _e
_erda
_a
_d
_b
_c
041 _e
_e
_a
_b
_g
_h
_r
043 _a
_a
_b
045 _b
_b
_a
050 _a
_a
_d
_b2
_c0
051 _c
_c
_a
_b
055 _a
_a
_b
060 _a
_a
_b
070 _a
_a
_b
072 _2
_2
_d
_a
_x
082 _a
_a
_d
_b2
_c
084 _2
_2
_a
086 _2
_2
_a
090 _a
_a
_m
_b
_q
092 _f
_f
_a
_b
096 _a
_a
_b
097 _a
_a
_b
100 _e
_e
_aPatricia Nicole C. Trajano, Ayra Shane C. Villacarlos.
_d
_b4
_u
_c0
_q16
110 _e
_e
_a
_d
_b
_n
_c
_k
111 _a
_a
_d
_b
_n
_c
130 _s
_s
_a
_p
_f
_l
_k
210 _a
_a
_b
222 _a
_a
_b
240 _s
_s
_a
_m
_g
_n
_f
_l
_o
_p
_k
245 0 _a
_aAn Enhanced Hartigan-Wong algorithm applied for determining the crime rates per area in the Philippines.
_d
_b
_n
_c
_h6
_p
246 _a
_a
_b
_n
_i
_f6
_p
249 _i
_i
_a
250 _6
_6
_a
_b
260 _e
_e
_a
_b
_f
_c
_g
264 _3
_3
_a
_d
_b
_c4538346
300 _e
_e
_c
_a
_b
310 _a
_a
_b
321 _a
_a
_b
336 _b
_atext
_2rdacontent
337 _3
_30
_b
_aunmediated
_2rdamedia
338 _3
_30
_b
_avolume
_2rdacarrier
340 _2
_20
_g
_n
344 _2
_2
_a0
_b
347 _2
_2
_a0
362 _a
_a
_b
385 _m
_m
_a2
410 _t
_t
_b
_a
_v
440 _p
_p
_a
_x
_v
490 _a
_a
_x
_v
500 _a
_aUndergraduate Thesis : (Bachelor of Science in Computer Science) - Pamantasan ng Lungsod ng Maynila, 2024.
_d
_b
_c56
504 _a
_a
_x
505 _a
_a
_b
_t
_g
_r
506 _a
_a5
510 _a
_a
_x
520 _b
_b
_c
_aABSTRACT: Clustering algorithms are very crucial in data analysis wherein it is used to analyse and find patterns within datasets. The Hartigan-Wong algorithm, a version of the K-means, is widely used in many different applications, but it exhibits limitations involving the accuracy of the final clusters for high-dimensional data, determining the outliers in the data, and its initialization of centroids. To address these issues, the authors integrate the Isolation Forest algorithm to identify and remove outliers present in the dataset, thereby improving the quality of input data. Second, the Principal Component Analysis (PCA) is employed as a dimensionality reduction method to address the issue of high-dimensional data. Lastly, the Quintile Methods is used to determine the optimal initial centroids, especially for high-dimensional datasets. The researchers evaluate these enhancements using real-world crime data from the Philippines, comparing against existing Hartigan-Wong implementations. The results demonstrate imrproved accuracy and efficiency against the existing implementations of the Hartigan-Wong algorithm. Overall, the study contributes to the advancement of data clustering by enhancing the Hartigan-Wong algorithm to better suit the complexities of high dimensional data. The enhancements offer valuable insights for enhancing the applicability and performance of the algorithm's performance in real-world scenarios involving high-dimensional data.
_u
521 _a
_a
_b
533 _e
_e
_a
_d
_b
_n
_c
540 _c
_c
_a5
542 _g
_g
_f
546 _a
_a
_b
583 _5
_5
_k
_c
_a
_b
590 _a
_a
_b
600 _b
_b
_v
_t
_c2
_q
_a
_x0
_z
_d
_y
610 _b
_b
_v
_t2
_x
_a
_k0
_p
_z
_d6
_y
611 _a
_a
_d
_n2
_c0
_v
630 _x
_x
_a
_d
_p20
_v
648 _2
_2
_a
650 _x
_x
_a
_d
_b
_z
_y20
_v
651 _x
_x
_a
_y20
_v
_z
655 _0
_0
_a
_y2
_z
700 _i
_i
_t
_c
_b
_s1
_q
_f
_k40
_p
_d
_e
_a
_l
_n6
710 _b
_b
_t
_c
_e
_f
_k40
_p
_d5
_l
_n6
_a
711 _a
_a
_d
_b
_n
_t
_c
730 _s
_s
_a
_d
_n
_p
_f
_l
_k
740 _e
_e
_a
_d
_b
_n
_c6
753 _c
_c
_a
767 _t
_t
_w
770 _t
_t
_w
_x
773 _a
_a
_d
_g
_m
_t
_b
_v
_i
_p
775 _t
_t
_w
_x
776 _s
_s
_a
_d
_b
_z
_i
_t
_x
_h
_c
_w
780 _x
_x
_a
_g
_t
_w
785 _t
_t
_w
_a
_x
787 _x
_x
_d
_g
_i
_t
_w
800 _a
_a
_d
_l
_f
_t0
_q
_v
810 _a
_a
_b
_f
_t
_q
_v
830 _x
_x
_a
_p
_n
_l0
_v
942 _a
_alcc
_cBK
999 _c25324
_d25324