000 02383nam a22002417a 4500
003 FT8885
005 20251217152416.0
008 251217b ||||| |||| 00| 0 eng d
041 _aengtag
050 _aQA76.9 A73 S47 2025
082 _2.
100 1 _a Servo, Samantha Vivien I.; Inso, Kelly Denise A.
245 _aModified support vector machine algorithm for text classification applied psychiatric tele-triage
264 1 _a.
_b.
_cc2525
300 _bUndergraduate Thesis: (Bachelor of Science Computer Science) - Pamantasan ng Lungsod ng Maynila, 2025
336 _2text
_atext
_btext
337 _2unmediated
_aunmediated
_bunmediated
338 _2volume
_avolume
_bvolume
505 _aABSTRACT: This study investigates the use of Support Vector Machine (SVM) models to enhance text classification for tele-triage in psychiatry. The issue addressed is SVM’s tendency to ignore significant textual features, which results in low precision and recall, particularly in multi-class classification tasks with imbalanced classes. In order to address this, the researchers propose generating embeddings using the Large Language Model (LLM) RoBERTa, then reducing the dimensionality using PCA before training the SVM model. The dataset includes 500 Reddit posts with five categories of suicide risk: Attempt, Behavior, Ideation, Indicator and Supportive. Experts used the Columbia Suicide Severity Rating Scale (C-SSRS) to sort these posts. Results slow significant improvement over the baseline SVM model. The model initially had trouble with recall and precision, especially for the Attempt class, which had zero precision. Significant were observed in the Supportive class (precision: 0.55 to 0.59, recall: 0.43 to 0.57) and Behavior (precision: 0.25 to 0.31, recall: 0.13 to 0.27) following the implementation of the RoBERTa-based strategy. Even though the attempt demonstrated some improvement (precision: 0.00 to 0.33), more optimization is required. These results suggest that incorporating RoBERTa embeddings and PCA for dimensionality reduction can enhance SVM’s performance by preventing the loss of important features. The model still has issues with minority classes, suggesting that more research is needed to enhance recall for underrepresented categories and handle class imbalances.
526 _aF
655 _aacademic writing
942 _2lcc
_cMS
999 _c37365
_d37365