A further enhancement of Tagalog stemming algorithm applied in Tagalog Dictionary searching
By: Jonathan Aaron F. Baltazar, and John Romel C. Japson
Language: English Publisher: . . c2018Description: Undergraduate Thesis: (BSCS major in Computer Science)- Pamantasan ng Lungsod ng Maynila, 2018Content type: text Media type: unmediated Carrier type: volumeGenre/Form: academic writingDDC classification: . LOC classification: QA76.9 B35 2018| Item type | Current location | Home library | Collection | Call number | Status | Date due | Barcode | Item holds |
|---|---|---|---|---|---|---|---|---|
| Archival materials | PLM | PLM Archives | Filipiniana-Thesis | QA76.9 B35 2018 (Browse shelf) | Available | FT6459 |
ABSTRACT : Tagalog Stemming Algorithm (TagSA) is an algorithm developed for all forms of Tagalog words as input. It basically used in information retrieval systems to improve performance. In this study, it is as a morphological analyser that extract the root words from Filipino words conjugated in different forms as inputs and produces affixes used and the tenses of the original input word. By analyzing and examining the enhanced and existing algorithm, the proponents found three problems that occurred in the existing algorithm. First is the algorithm is not capable of morphophonemic changes resulting produce an invalid word as an output the second problem is the algorithm removes infixes resulting to an unexpected word as output and the last one is the algorithm cannot stem consecutive prefixes which result to be the word to under stem. In order to solve the occurred problems of the existing, the proponents prepared three specific objectives, first is to apply the morphophonemic changes in the algorithm, second is to recognize words with infixes as part of the expected root word and last is to expand the ability of the Prefix Removal Routine for it to be able to stem consecutive prefixes in a word. The proponents studied in order to provide enhanced algorithm and accomplish the objectives. For the first problem and objective, the researchers studied about morphophonemic changes in Tagalog words and applied in the enhanced algorithm. For the second problem and objective, the researchers added a case per case solution about the infixes that will solve the problem this solution follows the Tagalog syllable structure. For the third problem and objective, the enhanced algorithm removes the prefixes per character and from shortest to longest to prevent over stemming. After the proponents used these solutions, they were able to produce a valid root word for each word that is inputted to the enhanced algorithm. The proponents concluded that every language is unique and have their own structure. It can have its own algorithm and have its own routines just like the Tagalog Stemming Algorithm. The proponents also developed an application which holds this enhanced algorithm that can search the stemmed word to the Tagalog dictionary.
Filipiniana

There are no comments for this item.