Normal view MARC view ISBD view

A further enhancement of Tagalog stemming algorithm applied in Tagalog Dictionary searching

By: Jonathan Aaron F. Baltazar, and John Romel C. Japson

Language: English Publisher: . . c2018Description: Undergraduate Thesis: (BSCS major in Computer Science)- Pamantasan ng Lungsod ng Maynila, 2018Content type: text Media type: unmediated Carrier type: volumeGenre/Form: academic writingDDC classification: . LOC classification: QA76.9 B35 2018

Contents:

ABSTRACT : Tagalog Stemming Algorithm (TagSA) is an algorithm developed for all forms of Tagalog words as input. It basically used in information retrieval systems to improve performance. In this study, it is as a morphological analyser that extract the root words from Filipino words conjugated in different forms as inputs and produces affixes used and the tenses of the original input word. By analyzing and examining the enhanced and existing algorithm, the proponents found three problems that occurred in the existing algorithm. First is the algorithm is not capable of morphophonemic changes resulting produce an invalid word as an output the second problem is the algorithm removes infixes resulting to an unexpected word as output and the last one is the algorithm cannot stem consecutive prefixes which result to be the word to under stem. In order to solve the occurred problems of the existing, the proponents prepared three specific objectives, first is to apply the morphophonemic changes in the algorithm, second is to recognize words with infixes as part of the expected root word and last is to expand the ability of the Prefix Removal Routine for it to be able to stem consecutive prefixes in a word. The proponents studied in order to provide enhanced algorithm and accomplish the objectives. For the first problem and objective, the researchers studied about morphophonemic changes in Tagalog words and applied in the enhanced algorithm. For the second problem and objective, the researchers added a case per case solution about the infixes that will solve the problem this solution follows the Tagalog syllable structure. For the third problem and objective, the enhanced algorithm removes the prefixes per character and from shortest to longest to prevent over stemming. After the proponents used these solutions, they were able to produce a valid root word for each word that is inputted to the enhanced algorithm. The proponents concluded that every language is unique and have their own structure. It can have its own algorithm and have its own routines just like the Tagalog Stemming Algorithm. The proponents also developed an application which holds this enhanced algorithm that can search the stemmed word to the Tagalog dictionary.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings ( 1 )
Title notes
Comments ( 0 )

Item type	Current location	Home library	Collection	Call number	Status	Date due	Barcode	Item holds
Archival materials	PLM	PLM Archives	Filipiniana-Thesis	QA76.9 B35 2018 (Browse shelf)	Available		FT6459

Total holds: 0

Filipiniana

There are no comments for this item.

to post a comment.