Enhancement of the porter 2 stemming algorithm applied in android mobile dictionary
By: Jonelle R. Atienza and Hector Allan Norbert A. Erni
Language: English . . c2013Description: Undergraduate Thesis: (Bachelor of Science in Computer Studies major in Computer Science) - Pamantasan ng Lungsod ng Maynila, 2013Content type: text Media type: unmediated Carrier type: volumeGenre/Form: .DDC classification: . LOC classification: QA76.15 A85 2013| Item type | Current location | Home library | Collection | Call number | Status | Date due | Barcode | Item holds |
|---|---|---|---|---|---|---|---|---|
| Thesis/Dissertation | PLM | PLM Archives | Filipiniana-Thesis | QA76.15 A85 2013 (Browse shelf) | Available | FT6134 |
ABSTRACT: Stemming algorithms are responsible for the conversion of a word to its root form regardless of its context. There are many stemming algorithms in use today like Lovins, and Kovetz. There are also different methods people or researchers use to stem words like using a lookup table or by suffix-stripping method where a word is processed under set of rules. Porter 2 Stemming Algorithm is developed by Martin Porter and considered the best of the most commonly used the stemming algorithms since it contains the least occurences of unnecessary stemming of suffix (overstemming or understemming). Having said that is the best, the researchers decided t use Porter 2 Stemming algorithm in this study. The main goal of this study is to enhance the output being generated by the existing Porter 2 Stemming algorithm. The study went through processes of testing and simulations to improve the existing algorithm and come up with a better one. By modifying the set of rules defined in the existing algorithm, the researchers were able to accomplish the proposed objectives. Since the algorithm is applied in an Android mobile dictionary, the existing algorithm has to be weakened a bit to make it stem less suffix in a word while at the same time, making sure it stems the input word correctly by making it aware of a word’s part of speech and context. After several tests and simulations, the enhanced algorithm proved to be better in performance against the existing algorithm. Suppose that the user enters the word pasted, under the existing algorithm, the entered word will be stemmed to past which is correct as it is indeed a root word but considering that the algorithm’s application is a mobile dictionary, the context of a word should be taken into consideration: meaning, pasted should be stemmed to paste, not past. This sample problem of the existing algorithm is already fixed in the enhanced algorithm.
5
Filipiniana
5

There are no comments for this item.