Normal view MARC view ISBD view

Enhancement of the porter 2 stemming algorithm applied in android mobile dictionary

By: Jonelle R. Atienza and Hector Allan Norbert A. Erni

Language: English . . c2013Description: Undergraduate Thesis: (Bachelor of Science in Computer Studies major in Computer Science) - Pamantasan ng Lungsod ng Maynila, 2013Content type: text Media type: unmediated Carrier type: volumeGenre/Form: .DDC classification: . LOC classification: QA76.15 A85 2013

Contents:

ABSTRACT: Stemming algorithms are responsible for the conversion of a word to its root form regardless of its context. There are many stemming algorithms in use today like Lovins, and Kovetz. There are also different methods people or researchers use to stem words like using a lookup table or by suffix-stripping method where a word is processed under set of rules. Porter 2 Stemming Algorithm is developed by Martin Porter and considered the best of the most commonly used the stemming algorithms since it contains the least occurences of unnecessary stemming of suffix (overstemming or understemming). Having said that is the best, the researchers decided t use Porter 2 Stemming algorithm in this study. The main goal of this study is to enhance the output being generated by the existing Porter 2 Stemming algorithm. The study went through processes of testing and simulations to improve the existing algorithm and come up with a better one. By modifying the set of rules defined in the existing algorithm, the researchers were able to accomplish the proposed objectives. Since the algorithm is applied in an Android mobile dictionary, the existing algorithm has to be weakened a bit to make it stem less suffix in a word while at the same time, making sure it stems the input word correctly by making it aware of a word’s part of speech and context. After several tests and simulations, the enhanced algorithm proved to be better in performance against the existing algorithm. Suppose that the user enters the word pasted, under the existing algorithm, the entered word will be stemmed to past which is correct as it is indeed a root word but considering that the algorithm’s application is a mobile dictionary, the context of a word should be taken into consideration: meaning, pasted should be stemmed to paste, not past. This sample problem of the existing algorithm is already fixed in the enhanced algorithm.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings ( 1 )
Title notes
Comments ( 0 )

Item type	Current location	Home library	Collection	Call number	Status	Date due	Barcode	Item holds
Thesis/Dissertation	PLM	PLM Archives	Filipiniana-Thesis	QA76.15 A85 2013 (Browse shelf)	Available		FT6134

Total holds: 0

Filipiniana

There are no comments for this item.

to post a comment.