MARC View

000		05941nam a2201225Ia 4500
000		03772ntm a2200205 i 4500
001		87877
003		0
005		20250920173719.0
008		230216n 000 0 eng d
010		_z _z _o _a _b
015		_22 _a
016		_2 _2 _a _z
020		_e _e _a _b _z _c _q _x
022		_y _y _l _a2
024		_2 _2 _d _c _a _q
028		_a _a _b
029		_a _a _b
032		_a _a _b
035		_a _a _b _z _c _q
037		_n _n _c _a _b
040		_e _erda _a _d _b _c
041		_e _e _a _b _g _h _r
043		_a _a _b
045		_b _b _a
050		_a _a _d _b2 _c0
051		_c _c _a _b
055		_a _a _b
060		_a _a _b
070		_a _a _b
072		_2 _2 _d _a _x
082		_a _a _d _b2 _c
084		_2 _2 _a
086		_2 _2 _a
090		_a _a _m _b _q
092		_f _f _a _b
096		_a _a _b
097		_a _a _b
100		_e _e _aJasmine P. Laurente, Carla Johnica D. Quilop. _d _b4 _u _c0 _q16
110		_e _e _a _d _b _n _c _k
111		_a _a _d _b _n _c
130		_s _s _a _p _f _l _k
210		_a _a _b
222		_a _a _b
240		_s _s _a _m _g _n _f _l _o _p _k
245	0	_a _aAn enhancement of the gibberish classification algorithm for detecting gibberish content in text document / _d _b _n _cJasmine P. Laurente, Carla Johnica D. Quilop. _h6 _p
246		_a _a _b _n _i _f6 _p
249		_i _i _a
250		_6 _6 _a _b
260		_e _e _a _b _f _c _g
264		_3 _3 _a _d _b _cMarch 2016.46
300		_e _e _c28 cm. _a92 pp. _b
310		_a _a _b
321		_a _a _b
336		_b _atext _2rdacontent
337		_3 _30 _b _aunmediated _2rdamedia
338		_3 _30 _b _avolume _2rdacarrier
340		_2 _20 _g _n
344		_2 _2 _a0 _b
347		_2 _2 _a0
362		_a _a _b
385		_m _m _a2
410		_t _t _b _a _v
440		_p _p _a _x _v
490		_a _a _x _v
500		_a _aThesis: (BSCS major in Computer Science) - Pamantasan ng Lungsod ng Maynila, 2016. _d _b _c56
504		_a _a _x
505		_a _a _b _t _g _r
506		_a _a5
510		_a _a _x
520		_b _b _c _aABSTRACT: Gibberish Classification algorithm aims to detect whether the text is valid, or randomly typed in a keyboard. It returns a percentage where a low one means valid test, and a high one means gibberish text. If the result is lower than 50%, it's likely that the text is valid. If a result is higher than 50%, it's likely that the text is gibberish. The algorithm is optimized for the English Language and for longer text. It will still work for shorter text, for example is one sentence, but then the result will be less accurate. The algorithm won't give a percentage lower than 1%, except if the input string is null or empty, then it returns 0%. The algorithm checks three things. First, it checks whether the amount of unique characters, in a chunk of 35 characters, is in usual range, Second, if the amount of vowels in the letters is in usual range. Third, it checks whether the word/clear ration is in usual range. The final percentage will be computed based from these three things. The researchers purpose is to improve the Gibberish Classification Algorithm so it can be more accurate in giving the final percentage of how the text is. It will now based from the right spelling or structure of words in the English language and not by the range of unique characters, vowels and word/char ratio. Since the Gibberish Classification Algorithm is still in its early stage, so there are still some incorrect return values. There are still cases that the Gibberish Classification Algorithm produces a high percentage to a clearly valid sentence and conversely, for gibberish inputs, the algorithm sometimes produces a low percentage. While studying the existing gibberish classification algorithm, the researchers encountered these problems. First, words with correct spelling are being considered as gibberish with 15 out of 25 of the sample valid inputs are evaluated as gibberish (60% incorrect results). Second, the algorithm returns a valid percentage to sentences that uses numerous punctuation marks with 17 out of 25 of the sample invalid inputs are evaluated as valid (68% incorrect result). Third, words that uses mixed uppercase and lowercase letters are being considered as valid. In order to improve the existing algorithm, the researchers solution to the encountered problems are the following: First is to lessen the 60% incorrect results regarding into 10% by adding additional computation for words and for sentence. Second is to lessen the 68% incorrect results regarding punctuation into 18% by adding additional computation for punctuation marks. Third is to be able to check the case of the letters in each word and consider a word as gibberish if different cases is detected. Improving the Gibberish Classification Algorithm will be a great help to people, especially to English Proficiency teachers, and people who wants to detect if there are gibberish content in their documents. _u
521		_a _a _b
533		_e _e _a _d _b _n _c
540		_c _c _a5
542		_g _g _f
546		_a _a _b
583		_5 _5 _k _c _a _b
590		_a _a _b
600		_b _b _v _t _c2 _q _a _x0 _z _d _y
610		_b _b _v _t2 _x _a _k0 _p _z _d6 _y
611		_a _a _d _n2 _c0 _v
630		_x _x _a _d _p20 _v
648		_2 _2 _a
650		_x _x _a _d _b _z _y20 _v
651		_x _x _a _y20 _v _z
655		_0 _0 _a _y2 _z
700		_i _i _t _c _b _s1 _q _f _k40 _p _d _e _a _l _n6
710		_b _b _t _c _e _f _k40 _p _d5 _l _n6 _a
711		_a _a _d _b _n _t _c
730		_s _s _a _d _n _p _f _l _k
740		_e _e _a _d _b _n _c6
753		_c _c _a
767		_t _t _w
770		_t _t _w _x
773		_a _a _d _g _m _t _b _v _i _p
775		_t _t _w _x
776		_s _s _a _d _b _z _i _t _x _h _c _w
780		_x _x _a _g _t _w
785		_t _t _w _a _x
787		_x _x _d _g _i _t _w
800		_a _a _d _l _f _t0 _q _v
810		_a _a _b _f _t _q _v
830		_x _x _a _p _n _l0 _v
942		_a _alcc _cBK
999		_c25386 _d25386