Normal view MARC view ISBD view

A further enhancement of Paul Graham's Batesian algorithm applied in spam filtering

By: Jerhica Kim T. Canaya and Jolina P. Escolano

Language: English . . c2017Description: Undergraduate Thesis: (BSCS major in COmputer Science) -Pamantasan ng Lungsod ng Maynila, 2017Content type: text Media type: unmediated Carrier type: volumeGenre/Form: academic writingDDC classification: LOC classification: QA76.9 C36 2017

Contents:

ABSTRACT: Nowadays, spammers are just in a corner, sending random and irrelevant mails to our-e-mails, considering that we need to check our received mails for the day. These spam mails may contain malicious words or attachment, links that redirects you to an unwanted website, and some links contain viruses that can harm your computer without even knowing it. These are threats to users, spammers can get information just by simply opening the mail they sent. This research paper presents a variation of token to consider that may use in filtering and number of token to test. This will be beneficial to all who’s using the email to send messages, these may prevent the user in having unnecessary files and viruses attached to the email. We have done manual simulations and computerized simulation to know the possible result of mail stricter. The Paul Graham’s Bayesian algorithm is a machine learning algorithm that trains and classifies a data or a token with different score, we therefore conclude that considering HTML tags and multiple word as token use to determine whether it is from a spam ail or non-spam mail is much effective. In applying this algorithm to spam filtering stricter, we can distinguish whether our received mail is a spam or not.

Summary: ABSTRACT: Nowadays, spammers are just in a corner, sending random and irrelevant mails to our-e-mails, considering that we need to check our received mails for the day. These spam mails may contain malicious words or attachment, links that redirects you to an unwanted website, and some links contain viruses that can harm your computer without even knowing it. These are threats to users, spammers can get information just by simply opening the mail they sent. This research paper presents a variation of token to consider that may use in filtering and number of token to test. This will be beneficial to all who's using the email to send messages, these may prevent the user in having unnecessary files and viruses attached to the email. We have done manual simulations and computerized simulation to know the possible result of mail stricter. The Paul Graham's Bayesian algorithm is a machine learning algorithm that trains and classifies a data or a token with different score, we therefore conclude that considering HTML tags and multiple word as token use to determine whether it is from a spam ail or non-spam mail is much effective. In applying this algorithm to spam filtering stricter, we can distinguish whether our received mail is a spam or not.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings ( 1 )
Title notes
Comments ( 0 )

Item type	Current location	Home library	Collection	Call number	Status	Date due	Barcode	Item holds
Archival materials	PLM	PLM Archives	Filipiniana-Thesis	QA76.9 C36 2017 (Browse shelf)	Available		FT6057

Total holds: 0

ABSTRACT: Nowadays, spammers are just in a corner, sending random and irrelevant mails to our-e-mails, considering that we need to check our received mails for the day. These spam mails may contain malicious words or attachment, links that redirects you to an unwanted website, and some links contain viruses that can harm your computer without even knowing it. These are threats to users, spammers can get information just by simply opening the mail they sent. This research paper presents a variation of token to consider that may use in filtering and number of token to test. This will be beneficial to all who's using the email to send messages, these may prevent the user in having unnecessary files and viruses attached to the email. We have done manual simulations and computerized simulation to know the possible result of mail stricter. The Paul Graham's Bayesian algorithm is a machine learning algorithm that trains and classifies a data or a token with different score, we therefore conclude that considering HTML tags and multiple word as token use to determine whether it is from a spam ail or non-spam mail is much effective. In applying this algorithm to spam filtering stricter, we can distinguish whether our received mail is a spam or not.

Filipiniana

There are no comments for this item.

to post a comment.