Applying artificial immune system for spam filtering - Luong Van Lam - 1


ACKNOWLEDGEMENTS


To complete this graduation thesis, I would like to express my deep gratitude to my teacher, Ths. Nguyen Van Truong - Lecturer of Informatics, Faculty of Mathematics, University of Education - Thai Nguyen University, for guiding my ideas, wholeheartedly helping and instructing me throughout the thesis process.

I would like to sincerely thank the school's Board of Directors, the Head of the Mathematics Department and all the teachers in the department for their dedicated guidance and help in completing my thesis.

Maybe you are interested!

Besides, I would like to thank my family, friends and relatives who have encouraged and helped me throughout the thesis writing process.

In the process of writing the thesis, due to lack of experience, it is inevitable that there will be shortcomings and limitations. Therefore, I really hope to receive comments from teachers and students to make the thesis more complete.

Applying artificial immune system for spam filtering - Luong Van Lam - 1

Thank you very much!

Thai Nguyen, April 2015

Student


Luong Van Lam



Abbreviations, symbols

LIST OF ABBREVIATIONS AND SYMBOLS


Write fully and meaningfully

HMD Immune System.

Negative Selection Algorithm Negative Selection Algorithm

NSA

polar (negative)

SMTP Simple Mail Transfer Protocol.

WEKA Waikato Environment for Knowledge Analysis.

HTML HyperText Markup Language.

IBM International Business Machines.

TP Number of spam emails concluded correctly.

TN Number of emails usually concludes correctly.

FP Number of emails often incorrectly labeled as spam.

FN Number of spam emails incorrectly concluded as normal.

Acc Overall Accuracy.

DR Detection Rate.

FPR False Positive Rate.

LIST OF DRAWINGS


LIST OF TABLES


INDEX

Sub-cover 1

Thanks

thanks… 2

Letters

Electronics

INTRODUCTION

(email) has been and is one of the means and tools

send

The most widely used information receiver in the world. The development of email is closely linked to the development of information technology.

Spam is an email sent automatically to a user's account (mailbox) with unwanted, unwanted, inappropriate or irrelevant content. The appearance of spam causes inconvenience and wastes time for users. In addition, it also slows down the Internet connection due to the number of spam messages sent at a time.

is a lot, spam is also one of the tools with many unpredictable consequences in many aspects.

spread of computer viruses

To prevent and stop spam, many methods have been used to create

many mail filtering software

garbage, one of the new methods

has been and is being

Research and development is the application of artificial immune system (AIS) - a method based on the principles, functions, and operating models of biological HMD in humans, with "machine learning" techniques that bring relatively high efficiency.

With this technique, regular or spam emails will be “learned” or “trained” to form a database to detect spam. The problem is to improve the efficiency of the machine learning process, as well as the process of identifying and eliminating spam.

Therefore, I decided to choose the research content in my thesis as: "Application of artificial immune system for spam filtering".

I. Research objectives

Initial study of artificial immune system and its application to spam filtering problem.

II. Research tasks

Research the history of the development of email, the benefits and limitations that email brings.

Research on spam: its development process, structure, harms... Learn about spam prevention methods, advantages and disadvantages of the methods.

Learn about artificial immune system content, some algorithms in artificial immune system.

Build a program that applies an artificial immune system algorithm to spam filtering.

III. Research methods

Research documents: books, theses, some research topics in the same field, articles, forums specializing in email and artificial immune systems.

Consult your instructor and fellow students.

Test the program settings and compare its performance with some other methods (on WEKA) in terms of correct detection and error rate.

IV. Structure of the topic

In addition to the introduction and conclusion, the thesis has 03 chapters:

Chapter 1. Overview of email and spam.

Chapter 2. Overview of biological immune systems and artificial immune systems.

Chapter 3. Building a spam filtering program using an artificial immune system.


CHAPTER 1

OVERVIEW OF EMAIL AND SPAM


This chapter presents an overview of the history, concepts, benefits of email, general structure and protocols for sending and receiving email.


1.1. Overview of email


1.1.1. Development history

Nowadays, email is a familiar concept and almost indispensable for most Internet users. Billions of email accounts are being used, showing that email is the leading tool for sending, receiving and exchanging information in the world today.

The history of the development of email is associated with the following milestones:


 Pre-email era

1961: Tom Van Vleck (American computer software engineer) developed a multi-user message exchange system on a single computer.

1965: First Massachusetts letter, USA.

Electronics

launched at the Institute of Technology

1971: Ray Tomlinson (American programmer) developed the communication system

Translate messages to multiple people on multiple computers and send the letter

Electronics

first on

ARPANET (Advanced Research Projects Agency Network), the email was an email test.

1977: A standard format (RFC 733) was proposed by Dave Crocker to transform electronic mail communication over the Internet.

 The birth of email

export to

universal

1978: VA Shiva Ayyadurai created an electronic system for sending mail between departments within the University of Medicine and Dentistry of New Jersey.

1979: The components: To, From, Cc, Bcc, Subject, Inbox, Outbox,.. were converted into an email system.

1980: The above email system was put into practical use at the University of Medicine and Dentistry of New Jersey.

August 30, 1982: The term “email” and the official mail system.

Electronics

be given a copy

1982: SMTP is an electronic mail transfer protocol. SMTP is a protocol for transferring electronic mail over the network. SMTP allows transferring electronic mail messages from the sender's mail server to the recipient's mail server.

1985: The system develops an offline email format that allows recipients to store messages on their computers.

1988: Microsoft Mail was the first commercial email client developed for the Media Access Control (MAC) network protocol.

1989: IBM launches Lotus 1.0 – the first email server model.


 The 1990s

In the early 1990s, the spam problem began to rage.

1992: Microsoft Outlook version for MSDOS operating system was released.

1993: America Online and Delphi connect their proprietary email systems to the Internet. Meanwhile, IBM, in partnership with BellSouth, produces the first smartphone, the Simon Personal Communicator, which includes email functionality.

1996: Sabeer Bhatia and Jack Smith launch “HotMail” the world's first free email service website and HotMail quickly becomes the world's most used email service.

1997: Yahoo! launched Yahoo Mail to compete with Hotmail.

1999: Blackberry allows access to email via mobile phone. The ability to send emails via phone makes using email more convenient and faster than ever.

In the late 1990s, HTML-based email emerged, allowing for richer text formatting than plain text.

 The early years of the 21st century

2000: Microsoft releases the Microsoft Entourage email client for the Mac OS.

2003: Microsoft Outlook 2003 develops spam and phishing filters.

2004: The US Federal Trade Commission enacted the anti-spam law.

2006: Microsoft Outlook 2007 was released, supporting RSS feeds and receiving messages. At the same time, Facebook launched globally, linking Facebook accounts to email accounts.

April 2007: Gmail goes live after four years in beta.

2010:

+ Microsoft Outlook 2010 integrates Outlook Social Connector (supports sending and receiving emails with social networks) skipping conversations and cleaning up conversations.

+ Outlook Mobile for Windows Phone 7 and Outlook for Mac 2011 are released.

+ Social network Facebook publicly announced plans to integrate Microsoft web applications into its new messaging system.

2011: The US AP Stylebook officially uses the word “email” in the media instead of “email”.

Through the stages of development, email is now being improved to be more convenient and user-friendly, demonstrated through improvements in the user interface along with increasingly effective email protection functions.

Comment


Agree Privacy Policy *