Vietnamese base noun phrase analysis using crfs model - 1


SCHOOL …………………. FACULTY……………………….


-----  -----

Maybe you are interested!


Graduation report

Vietnamese base noun phrase analysis using crfs model - 1


Topic :


ANALYZING BASIC NOUN PHRASES IN VIETNAMESE USING CRFs MODEL


COMMITMENT

I hereby certify that the results of this thesis are entirely the result of my own research and study. References are fully cited and annotated.


Student


Nguyen Thanh Huyen


ACKNOWLEDGEMENTS

During the time of studying and completing my thesis, I was taught very useful knowledge and scientific research methods by my teachers and received a lot of attention and encouragement from my family, agency, colleagues and friends.

First of all, I would like to thank the teachers in the Faculty of Information Technology - University of Technology - Vietnam National University, Hanoi for imparting valuable knowledge to me during my time studying at the school. In particular, I would like to express my deep gratitude to my supervisor, Associate Professor, Dr. Doan Van Ban, who has wholeheartedly instructed and guided me professionally throughout the process of writing this thesis.

Also, I would like to thank the board of directors of Hanoi College of Economics, where I am working, for creating favorable conditions for me during my studies as well as during the process of writing my graduation thesis.

Finally, I would like to thank my parents, brothers, sisters, husband, children, friends and colleagues who have always supported and encouraged me so much that I could confidently research and complete my thesis. During the process of writing my thesis, I tried to focus on learning, researching and referring to many related documents. However, due to limited time and my lack of experience in scientific research, the thesis certainly still has many shortcomings. I look forward to receiving guidance from teachers and comments from friends and colleagues to improve my thesis.

Hanoi, June 12, 2011

Nguyen Thanh Huyen


INDEX

COMMITMENT i

ACKNOWLEDGEMENTS iii

TABLE OF CONTENTS iv

LIST OF SYMBOLS AND ABBREVIATIONS vi

LIST OF TABLES vii

LIST OF FIGURES viii

INTRODUCTION 1

Chapter 1 - OVERVIEW OF DATA MINING AND ROUGH SET THEORY 3

1.1. Introduction to data mining 3

1.1.1 Knowledge Discovery 3

1.1.2. Data Mining 4

1.2. Applications of data mining 5

1.3. Some popular data mining methods 6

1.3.1. Classification 6

1.3.2. Clustering 8

1.3.3. Association Rules 9

1.4. Rough set theory 9

1.4.1. Information system 10

1.4.2. Decision table 10

1.4.3. Indistinguishable relationship 12

1.4.4. Approximation to set 12

1.5. Conclusion of chapter 1 14

Chapter 2- DECISION TREES AND DECISION TREE BUILDING ALGORITHM 15

2.1. Overview of decision tree 15

2.1.1. Definition 15

2.1.2. Decision tree design 16

2.1.3. General method of building decision trees 18

2.1.3. Application of decision trees in data mining 19

2.2. Algorithm for building decision tree based on Entropy 20

2.2.1. Criteria for selecting classification attributes 20

2.2.2. ID3 Algorithm 21

2.2.3. Example of ID3 algorithm 23

2.3. Algorithm for building decision tree based on attribute dependency 28

2.3.1. Attribute dependence according to rough set theory 28

2.3.2. Exact dependence according to rough set theory 28

2.3.3. Criteria for selecting attributes for classification 28

2.3.4. ADTDA decision tree construction algorithm 29

2.3.5. Example 30

2.4. Algorithm for building decision tree based on Entropy and attribute dependence 33

2.4.1. Criteria for selecting attributes for classification 33

2.4.2. FID3 algorithm (Fixed Iterative Dichotomiser 3 [5] ) 34

2.4.3. Example 35

2.5. Conclusion of chapter 2 39

Chapter 3 - APPLICATION OF VERIFICATION AND EVALUATION 40

3.1. Introduction to problem 40

3.2. Introduction to database 40

3.3. Application installation 41

3.4. Results and evaluation of algorithm 42

3.4.1. Decision tree model corresponding to data set Bank_data 42

3.4.2. Decision rules corresponding to the Bank_data 44 dataset

3.4.3. Algorithm evaluation 44

3.4.4. Application of decision trees in data mining 45

3.5. Conclusion of chapter 3 46

CONCLUSION 47

REFERENCES 49


LIST OF SYMBOLS AND ABBREVIATIONS

SYMBOLS:

S = (U, A) Information system

V a The set of values ​​of attribute a

IND(B) Equivalence relation of attribute set B [u i ] p Equivalence class containing object u i

U/B Partition of U generated by the relation IND(B)

DT=(U,C D) Decision table

B ( X )


B ( X )

PO S C ( d )

B-Lower approximation of X B-Upper approximation of X

The C-assertive domain of d

|DT| Total number of objects in DT

|U| Cardinality of set U

[U] d Partition of U generated by the relation IND(d)


ABBREVIATIONS:

ADTDA Algorithm for Building Decision Tree Based on Dependency of Attributes

FID3 Fixed Iterative Dichotomiser 3 ID3 Iterative Dichotomiser 3

IG Information Gain


LIST OF TABLES

Table 1. Simple information system 10

Table 2. A decision table with C={Age, LEMS} and D={Walk} 11

Table 3. Training data 23

Table 4. Table of attributes of the Bank_data dataset 41

Table 5. Accuracy of algorithms 45

LIST OF FIGURES


Figure 1. Data classification process – Model building step 7

Figure 2. Data classification process – Estimating model accuracy 8

Figure 3. Data classification process – New data classification 8

Figure 4. Approximation of the object set in Table 2 by the conditional attributes Age and LEMS 14

Figure 5. General description of decision tree 15

Figure 6. Example of Decision Tree 16

Figure 7. Classification model of new samples 19

Figure 8. Tree after selecting Humidity attribute (ID3) 25

Figure 9. Tree after selecting Outlook attribute (ID3) 26

Figure 10. Result tree (ID3) 27

Figure 11. Tree after selecting Humidity attribute (ADTDA) 31

Figure 12. Tree after selecting Outlook properties (ADTDA) 32

Figure 13. Results tree (ADTDA) 33

Figure 14. Decision tree after selecting the Humidity attribute (FID3) 36

Figure 15. Decision tree after selecting attribute Windy (FID3) 38

Figure 16. Result tree (FID3) 39

Figure 17. ID3 42 decision tree form

Figure 18. ADTDA 42 decision tree form

Figure 19. FID3 decision tree form 43

Figure 20. Some rules of the ID3 decision tree 44

Figure 21. Some rules of the ADTDA 44 decision tree

Figure 22. Some rules of FID3 decision tree 44

Figure 23. Application interface 46

Comment


Agree Privacy Policy *