KARPAGAM Journal of Computer Science (ISSN : 0973-2926)

Authors : P.Sumathi M.Sc.,M.Phil.,(Ph.D)., & Dr. R.Manicka chezian

Text mining is a new and exciting research area that attempts to solve the information overload problem technique for automatically extracting association rules from collections of textual documents. Depending on keyword features for discover association rules amongst keywords labeling the documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the swine flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the system compared with another system that uses the Apriori algorithm throughout the execution.