Abstract

A GUIDED CLUSTERING TECHNIQUE FOR KNOWLEDGE DISCOVERY

Diksha Sharma, Dr. Kalyankar N.V

095-100

Vol: 2, Issue: 4, 2012

Scalability: As visualization techniques are used more and more in data mining tools, it becomes important to address scalability issue. Fundamentally, the reason for scalable solutions is to be able to build a good data mining model as quickly as possible. This offers two benefits, a) there is value in being able to deploy and use the model sooner rather than later and b) faster turnaround times yield better models. Time previously spent waiting for results can instead be devoted to finding the model that result in the best and most reliable solution. Making data mining tools scalable requires hardware scalability and parallel algorithms. The goal of hardware scalability is to provide high performance by adding modestly priced processor building blocks (or nodes) in such a way that performance scales linearly.

Download PDF

    References

  1. Adriaans P. and Zantinge D. (1999). Introduction to Data Mining and Knowledge Discovery. 3rdEdition Potomac, MD: Two Crows Corporation.
  2. Adriaans P. and Zantinge D. (2003). Data Mining. Pearson Education, Seventh Indian Reprint, 2003.
  3. Agrawal R. and Srikant R. (1994). Fast Algorithms for Mining Association Rule. Proceedings of the 20th International Conference on Very Large Databases (VLDB), 487 – 499.
  4. Agrawal R., Faloutsos C. and Swami A. (1993). Efficient similarity search in sequence databases. Proceedings of the Fourth International Conference on Foundations of Data Organisation and Algorithms, Chicago, Vol. 730, 69-84.
  5. Agrawal R., Imielinski T. and Swami A. (1993). Mining association rules between sets of itemsin large databases. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington DC, 207-216.
  6. Ahmed S.R. (2007). Applications of Data Mining in Retail Business. International Conference on Information Technology: Coding and Computing, Las Vegas, Nevada, Vol. 2.
  7. Anand S.S., Bell D.A. and Hughes J.G. (1995). The Role of Domain Knowledge in Data Mining. Proceedings of the Fourth International Conference on Information and knowledge management, 37- 43.
  8. Ankerest M. (2001). Human Involvement and Interactivity of the Next generation’s Data Mining Tools. ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. Santa Barbara, CA.
  9. Ankerest M., Ester M. and Kriegel H.P. (2000). Towards an Effective Cooperation of the Userand the Computer for Classification. Proceedings of 6th International conference on Knowledge Discovery and Data Mining, Boston, MA.
  10. Aruna P., Puviarasan N. and Palaniappan B. (2005). An Investigation of Neuro-Fuzzy Systems in Psychosomatic Disorders. Expert Systems with Applications. Vol. 28, 673-679.
  11. Bates J.H.T. and Young M.P. (2003). Applying Fuzzy Logic to Medical Decision Making in the Intensive Care Unit. American Journal of Respiratory and Critical Care Medicine, Vol. 167, 948-952.
  12. Bayrak C., Kolukisaoglu H. and Chia-Chu Chiang. (2006). Di-Learn: Distributed Knowledge Discovery with Human Interaction. IEEE International conference on Systems, Man and Cybernetics, Taipei, Taiwan, Vol. 4, 3354 – 3359.
  13. Ding C. and He X. (2004). k-Means Clustering via Principal Components Analysis. ACM Proceedings of the 21st International Conference on Machine Learning, Vol. 69, page 29.
  14. Edelstein H.A. (1999). Introduction to Data Mining and Knowledge Discovery (3rd Edition) Potomac, MD: Two Crows Corp.
  15. Ester M., Kriegel H.P. and Sander J. (2001). Algorithms and Applications for Spatial Data Mining. Published in Geographic Data Mining and Knowledge Discovery, Research Monographs in GIS, Taylor and Francis.
  16. Fawcett T. and Provost F. (1997). Adaptive Fraud Detection. Data Mining and Knowledge Discovery, Vol. 1(3):291-316.
  17. Fayyad U. M., Piatetsky-Shapiro G. and Smyth P. (1996). From Data Mining to Knowledge Discovery: An Overview. Advances in Knowledge Discovery and Data mining, AAAI Press, 1-34.
  18. Fayyad U., Piatetsky-Shapiro G. and Smyth P. (1996). The KDD Process for Extracting Useful Knowledge from Volumes of Data. Communications of ACM, Vol. 39, 27-34.
  19. Forgionne G.A., Gagopadhyay A. and Adya M. (2000). Cancer Surveillance Using Data Warehousing, Data Mining, and Decision Support Systems. Topics in Health Information Management, Proquest Medical Library, Vol. 21(1), 21-34.
  20. Frank H., Klawonn F., Kruse R. and Runkler T. (1999). Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition. New York: John Wiley.
  21. Frawley W.J., Piatetsky-Shapiro G. and Matheus C.(1996). Knowledge Discovery in Databases: An Overview. Knowledge Discovery in Databases, AAAI Press/MIT Press, Cambridge, MA., Menlo Park, C.A, 1-30.
  22. G. Fort and Lambert-Lacroix S. (2005). Classification using Partial Least Squares with Penalised Logistic Regression. England: Bioinformatics-Oxford, 21(7), 1104-1111.
Back

Disclaimer: Indexing of published papers is subject to the evaluation and acceptance criteria of the respective indexing agencies. While we strive to maintain high academic and editorial standards, International Journal of Research in Science and Technology does not guarantee the indexing of any published paper. Acceptance and inclusion in indexing databases are determined by the quality, originality, and relevance of the paper, and are at the sole discretion of the indexing bodies.