EE 380L : Data Mining

Fall 2001
Prof. Joydeep Ghosh


Be aware of the copyright notice when you use these materials.

Check the presentation schedule to see which papers have been taken. Please send your selections to the TA.

Notice: Use Netscape if you have problem downloading .ps.gz file correctly using Internet Explorer (IE sometimes automatically uncompress .ps.gz files).


Reading List - II

    Exploratory Data Analysis

  1. "Robust Space Transformations for Distance-based Operations"
    Edwin M. Knorr, Ramond T. Ng and Ruben H. Zamar
    KDD-2001, pp. 126-135
  2. "Mining Frequent Patterns by Pattern-Growth: Methodology and Implications"
    J. Han and J. Pei
    SIGKDD Explorations, vol. 2(2), Dec. 2000
  3. "Automating Exploratory Data Analysis for Efficient Data Mining",
    Becher, J.D., Berkhin, P., and Freeman, E.,
    KDD-2000, pp. 424 - 429

    Clustering/Segmentation

  4. "The Online Median Problem"
    Ramgopal R. Mettu and C. Greg Plexton
    Foundations of Computer Science, 2000. Proceedings. 41st Annual Symposium on , 2000 Page(s): 339 -348
  5.   "A general probabilistic framework for clustering individuals"
    I. Cadez, S. Gaffney and P. Smyth
    Technical Report UCI-ICS 00-09, March 2000
    Revised version in ACM SIGKDD 2000 Proceedings. Outlines a general EM-based framework for clustering sets of sequences, curves, and other non-vector objects, with applications to gene expression data, Web page requests, and red blood cell histograms.
  6. "CACTUS-Clustering Categorical Data Using Summaries"
    Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke
    DEMON project
    Proc. 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), Aug 1999, pp 73-83

    Association Rules and Market Basket Analysis

  7. "Mining changes for real-life applications"
    Bing Liu, Wynne Hsu, Heng-Siaw Han and Yiyuan Xia. 
    The 2nd international conference on data warehousing and knowledge discovery (DaWaK-2000)
  8. "Empirical Bayes Screening for Multi-item Associations"
    William DuMouchel and Daryl Pregibon
    KDD-2001, pp. 67-76

    Classification and Prediction/Regression

  9. "Learning and Making Decisions When Costs and Probabilities are Both Unknown"
    Bianca Zadrozny and Charles Elkan
    KDD-2001, pp. 204-213
  10. "Mining Time-Changing Data Streams"
    Geoff Hulten, Laurie Spencer and Pedro Domingos
    KDD-2001, pp. 97-106
  11. Probabilistic Classification and Clustering in Relational Data. IJCAI-01
    B. Taskar
    , E. Segal & D. Koller
  12. Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach. SIGMOD-01
    A. Doan
    , P. Domingos & Alon Halevy

    Miscellaneous

  13. "A Streaming Ensemble Algorithm (SEA) for Large-Scale Classification"
    W. Nick Street and Yongseog Kim
    KDD-2001, pp. 377-382
  14. "Detecting Graph-Based Spatial Outliers: Algorithms and Applications (A summary of results)"
    Shashi Shekhar, Chang-Tien Lu and Pusheng Zhang
    KDD-2001, pp. 371-376
  15. "Ensemble-Index: A New Approach to Indexing Large Databases"
    Eamonn Keogh, Selina Chu and Michael Pazzani
    KDD-2001, pp. 117-125
  16. "Scalable Data Mining with Model Constraints"
    M. Garofalakis and R. Rastogi
    SIGKDD Explorations, vol. 2(2), Dec. 2000
  17. "Signature-Based Methods for Data Streams"
    Corinna Cortes, Daryl Pregibon
    Data Mining and Knowledge Discovery 5 (3):167-182, July 2001
  18. "Learning in the Presence of Concept Drift and Hidden Contexts." Machine Learning, 23 (1996), 69-101
    Widmer, G. and Kubat, M.
  19. "The Impact of Changing Populations on Classifier Performance"
    Mark G. Kelly, David J. Hand, Niall M. Adams
    Proc. 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), Aug 1999, pp 367-371


Last updated 09/2001