Reading List
EE 380L - A Practicum in Data Mining
CS 395T/CAM 395T - Large-Scale Data Mining
Spring 2001

Notices

Paper Selection Policy


General Reading (but not for class presentations)

  1. Statistical Pattern Recognition: A Review
    Anil K. Jain, Robert P.W. Duin, and Jianchang Mao
    IEEE Trans PAMI, Vol. 22, No. 1, January 2000
  2. Web Mining Research:  A Survey
    R. Kosala and H. Blockeel
    SIGKDD Explorations, June 2000. Volume 2, Issue 1
  3. Data mining for hypertext: A tutorial survey
    S. Chakrabarti
    ACM SIGKDD Explorations, 1(2), pages 1--11, 2000
  4. Text-Learning and Related Intelligent Agents: A Survey
    Dunja Mladenic, IEEE Intelligent Systems, July/August 1999
  5. An Internet-enabled Knowledge Discovery Process
    by Alex Buchner, et. al., MINEit Software Ltd., 1999
  6. Impact of Similarity Measures on Web-page Clustering
    A.Strehl, J. Ghosh and R. Mooney
    Proc. AAAI workshop on AI for Web Search, K. Bollacker (Ed)
    TR WS-00-01, AAAI Press, July 2000, pp. 58-64
  7. Data Preparation for Mining World Wide Web Browsing Patterns
    Robert Cooley, Bamshad Mobasher, and Jaideep Srivastava
    Knowledge and Information Systems V1(1), 1999

Hyperlinks

  1. Finding Related Pages in the World Wide Web NEW! (Selected)
    Jeffrey Dean and Monika R. Henzinger
  2. The Anatomy of a Large-Scale Hypertexual Web Search Engine (Selected)
    Sergey Brin and Lawrence Page

  3. The PageRank Citation Ranking: Bringing Order to the Web (Selected)
    Lawrence Page, Sergy Brin, Rajeev Motwani, and Terry Winograd

  4. Improved Algorithms for Topic Distillation in a Hyperlinked Environment
    Krishna Bharat and Monika Henzinger

  5. Clustering Hypertext with Applications to Web Searching
    Dharmendra S. Modha and W. Scott Spangler
    IBM Almaden Research Center

  6. Graph Structure in the Web (Selected)
    Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Rajagopalan, Raymie Stata, Andrew Tomkines, and Janet Wiener

  7. Mining the Web's Link Structure (Selected)
    Soumen Chakrabarti, Byron E. Dom, S. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins, David Gibson, and Jon Kleinberg
    IEEE Computer, 32(8), Aug, 1999, pp. 60-67


Information Retrieval

  1. Distributional Clustering of Words for Text Classification (Selected)
    L. Douglas Baker and Andrew Kachites McCallum
    SIGIR 1998, Melbourne, Australia

  2. Beyond Document Similarity: Understanding Value-Based Search and Browsing Technologies (Selected)
    Paepcke,H. Garcia-Molina,G. Rodriguez-Mula,J. Cho
    SIGMOD Records, 29(1): March 2000

  3. Improving Short-Text Classification using Unlabeled Background Knowledge to Assess Document Similarity
    Sarah Zelikovitz and Haym Hirsh
    Proceedings of the Seventeenth International Conference on Machine Learning (ICML-2000).

  4. ProbMap: A Probabilistic Approach for Mapping Large Document Collections
    Thomas Hofmann
    IDAJ 2000
  5. Learning to Construct Knowledge Bases from the World Wide Web (Selected)
    M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam & S. Slattery
    Artificial Intelligence  118(1-2): 69-113

  6. Concept-Based Knowledge Discovery in Texts Extracted from the Web (Selected)
    S. Loh, L. K. Wives, and J. P. de Oliveira
    SIGKDD Explorations, June 2000. Volume 2, Issue 1

  7. Similarity Search in High Dimensions via Hashing (Selected)
    A. Gionis, P. Indyk and R. Motwani
    25th International Conference on Very Large Databases (VLDB) , 1999

  8. Using Metadata to Enhance a Web Information Gathering System (Selected)
    Neel Sundaresan, Jeonghee Yi, and Anital Huang
    WebDB 2000

  9. Document Categorization and Query Generation on the World Wide Web Using WebACE (Selected)
    E. Han, D. Boley, M. Gini, R. Gross, K. Hastings, J. Moore, G. Karypis, B. Mobasher, and V. Kumar
    Journal of Artificial Intelligence Review, Vol. 13, No. 5-6, pp. 365-391, 1999


Click-Stream Analysis

  1. Analysing navigation behaviour in web sites integrating multiple information systems
    Bettina Berendt and Myra Spiliopoulou
    VLDB Journal, Special Issue on Databases and the Web, 2000

  2. Discovery of Interesting Usage Patterns from Web Data (Selected)
    Robert Cooley, Pang-Ning Tan, Jaideep Srivastava
    Springer-Verlag LNCS/LNAI series, 2000

  3. Recommending Web Documents Based on User Preferences (Selected)
    Eric Glover, Steve Lawrence, Michael Gordon, William Birmingham, C. Lee Giles (NEC Research)

  4. Adaptive Web Sites: Concept and Case Study (Selected)
    Mike Perkowitz, Oren Etzioni
    Artificial Intelligence, 118(1-2), 2000

  5. Integrating Web Usage and Content Mining for More Effective Personalization (Selected)
    Bamshad Mobasher et. al.
    EC-Web 2000
  6. Web usage mining, site semantics, and the support of navigation
    Bettina Berendt
    Proc. Web Mining Workshop, KDD2000

  7. Navigation Analysis Tool based on the Correlation between Contents Distribution and Access Patterns
    Hiroki Kato, Takehiro Nakayama, Yohei Yamane
    Proc. Web Mining Workshop, KDD2000


Personalization

  1. An Algorithmic Framework for Performing Collaborative Filtering (Selected)
    Herlocker, J., Konstan, J., Borchers, A., Riedl, J.
    SIGIR '99: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, August 15-19, 1999, Berkeley, CA, USA. ACM 1999 230-237

  2. Combining content and collaboration in text filtering (Selected)
    Ian M. Soboroff and Charles K. Nicholas
    In Thorsten Joachims, editor, Proceedings of the IJCAI'99 Workshop on Machine Learning in Information Filtering, pages 86--91, Stockholm, Sweden, August 1999

  3. Latent Class Models for Collaborative Filtering
    Thomas Hofmann and Jan Puzicha
    IJCAI 1999

  4. Autonomous Interface Agents (Selected)
    H. Lieberman
    Proceedings of the ACM Conference on Computers and Human Interface, CHI-97, Atlanta, Georgia, March 1997
  5. Integrating E-Commerce and Data Mining: Architecture and Challenges (Selected)
    Suhail Ansari, et. al.
    Proc. Web Mining Workshop, KDD2000, pp. 49-60

Last updated 01/2001