BooksJiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers
R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2ed., Wiley-Inter-science, 2001.
U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, R. Uthurusamy, Advances in Knowledge Discovery and Data Mining, The MIT Press, 1996
U. Fayyad, G. Grinstein, and A. Wierse, Information Visualization in Data Mining and Knowledge Discovery, Morgan Kaufmann, 2001
D. J. Hand, H. Mannila, and P. Smyth, Principles of Data Mining, MIT Press, 2001.
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer-Verlag, 2001
T. M. Mitchell, Machine Learning, McGraw Hill, 1997.
S. M. Weiss and N. Indurkhya, Predictive Data Mining, Morgan Kaufmann, 1998
H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann, 2001
PapersIntroduction(INR)
V. Ganti, J. Gehrke, R. Ramakrishnan. Mining very large databases. COMPUTER, 32(8):38-45, 1999.
Michael Goebel and Le Gruenwald, “A Survey of Data Mining software Tools”, ACM SIGKDD Exploration, June 1999. Volume 1, Issue 1
David Han, “Statistics and Data Mining: Intersecting Disciplines ”, ACM SIGKDD Exploration, June 1999. Volume 1, Issue 1
S. Chaudhuri, U. Dayal, and V. Ganti, Database Technology for Decision Support Systems. Computer, 34(12):48-55, Dec. 2001.
Data Preprocessing
D. Barbará et al. The New Jersey Data Reduction Report. Bulletin of the Technical Committee on Data Engineering, 20, Dec. 1997, pp. 3-45.
Liu H.; Hussain F.; Tan C.L.; Dash M.. Discretization: An enabling techniques. Data Mining and Knowledge Discovery, 6(4): 393-423, 2002.
V. Raman and J. M. Hellerstein. Potter's Wheel: An Interactive Data Cleaning System, Proc. 2001 Int. Conf. on Very Large Data Bases (VLDB'01), Rome, Italy, pp. 381-390, Sept. 2001.
H. Galhardas, D. Florescu, D. Shasha, E. Simon, and C.-A. Saita. Declarative Data Cleaning: Language, Model, and Algorithms Proc. 2001 Int. Conf. on Very Large Data Bases (VLDB'01), Rome, Italy, pp. 371-380, Sept. 2001.
D. Pyle. Data Preparation for Data Mining. Morgan Kaufmann, 1999.
T. Dasu, T. Johnson, S. Muthukrishnan, V. Shkapenyuk. Mining Database Structure; Or, How to Build a Data Quality Browser. Proc. 2002 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'02), Madison, WI, pp. 240-251, June 2002.
Data Warehouse, OLAP, and Data Generalization
S. Chaudhuri, and U. Dayal. An overview of data warehousing and OLAP technology.ACM SIGMOD Record, 26(1):65-74, 1997.
J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab and sub-totals. Data Mining and Knowledge Discovery, 1(1):29-54, 1997.
V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In SIGMOD'96, pp. 205-216, Montreal, Canada, June 1996.
S. Agarwal, R. Agrawal, P. M. Deshpande, A. Gupta, J. F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the computation of multidimensional aggregates. In Proc. 1996 Int. Conf. Very Large Data Bases (VLDB'96), pp. 506-521, Bombay, India, Sept. 1996.
Y. Zhao, P. M. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous multidimensional aggregates. In SIGMOD'97, pp. 159-170, Tucson, Arizona, May 1997.
R. Agrawal, A. Gupta, and S. Sarawagi. Modeling multidimensional databases. In Proc. 1997 Int. Conf. Data Engineering (ICDE'97), Birmingham, England, April 1997.
S. Sarawagi, R. Agrawal, and N. Megiddo. Discovery-driven exploration of OLAP data cubes. In Proc. Int. Conf. of Extending Database Technology (EDBT'98), Valencia, Spain, pp. 168-182, March 1998.
S. Sarawagi Explaining Differences in Multidimensional Aggregates. In Proc. Int. Conf. of Very Large Data Bases (VLDB'99), pp. 42-53
K. A. Ross, D. Srivastava, and D. Chatziantoniou. Complex aggregation at multiple granularities. In EDBT'98, pp. 263-277, Valencia, Spain, March 1998.
K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cubes. In SIGMOD'99, pp. 359--370, Philadelphia, PA, June 1999.
J. Han, J. Pei, G. Dong, and K. Wang. Efficient computation of iceberg cubes with complex measures. In SIGMOD'01, pp. 1--12, Santa Barbara, CA, May 2001.
G. Dong, J. Han, J. Lam, J. Pei, and K. Wang. Mining Multi-Dimensional Constrained Gradients in Data Cubes. In VLDB'01, Rome, Italy, Sept. 2001.
W. Wang, H. Lu, J. Feng, and J. X. Yu. Condensed Cube: An Effective Approach to Reducing Data Cube Size. In Proc. 2002 Int. Conf. Data Engineering (ICDE'02) , San Fransisco, CA, April 2002.
L. V. S. Lakshmanan, J. Pei, and J. Han, Quotient Cube: How to Summarize the Semantics of a Data Cube, Proc. 2002 Int. Conf. on Very Large Data Bases (VLDB'02), Hong Kong, China, Aug. 2002.
D. Xin, J. Han, X. Li, B. W. Wah, “Star-Cubing: Computing Iceberg Cubes by Top-Down and Bottom-Up Integration”, Proc. 2003 Int. Conf. on Very Large Data Bases (VLDB'03), Berlin, Germany, Sept. 2003.
J. Han. Towards on-line analytical mining in large databases.ACM SIGMOD Record, 27:97-107, 1998.
J. Han, Y. Cai and N. Cercone, Knowledge Discovery in Databases: An Attribute-Oriented Approach in (VLDB'92) , Vancouver, Canada, August 1992, pp. 547-559.
G. Sathe and S. Sarawagi. Intelligent Rollups in Multidimensional OLAP Data. In Proc. Int. Conf. of Very Large Data Bases (VLDB'01), Rome, Italy, pp. 531-540
Mining Frequent Patterns and Association Rules in Large Databases
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In VLDB'94, pp. 487-499, Santiago, Chile, Sept. 1994.
J. Han and Y. Fu. Discovery of multiple-level association rules from large databases. In VLDB'95, pp. 420-431, Zürich, Switzerland, Sept. 1995.
R. Srikant and R. Agrawal. Mining generalized association rules. In VLDB'95, pp. 407-419, Zürich, Switzerland, Sept. 1995.
R. Srikant and R. Agrawal. Mining quantitative association rules in large relational tables. In SIGMOD'96, pp. 1-12, Montreal, Canada, June 1996.
B. Lent, A. Swami, and J. Widom. Clustering association rules. In ICDE'97, pp. 220-231, Birmingham, England, April 1997.
S. Brin, R. Motwani, and C. Silverstein. Beyond market basket: Generalizing association rules to correlations. In SIGMOD'97, pp. 265-276, Tucson, Arizona, May 1997.
R. Ng, L. V. S. Lakshmanan, J. Han, and A. Pang. Exploratory mining and pruning optimizations of constrained associations rules. In SIGMOD'98, pp. 13-24 Seattle, Washington, June 1998.
Y. Aumann and Y. Lindell. A Statistical Theory for Quantitative Association Rules Proc. 1999 Int. Conf. Knowledge Discovery and Data Mining (KDD'99), San Diego, CA, 261-270, Aug. 1999.
J. Han, L. V. S. Lakshmanan, and R. T. Ng. Constraint-based, multidimensional data mining. COMPUTER, 32(8): 46-50, 1999.
J. Han, J. Pei, and Y. Yin. Mining Frequent Patterns without Candidate Generation., Proc. 2000 ACM-SIGMOD Int. Conf. on Management of Data (SIGMOD'00), Dallas, TX, May 2000.
J. Pei, J. Han, and L. V. S. Lakshmanan. Mining Frequent Itemsets with Convertible Constraints, Proc. 2001 Int. Conf. on Data Engineering (ICDE'01), Heidelberg, Germany, April 2001.
J. Pei, J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang. H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases , Proc. 2001 Int. Conf. on Data Mining (ICDM'01)}, San Jose, CA, Nov. 2001.
Zaki and Hsiao. CHARM: An Efficient Algorithm for Closed Itemset Mining, Proc. 2002 SIAM Int. Conf. Data Mining (SDM'02), Arlington, VA, pp. 457-473, April 2002.
J. Wang, J. Han, and J. Pei, “CLOSET+: Searching for the Best Strategies for Mining Frequent Closed Itemsets”, Proc. 2003 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'03), Washington, D.C., Aug. 2003.
Y. Xu, J. X. Yu, G. Liu, H. Lu, From Path Tree To Frequent Patterns: A Framework for Mining Frequent Patterns, Proc. 2002 Int. Conf. on Data Mining (ICDM'02)}, Japan, Dec. 2002
F. Pan, G. Cong, A. K. H. Tung, J. Yang, and M. Zaki , CARPENTER: Finding Closed Patterns in Long Biological Datasets, Proc. 2003 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'03), Washington, D.C., Aug. 2003.
G. Liu, H. Lu, Y. Xu, J. X. Yu, Ascending Frequency Ordered Prefix-tree: Efficient Mining of Frequent Patterns, Proc. 2003 Int. Conf. on Database Systems for Advanced Applications (DASFAA’03), Kyoto, Japan, March 2003.
G. Liu, H. Lu, W. Lou, J. X. Yu , On Computing, Storing and Querying Frequent Patterns, Proc. 2003 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'03), Washington, D.C., Aug. 2003.
J. Han, J. Wang, Y. Lu, and P. Tzvetkov, “Mining Top-K Frequent Closed Patterns without Minimum Support”, Proc. 2002 Int. Conf. on Data Mining (ICDM'02), Maebashi, Japan, Dec. 2002.
Mohammad El-Hajj and Osmar R. Zaïane, Inverted Matrix: Efficient Discovery of Frequent Items in Large Datasets in the Context of Interactive Mining, in Proc. 2003 Int'l Conf. on Data Mining and Knowledge Discovery (ACM SIGKDD), Washington, DC, USA, August 24-27, 2003
Mohammad El-Hajj and Osmar R. Zaïane, Non Recursive Generation of Frequent K-itemsets from Frequent Pattern Tree Representations, in Proc. of 5th International Conference on Data Warehousing and Knowledge Discovery (DaWak'2003), Prague, Czech Republic, September 3-5, 2003
R. Meo, G. Psaila, and S. Ceri. A new SQL-like operator for mining association rules. In VLDB'96, pp. 122-133, Bombay, India, Sept. 1996.
T. Imielinski and A. Virmani. MSQL: a query language for database mining. Data Mining and Knowledge Discovery, 3(4): 373-408, 1999.
A. Savasere, E. Omiecinski, S. B. Navathe, Mining for Strong Negative Associations in a Large Database of Customer Transactions, In ICDE’98,Feb., 1998, Orlando, Florida.
E. Omiecinski. Alternative Interest Measures for Mining Associations, IEEE Trans. Knowledge and Data Engineering, 15(1):57-69, 2003.
Cristian Bucila, Johannes Gehrke, Daniel Kifer, Walker White: DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints. Data Mining and Knowledge Discovery, Vol. 7, Issue 4, July 2003, pages 241-272.
B. Goethals, M. Zaki: FIMI: Workshop on Frequent Itemset Mining Implementations (An Introduction). ICDM-FIMI Workshop, Melbourne, Florida, Nov. 2003.
Pang-Ning Tan, Vipin Kumar, Jaideep Srivastava, Selecting the Right Interestingness Measure for Association Patterns . In Proc of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2002), Edmonton, Alberta, 32-41 (2002).
Classification and Prediction
J. Shafer, R. Agrawal, and M. Mehta. SPRINT: A scalable parallel classifier for data mining. In VLDB'96, pp. 544-555, Bombay, India, Sept. 1996.
J. Gehrke, R. Ramakrishnan, V. Ganti. RainForest: A framework for fast decision tree construction of large datasets. In VLDB'98, pp. 416-427, New York, NY, August 1998.
J. Gehrke, V. Gant, R. Ramakrishnan, and W.-Y. Loh, BOAT -- Optimistic Decision Tree Construction . In SIGMOD'99 , Philadelphia, Pennsylvania, 1999
S. K. Murthy. Automatic construction of decision trees from data: A multi-disciplinary survey. Data Mining and Knowledge Discovery, 2(4): 345-389, 1998.
C. J. C. Burges. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 2(2): 121-168, 1998.
B. Liu, W. Hsu, and Y. Ma. Integrating Classification and Association Rule Mining. Proc. 1998 Int. Conf. Knowledge Discovery and Data Mining (KDD'98) New York, NY, Aug. 1998.
W. Li, J. Han, and J. Pei, CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules, , Proc. 2001 Int. Conf. on Data Mining (ICDM'01), San Jose, CA, Nov. 2001.
X. Yin and J. Han, “CPAR: Classification based on Predictive Association Rules”, Proc. 2003 SIAM Int.Conf. on Data Mining (SDM'03), San Fransisco, CA, May 2003.
H. Yu, J. Yang, and J. Han, “Classifying Large Data Sets Using SVM with Hierarchical Clusters”, Proc. 2003 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'03), Washington, D.C., Aug. 2003.
Cluster Analysis
R. Ng and J. Han. Efficient and effective clustering method for spatial data mining. In VLDB'94, pp. 144-155, Santiago, Chile, Sept. 1994.
T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: An efficient data clustering method for very large databases. In SIGMOD'96, pp. 103-114, Montreal, Canada, June 1996.
M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases. In KDD'96, pp. 226-231, Portland, Oregon, August 1996.
S. Guha, R. Rastogi, and K. Shim. CURE: An efficient clustering algorithm for large databases. In SIGMOD'98, pp. 73-84, Seattle, Washington, June 1998.
S. Guha, R. Rastogi, and K. Shim. ROCK: A robust clustering algorithm for categorical attributes. In ICDE'99, pp. 512-521, Sydney, Australia, March 1999.
R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In SIGMOD'98, pp. 94-105, Seattle, Washington, June 1998.
M. Ankerst, M. Breunig, H.-P. Kriegel, and J. Sander. Optics: Ordering points to identify the clustering structure. In SIGMOD'99, pp. 49-60, Philadelphia, PA, June 1999.
G. Sheikholeslami, S. Chatterjee, and A. Zhang. WaveCluster: A multi-resolution clustering approach for very large spatial databases. In VLDB'98, pp. 428-439, New York, NY, August 1998.
G. Karypis, E.-H. Han, and V. Kumar. CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling. COMPUTER, 32(8): 68-75, 1999.
A. K. H. Tung, J. Han, L. V. S. Lakshmanan, and R. T. Ng. Constraint-Based Clustering in Large Databases , Proc. 2001 Int. Conf. on Database Theory (ICDT'01), London, U.K., Jan. 2001.
A. K. H. Tung, J. Hou, and J. Han. Spatial Clustering in the Presence of Obstacles , Proc. 2001 Int. Conf. on Data Engineering (ICDE'01), Heidelberg, Germany, April 2001
H. Wang, W. Wang, J. Yang, and P.S. Yu. Clustering by pattern similarity in large data sets, Proc. the ACM SIGMOD International Conference on Management of Data (SIGMOD), Madison, Wisconsin, 2002.
Beil F., Ester M., Xu X.: "Frequent Term-Based Text Clustering", Proc. 8th Int. Conf. on Knowledge Discovery and Data Mining (KDD'02), Edmonton, Alberta, Canada, 2002.
Stream Data Mining(STR)
S. Guha, N. Mishra, R. Motwani, and L. O'Callaghan. Clustering Data Streams, Proc. IEEE Symposium on Foundations of Computer Science (FOCS'00), Redondo Beach, CA, pp. 359-366, 2000
S. Babu and J. Widom Continuous Queries over Data Streams. SIGMOD Record, pp. 109-120, Sept. 2001.
B. Babcock, S. Babu, M. Datar, R. Motwani and J. Widom, “Models and Issues in Data Stream Systems”, Proc. 2002 ACM-SIGACT/SIGART/SIGMOD Int. Conf. on Principles of Data base (PODS'02), Madison, WI, June 2002. (Conference tutorial)
M. Garofalakis, J. Gehrke, R. Rastogi, “Querying and Mining Data Streams: You Only Get One Look”, Tutorial at 2002 ACM-SIGMOD Int. Conf. on Management of Data (SIGMOD'02), Madison, WI, June 2002.
Y. Chen, G. Dong, J. Han, B. W. Wah, and J. Wang, " Multi-Dimensional Regression Analysis of Time-Series Data Streams '', Proc. 2002 Int. Conf. on Very Large Data Bases (VLDB'02), Hong Kong, China, Aug. 2002.
Stratis Viglas, Jeffrey Naughton, Rate-Based Query Optimization for Streaming Information Sources, SIGMOD’02
Samuel Madden, Mehul Shah, Joseph Hellerstein, Vijayshankar Raman, Continuously Adaptive Continuous Queries over Streams, SIGMOD02.
Alin Dobra, Minos N. Garofalakis, Johannes Gehrke, Rajeev Rastogi:, Processing Complex Aggregate Queries over Data Streams, SIGMOD’02
Gurmeet Singh Manku, Rajeev Motwani.. Approximate Frequency Counts over Data Streams, VLDB’02
Yunyue Zhu, Dennis Shasha. StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time, VLDB’02
J. Gehrke, F. Korn, D. Srivastava. On computing correlated aggregates over continuous data streams. Proc. 2001 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'01), Santa Barbara, CA, pp. 13-24, May 2001.
Geoff Hulten, Laurie Spencer, Pedro Domingos: Mining time-changing data streams. KDD 2001: 97-106
J. Han, `` Mining Dynamics of Data Streams in Multidimensional Space '' (in PowerPoint), ICDM'02 Keynote Speech, Maebashi City, Japan, Dec. 2002.
C. Aggarwal, J. Han, J. Wang, P. S. Yu, “A Framework for Clustering Evolving Data Streams”, Proc. 2003 Int. Conf. on Very Large Data Bases (VLDB'03), Berlin, Germany, Sept. 2003.
H. Wang, W. Fan, P. S. Yu, and J. Han, “Mining Concept-Drifting Data Streams using Ensemble Classifiers”, Proc. 2003 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'03), Washington, D.C., Aug. 2003.
C. Giannella, J. Han, J. Pei, X. Yan and P.S. Yu, “Mining Frequent Patterns in Data Streams at Multiple Time Granularities”, H. Kargupta, A. Joshi, K. Sivakumar, and Y. Yesha (eds.), Next Generation Data Mining, 2003.
Spatio-temporal and Time-series Data Mining(STT)
K. Koperski and J. Han. Discovery of spatial association rules in geographic information databases. In Proc. 4th Int'l Symp. on Large Spatial Databases (SSD'95), pp. 47-66, Portland, Maine, Aug. 1995.
S. Shekhar, P. Zhang, Y. Huang, R. Vatsavai, Trend in Spatail Data Mining, as a chapter to appear in Data Mining: Next Generation Challenges and Future Directions, Hillol Kargupta and Anupam Joshi(eds.), AAAI/MIT Press, 2003, (pdf, PS)
J. Han, R. B. Altman, V. Kumar, H. Mannila and D. Pregibon, “ Emerging Scientific Applications in Data Mining”, Communications of ACM, 45(8):54-58, 2002.
Shashi Shekhar and Yan Huang, “Discovering Spatial Co-location Patterns: a Summary of Results”, In Proc. of 7th Intl. Symp. on Spatial and Temporal Databases (SSTD), Redondo Beach, CA, July 2001
R. Agrawal and R. Srikant. Mining sequential patterns. In ICDE'95, pp. 3-14, Taipei, Taiwan, March 1995.
Mannila H.; Toivonen H.; Inkeri Verkamo A., Discovery of Frequent Episodes in Event Sequences. Data Mining and Knowledge Discovery, 1997, vol. 1, no. 3, pp. 259-289(31)
Ester M., Kriegel H.-P., Sander J, Algorithms and Applications for Spatial Data Mining, in: Geographic Data Mining and Knowledge Discovery, Research Monographs in GIS, Taylor and Francis, 2001, pp. 160-187.
M. Garofalakis, R. Rastogi, and K. Shim. SPIRIT: Sequential pattern mining with regular expression constraints. In Proc. 1999 Int. Conf. Very Large Data Bases (VLDB'99), pp. 223-234, Edinburgh, UK, Sept. 1999.
J. Pei, J. Han, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu. PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. , Proc. 2001 Int. Conf. on Data Engineering (ICDE'01), Heidelberg, Germany, April 2001.
R. Agrawal, K.-I. Lin, H.S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In VLDB'95, pp. 490-501, Zurich, Switzerland, Sept. 1995.
Y.-S. Moon, K.-Y. Whang, W.-K. Loh. Duality-Based Subsequence Matching in Time-Series Databases., Proc. 2001 Int. Conf. Data Engineering (ICDE'01), Heidelberg, Germany, pp. 263-272, April 2001
R. Agrawal, G. Psaila, E. L. Wimmers, and M. Zait. Querying shapes of histories. In VLDB'95, pp. 502-514, Zürich, Switzerland, Sept. 1995.
J. Han, G. Dong, and Y. Yin. Efficient mining of partial periodic patterns in time series database. In ICDE'99, pp. 106-115, Sydney, Australia, April 1999.
J. Pei, J. Han, and W. Wang, “Mining Sequential Patterns with Constraints in Large Databases”, Proc. 2002 Int. Conf. on Information and Knowledge Management (CIKM'02)}, Washington, D.C., Nov. 2001.
X. Yan and J. Han, “gSpan: Graph-Based Substructure Pattern Mining”, Proc. 2002 Int. Conf. on Data Mining (ICDM'02), Maebashi, Japan, Dec. 2002.
X. Yan and J. Han, “CloseGraph: Mining Closed Frequent Graph Patterns”, Proc. 2003 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'03), Washington, D.C., Aug. 2003.
X. Yan, J. Han, and R. Afshar, “CloSpan: Mining Closed Sequential Patterns in Large Datasets”, Proc. 2003 SIAM Int.Conf. on Data Mining (SDM'03), San Fransisco, CA, May 2003.
S. Chawla, S. Shekhar, W. Wu and U. Ozesmi, Extending Data Mining for Spatial Applications: A Case Study in Predicting Nest Locations, Proc. Int. Confi. on 2000 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2000), Dallas, TX, May 14, 2000.. (PS, PDF)
Michael Steinbach, Pang-Ning Tan, Vipin Kumar, Steve Klooster, Christopher Potter, Discovery of Climate Indices using Clustering, Proc of the Ninth ACM SIGKDD Int'l Conf on Knowledge Discovery and Data Mining (KDD-2003), Washington, DC, Aug 24-27 (2003).
Information Retrieval and Web Mining(IRW)
S. Chakrabarti, B. E. Dom, S. R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins, D. Gibson, and J. Kleinberg. Mining the Web's link structure. COMPUTER, 32(8):60-67, 1999.
J. M. Kleinberg. “Authoritative Sources in a Hyperlinked Environment”. Journal of ACM, 46(5):604-632, 1999.
H. Yu, J. Han, and K. C.-C. Chang, " PEBL: Positive Example Based Learning for Web Page Classification Using SVM '', Proc. 2002 Int. Conf. on Knowledge Discovery in Databases (KDD'02), Edmonton, Canada, July 2002.
K. Wang, S. Zhou and S. C. Liew. “Building hierarchical classifiers using class proximity”. In VLDB99, Edinburgh, UK, Sept. 1999.
Mukund Deshpande and George Karypis, Selective Markov Models for Predicting Web-Page Accesses, 1st SIAM Data Mining Conference, 2001
J. Han, and K. C.-C. Chang, “Data Mining for Web Intelligence”, Computer, Nov. 2002
Chris Ridings and Mike Shishigin, “PageRand Uncovered”, Google Tech, September, 2002
Pang-Ning Tan, Vipin Kumar, “Discovery of Web Robot Sessions based on their Navigational Patterns”, Data Mining and Knowledge Discovery, 6(1): 9-35 (2002)
Bio-mining(BIO)
J. Yang, P. Yu, W. Wang, and J. Han, '' Mining Long Sequential Patterns in a Noisy Environment '', Proc. 2002 ACM-SIGMOD Int. Conf. on Management of Data (SIGMOD'02), Madison, WI, June 2002.
Ying Zhao and George Karypis, “Prediction of Contact Maps Using Support Vector Machines”, IEEE Symposium on Bioinformatics and Bioengineering, 2003
Mukund Deshpande, Michihiro Kuramochi, and George Karypis, Frequent Sub-structure Based Approaches for Classifying Chemical Compounds, IEEE International Conference on Data Mining, 2003
H. Wang, W. Wang, J. Yang, and P.S. Yu. Clustering by pattern similarity in large data sets, Proc. the ACM SIGMOD International Conference on Management of Data (SIGMOD), Madison, Wisconsin, 2002.
Visual Data Mining(VIS)
Tutorial KDD-2002 on "Visual Data Mining: Background, Techniques, and Drug Discovery Applications" by M. Ankerst, G. Grinstein, and D. Keim, Tutorial Notes (14 MByte), Edmonton, Canada.
Tutorial IEEE Visualization 2000 on "An Introduction to Information Visualization Techniques for Exploring Large Databases"
Tutorial Notes (11 MByte)
Data Mining Applications and Trends in Data Mining(TRD)
H. Mannila, Theoretical Frameworks of Data Mining. SIGKDD Explorations , 1(2): 30-32, 2000
C. Clifton and D. Marks. Security and Privacy Implications of Data Mining. In Proc. 1996 SIGMOD'96 Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD'96), Montreal, Canada, pp. 15-20, June 1996.
R. Agrawal and R. Srikant. Privacy-preserving data mining. In Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'00), pages 439-450, Dallas, TX, May 2000.