10 Challenging Problems in Data Mining Research


In October 2005, we took an initiative to identify 10 challenging problems in data mining research, by consulting some of the most active researchers in data mining and machine learning for their opinions on what are considered important and worthy topics for future research in data mining. We hope their insights will inspire new research efforts, and give young researchers (including PhD students) a high-level guideline as to where the hot problems are located in data mining.

The identification results were presented at the fifth IEEE International Conference on Data Mining (ICDM '05).

The 10 challenging problems are listed below (where the order of the listing does not reflect their level of importance):
  1. Developing a Unifying Theory of Data Mining
  2. Scaling Up for High Dimensional Data and High Speed Data Streams
  3. Mining Sequence Data and Time Series Data
  4. Mining Complex Knowledge from Complex Data
  5. Data Mining in a Network Setting
  6. Distributed Data Mining and Mining Multi-agent Data
  7. Data Mining for Biological and Environmental Problems
  8. Data-Mining-Process Related Problems
  9. Security, Privacy and Data Integrity
  10. Dealing with Non-static, Unbalanced and Cost-sensitive Data

Qiang Yang and Xindong Wu


This page has been accessed times since November 29, 2006.
Last updated: January 18, 2007.