In recent years, there has been much interest in learning Bayesian networks from data. Learning such models is desirable simply because there is a wide array of off-the-shelf tools that can apply the learned models as expert systems, diagnosis engines, and decision support systems. Practitioners also claim that adaptive Bayesian networks have advantages in their own right as a non-parametric method for density estimation, data analysis, pattern classification, and modeling. Among the reasons cited we find: their semantic clarity and understandability by humans, the ease of acquisition and incorporation of prior knowledge, the ease of integration with optimal decision-making methods, the possibility of causal interpretation of learned models, and the automatic handling of noisy and missing data.
In spite of these claims, and the initial success reported recently, methods that learn Bayesian networks have yet to make the impact that other techniques such as neural networks and hidden Markov models have made in applications such as pattern and speech recognition. In this paper, we challenge the research community to identify and characterize domains where induction of Bayesian networks makes the critical difference, and to quantify the factors that are responsible for that difference. In addition to formalizing the challenge, we identify research problems whose solution is, in our view, crucial for meeting this challenge.