Predictive Data Mining
with
Multiple Additive Regression Trees
by Jerome H. Friedman, Stanford University
Predicting future outcomes based on past observational data is a
common application in data mining. The primary goal is usually
predictive accuracy, with secondary goals being speed, ease of use,
and interpretability of the resulting predictive model. New automated
procedures for predictive data mining, based on "boosting" (CART)
regression trees, are described. The goal is a class of fast
"off-the-shelf" procedures for classification and regression that are
competitive in accuracy with more customized approaches, while being
fairly automatic to use (little tuning), and highly robust especially
when applied to less than clean data. Tools are presented for
interpreting and visualizing these multiple additive regression tree
(MART) models.
This page has been accessed times since October 3, 2001.