Sequential Supervised Learning: General Methods for Sequence
Labeling and Segmentation
by
Thomas G. Dietterich, Oregon State University, USA
Many existing and emerging applications of machine learning
and data mining involve the problem of labeling the elements of a
sequence. Examples include information extraction from web pages,
part-of-speech tagging in computational linguistics, protein and DNA
sequence analysis, and computer intrusion detection. In all of these
tasks, the training examples consist of pairs (X,Y), where X is a
sequence of objects or events (x1, ..., xT) each described by a vector
of features, and Y is a matching sequence of class labels (y1, ...,
yT). Given a new sequence of objects X, the goal is to predict the
corresponding sequence of labels Y. This is an example of "collective
classification", where each of the objects xt is classified
simultaneously with all of the other objects in the sequence. This talk
will discuss practical, off-the-shelf machine learning methods for
sequential supervised learning and describe our experience with
applications in computational linguistics, information extraction, and
bio-informatics.
This page has been accessed times since August 20, 2003.