### Research Project Name

*Learning Speed Curves: Prediction of Average Case
Learning
Using VC-Dimension Analysis and Regression*

### Objective

The objective of this project is to predict the learning speed curve for
an inductive learning algorithm, when given just a small number of examples drawn from the target
distribution. This objective differs form most existing research in that the
goal is to predict average case performance, not worse case performance; and to
produce results for both noisy and noise-free input.

### Results

Existing general regression techniques are analyzed with respect to their
ability to accurately create
predictive learning-speed curves.

A new method of general regression is presented, and implemented in a system
called SEER. The new method’s
model, called the Effective Dimension Model, is based on the Vapnik-Chervonenkis
dimension.

The described experimental results show that SEER accurately predicts, from a
small sample of cases, with and without noise, the number of cases required to
achieve a desired level of classification accuracy.

The resulting average learning speed curves can be used in various ways. If the goal is to achieve a
particular inductive learning accuracy, the prediction algorithm predicts how
many examples are needed. If an additional number of training examples are
planned to be collected, the prediction algorithm predicts the resultant
accuracy.
If there is a difference in time between collecting noise-free and
noisy-examples, SEER predicts which approach produces the highest quality level
of classification accuracy.

### Publications

- Carl Myers Kadie (1995). "
*Seer: Maximum Likelihood Regression
for Learning-Speed Curves*," Ph.D. Dissertation, Department of
Computer Science, University of Illinois, Urbana-Champaign, July
1995, 106 pages.
pdf

- Kadie, C. M. and Wilkins, D. C., Speed Curves: Prediction
of Average Case Performance Using VC-Dimension Analysis and
Regression, draft manuscript, 42 pages.
pdf