EventsN. Brummer: Optimization of Accuracy and Calibration of Binary and Multiclass Pattern Recognizers for Wide Ranges of ApplicationsPoslucharna A112, FIT VUT Bozetechova, 10:0012:00 25.3.2009 It is common practice in many fields of basic pattern recognition
research to evaluate performance as the misclassification errorrate on
a given evaluation database. A limitation of this approach is that it
implicitly assumes that all types of misclassification have equal cost
and that the prior class distribution equals the relative proportions
of classes in the evaluation database.
In this talk, we generalize the traditional errorrate evaluation, to
create an evaluation criterion that allows optimization of pattern
recognizers for wide ranges of applications, having different class
priors and misclassification costs. We further show that this same
strategy optimizes the amount of relevant information that recognizers
deliver to the user.
In particular, we consider a class of evaluation objectives known as
"proper scoring rules", which effectively optimize the ability of
pattern recognizers to make minimumexpectedcost Bayes decisions. In
this framework, we design our pattern recognizers to:
 extract from the input as much relevant information as possible about
the unknown classes, and
 to output this information in the form of wellcalibrated class
likelihoods.
We refer to this form of output as "applicationindependent". Then when
applicationspecific priors and costs are added, the likelihoods can be
used in a straightforward and standard way to make
minimumexpectedcost Bayes decisions.
A given proper scoring rule can be interpreted as a weighted
combination of misclassification costs, with a weight distribution over
different costs and/or priors. On the other hand, proper scoring rules
can also be interpreted as generalized measures of uncertainty and
therefore as generalized measures of information. We show that there is
a particular weighting distribution which forms the logarithmic proper
scoring rule, and for which the associated
uncertainty measure is Shannon's entropy, which is the canonical
information measure. We conclude that optimizing the logarithmic
scoring rule not only minimizes errorrates and misclassification
costs, but it also maximizes the effective amount of relevant
information delivered to the user by the recognizer.
We discuss separately our strategies for binary and multiclass pattern
recognition:
 We illustrate the binary case with the example of speaker
recognition, where the calibration of detection scores in
likelihoodratio form is of particular importance for forensic
applications.
 We illustrate the multiclass case with examples from the recent 2007
NIST Language Recognition Evaluation, where we experiment with the
language recognizers of 7 different research teams, all of which had
been designed with one particular language detection application in
mind. We show that by recalibrating these recognizers by optimization
of a multiclass logarithmic scoring rule, they can be successfully
applied to a range of thousands of other applications.
slides available from http://www.fit.vutbr.cz/research/groups/speech/servite/2009/20070226NBrummer.pdf
Niko's pages: http://niko.brummer.googlepages.com/
SpeakersBrümmer Niko, Agnitio
