ML+NLP: generative and discriminative classifiers: NB and LR

Tom-NB&LR

• We can use Bayes rule as the basis for designing learning algorithms (function
approximators), as follows: Given that we wish to learn some target
function f : X → Y, or equivalently, P(Y|X), we use the training data to
learn estimates of P(X|Y) and P(Y).
New X examples can then be classified using these estimated probability distributions, plus Bayes rule. This type of classifier is called a generative classifier, because we can view the distribution P(X|Y) as describing how to generate random instances X conditioned on the target attribute Y.

• Learning Bayes classifiers typically requires an unrealistic number of training
examples (i.e., more than |X| training examples where X is the instance
space) unless some form of prior assumption is made about the form of
P(X|Y). The Naive Bayes classifier assumes all attributes describing X
are conditionally independent given Y. This assumption dramatically reduces
the number of parameters that must be estimated to learn the classi-
fier. Naive Bayes is a widely used learning algorithm, for both discrete and
continuous X.
• When X is a vector of discrete-valued attributes, Naive Bayes learning algorithms
can be viewed as linear classifiers; that is, every such Naive Bayes
classifier corresponds to a hyperplane decision surface in X. The same statement
holds for Gaussian Naive Bayes classifiers if the variance of each feature
is assumed to be independent of the class (i.e., if σik = σi).

• Logistic Regression is a function approximation algorithm that uses training
data to directly estimate P(Y|X), in contrast to Naive Bayes. In this sense,
Logistic Regression is often referred to as a discriminative classifier because
we can view the distribution P(Y|X) as directly discriminating the value of
the target value Y for any given instance X.

• Logistic Regression is a linear classifier over X. The linear classifiers produced
by Logistic Regression and Gaussian Naive Bayes are identical in
the limit as the number of training examples approaches infinity, provided
the Naive Bayes assumptions hold. However, if these assumptions do not
hold, the Naive Bayes bias will cause it to perform less accurately than Logistic
Regression, in the limit. Put another way, Naive Bayes is a learning
algorithm with greater bias, but lower variance, than Logistic Regression. If
this bias is appropriate given the actual data, Naive Bayes will be preferred.
Otherwise, Logistic Regression will be preferred.

• We can view function approximation learning algorithms as statistical estimators
of functions, or of conditional distributions P(Y|X). They estimate
P(Y|X) from a sample of training data. As with other statistical estimators,
it can be useful to characterize learning algorithms by their bias and

expected variance, taken over different samples of training data.

ML+NLP

2015年1月18日星期日

generative and discriminative classifiers: NB and LR

没有评论:

发表评论