This is part of our deep dive into generative models which will eventually loops us back to BN but will also shade light on GAN approaches. Here is some background and relevant resources -
Generative models
Under the generative model approach we attempt to model the joint
distribution p(x y). Given x and applying the Bayesian rule to our model we
classify as y the y for which p(y | x) is largest.
A straight forward application of the Bayes rule is to attempt the
estimation of probabilities in the Bayesian rule p(y | x) p(x) = p(x | y) p(y).
With the typical large number of dimensions of the vector x, density
estimation of the required quantiles is really hard. See the first 30 mins
of https://m.youtube.com/watch?v=_m7TMkzZzus#fauxfullscreen for
details.
As modeling the joint distribution p(x y) is hard simplifying assumption
are introduced leading to different more concrete classification
techniques.
LDA
LDA models each p(x | y) as a gaussian distribution. This stat quest
video describes how the average and standard deviation of the distribution are
chosen to maximize the separation between the classes over the training set https://m.youtube.com/watch?v=azXCzI57Yfc
The second 30 mins of this lecture derives LDA and explains what happens if
the covariance of all class matrices are I https://m.youtube.com/watch?v=_m7TMkzZzus#
Here the estimation of a covariance matrices of a random vector is
explained in detailhttps://en.m.wikipedia.org/wiki/Estimation_of_covariance_matrices
See chapter 24 of the understanding book for a broader coverage of
generation methods - https://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/understanding-machine-learning-theory-algorithms.pdf
The background required for the Gaussian distribution and the covariance
matrix is covered herehttp://cs229.stanford.edu/section/gaussians.pdf
No comments:
Post a Comment