Ml crash directory
Are you familiar with regression (0-5.59)? One way to view Ml
is regression on steroids...which mean a harder optimization problem
(one that does not have a close analytic solution and/or is not convex)
with many parameters.
Let's consider supervised learning first. You are given n labeled data points,
(
x1,y1),...,(xn,yn). Your objective is to find a function f(x)=y that
best predicts y on a new batch of x's. When y is continuous it is
called regression and when its discrete it is called classification.
There are two things to notice right away
1. To solve this an optimization problem is defined, e.g., a minimization of square error in our original regression problem
2.
Trying to explain the given data completely which is sometimes called
extrapolation is actually a pitfall. You may capture random trends and
your prediction power may be hindered. This is called overfitting
The
basic intuition underlying many approaches to the classification
problem is that had we known p(x, y) and given a new x we would have
calculated p(x, y) for each y and choose y with the greatest
probability. The difficulty is that it is not easy to estimate p(x,
y).
View reference two below up to 8.15.
To estimate p(x
y) we could proceed as follows. Recall that p(x y) = p(x |y)p(y) =
p(y|x)p(x). Thus, estimating p(x), p(y) and p(x|y) from the training
data will let us estimate p(x y) and p(y|x) and thus decide given a new x
its class y. A simplifying independence assumption leads to the naive
Bayes approach that is intuitively covered in the first part of Ariel
Kleiner's crash course on ML
(up to slide 25). This is an instance of what is referred to as
generative models.
Yet another approach is to define an
optimization that attempts to maximize performance on the training data
while keeping f(x) simple. This is done in a varieties of ways.
To
deep dive on ML concepts see reference three below. Iterate between
reference three and simple ML tutorial in python or R to master the
subject.
References
1. Introduction to programmers on why ml is useful to master. Notice that this introduction
ignores the challenges of applying it where it excels and dealing with drift.
2.
Nice overview that start with classification only thing to be careful of is
the claim that neural network are not statistical models. Estimating a
neural network performance should be done using the same standard
statistical tools, e.g., cross validation.
3. An intuitive deep
dive on the concepts of machine learning is given by Haul Daume III
Wednesday, August 22, 2018
Subscribe to:
Post Comments (Atom)
Our next ML study group meeting will take place on Monday the 8 th of October. I'll cover the contraction theorem. See relevant s...
-
Ml crash directory Are you familiar with regression - https://m.youtube.com/watch?v=aq8VU5KLmkY ? One way to view Ml is regression on ster...
-
We'll cover LDA in tw's meeting. Here is the slide - https://drive.google.com/open?id=1KRoCA4vo9H9oJOl3iD-qRqIHl9qQq9vf This is ...
-
Whiteboard from today's meeting on Bayesian ML: Cox: P(A, B) = P(A|B)P(B) = P(B|A)P(A) => P(A|B) = [P(B|A)P(A)] / P(B) (Bayes) ...
No comments:
Post a Comment