A Machine Learning Primer (English)

Tuesday, March 01, 2016

A Machine Learning Primer (English)

Everyone is talking about Machine Learning. People are confusing it with big data, with artificial intelligence, and every other concept associated with computers doing something impressive.

This short piece will explain what Machine Learning actually is, and why people are so excited about it.

Ниже есть продолжение.

What is Machine Learning?

Machine Learning is a tool for making predictions.

It’s not magic. It’s just computers doing statistics work for us, on a massive scale, so they can give us reliable predictions of what might happen in the future.

A real-world example

magine that you’re an online store owner, and you’re trying to figure out who buys things from your store. Is it young people? Is it older people? People from the east coast? People in college?

It’d be nice to know these things because it would help you market to them, and it would help you improve their purchasing experience. But you don’t really know that because it’s hard to find trends like that in so much data.

That’s what Machine Learning does. It consumes massive amounts of data about your customers for the purpose of being able to tell you the chances that the next person who comes to your website will buy something.

You can think of it working in three main steps:

1. Understand the prediction you’re trying to make about the future. E.g., “Is this person likely to buy something from my store?”

2. Collect and classify tons of data about the problem. Collect a trove of data that can be broken into a set of attributes. If you’re trying to sell something, the data might be things about the customers, such as, “married”, “single”, “already owns a bicycle”, “likes Game of Thrones”, etc. The more data the better for this step, since you’re not sure what the variable(s) are that might make someone likely to buy.

3. Feed the system a set of answers. So we know what we’re asking, and we know the structure of the data, so now we’re going to tell the system who did buy something in the past. And once again, the more data the better.

Now the Machine Learning algorithm statistically compares the attributes in the person who just showed up to the attributes of those who are known to have purchased. And based on how those attributes match up, it gives you a likelihood that they will make a purchase.

Example:

Because the current visitor to your website is recently in a relationship, and under 30, and living in San Francisco, and female, and in college, and a bicycle owner, there is a 89% chance she will purchase product Y if she visits that page.

Why it Matters

Machine Learning matters because predicting the future is lucrative for advertisers (and many others). It’s not the type of thing where you can ask it who will win the World Series, or when a regime will fall. Remember, you have to have an extremely specific prediction in mind for the system to calibrate on, and you need TONS of data to feed it to learn from.

This is why it’s so valuable to advertisers and other groups that benefit from predicting very specific scenarios using significant amounts of data about the decision makers.

Summary

Machine Learning is a technique for programming computers to predict if someone is going to do a very specific thing. It makes these predictions by studying massive amounts of data about the decision maker, combined with massive numbers of examples of when people performed that same action in the past.

Notes

1. There are two primary constraints to magic-like machine learning prediction: 1) the processing power of our computers, and the amount of data we can feed them. As these improve, so does our ability to predict outcomes using Machine Learning.