The hundred-page Machine Learning Book – Andriy Burkov

Let’s start by telling the truth: machines don’t learn. What a typical “learning machine” does, is finding a mathematical formula, which, when applied to a collection of inputs (called “training data”), produces the desired outputs. This mathematical formula also generates the correct outputs for most other inputs (distinct from the training data) on the condition that those inputs come from the same or a similar statistical distribution as the one the training data was drawn from. Why isn’t that learning? Because if you slightly distort the inputs, the output is very likely to become completely wrong. It’s not how learning in animals works. If you learned to play a video game by looking straight at the screen, you would still be a good player if someone rotates the screen slightly. A machine learning algorithm, if it was trained by “looking” straight at the screen, unless it was also trained to recognize rotation, will fail to play the game on a rotated screen. So why the name “machine learning” then? The reason, as is often the case, is marketing: Arthur Samuel, an American pioneer in the field of computer gaming and artificial intelligence, coined the term in 1959 while at IBM. Similarly to how in the 2010s IBM tried to market the term “cognitive computing” to stand out from competition, in the 1960s, IBM used the new cool term “machine learning” to attract both clients and talented employees. As you can see, just like artificial intelligence is not intelligence, machine learning is not learning. However, machine learning is a universally recognized term that usually refers to the science and engineering of building machines capable of doing various useful things without being explicitly programmed to do so. So, the word “learning” in the term is used by analogy with the learning in animals rather than literally.