What is reinforcement learning and how can it be applied to business
From time to time we hear about a new direction in data science, which should make some kind of breakthrough. I have been dealing with control systems since the 2000s. I want to share some thoughts about a topic that is popular for the past few years and connected to control systems also — reinforcement learning. RL technology is not always possible to apply and is not widespread, but in some cases it can be used. Below you can find information about what is RL, possible use cases and where it works best.
Types of Machine Learning
There are several types of learning in ML: supervised, unsupervised and reinforcement.
These types model a function based on some data.
Supervised — you provide X and Y, you, as an expert, say how it should be and on the basis model training proceeds
Unsupervised — you give X sets, model assigns Y sets itself. For example, you give a lot of personal data and a model distributes it into groups.
Reinforcement — you don’t have an expert who provides X and Y. Reinforcement works on many experiments in the environment. As a result of each experiment, the judgment is correct or incorrect, thus learning occurs. The question of why this where this type of ML required is a good question, because it is necessary very rarely
What is reinforcement learning
The main idea is that the model is not trained on beforehand prepared data but in an environment that helps to understand what is right and what is wrong with the help of symbolic carrots and sticks
Good example of reinforcement learning is dogs training
By the way, do you remember this moment from big ben theory :)? Seems something similar to reinforcement learning
When definitely not to use reinforcement learning
It is important to understand that you need an environment that can simulate the behavior of your system in reality, without the possibility of conducting experiments, it cannot be used.
You need to be able to run multiple experiments on the environment, it can be quite expensive
When reinforcement learning can be used
Generally there are many methods for solving the control problem: STRIPS, Decision trees, HTN, Utility systems, MCTS, etc. All of them are still in use today. When to use them, and when RL is worth a separate article.
To make a long story short
- RL should be used if the number of options is too large for algorithms that use even directed enumeration of options
- When there is not enough expert knowledge to develop rules.
In 1997, the DeepBlue defeated world champion Harry Kasparov. DeepBlue didn’t use ML, algorithms were enough. But there is still one game left, where people could win — GO game. This situation didn’t change till 2017 when AlphaGo was released. AlphaGO already used ML.
GO game can be a good example for the 2 rules above.
3. When a decision needs to be made in real time.
For example, autopilot — there is no time to go through the options, and the number of possible situations is too large to develop a system of rules for all possible cases.
One example of when this might be needed is creating an autopilot for an airplane that flies over the surface of the water. It uses energy very efficiently, but has to keep a certain distance from the water. Development of such an autopilot costs millions of dollars. With the help of reinforcement learning, it would be possible to do it cheaper using a flight emulator.
As a conclusion — just a robot, that is teaching how to walk.