https://en.wikipedia.org/wiki/AlphaGo_Zero
Training artificial intelligence without datasets derived from human experts is… valuable in practice because expert data is “often expensive, unreliable or simply unavailable.”
AlphaGo Zero’s neural network was trained using TensorFlow. The robot engaged in reinforcement learning, playing against itself until it could anticipate its own moves and how those moves would affect the game’s outcome
So the robot’s training is by playing against itself, not studying past games by other players.
The robot discovered many playing strategies that human players never thought of. In the first three days AlphaGo Zero played almost 5 million games against itself and learned more strategies than any human can.
In the game of GO, world’s strongest players are no longer humans. Strongest players are all robots. The strongest strategies humans have developed are easily beaten by these robots. Human players can watch these top (robot) players fight against each other, and try to understand why their strategies work.