Google's AI team develops powerful and self-learning program

By Lim Chang-won Posted : October 19, 2017, 10:56 Updated : October 19, 2017, 10:56

[Courtesy of DeepMind]


Google's artificial intelligence team DeepMind vowed to develop new algorithms that can learn on its own after its program AlphaGo stunned the world last year with an overwhelming victory against South Korean master Lee Sedol in the ancient Chinese game of Go.

About 17 months later, DeepMind came up with a more powerful program called "AlphaGo Zero", which was "no longer constrained by the limits of human knowledge" and can achieve superhuman performance in the most challenging domains with no human input.

Based on "reinforcement learning" without human data, guidance or domain knowledge beyond game rules, AlphaGo Zero has become its own teacher, DeepMind said in a blog post.

"This technique is more powerful than previous versions of AlphaGo because it is no longer constrained by the limits of human knowledge. Instead, it is able to learn tabula rasa from the strongest player in the world: AlphaGo itself," said DeepMind CEO Demis Hassabis.

He suggested that if applied to other structural problems such as protein folding, DeepMind's AI technology could help solve "some of the most important challenges humanity is facing". Misfolded proteins are responsible for diseases like Alzheimer's and Parkinson's.

"While it is still early days, AlphaGo Zero constitutes a critical step towards this goal," Hassabis said.

After AlphaGo scored a 4-1 victory against Lee in a landmark match in Seoul in May last year, DeepMind vowed to create general-purpose AI that can learn on its own and, eventually, be used as a tool to solve pressing problems from climate change to disease diagnosis.

AlphaGo Zero made a critical breakthrough as it accumulated thousands of years of human knowledge in a few days and discovered new knowledge, developing unconventional strategies and creative new moves.

DeepMind said that while previous versions trained on thousands of games to learn how to play Go, AlphaGo Zero skips this step and learns to play simply by playing games against itself, starting from completely random play.

After just three days of self-play training, AlphaGo Zero emphatically defeated the previously published version of AlphaGo. After 40 days, it became even stronger.

 
기사 이미지 확대 보기
닫기