Play To Transform

Will Playing Make AI More Human?

On March 15, 2016, Google’s AI program, AlphaGo, beat world champion Lee Sedol four-to-one in one of the most complex strategy games ever devised – the ancient Chinese game of Go.

Since that historic match, Google has released a new version of its AI agent, called AlphaGo Zero, which defeated its predecessor by 100 games to 0. Unlike AlphaGo, which relied on big data, machine learning, and advanced algorithms, AlphaGo Zero started learning Go on its own, from scratch. Starting with a very primitive understanding of the game, AlphaGo Zero created a duplicate of itself, playing itself repeatedly and using what it learned in each match to advance and update its algorithms. Beginning with random play, it took AlphaGo Zero only 40 days to master the game and become the world’s best player.

AlphaGo Zero is a perfect example of artificial generalized intelligence (AGI): an intelligent machine that can successfully perform any intellectual task that a human being can. It reflects a new type of AI that can learn, adapt, and think creatively in order to solve complex challenges that previous generations of AI struggled with. In other words, it knows how to play.

Technologists have already built machines that are faster, stronger, and more precise than humans. Millions of such machines are working across industries today. Where these machines stumble, however, is in adaptation. Faced with new and unforeseen circumstances, these machines find it difficult to adapt, adjust, or improvise. As such, AI researchers have been looking to biology for clues on adaptation, and they have struck something powerful with play.

Although play is not the most efficient, predictable, or streamlined form of learning, all living creatures play. Play is fun and serves as a natural user interface for learning. Play-led learning also provides a distinct advantage; it develops combinatorial flexibility. That is, those who learn through play cultivate the ability to take apart sets of inputs and relationships, and to reconstruct them in new and interesting ways. This second-order and generalized property of learning is what distinguishes AGI that learn through play from other forms of AI.

Hence, through self-play, AlphaGo Zero was able to develop a foundational understanding of the game of Go from the ground up. it would, for instance, intentionally complicate situations it was curious about in order to close out gaps in its knowledge. Supervised learning and external datasets in previous generations of AlphaGo did not have this capability, which was critical to AlphaGo Zero’s development and mastery of the game.

We are fast entering a new era with AGI – and play-led learning is driving it. Play-led learning will be integral to the development of this technology, as play fosters AGI that is intrinsically motivated, adaptive, and can take on more creative challenges.

The timeline of AI development comprises six major phases:

  • The Symbolic age
    Rule-based programming.
  • Data Mining and analytics
    Finding patterns and predictions in well-defined data.
  • Cognitive Computing
    Understanding unstructured data, such as voice, text, video, and images.
  • Curious and Creative Machines
    Combining the above so that AI will not only reach logical conclusions, but will also formulate new questions and experiments to generate understanding.
  • Self-aware AGI
    Developing AI that understands itself and is aware of its interactions in the world, allowing it to explore new ways of being and doing.
  • Next Generation AGI
    Embracing AGI that is fully integrated and advanced enough to create the next generation of AI beyond itself.

AGI research is making significant strides in pushing us forward on this horizon. Consider the following three studies.

Taking their inspiration from how children play, discover, and learn new skills on the playground, researchers at the University of California, Berkeley have developed AI agents that are driven only by curiosity to learn effective behaviors in two separate game environments: Super Mario Bros. and ViZDoom. In their application of curiosity, they discovered that play can lead to intrinsically motivated learning and survival skills, and that in the absence of the pursuit of explicit goals, play lends itself well to generalizable skills. They also discovered that AI which develops through play can learn even faster when supported by external guidance later in the process.

In another paper titled “Resource-Bounded Machines Are Motivated to Be Effective, Efficient, and Curious” (2013), Steunebrink, Koutník, Thórisson, Nivel, and Schmidhuber use a Work, Play, and Dream framework to design AI that more effectively balance exploration and exploitation.

In this framework, Play – curiosity- driven exploration and the seeking of novelty – drives an AI agent to explore gaps in its knowledge. learnings from such activities serve to foster discovery and new sequences of useful actions, which can then be capitalized upon in order to use less resources (energy, time, or inputs) in solving current and future work problems. In understanding the importance of play in the overall development of AGI, the authors write, “We emphasize that Work, Play, and Dream should not be considered as different states, but as different processes which must be scheduled.”

Finally, studies led by professor Hod Lipson and the Creative Machines lab at Columbia University are using play to push AI research beyond cognitive computing and creative machines and toward self-aware AI. In a series of experiments, Lipson’s team designed robots to use play principles to first learn about themselves, much like most living creatures do. Starting with no image of itself or its capabilities, these robots danced around in silly, random ways, and used the feedback they got from these activities to generate increasingly more accurate images of themselves. This enabled the robots to then learn how to build new capabilities, such as walking forward, without ever having been taught or programmed to do so. After the robot learned this new capability, however, Lipson’s team did something surprising: they chopped off one of the legs that the robot used to walk. Remarkably, within one day, the robot had reimagined itself and generated a new way to walk forward, optimizing the capabilities it had left – a feat that had not been accomplished before. The team’s latest work in this area, “Reset-free trial-and-Error learning for Robot Damage Recovery,” by Chatzilygeroudis, Vassiliades, and Mouret, was published in the February 2018 issue of Robotics and Autonomous Systems. 

These examples highlight how play is enabling researchers to develop the next generation of AGI. As mentioned, play in itself is not the most efficient or predictable form of learning. But play-led learning enables a deeper understanding that fosters AGI agents to be more curious, creative, and self-aware.

For now, the ability to think creatively and strategically may be hailed as the last refuge of human career life – see Joseph Pistrui’s January 2018 Harvard Business Review article, “The Future of Human Work is imagination, Creativity, and Strategy.” But looking ahead, it is conceivable that AGI will have the ability to do everything that humans currently do in their day-to-day work, whether blue-collar, professional, or creative. And if AGI will soon be on the same playing field as us, the consequent question we may want to consider is this: How might such technology enable us to reinvent the future of work?

the author

Farzad Fazeliani

Farzad Fazeliani is an economist specializing in strategic foresight and innovation.