Paper Review | Emergent Tool Use From Multi-Agent Autocurricula
What could go wrong after 13 Billion years of learning?
I really enjoyed reading this research paper and decided to cover it even though it is a bit older. In this paper ‘Emergent Tool Use From Multi-Agent Autocurricula’, OpenAI’s research team has been studying multi-agents and their self-learning behavior.
What makes this paper interesting to me is not necessarily that it’s a research project by OpenAI, but its implications as it asks a quite profound question.
Can a group of agents develop new skills on their own?
If this would work, it could mean a mile-long step forward for generalization and adaptive artificial intelligence.
Sentience is a maze of self-discovery.
Project Goal: Is it possible for an agent to learn skills by itself?
Problem: Agents have to be trained with massive amounts of data on tasks and can’t generalize well.
Proposed Solution: The project team defined a closed environment game setting (“hide-and-seek”) within MuJoCo with (1) defined simple game rules, (2) multi-agent competitive self-play with sparse rewards, and (3) standard reinforcement learning algorithms.
Opinion: OpenAI’s team could show that by using self-play autocurricula the agent teams learned the usage of tools which was previously thought of as being a human-only trait. This paper matters because agent technology has made rapid progress since this paper was published and new research could elevate the findings to another level. If you consider ReAct and the Attention paper in combination with tools like AutoGPT, self-directed skill learning/skill ideation could be used to improve policy setting.
Yet, do we really want agents to be able to develop new skills is a completely different question.
Links: Paper, Website, Video (9/10)
Please subscribe or leave a like for more content like this.
So let’s dive in.