From meta-learning towards lifelong learning; efficient and fast reinforcement learning for complex environments. 01/01/2019 - 31/12/2022

Abstract

Reinforcement Learning agents have attained incredible achievements over the past few years, with AlphaGo's resounding victory over one of the world's top Go players as a crowning achievement. A severe limitation of such agents is that they only know how to function in one very specific environment; AlphaGo is unable to play Go with a tweaked ruleset, let alone play competitively in a different board game. The meta-learning principle aims to improve this. By training the agent not only on one task, but instead on many tasks from a distribution, the trained agent can quickly learn how to behave in a novel task from the distribution. In this project, we propose several improvements to the field of meta-reinforcement learning. First, we propose a meta-learner based on Hierarchical Temporal Memory, which mimics the human brain according to our current understanding of it. This system adapts quickly to changing patterns in the environment—a desirable property for a meta-learner. We also investigate a plethora of ways to auto-generate these task distributions, and evaluate how we can introduce new abilities efficiently to an already trained meta-learner. Finally, we will extend a meta-learner to work with not just one, but with many task distributions. Ideally, such a system would be able to quickly learn to perform any conceivable task at least as well as a human.

Researcher(s)

Research team(s)