Abstract
Reinforcement Learning (RL) boasts the ability to learn how to determine an agent’s best action given a state. However, given its need for storing all the information required to describe the state, its usage is limited to simple problems. When the knowledge space grows Deep Reinforcement Learning (DRL) allows handling more complex problems by learning from high dimensional sensory streams. However, when training aforesaid methods in fixed environments, they tend to overfit, leading to failure when applied to different environments. This issue has been studied and partially solved by involving procedural content generation to add variety to the training environment, but the DRL performances drop when the generated content is not similar enough to the final environment. In this paper we explore the possibility of using Generative Adversarial Networks (GANs) to generate an ideal training environment for DRL, generating content by extracting features from human designed levels and using an agent in original human designed levels. The levels produced with GANs showed very good generalization, producing very humanlike levels with a large variety and combinations. Our DRL results show that overfitting to a problem can be positive when the test environment is known, as training on varied and more difficult levels did not produce better results on easy levels. We also proved the efficacy of progressive learning by increasing the level of difficulty only when a certain performance has been reached, as that was the only method capable of achieving a learning process in humanlike levels with complex mazes and patterns.
Development
This project was developed for a special project at the IT University of Copenhagen in a span of twelve weeks. The project was developed by a team of two individuals. The project was developed in Python.