We present the the concept of Guided Learning, which out-lines a framework in which a reinforcement learning agent can effectively’ask for help’ as it encounters stagnation. Either a human or expert agentsupervisor can then effectively ’guide’ the agent as to how to progressbeyond the point of stagnation. This guidance is then encoded in a novelway using a separately trained neural network referred to as a ’TaughtResponse Memory’ that can be recalled when another ’similar’ situa-tion arises in the future. This paper applies Guided Learning on topof an evolutionary algorithm but also shows how Guided Learning isalgorithm independent and can be applied in any reinforcement learn-ing context. The results show that our initial implementation of GuidedLearning provided in this paper gives superior performance and yields,on average, an increase of 136% in the rate of progression of the mostfit genome with best and worst case results yielding 137% and 110%respectively and an average increase of 112% in rate of progression forthe average genome with best and worst case results of 558% and 47%respectively. All results were achieved with minimal guidance. Such re-sults occur because the agent can exploit more information and thus,the need for exploration of the solution space is reduced. The results ob-tained show good promise for Guided Learnings potential as such resultswere obtained with only a partial implementation and much future workstill remains.

Setup

Dependancies:

Python 3.6 + various libraries (matplotlib, sklearn, skimage, etc..)
OpenAI Retro: https://github.com/openai/retro

python -m pip install -r requirements.txt

python -m retro.import /path/to/your/ROMs/directory/

copy data.json to \path_to_python\Lib\site-packages\retro\data\stable\SuperMarioBros-Nes

Patch python-neat by adding:

    def next_generation(self):
        # Gather and report statistics.
        best = None
        for g in itervalues(self.population):
            if best is None or g.fitness > best.fitness:
                best = g
        self.reporters.post_evaluate(self.config, self.population, self.species, best)

        # Track the best genome ever seen.
        if self.best_genome is None or best.fitness > self.best_genome.fitness:
            self.best_genome = best

        # Create the next generation from the current generation.
        self.population = self.reproduction.reproduce(self.config, self.species,
                                                          self.config.pop_size, self.generation)
        # Check for complete extinction.
        if not self.species.species:
            self.reporters.complete_extinction()

            # If requested by the user, create a completely new population,
            # otherwise raise an exception.
            if self.config.reset_on_extinction:
                self.population = self.reproduction.create_new(self.config.genome_type,
                                                                   self.config.genome_config,
                                                                   self.config.pop_size)
            else:
                raise CompleteExtinctionException()

        # Divide the new population into species.
        self.species.speciate(self.config, self.population, self.generation)

        self.reporters.end_generation(self.config, self.population, self.species)

        self.generation += 1

        return

To the end of \path_to_python\Lib\site-packages\neat\population.py

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
Evaluation		Evaluation
.gitignore		.gitignore
README.md		README.md
config-feedforward.config		config-feedforward.config
data.json		data.json
eval_baseline.bat		eval_baseline.bat
human_input.py		human_input.py
inputs.py		inputs.py
keras_example.py		keras_example.py
main.py		main.py
mario.config		mario.config
mario.py		mario.py
model.h5		model.h5
model.json		model.json
notes.md		notes.md
requirements.txt		requirements.txt
scenario.json		scenario.json
utils.py		utils.py
visualize.py		visualize.py

BeelGroup/Guided-Learning

Folders and files

Latest commit

History

Repository files navigation

Setup

About

Resources

Stars

Watchers

Forks

Languages