Source: https://arxiv.org/abs/1704.01665 Adapted from: https://github.com/higgsfield/np-hard-deep-reinforcement-learning and https://github.com/louisv123/COLGE
This project was done for 6.890 in Spring 2019 at MIT by Sadhika Malladi.
The goal of this project was to extend existing machine learning algorithms for combinatorial optimization to augment the greedy coloring algorithm on graphs. The benefits include: (1) being able to improve a coloring on one graph by overfitting and (2) insight into optimal orderings for colorings.
The code in the main folder has the S2V-DQN described by Dai et al. and the baseline/
folder has the Pointer Network Actor-Critic that I used as a baseline.
Since this project was for a class, there is a writeup of my findings on the efficiency of the coloring algorithm. Please find it in paper.pdf
.
create_datasets.py
: creates graphs of specified type and size and saves them for reference during training
dsatur.py
: baseline code to run DSatur (another greedy coloring algorithm) on datasets that were created
environment.py
: environment class for the problem. To extend to other combinatorial optimization problems, make new elif branches in the various methods here.
agent.py
: combination of fitted Q-iteration and n
-step Q-learning.
runner.py
: contains the loops to train over the data, get predictions for the samples, and store analytics
models.py
: contains different models to pick the next node to color, S2V_QN_1 corresponds to Dai et al. Change the yaml file to modify hyperparameters in the model.
Follows basic structure as above, and train.py
is the main runnable file. Works on the datasets generated by create_datasets.py
in the structure2vec code.