HD-MILP-Plan

Hybrid Deep MILP Planner (HD-MILP-Plan) is a two-stage planner based on the learning and planning framework [1,2] that (i) learns the state transition function T(s_t,a_t) = s_t+1 of a factored planning problem using Densely Connected Neural Networks [3] from data, and (ii) compiles multiple copies of the learned transition function T'(...T'(T'(T'(I,a₀),a₁),a₂)...) = G (as visualized by Figure 1) into MILP and solves it using off-the-shelf MILP solver [4]. HD-MILP-Plan can handle discrete/continuous action and state spaces, arbitrarily complex state transition functions, linear constraints on actions/states and linear reward functions.

Figure 1: Visualization of the learning and planning framework presented in [1,2] where red circles represent action variables, blue circles represent state variables, gray circles represent the activation units and w's represent the weights of the neural network.

Improvements

HD-MILP-Plan

i) includes parsers for domain files that read in linear expressions of form: sum_1..i..n a_i x_i ? k where ? can be <=, >= or == and a_i is a constant real number. See translation folder for more details.

ii) handles goal (state) constraints.

iii) can make use of known transition functions. See example transition files under translation folder for more details. HD-MILP-Plan will be prompted to encode known transition functions only when a strict subset of states are predicted by the DNNs.

iv) can handle discrete state variables with a) boolean domains that are learned with step function, and b) integer domains that are learned with linear function. For b), the prediction value is rounded to the nearest integer during optimization.

Dependencies

i) Data collection (input to training DNN): Data is collected using the RDDL-based domain simulator [5].

ii) Training DNN: The toolkit [6] is used to train DNNs. The final training parameters were recorded into dnn.txt file.

iii) Solver: Any off-the-shelf MILP solver works. In our papers [1,2], we used CPLEX solver [4].

For i) any domain simulator and for ii) any DNN training toolkit works. Example dnn.txt and domain files (under translation folder) are provided for navigation, hvac and reservoir domains. Therefore to run the planner, you will only need iii).

Running HD-MILP-Plan

python hd_milp_plan.py -d domain -i instance -h horizon -b bound -s sparsify (optional, set to 0.0 by default)

Example: python hd_milp_plan.py -d hvac -i 3 -h 10 -b False

Verification Task

HD-MILP-Plan can also be used to verify different properties of DNNs by setting horizon -h to 1.

Implementation Notes

HD-MILP-Plan is re-written with some minor implementation differences from the original implementation including:

i) the current implementation is in Python as opposed to the original implementation that was in C++.

ii) ability to handle discrete state variables (as mentioned above).

iii) the ability to handle goal constraints (as mentioned above).

iv) the ability to handle known transition functions (as mentioned above).

v) not connected to the RDDL simulator [5] (to connect to the RDDL simulator, you simply need to feed the first executed actions at time 0 to the rddlclient.c via console, and reduce the horizon by 1 at each iteration of the planning loop).

vi) the number of parallel threads is set to 1 as opposed to 32.

vii) no total time limit is set for planning.

viii) total time limit for preprocessing is set to a deterministic constant value (i.e. 30,000 ticks). Allocating more time can significantly improve your planning performance especially for DNNs with more than 1 hidden layers.

Summary

Action Space	State Space	DNN Type	Global Constraints	Reward Optimization	Known Transition Functions	Optimality Guarantee w.r.t. DNN
Discrete and Continuous	Discrete and Continuous	Densely-connected DNNs with ReLU activation units	Yes, Piecewise Linear	Yes, Piecewise Linear	Yes, Piecewise Linear	Yes

Citation

If you are using HD-MILP-Plan, please cite the papers [1,2].

References

[1] Buser Say, Ga Wu, Yu Qing Zhou, and Scott Sanner. Nonlinear hybrid planning with deep net learned transition models and mixed-integer linear programming. In 26th IJCAI, pages 750–756, 2017.

[2] Ga Wu, Buser Say, and Scott Sanner. Scalable Planning with Deep Neural Network Learned Transition Models. JAIR, Volume 68, pages 571–606, 2020.

[3] Gao Huang, Zhuang Liu, and Kilian Weinberger. Densely connected convolutional networks. 2016.

[4] IBM ILOG CPLEX Optimization Studio CPLEX User's Manual, 2017.

[5] Scott Sanner. Relational dynamic influence diagram language (rddl): Language description. 2010.

[6] Ga Wu. Github repository.

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
dnn		dnn
translation		translation
LICENSE.md		LICENSE.md
README.md		README.md
hd_milp_plan.py		hd_milp_plan.py
hdmilpplan.png		hdmilpplan.png
nonlinear-hybrid-planning.pdf		nonlinear-hybrid-planning.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dnn

dnn

translation

translation

LICENSE.md

LICENSE.md

README.md

README.md

hd_milp_plan.py

hd_milp_plan.py

hdmilpplan.png

hdmilpplan.png

nonlinear-hybrid-planning.pdf

nonlinear-hybrid-planning.pdf

Repository files navigation

HD-MILP-Plan

Improvements

Dependencies

Running HD-MILP-Plan

Verification Task

Implementation Notes

Summary

Citation

References

About

Releases

Packages

Languages

License

saybuser/HD-MILP-Plan

Folders and files

Latest commit

History

Repository files navigation

HD-MILP-Plan

Improvements

Dependencies

Running HD-MILP-Plan

Verification Task

Implementation Notes

Summary

Citation

References

About

Topics

Resources

License

Stars

Watchers

Forks

Languages