You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+8-34Lines changed: 8 additions & 34 deletions
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,6 @@
1
-
# ChroMOO (or chromoo)
1
+
# chromoo
2
2
3
-
Chromatography optimization with cadet and pymoo. I built this because CADET-Match didn't offer optimization with certain parameters and objectives. For example, optimizing SOLUTION_BULK is not possible currently. Input of vector parameters is also currently not possible.
4
-
5
-
While I haven't yet fully looked through the source code of CADET-Match, I would like to take inspiration from it while building chromoo. Eventually, when I get a better understanding of pymoo, CADET-Match and the problem, I believe it should be possible to merge the code into CADET-Match.
3
+
`chromoo` a chromatography multi-objective optimization tool built on `Cadet-Core` and `pymoo==0.5`.
6
4
7
5
# Installation
8
6
@@ -17,14 +15,12 @@ conda install cadet
17
15
pip install -r requirements.txt
18
16
19
17
# Install this package. Use -e for an editable install.
20
-
pip install .
18
+
pip install [-e] .
21
19
```
22
20
23
21
# Usage
24
22
25
-
Chromoo requires a YAML config file. I use ruamel.yaml, which allows using YAML v1.2, meaning comments are allowed, and exponential notation is better parsed.
26
-
27
-
A template of the config follows:
23
+
Chromoo requires a YAML config file of the following form.
28
24
29
25
```yaml
30
26
filename: 10k-mono.mono1d.h5
@@ -65,36 +61,14 @@ termination:
65
61
- parameters and objectives are **lists**
66
62
- Objective targets can be provided as an (times,values) csv file in `objectives.filename` or with the times separately specified in `objectives.times`
67
63
- chromatograms already contain times, so it's easier to just provide the filename
68
-
- solution_bulk data obtained from 3D sims are multidimensional, and we would like to try fitting the whole thing as a flat vector first
69
64
- The `solution_times` section of the provided cadet simulation will be changed to match those of `objectives[0]` exactly.
70
65
- Recommended population sizes for n-dimensional problems is 100*n
71
-
- [CRIT] Don't fit porosity and velocity together. You can fit porosity and flowrate instead
72
-
- Provided examples, while valid, are *NOT* guaranteed to be correct. Though I will try to keep them correct.
73
-
- Checkpoints are saved at every generation by default. Checkpoints help avoid the pain of libpthread (or other) crashes from having to completely restart the fitting.
74
-
- Use `force_checkpoint_continue` to force the algorithm to continue from a terminated checkpoint. Helpful if you made the termination criteria stricter.
66
+
- Don't fit porosity and velocity together. You can fit porosity and flowrate instead
67
+
- Provided examples, while valid, are *NOT* guaranteed to be correct as the software is not guaranteed to be stable in terms of development and backwards compatibility.
68
+
- Checkpoints are saved at every generation by default.
69
+
- Use `force_checkpoint_continue` to force the algorithm to continue from a _terminated_ checkpoint. Helpful if you made the termination criteria stricter than required.
75
70
- Be careful when resuming from a checkpoint. Any changes to problem parameters might not be reflected because the algorithm/problem is fully restored from the checkpoint
76
-
- `transforms.parameters`(choices = [`lognorm`, `norm`, `none`]) applies to all inputs individually
77
-
- `transforms.objectives`(choices = [`none`, `mean`, `geometric`]) will be made to convert multiple objectives into a single objective.
78
71
79
72
# Known Issues
80
-
- [CRIT] Got simulation failure due to `error 4 in libpthread` (see dmesg when it happens) on IBT012. Simulation runs manually.
81
73
- Reading inputs from YAML loads strings as `str` and from h5 files we get `numpy.bytes_`. CADET-Python run_load() uses load_results() instead of full load(). So if we check for input string values after simulation, the type of it depends on whether we use full load() or load_results() since we deal with YAML files as well. So we have to consider whether we deal with strings or bytestrings. Simple solution: Don't use run_load in scripts.
82
74
- Loading checkpoints also loads the previous values for all/most parameters. So if nproc is updated before loading, the new value isn't used.
83
-
84
-
# TODO
85
-
- [ONGO] Move to numpy arrays
86
-
- [TASK] Hypervolume indicator as a convergence criterion
87
-
- [DONE] Replace readArray, readChromatogram with np.genfromtxt or loadtxt
88
-
- [DONE] Implement Objective vs Objective 2D plots
89
-
- [DONE] Pickle pop xs and fs for use with chromoo-post
90
-
- [TASK] Write out opts per generation
91
-
- [TASK] Write a configHandler method to output a dummy config.
92
-
- [TASK] Unified interface/method for deep getting and setting from/to a Dict or dict
93
-
- [TASK] Consider weighting objectives: Look at weighted least squares.
94
-
- [NOTE] Normalizing equalizes weights, treating relative error on each curve.
95
-
- [TASK] Consider multi-started/restarted systems
96
-
- [TASK] Check out numpickle: https://gwang-jin-kim.medium.com/faster-loading-and-saving-of-pandas-data-frames-using-numpickle-numpy-and-pickle-d15870519529
97
-
- [TASK] Performance of np.take() with numbajit vs boolean indexing: https://stackoverflow.com/questions/46041811/performance-of-various-numpy-fancy-indexing-methods-also-with-numba
98
-
- [TASK] Improved plotting for large number of subplots in post: https://stackoverflow.com/questions/13046127/matplotlib-very-slow-is-it-normal/13060980#13060980
99
-
- [CRIT] Check for behavior when take=[2,[0,1,2...]]. i.e., when indices is a list. Does verify fail? What happens?
0 commit comments