Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data set failing using OR-Tools #1

Closed
torressa opened this issue Mar 7, 2024 · 2 comments
Closed

Data set failing using OR-Tools #1

torressa opened this issue Mar 7, 2024 · 2 comments
Assignees

Comments

@torressa
Copy link

torressa commented Mar 7, 2024

Running data_extraction_from_rf_experiments.py with method = "cp-sat" fails on the adult dataset as follows:

Local Config:
 dataset =  adult
 n_estimators =  1
 max_depth_t =  None
 seed =  0
Using dataset adult, training set size is 10 with 19 attributes.
accuracy_train= 1.0 accuracy_test= 0.7657718120805369
RF parsing done!
Model creation done!

Starting CP-SAT solver v9.8.3296
Parameters: random_seed: 0 max_time_in_seconds: 18000 log_search_progress: true num_workers: 8

Initial satisfaction model '': (model_fingerprint: 0x388a71457e5c97d2)
#Variables: 190
  - 190 Booleans in [0,1]
#kLinear0: 10
#kLinearN: 10 (#terms: 60)

Starting presolve at 0.00s
INFEASIBLE: 'proven during initial copy of constraint #10:
linear {
  domain: [1, 1]
}
With current variable domains:
'

Presolve summary:
  - 0 affine relations were detected.
Problem closed by presolve.
CpSolverResponse summary:
status: INFEASIBLE
objective: NA
best_bound: NA
integers: 0
booleans: 0
conflicts: 0
branches: 0
propagations: 0
integer_propagations: 0
restarts: 0
lp_iterations: 0
walltime: 0.000337
usertime: 0.000337
deterministic_time: 0
gap_integral: 0

Traceback (most recent call last):
  File "DRAFT/data_extraction_from_rf_experiments.py", line 265, in <module>
    dict_res = extractor.fit(
               ^^^^^^^^^^^^^^
  File "DRAFT/DRAFT.py", line 299, in fit
    self.perform_reconstruction_v1_CP_SAT(
  File "DRAFT/DRAFT.py", line 474, in perform_reconstruction_v1_CP_SAT
    raise RuntimeError(
RuntimeError: Infeasible model: the reconstruction problem has no solution. Please make sure the provided one-hot encoding constraints are correct. Else, report this issue to the developers.

I have not touched the one-hot encoding and it works for the other 2 datasets.

Additionally, as a general comment, model-building time in your experiments is part of the solve_duration_time, maybe this should be considered as a separate timer as it doesn't reflect the solver's ability to solve the problem but the speed of the API (which can of course vary depending on the implementation and the language used).

@ferryjul ferryjul self-assigned this Mar 7, 2024
ferryjul added a commit that referenced this issue Mar 7, 2024
@ferryjul
Copy link
Collaborator

ferryjul commented Mar 7, 2024

Thank you for your feedback @torressa ! I've been able to reproduce this error. It was due to an edge case where all training data belongs to the same class - and the resulting trees have a single leaf node (and no internal node). I fixed that in our code, the error should no longer appear! (however, note that in such a case the forest brings no valuable information for reconstruction)

Regarding the fact that we include the model-building time in the returned solve_duration_time, it is due to the fact that we aim at measuring the runtime of our entire method - and not the performances of the solvers themselves. Indeed, model creation would be way faster using the C++ APIs rather than the Python ones!
We could have done it the other way to actually quantify the solver's ability to solve the problem, and may modify that later.
In our experiments, we assign limited run-times using the timeout parameter, which applies to the solvers' runtimes only as we cannot bound the models' creation time (and so the comparisons presented in the Appendices of our paper only compare the solvers' runtimes once the model is built).

I'm closing this issue now as the problem is fixed, but don't hesitate to reach out for any additional comment/feedback!

@ferryjul ferryjul closed this as completed Mar 7, 2024
@torressa
Copy link
Author

torressa commented Mar 7, 2024

Cool! Thanks for the clarification!
Very cool piece of work BTW!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants