Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Randomness of PopulationSim outputs related to API calls #182

Open
xiex0055 opened this issue Nov 9, 2023 · 2 comments
Open

Randomness of PopulationSim outputs related to API calls #182

xiex0055 opened this issue Nov 9, 2023 · 2 comments

Comments

@xiex0055
Copy link

xiex0055 commented Nov 9, 2023

Recently, while executing MWCOG Population Synthesizer, developed based on PopulationSim v0.4.3, MWCOG staff discovered that the software consistently produced varying outputs at the disaggregate level (specifically, within the synthesized household and person files), despite identical inputs. This outcome was unexpected, considering that the entropy maximaizing and integerization methods utilized in PopulationSim should render deterministic results. In response, MWCOG staff sought assistance from their consultant, RSG. Initially, RSG staff suspected that the issue might be linked to multiprocessing (e.g., see #150). However, they subsequently realized that the multiprocessing feature was introduced in the latest version (v 0.5.1) of PopulationSim and was unavailable in the version (v0.4.3) used in the MWCOG Population Synthesizer. Aditya Gore (RSG) made the following observation:

"The PopulationSim uses linear programming tools from the ortools package for integerization. Time limits can be specified for these tools and in PopulationSim the time limit is currently set to 60 seconds. I have noticed that based on available computer resources the tool sometimes (randomness here) hit this time limit without returning a solution in which case PopulationSim turns to a different method of integerization. It is possible that you are running into this issue and each run is producing slightly different results."

MWCOG staff pose the following questions:

  1. Does the randomness of PopulationSim outputs significantly impact the PopulationSim results at the aggregate level including the validation results?
  2. Is there any evidence suggesting that this variability in PopulationSim outputs will NOT considerably affect regional travel demand modeling results?
  3. Can this source of random be eliminated in a future version of PopulationSim?
@bettinardi bettinardi added this to the Phase 9 Priorities milestone Jan 5, 2024
@bettinardi
Copy link
Collaborator

Quickly adding that Oregon's testing and work with PopulationSim is also producing this issue (it exists for Oregon too).

@jeffreyhood
Copy link

Although I'm not yet fully apprised of the issue, my initial reaction to this information is to propose that we change the limit from a time to a number of iterations, or that we change solvers if a limit based on a number of iterations is not possible in ortools. It will be very important for statewide collaboration that the population synthesizer generate reproducible results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants