Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module Dependency Errors #1325

Open
mnjowe opened this issue Apr 25, 2024 · 6 comments · May be fixed by #1379
Open

Module Dependency Errors #1325

mnjowe opened this issue Apr 25, 2024 · 6 comments · May be fixed by #1379
Labels
enhancement New feature or request framework

Comments

@mnjowe
Copy link
Collaborator

mnjowe commented Apr 25, 2024

****** For such moments where you don't want to run a full model analyses **************
@tbhallett , @tamuri and @matt-graham
With modules declaring dependencies and analyses wanting the modules where these dependencies are declared registered in sim.register(), I think it is somewhat difficult for someone new to TLM to setup a small analyses and run it as it will keep crashing with module dependency errors. see below for an example

tlo.dependencies.ModuleDependencyError: Module Tb depends on Epi which is missing from modules to register as are any alternatives to Epi.

Wouldn't it be nice if we can automatically register modules that are missing in simulation? i.e. when disease modules are declaring module dependencies they also declare/include the module name where this dependency is found. something like INIT_DEPENDENCIES = {'enhanced_lifestyle': 'Lifestyle'}. That way we could use the module name to automatically register the module if not already registered in sim.register()?

I know there might be some better ways to doing it BUT I just feel it would simplify the process of setting up a smaller analyses if all missing modules could be automatically registered instead of crashing the simulation.

@mnjowe mnjowe changed the title MODULE DEPENDENCIES Module Dependency Errors Apr 25, 2024
@tbhallett tbhallett added this to Issues in Issue management via automation Apr 25, 2024
@tbhallett
Copy link
Collaborator

I think this is a good idea @mnjowe

@matt-graham
Copy link
Collaborator

Agreed - this would be a nice feature to add. I think how the module dependencies are already declared could already allow doing this on the presumption that there is only one subclass of Module with a particular name (for example that there aren't two Lifestyle classes defined in different Python module files) which at the moment is the case, though having a more explicit specification of where the class is defined would definitely be more robust and easier to work with. Specificlly the get_module_class_map helper function in src/tlo/dependencies.py

def get_module_class_map(excluded_modules: Set[str]) -> Mapping[str, Type[Module]]:
"""Constructs a map from ``Module`` subclass names to class objects.
:param excluded_modules: Set of ``Module`` subclass names to exclude from map.
:return: A mapping from unqualified ``Module`` subclass to names to the correponding
class objects. This adds an implicit requirement that the names of all the
``Module`` subclasses are unique.
:raises RuntimError: Raised if multiple ``Module`` subclasses with the same name are
defined (and not included in the ``exclude_modules`` set).
"""
methods_package_path = os.path.dirname(inspect.getfile(tlo.methods))
module_classes = {}
for _, methods_module_name, _ in pkgutil.iter_modules([methods_package_path]):
methods_module = importlib.import_module(f'tlo.methods.{methods_module_name}')
for _, obj in inspect.getmembers(methods_module):
if is_valid_tlo_module_subclass(obj, excluded_modules):
if module_classes.get(obj.__name__) not in {None, obj}:
raise RuntimeError(
f'Multiple modules with name {obj.__name__} are defined'
)
else:
module_classes[obj.__name__] = obj
return module_classes

allows constructing a map from module name strings to module classes.

I think we would probably want to make this behaviour opt-in, at least initially, rather than the default so we could so somehting like add a keyword argument auto_register_modules defaulting to False to the Simulation.register method and if it is True find any missing dependencies and add them to the list of modules to be registered. This would only allow registering modules with the default values for their __init__ method arguments, but for a lot of cases this would probably suffice. One slight snag is that at the moment all modules have a required positional argument resourcefilepath to their __init__ method so we would either need to add this as an additional optional argument to Simulation.register to be passed through, or, refactor so that resourcefilepath is instead defined at the Simulation level as an argument to its initialiser (with modules then either given access by setting an attribute when registering or exposing a property which uses the existing reference to the top-level simulation object).

@matt-graham matt-graham added enhancement New feature or request framework labels Apr 25, 2024
@mnjowe
Copy link
Collaborator Author

mnjowe commented Apr 25, 2024

Thanks so much @matt-graham for the detailed explanation on how we can go about this. Should I then in a few days open a new PR on this?

@matt-graham
Copy link
Collaborator

Yeah opening a PR on this would be great! I would potentially try to deal with the issue of setting resourcefilepath for each module in one PR and have a separate PR for adding a new keyword argument to Simulation.register that allows automatically loading dependencies. The former is likely to require a bit more figuring out of what interface we want to go with and how to do this in a backwards compatible manner, and will also potentially involve edits to all existing modules, so might be a bit more complex to orchestrate.

To avoiding delaying a PR on adding the automatic module registering, for now we could potentially assume something like that automatically loaded dependencies are initialised with the resourcefilepath attribute of the already loaded module which they are a dependency for? This is a bit brittle as there is nothing which currently enforces that modules define a resourcefilepath attribute but would mean if we later refactor this so it is for example defined at a simulation level, we don't need to make multiple changes to the Simulation.register interface.

@mnjowe
Copy link
Collaborator Author

mnjowe commented Apr 25, 2024

Great, thanks @matt-graham . The two issues you have noted above are indeed so important in as far resolving this automatic module registering issue is concerned. Especially the resource file path which is needed when registering disease modules. I think I will have to delay my PR on this until you have found a workaround on the resource file issue. Thanks

@tbhallett
Copy link
Collaborator

tbhallett commented Apr 25, 2024

Thanks @matt-graham and @mnjowe

You're both quite right about the 'resourcefilepath' thing.

I do think that we can assume that the resourcefilepath is the same for all modules, and in fact, is really a Simulation-level parameter.

Indeed, I think there is already a structure for this, that we should have been using all along (I think I am to blame for not using it!): The place where the data is read-in is always Module::read_parameters, which is passed an argument from the Simulation object data_folder. I assume data_folder is intended to be a path to the resources that was provided to the Simulation on initialisation.

@mnjowe mnjowe linked a pull request May 13, 2024 that will close this issue
@mnjowe mnjowe linked a pull request May 29, 2024 that will close this issue
@mnjowe mnjowe linked a pull request May 29, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request framework
Projects
3 participants