New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--default-resources
combined with partial resource specification causes TypeError
#1550
Comments
Here's the complete trace:
|
An immediate fix that DOES work is to modify Happy to make a PR, but this may be too much the "band-aid" fix for a deeper problem. Let me know what you'd like. EDIT: |
The new auto `--default-resources` was causing test failures, so we manually set the default mem_mb to 0. Ran formatting with black
Thanks a lot, good catch! I have fixed this in above PR. Basically the call to update the resources before scheduling came too late. |
The new auto `--default-resources` was causing test failures, so we manually set the default mem_mb to 0. Ran formatting with black
The new auto `--default-resources` was causing test failures, so we manually set the default mem_mb to 0. Ran formatting with black
The new auto `--default-resources` was causing test failures, so we manually set the default mem_mb to 0. Ran formatting with black
The new auto `--default-resources` was causing test failures, so we manually set the default mem_mb to 0. Ran formatting with black
The new auto `--default-resources` was causing test failures, so we manually set the default mem_mb to 0. Ran formatting with black
The new auto `--default-resources` was causing test failures, so we manually set the default mem_mb to 0. Ran formatting with black
The new auto `--default-resources` was causing test failures, so we manually set the default mem_mb to 0. Ran formatting with black
* Add .python-version to .gitignore .python-version is used by pyenv to manage local python versions * Track what specific pipe group each job belongs to * Adds a pipe_group property to jobs that tracks which pipe group the job belongs to, if any. None if not involved with pipes, otherwise, it equals the CandidateGroup generated when grouping the pipe. * If the job was not part of a user-defined group, the ordinary group property will still be set to CandidateGroup, as before. * Toposort puts pipe jobs at same level * Jobs connected by a pipe now do not depend on each other, but inherit each other's dependencies. This properly ensures toposort() puts them at the same level. * Ensure item is correct type for CandidateGroup eq * __eq__ in CandidateGroup failed when comparing with a different type of object. This corrects this by inserting a type check * Group resource constraint calculations Computes group-resource requirements on the cluster by taking user-provided constraints into account. Jobs that could be run in parallel will now be run in series if not enough resources (e.g cores or memory) are available. Runtime is now treated as a special resource, and will be accurately calculated for parallel jobs via max() or for series jobs via sum() Logic is implemented in a series of private methods in GroupJob Pipe groups are treated as a single unit when calculating resources. Runtime is summed over pipe groups instead of "maxed". Cores are merged via max() * Fix sorting for pipe job scheduling. The toposorting within GroupJob was updated to put all pipe jobs at the same toposort level so that accurate resource requirements could be calculated. This broke the scheduler, however, which relied on pipe jobs within the same group being in an hierarchical order. This adds an additional toposort to GroupJob.__iter__ specifically for pipe jobs. This also refactors the toposorting code to a less complex form * Stop constraining resources for submitted jobs Add a new property "local_resources" to Job and GroupJob and change all references to job.resources in the scheduler to job.local_resources. local_resources is normally identical to resources, unless job.is_local == False (i.e the job is submitted to a cluster), in which case local_resources will only include "_cores" and "_nodes". GroupJob.resources now outputs Resources object instead of dict. * Add resource constraints to cluster job submission Include the --resources flag on the snakemake call made in the child job. Does not send any runtime information, as this falsly constrains jobs within the cluster. Cores is still set to "all" for regular cluster submission, leaving the cluster to manage how many cores the job is entitled to * Add workaround for #1550 in tests The new auto `--default-resources` was causing test failures, so we manually set the default mem_mb to 0. Ran formatting with black * Implement resource scopes for remote jobs Resources can now be set as local or global, either using the Snakefile directive "resource_scope" or, with higher priority, the CLI argument "--set-resource-scopes". The scheduler shall only take local resources into account on a local job, i.e. a job scheduled on a cluster. Global resources shall always be taken into account. Group jobs that would exceed global resource allocations shall not be scheduled until other jobs have finished. Note that the global/local distinction only has effect in remote jobs, i.e. cluster, compute nodes, etc. In local executions, all jobs are local. * Update documentation for resource scopes Add resource scope info and clean up/organize resource section * Unify toposort functions Toposort was seperately implemented both in dag and in jobs. This moves the jobs version into dag. Pipe group jobs are immediately sorted in dependency order when toposort is calculated, instead of each time __iter__() is called on a group job * Ensure resources passed to subcmds are digits Previously, the check was merely if the resource was an int, but some int subclasses are not represented as digits, especially bool, which is represented by the strings True and False. This is corrected by checking if the stringified int looks like an int (as opposed to `True` or `False`) * Remove special merging rules for pipejobs Also refactors _simple_merge into a more generic format * Refactor group job calculation into new class Moves all relevant code into the GroupResources class in the resources module. Remove the hardcoding of "runtime" from the logic and isolate it into two variables: additive_resources, which determines which resources are summed across layers, and sortby, which determines sorting priority when layering. GroupResources is changed a UserDict containing a class method 'basic_layered'. This opens the possibility for new group resource merging methods to be added later. * Refactor resource scopes into self-contained class Moves the logic for resource scope management into a single class Adds an 'excluded' scope for resources not be sent child snakemake processes. This undocumented feature will currently be used to prevent runtime from being sent to cluster jobs * Pass on resource limits to run-directive jobs * Make too-much-group-resource error more generic Remove hardcoded mention of runtime, instead referring to the additive_resources Move some of the error logic into it's own function * Fix group_job_pipe test failures on low core envs Some environments (like gh actions) don't have enough cores to run the subprocess in test_group_job_resources_with_pipe. We fix this by patching RealExecutor.get_job_args, which instructs the executor how many cores to request * Refactor basic_layered and add docstring Common operations have been refactored into seperate methods to simplify the logic. * Clarify snakemake version scope feature started * Add link to scopes explanation in cli help * Test approp inclusion of resources in submission Ensure that excluded resources like runtime are not included in the job submit script
Snakemake version
7.3.5+0.g4f2258ff.dirty
Describe the bug
When running snakemake using
--default-resources
, but without specifyingmem_mb
so that the Snakemake-defined default ofmem_mb=max(2*input.size_mb, 1000)
kicks in, along with the--resources
flag withmem_mb
, if a root rule specifies amem_mb
requirement under its resources directive and a downstream rule doesn't, the downstream rule will error out withLogs
Minimal example
# CLI call snakemake --default-resources --resources mem_mb=4000 -c1 all
Additional context
The last version the above code will work is v6.4.1. The first version the error appears in is v6.8.0. Weirdly, in v6.7,6,5, I get completely different errors, but perhaps those issues have been resolved?
I'm currently looking into the source if the error, although it obviously stems from the fact that, in the example above,
all
does not know the size of it's input, so it'smem_mb
gets automatically set to"<TBD>"
. Obviously in previous versions that would get reset whenall
actually ran, but this isn't happening any more.The text was updated successfully, but these errors were encountered: