Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need additional clarification/examples around using set_dependencies+map #246

Open
varunm22 opened this issue Nov 29, 2021 · 2 comments
Open
Labels
enhancement New feature or request

Comments

@varunm22
Copy link

Summary

I'm confused on how to properly use dependencies. Let's say I have a workflow with 4 groups of steps (A, B, C, D) and each has multiple subtasks that can happen in parallel (A1, A2, ..., B1, B2, ...). Currently, I'm adding all the A steps using couler.map, then adding all the B steps with couler.map, etc. This correctly parallelizes across A1, A2, ..., but none of the B steps start until all the A steps have completed, despite the fact that I never explicitly set dependencies.

In this case, I want A and B to run in parallel, then C then D. Having this run sequentially as A, B, C, D is technically correct, but not ideally performant. However, given that I'm not setting dependencies, and they're still running sequentially, I feel like using the set_dependencies function wouldn't help. Also, when I tried to use the set_dependencies function, the couler code errored on parsing its own generated yaml due to duplicate anchor definitions. Would definitely like to see a more in-depth example than those currently present in the README which shows how to properly use set_dependencies in combination with functions like map.

Use Cases

Mostly explained above.


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

@varunm22 varunm22 added the enhancement New feature or request label Nov 29, 2021
@dmerrick
Copy link
Contributor

dmerrick commented Dec 21, 2021

I think I can phrase this question more concisely.

We want a job that looks like this:



       start
      /     \
     /       \
   /|\       /|\
  / | \     / | \
 A1...AN   B1...BN
  \ | /     \ | /
     \        /
	  \      / 
       \    / 
         C
		 |
         D

Where A and B are separate sets of commands wrapped in couler.map()

What we get is this:

  start
    |
   /|\
  / | \
 A1...AN
  \	| /
    |
  / | \
 B1...BN
  \	| /
    |
	C
	|
	D

@dmerrick
Copy link
Contributor

Is there a way to get this to work in Couler? Do we have to use a DAG?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants