Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] improve docs on eager workflow remote execution #5288

Open
2 tasks done
vttrifonov opened this issue Apr 25, 2024 · 3 comments
Open
2 tasks done

[Docs] improve docs on eager workflow remote execution #5288

vttrifonov opened this issue Apr 25, 2024 · 3 comments
Assignees
Labels
documentation Improvements or additions to documentation documentation-backlogged For internal use. Reserved for community team workflow.

Comments

@vttrifonov
Copy link

vttrifonov commented Apr 25, 2024

Description

I am very excited to learn about eager workflows! Unfortunately, so far I have not managed to make them work for me outside of sandbox and local.

The documentation focuses mostly on sandbox/local execution and for remote there is just a blurb. Throughout the doc it seems like @eager() is enough to decorate the eager workflows but is seems like for remote workflows @eager(remote=...) needs to be included everywhere (?). I understand (vaguely) why this is, but it does not make for a good looking code... This is not needed for @task and @workflow so the natural expectation is that it should not be needed for @eager as well. In any, case a more complete example with eager sub-workflows with remote in mind will be nice.

The second issue I ran into is that even when remote is set it turns out I also need 'secret_group' and 'secret_key' (otherwise an AssertionError: secret_group must be defined when using a remote cluster pops). This is where my attempts for using this feature have stalled. The documentation makes it sound like the person who writes the workflows (me) is the same as person who sets up the cluster (a very busy team of devops in our case). Other than toy examples, this is not true in general. For example, the doc says

Where config.yaml contains a flytectl-compatible config file and my_client_secret_group and my_client_secret_key are the secret group and key that you’ve configured for your Flyte cluster to authenticate via a client key.

When I read this initially I thought that I need to setup the cluster somehow (it sounds like my_client..., I've configured, my cluster etc) but then upon reading the secrets and authentication page (this should be linked in the eager workflow doc) it became more clear this is something that perhaps devops needs to do and then I get it from them. Not sure yet... I do not know much about setting up Kubernetes clusters but overall the impression is that to run @eager now I also need help from devops to intervene somehow.

Overall, @eager is a great feature that needs to crystalize further. I am hoping my comments here will help a bit. Ideally, the remote executing code should look (mostly) like the local one. In the meantime, a bit better documentation (for non-devops people) for what it takes to setup and run with remote will be helpful.

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@vttrifonov vttrifonov added documentation Improvements or additions to documentation untriaged This issues has not yet been looked at by the Maintainers labels Apr 25, 2024
Copy link

welcome bot commented Apr 25, 2024

Thank you for opening your first issue here! 🛠

Copy link

runllm bot commented Apr 25, 2024

Thank you for your detailed feedback on the documentation for eager workflows in Flyte, especially regarding remote execution. Your points highlight important areas for improvement in clarity and usability for users who are not directly involved in cluster management.

  1. Remote Execution Decorator: You're correct that for remote execution of eager workflows, the @eager(remote=...) decorator is necessary. This is different from @task and @workflow, which might indeed lead to confusion. A clearer explanation and justification for this requirement in the documentation could help align user expectations.

  2. Secrets Requirement: The need for secret_group and secret_key in remote configurations is a crucial point that should be better addressed in the documentation. It's important to clarify the roles and responsibilities between developers and DevOps, ensuring that developers understand what is required from them and what needs to be set up by the DevOps team. This separation of concerns isn't currently well-defined in the documentation.

  3. Documentation Links and Clarity: Including direct links to relevant sections like secrets management and authentication within the eager workflow documentation can provide a smoother learning curve and reduce confusion.

  4. Examples and Templates: Providing complete examples that include eager sub-workflows configured for remote execution could serve as valuable templates for users to adapt to their needs.

Your suggestions for making the remote execution code resemble local execution as closely as possible are insightful. Enhancing the documentation to better cater to non-DevOps personnel will indeed make the feature more accessible and easier to adopt.

Would you like more detailed guidance on how to configure and use eager workflows remotely, including code examples and configuration details? This could potentially help you move forward with your implementation while the documentation is being improved.

Sources:


This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @RunLLM to ask follow-up questions. See runllm.com for more!

@davidmirror-ops davidmirror-ops removed the untriaged This issues has not yet been looked at by the Maintainers label Apr 25, 2024
@neverett neverett added the documentation-backlogged For internal use. Reserved for community team workflow. label Apr 25, 2024
@kumare3
Copy link
Contributor

kumare3 commented May 3, 2024

@vttrifonov thank you for the feedback, we are aware, we have currently not dedicated enough resources to eager workflows. We will get to them soon, later in this year Q3ish

If not would love your help.
There are few major issues

  1. UI support - eager cannot be visualized well in ui
  2. Auth delegation. Currently it is not possible to delegate auth to the running container
  3. failure-recovery / state-saving support that is lightweight. currently it will indeed recover from failures, but that involves consulting with remote. We have ideas of how to make it fast with a local state checkpoint

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation documentation-backlogged For internal use. Reserved for community team workflow.
Projects
None yet
Development

No branches or pull requests

4 participants