Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Use Couler or not #2996

Open
typhoonzero opened this issue Oct 9, 2020 · 3 comments
Open

[Discussion] Use Couler or not #2996

typhoonzero opened this issue Oct 9, 2020 · 3 comments

Comments

@typhoonzero
Copy link
Collaborator

In the refactored code, we use Couler to generate an Argo YAML file to submit to the Kubernetes cluster to run. A generated Couler program should look like:

def step_entry_0():
    import runtime
    runtime.local.tensorflow.train(....)

couler.run_container(step_entry_0, ...)

def step_entry_1():
   import runtime
   runtime.db.exec(....)

couler.run_container(step_entry_1, ...)

Execute the above Couler program, it should generate a YAML file with the above step python code in it.

  1. SQLFlow is a compiler to compile a SQL program to a workflow YAML file, we use a code generator to generate the above Couler program according to the SQLFlow IR, then execute the Couler program to get the YAML, yet we can rewrite the code generator to generate the YAML directly to make the procedure simple.

image

  1. Use Couler to generate YAML is hard to maintain. We use Couler to generate and submit the workflow yet, we still use Go to Fetch the workflow status periodically; The SQLFlow compiler needs to maintain a Go side workflow struct in order to do dependency analysis and other optimizations ( e.g. katib?), there's no need to translate it to Python side, and use Python Couler to implement YAML generation again.

  2. We need a local mode to simplify the development and debugging. As a compiler, SQLFlow local mode can directly generate a Python program with several step functions and call them one by one, and with the "workflow mode", SQLFlow can generate the YAML with the step functions directly.

@lhw362950217
Copy link
Collaborator

lhw362950217 commented Oct 9, 2020

I mostly agree with this idea. As for now, we only use a very basic operation of couler. And the couler generation process is more like copy the code to the output YAML. However, we need to make sure we will not use the more advanced features, like condition, loop, and other things. We don't want to implement the whole suite of logic in couler in Go.

@brightcoder01
Copy link
Collaborator

brightcoder01 commented Oct 9, 2020

  1. We need a local mode to simplify the development and debugging. As a compiler, SQLFlow local mode can directly generate a Python program with several step functions and call them one by one, and with the "workflow mode", SQLFlow can generate the YAML with the step functions directly.

I think that for local step, we will call docker api to launch a container on our Laptop instead of calling the step functions built upon the runtime library directly.

Because models and runnables are released in customized docker images. We need launch a container using these images to execute them.

@typhoonzero
Copy link
Collaborator Author

We decide to rename Couler to flow.

SQLFlow compiles a SQL program into a workflow program, this workflow program should be able to run on different runtime environments like running in local with Docker, on Kubernetes with Argo or Tekton.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants