Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asynchronous web tasks #1572

Open
daquinteroflex opened this issue Mar 27, 2024 · 1 comment
Open

Asynchronous web tasks #1572

daquinteroflex opened this issue Mar 27, 2024 · 1 comment
Assignees
Labels
3.0 will eventually go into version 3.0

Comments

@daquinteroflex
Copy link
Collaborator

daquinteroflex commented Mar 27, 2024

When doing parameter scans, especially with large simulation tasks like those that use CustomMediums, it takes a pretty long time to create/upload the task, and then sequentially download it. Could we paralellize this process?

Note related to #1242 but specifically on asynchronous tasks.

This is however a bit broader:

For example, in https://docs.flexcompute.com/projects/tidy3d/en/latest/notebooks/MetalHeaterPhaseShifter.html

for _, hs_data in batch_data.items():
    temp_interpolated = hs_data["temperature"].temperature.interp(x=target_grid.x, y=0, z=target_grid.z, fill_value=300)
    psim = optic_sim.perturbed_mediums_copy(temperature=temp_interpolated)
    perturb_sims.append(psim)

It'd be nice if hs_data could be just a collection of task result references, that can be downloaded, and not have to be uploaded to the cloud again.

Useful reading:

@daquinteroflex daquinteroflex changed the title Speed up task generation, upload and download Asynchronous web task running Apr 5, 2024
@daquinteroflex daquinteroflex changed the title Asynchronous web task running Asynchronous web tasks Apr 5, 2024
@daquinteroflex daquinteroflex self-assigned this Apr 9, 2024
@daquinteroflex
Copy link
Collaborator Author

daquinteroflex commented Apr 9, 2024

Theory behind the implementation options

We want users to access asynchronous commands without much complexity. In my opinion, we want:

  • To support both concurrent and parallel tasks framework
  • Make it easy for the user to interact with these aynchronous commands without necessarily having to understand the asynchronous packages underneath.

From theiron.io blog:
image

Candidate implementation packages suggested:

Important concepts to understand:

CPU-bound jobs will spend most of their execution time on actual computation ("number crunching"[1]) as opposed to e.g. communicating with and waiting for peripherals such as network or storage devices (which would make them I/O bound instead).

In our case, our blocking functions are pretty clear: the user has to await our server to receive the uploaded simulation, run the pipeline, and download the simulation. In this sense, fundamentally our web api has to wait for such operations to be completed, and hence our operations are mainly IO-limited really.

Now, let's evaluate each package according to this requirement:

"asyncio is often a perfect fit for IO-bound and high-level structured network code"

I have also looked into multiprocess. The documentation and version management is not great in my opinion https://multiprocess.readthedocs.io/en/latest/multiprocess.html

Requirements

One of the main things to define is what we want to parallelise and what we don't.

My personal requirements based on my understanding are for the 3.0 architecture:

  • Parallelise uploading and running tasks up as limited by the user bandwidth
    • Be able to manage and keep track of these tasks asynchronously

Implementation caveats

  • By default asyncio works based on one thread. Ideally, we want to leverage multiple CPU threads and all our internet bandwidth to parallelize our operations.

@daquinteroflex daquinteroflex added the 3.0 will eventually go into version 3.0 label Apr 16, 2024
@tylerflex tylerflex removed the feature label May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.0 will eventually go into version 3.0
Projects
None yet
Development

No branches or pull requests

2 participants