Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resize broadway pipeline #228

Open
fcevado opened this issue Apr 1, 2021 · 8 comments
Open

Resize broadway pipeline #228

fcevado opened this issue Apr 1, 2021 · 8 comments

Comments

@fcevado
Copy link

fcevado commented Apr 1, 2021

Discussing on broadway_rabbitmq @josevalim commented about this possibility.

I was thinking that restarting everything could be avoided by changing BatchProcessorSupervisor and ProcessorSupervisor to a :simple_one_for_one supervisor(or a DynamicSupervisor), but I guess it would require broadway callbacks to be stateful. At least it's what I'm getting from broadway architecture docs

The idea is that if any process fails, we want to restart the rest of the tree. Since Broadway callbacks are stateless, we can handle errors and provide reports without crashing processes.

@josevalim
Copy link
Member

DynamicSupervisor is not needed. You can add children to any supervisor while they are running. It won't be as fast as a dynamic one - but this is not like an operation you would do every second.

The biggest issue with adding this feature is actually making sure the new entries supervise to their producers and their children supervise to them - and how to handle failures if anything goes wrong in this process.

There is another approach which is to restart the rest of the tree except the producer. This will be necessary whenever there is a PartitionDispatcher, since the number of partitions are specified upfront.

@fcevado
Copy link
Author

fcevado commented Apr 1, 2021

You can add children to any supervisor while they are running.

I was thinking of a scenario where you want to scale down. A :one_for_all supervisor would always require to restart the entire tree when doing that, right?

@josevalim
Copy link
Member

@fcevado no, you can still stop children in a regular supervisor without restarting. The restart is always about unexpected failures.

@guigaoliveira
Copy link
Contributor

I'm not sure I understand correctly, but changing broadway configuration dynamically can be interesting to optimize pipeline according to metrics and then have best possible typology while it's running. Or change values according to specific scenarios (usually based on some metric), not necessarily optimizing, but increasing / decreasing parameters. If it is possible not only change concurrency of processors, but also other parameters such as batch_size, batch_timeout, etc.

@PranavRam
Copy link

I'm not sure I understand correctly, but changing broadway configuration dynamically can be interesting to optimize pipeline according to metrics and then have best possible typology while it's running. Or change values according to specific scenarios (usually based on some metric), not necessarily optimizing, but increasing / decreasing parameters. If it is possible not only change concurrency of processors, but also other parameters such as batch_size, batch_timeout, etc.

This would be really useful. Are you thinking something as simple as updating the config (producer + processor concurrency and min_demand, max_demand, etc) either sync or async?

@josevalim
Copy link
Member

The only way to do that at the moment would be by restarting the pipeline. This is mostly OK except for Kafka which doesn't cope well with constant reconections.

@fcevado
Copy link
Author

fcevado commented Sep 17, 2021

@josevalim I was thinking, just increasing the amount of childs of the ProcessorSupervisor and restarting the BatchProcessorSupervisor wouldn't do it? Given the genstage subscription structure, just the BatchProcessor needs to know about all the Processors, is that right? I guess it's not needed to mess up with the the producer section of the supervision tree.

@josevalim
Copy link
Member

Not quite. You still need to go through the rest of the pipeline and tell them to subscribe to the new processors. Sizing down has its own challenges as you need to subscribe, drain, and then terminate. To be clear, it is all doable, but it is a sizeable amount of work and testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants