Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add conditions to perform #50

Open
silva96 opened this issue Jan 26, 2017 · 0 comments
Open

add conditions to perform #50

silva96 opened this issue Jan 26, 2017 · 0 comments

Comments

@silva96
Copy link

silva96 commented Jan 26, 2017

Hi, we have a website that fetches bus_travels from multiple bus operators, for each route there may be multiple bus operators that have to be fetched

right now, I have the condition inside each BusOperatorWorker.

Superworker.define(:BusGatewaySuperWorker, :bus_travel_id, :route_bus_operators_internal_names) do
  # mark work as started.
  StartWorker :bus_travel_id do
    # concurrent
    parallel do
      BusOperatorNameNumberOneWorker(:bus_travel_id, :route_bus_operators_internal_names)
      BusOperatorNameNumberThreeWorker(:bus_travel_id, :route_bus_operators_internal_names)
      BusOperatorNameNumberThreeWorker(:bus_travel_id, :route_bus_operators_internal_names)
      . . .
      BusOperatorNameNumberThirtyWorker(:bus_travel_id, :route_bus_operators_internal_names)
    end

    EndWorker :bus_travel_id
  end
  # mark work as done.
end

and this is a worker for one bus operator.

class BusOperatorNameNumberOneWorker

  include Sidekiq::Worker
  sidekiq_options :queue => :crawler, :retry => false

  def perform(bus_travel_id, route_bus_operators_internal_names)
    return unless route_bus_operators_internal_names.include?('name_number_one')
    # . . . here fetch from bus operator servers if we called the worker for the correct bus operator. 

This way, I call perform_async for several bus_operators and it works.

But right now I have 30 workers inside the parallel, each time a search is performed, all the 30 workers are launched and evaluated (at least to see if there is a match of bus operator, if not, return)

So, imagine we have 400 people making searches, thats 30 * 400 workers, my sidekiq is blowing up. We do some cache for the search results, but when we want to clear the cache, sidekiq goes crazy.

this is the use case

In route A, BusOp1,BusOp2 and BusOp3 out of the 30 BusOperators can do the route, we call the superworker with parameters route_bus_operators_internal_names = ['bus_one', 'bus_two', 'bus_three'] and all the 30 Workers are launched in parallel, the first 3 Workers will actually perform the search and the other 27 will just return because they don't match with the operator internal name.

The problem is, even if the worker 4 to 30 are not related to any of the route_bus_operators_internal_names array, they are still being launched to evaluate this.

I want to evaluate this inside the superworker to see which workers perform in parallel without having to launch them all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant