Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallback schedules should get a distinct source #841

Closed
Flix6x opened this issue Sep 11, 2023 · 6 comments
Closed

Fallback schedules should get a distinct source #841

Flix6x opened this issue Sep 11, 2023 · 6 comments
Milestone

Comments

@Flix6x
Copy link
Contributor

Flix6x commented Sep 11, 2023

In case of an infeasible problem, the StorageScheduler falls back to using the fallback_charging_policy, while schedule data is still being attributed to the StorageScheduler. It would help debugging to be able to visually distinguish schedule data originating from one or the other, by having fallback schedules saved with their own distinct source.

I propose we refactor fallback_charging_policy to its own FallbackStorageScheduler, and think about triggering. For example:

  1. We could let the original scheduling job fail, and upon failure set up the fallback scheduling job. This would then also require something clever for retrieving schedules (getting a schedule using the ID of the failed job should result in obtaining the fallback job).
  2. Alternatively, we compute the fallback schedule within the same job (thereby dispensing with the issue of scheduling job IDs). We let the StorageScheduler.compute method fail, and trigger the fallback within the services.scheduling.make_schedule method.
@victorgarcia98
Copy link
Contributor

victorgarcia98 commented Sep 12, 2023

Good idea!

Just wanted to add that another benefit from option (1) is that we'll be able to track if a certain schedule failed via the rq job dashboard. This option also allows for recursive fallback mechanisms (in case the fallback fails).

Option (2) is simpler which is nicer IMO.

Aside, I think we should link the fallback mechanism to the StorageScheduler, for example, as a class attribute. That way, we can have specific fallback for the different schedulers. I was thinking in something like:

class StorageScheduler:
    fallback_scheduler : Scheduler
    [...]
    def __init__(...):
        self.fallback_scheduler = StorageSchedulerFallback(...)
        [...]

    def compute_fallback(...):
        return self.fallback_scheduler.compute()

@Flix6x
Copy link
Contributor Author

Flix6x commented Sep 12, 2023

we should link the fallback mechanism to the StorageScheduler, for example, as a class attribute

Good idea.

Also, I realised that fetching schedules is done on the basis of the data source ID of the Scheduler, so the issue of pointing users that are getting a schedule to the right place, is an issue for both options. Luckily, I found that we store the data source ID on the job already after we call the compute function (in make_schedule), so it should be relatively easy to point users that are getting a schedule to the right data, just by saving the data source ID of the FallbackScheduler on the job.

@victorgarcia98
Copy link
Contributor

Also, we should add the source id (maybe the name too) to the response. Perhaps it's also a good idea to add an open field to share unstructured info regarding the state of the schedule.

An alternative to keep the Job ID but relying on RQ is to set the retry = 1 and if it fails, check the number of trials and call the fallback schedule. We can even set the retry > 1 to allow for multiple retries.

The benefit is that the setup part of the main scheduler can be the same but instead of calling compute we call compute_fallback.

@Flix6x
Copy link
Contributor Author

Flix6x commented Sep 12, 2023

My one fear is that retrying the job will make us lose the error trace. But even if it's not lost, I think creating a new job would be cleaner, so a failed compute call doesn't end up showing under a successful job.

@victorgarcia98 victorgarcia98 added this to the 0.16.0 milestone Sep 13, 2023
@victorgarcia98
Copy link
Contributor

victorgarcia98 commented Sep 13, 2023

Good point!

Let's create a new job in case of failure and link them via meta attributes.

Regarding the fetching of the schedules (GET /<id>/schedules/<uuid> endpoint), I'm considering two different ways:

  1. Return in the same request for the StorageScheduler the contents of the Fallback, adding details on the status message.
  2. Use redirect to funnel the request to /<id>/schedules/<uuid-fallback> endpoint with the status code 303 (See Other).

I favor more the second as not to mix both jobs info.

For reference, the status code 303 states:

A 303 response to a GET request indicates that the origin server does
not have a representation of the target resource that can be
transferred by the server over HTTP. However, the Location field
value refers to a resource that is descriptive of the target
resource, such that making a retrieval request on that other resource
might result in a representation that is useful to recipients without
implying that it represents the original target resource.

Source: https://datatracker.ietf.org/doc/html/rfc7231#section-6.4.4

@victorgarcia98
Copy link
Contributor

victorgarcia98 commented Oct 23, 2023

Closed as completed by PR #846.

This is covered by the following test:

fallback_schedule = client.get(
get_schedule_response.headers["location"],
json={"duration": "PT24H"},
).json
# check that the fallback schedule has the right status and start dates
assert fallback_schedule["status"] == "PROCESSED"
assert parse_datetime(fallback_schedule["start"]) == parse_datetime(start)
models = [
source.model
for source in charging_station.search_beliefs().sources.unique()
]
assert "StorageFallbackScheduler" in models

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants