-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(fo): add historical production data for FO parser #6759
feat(fo): add historical production data for FO parser #6759
Conversation
Very nice! I'll review this later today (and try and get through some of the other parser upgrades) 😅 |
parsers/FO.py
Outdated
|
||
# historical API has a ~2:25h delay, so supplement with realtime API data if applicable | ||
# note that this will still leave a gap, but at least it will provide a realtime signal | ||
if target_datetime > production_breakdown_list.events[-1].datetime: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might not be obvious in the diff, but the contents of this block are essentially the old parser code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd try and split these into two functions, that both return a ProductionBreakdownList, one for the historical data and one for the realtime data.
Then you can return it with return ProductionBreakdownList.update(realtime, historical, logger).to_list()
which will ensure the historical data is always the one that is used even if there for some reason was a delay in the realtime data or it stopped updating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points! I've updated it but I feel like it's still not quite what you were thinking of so it might need another pass 😅 (I've still kept the conditional because I wanted to make sure we are still making a call to the realtime API only if it's needed e.g. no point if requesting data from the previous week)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take another look as soon as possible.
Some early feedback before the full review: Generally we want to avoid this as it makes overwriting the data more or less impossible but it was not really an option before. Now that it is an option I think we should just remove the old "live" parser code and just use the historical parser. Once we have historical data it will allow us to enable estimation for the zone which should cover any potential gap that might occur. EDIT: Flooring the realtime data to the hour is also an option as it would later be corrected. Might be preferable depending on how much historical data is available. |
I see! Thanks for the background on how things work 🙌🏽 From my little experimentation with this new API I think that in general the data for the hour is published at hour + 2:25h, so I guess that technically it actually does not satisfy the parsers' data 'freshness' requirements (e.g. in the worst case scenario at e.g. 8 UTC the latest available historical data is 5 UTC) but I thought it could be useful to at least have this shared somewhere in case it can be useful / the max 3h delay is still good enough. If I understood correctly if a parser doesn't return any data within the most recent two hours the zone is not coloured in the app? Flooring could be a nice trick to have it coloured while also having historical data! (and then with the example above there would just be the 6 UTC and 7 UTC datapoints missing, with the 8 UTC being the instantaneous one) |
There should be no data missing as the live data would populate the database continuously and then the historical data would override them when they match. The freshness requirement is less of a requirement and more of a guideline now that we support historical data in the app. But it seems we can get both here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good! And is working as expected!
The earliest data seem to be from datetime.datetime(2021, 12, 31, 23, 0, tzinfo=datetime.timezone.utc)
so let's keep the realtime parser as I don't think that is enough historical data to enable TSA estimations.
I have two small comments though to help with the readability and performance of the parser.
parsers/FO.py
Outdated
|
||
# historical API has a ~2:25h delay, so supplement with realtime API data if applicable | ||
# note that this will still leave a gap, but at least it will provide a realtime signal | ||
if target_datetime > production_breakdown_list.events[-1].datetime: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd try and split these into two functions, that both return a ProductionBreakdownList, one for the historical data and one for the realtime data.
Then you can return it with return ProductionBreakdownList.update(realtime, historical, logger).to_list()
which will ensure the historical data is always the one that is used even if there for some reason was a delay in the realtime data or it stopped updating.
… merge favouring historical
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 🎉
Thanks for doing this and sorry about the review delay, had some final exams in Uni this past week I needed to focus on, I hope you understand. 🙂
Description
If it can be helpful, this PR enhances the
FO
(Faroe Islands) parser to be able to fetch production data for past dates.Currently, the parser can only fetch live "instantaneous" production data, but new APIs are available on https://www.sev.fo/framleidsla/hagtoel that make available historical data too 🙌🏽
In particular, this includes an API that offers production data with hourly granularity over windows of 48h!
https://www.sev.fo/api/elproduction/search?type=item&group=powerplant&from=20230215170000&to=20230216170000
The data returned is itemised by power plant name, production mode, hour, and area ('main' islands vs. 'south' islands) making it really nice to work with:
e.g.
There is also an API that returns directly the aggregated production by production mode for each hour of the day:
but unfortunately the data cannot be split up into the subareas so might as well just use the unaggregated API to have access to all of the raw information directly and to the aggregations ourselves.
Now the
FO
parser will return hourly data for the target date of interest + the previous day!Note that the historical API seems to make available hourly data with a delay of ~2:25h, so if requesting data for a target_date that is "live", I have set up the parser so that the historical data for the current day is supplemented with an instantaneous datapoint from the "live" API (the one that was being used exclusively so far). This will still leave a gap of missing data for 1 or 2 hours, but it should already be much better than just having access to live data.
I have also noticed on the app that for some reason it seems like Faroe Islands have not been generating any solar power (e.g. looking at the last year) but they definitely seem to be having solar production in the south of the island, so I wonder if there is a problem in how the "live" API data has been parsed so far 👀 This new API we definitely returns solar data so it might help on that front too!
Double check
poetry run test_parser "zone_key"
pnpx prettier@2 --write .
andpoetry run format
in the top level directory to format my changes.