Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval: missing real generation_power values #110

Open
JasonFengGit opened this issue Apr 21, 2024 · 7 comments
Open

Eval: missing real generation_power values #110

JasonFengGit opened this issue Apr 21, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@JasonFengGit
Copy link

Describe the bug

In evaluation, some of the real/expected values of generation_power are missing.

To Reproduce

Steps to reproduce the behavior:

  1. Run python scripts/run_evaluation.py with the following testset.csv (a small test to illustrate the bug):
pv_id,timestamp
9531,2021-05-08 10:00:00
  1. Some values missing in results.csv in the generation_power columns
    Example results.csv:
,forecast_power,horizon_hour,pv_id,timestamp,generation_power
0,0.5382338261787198,0,9531,2021-05-08 10:00:00,
1,0.6805504837540712,1,9531,2021-05-08 11:00:00,
2,0.6950511506600507,2,9531,2021-05-08 12:00:00,
3,0.7507192765284325,3,9531,2021-05-08 13:00:00,
4,0.6222327619232007,4,9531,2021-05-08 14:00:00,
5,0.46010747864610435,5,9531,2021-05-08 15:00:00,
6,0.2792985706278065,6,9531,2021-05-08 16:00:00,
7,0.11883538094408863,7,9531,2021-05-08 17:00:00,0.19273080444335938
8,0.03377143967258781,8,9531,2021-05-08 18:00:00,0.05239992141723633
9,0.004003063439732276,9,9531,2021-05-08 19:00:00,0.0
10,0.0,10,9531,2021-05-08 20:00:00,0.0
11,0.0,11,9531,2021-05-08 21:00:00,0.0
12,0.0,12,9531,2021-05-08 22:00:00,0.0
13,0.0,13,9531,2021-05-08 23:00:00,0.0
14,0.0,14,9531,2021-05-09 00:00:00,0.0
15,0.0,15,9531,2021-05-09 01:00:00,0.0
16,0.0,16,9531,2021-05-09 02:00:00,0.0
17,0.0,17,9531,2021-05-09 03:00:00,0.0
18,0.0006960749166189652,18,9531,2021-05-09 04:00:00,0.0
19,0.021830932182701164,19,9531,2021-05-09 05:00:00,0.002466707944869995
20,0.04920016630787139,20,9531,2021-05-09 06:00:00,0.12896760559082032
21,0.16425460389406232,21,9531,2021-05-09 07:00:00,0.22877279663085937
22,0.2536578989915163,22,9531,2021-05-09 08:00:00,0.8414171752929688
23,0.3202140667660062,23,9531,2021-05-09 09:00:00,0.6911544189453125
24,0.6471341332970747,24,9531,2021-05-09 10:00:00,0.8355504150390625
25,0.7728203006501675,25,9531,2021-05-09 11:00:00,1.15409765625
26,0.6856276972650501,26,9531,2021-05-09 12:00:00,0.6737999877929688
27,0.7735971877911895,27,9531,2021-05-09 13:00:00,1.11731640625
28,0.6681219518935074,28,9531,2021-05-09 14:00:00,0.20179200744628906
29,0.49810158614186933,29,9531,2021-05-09 15:00:00,0.45828359985351563
30,0.3536980181332593,30,9531,2021-05-09 16:00:00,0.35039999389648435
31,0.19379396872601617,31,9531,2021-05-09 17:00:00,0.2593247985839844
32,0.05294271353381089,32,9531,2021-05-09 18:00:00,0.17835600280761718
33,0.00577927292344424,33,9531,2021-05-09 19:00:00,0.07551947784423828
34,0.0,34,9531,2021-05-09 20:00:00,3.235164058423834e-09
35,0.0,35,9531,2021-05-09 21:00:00,0.0
36,0.0,36,9531,2021-05-09 22:00:00,0.0
37,0.0,37,9531,2021-05-09 23:00:00,0.0
38,0.0,38,9531,2021-05-10 00:00:00,0.0
39,0.0,39,9531,2021-05-10 01:00:00,0.0
40,0.0,40,9531,2021-05-10 02:00:00,0.0
41,0.0,41,9531,2021-05-10 03:00:00,0.0
42,0.0016835594981394644,42,9531,2021-05-10 04:00:00,0.0
43,0.04807132423975142,43,9531,2021-05-10 05:00:00,0.01917263984680176
44,0.2019059924841576,44,9531,2021-05-10 06:00:00,0.20261639404296874
45,0.4591377241020738,45,9531,2021-05-10 07:00:00,0.33280679321289064
46,0.7547477658079034,46,9531,2021-05-10 08:00:00,0.34174200439453123
47,1.068172900817906,47,9531,2021-05-10 09:00:00,0.9841751708984375

Expected behavior

No missing values (or maybe some fallbacks to handle missing values).

@JasonFengGit JasonFengGit added the bug Something isn't working label Apr 21, 2024
@peterdudfield
Copy link
Contributor

Thanks @JasonFengGit for this

We'll have to think how to perhaps create a new test dataset that doesnt have any missing generation values

@JasonFengGit
Copy link
Author

JasonFengGit commented Apr 22, 2024

We could filter out timestamps with missing values, but that would introduce some biases that are hard to analyze.

@peterdudfield
Copy link
Contributor

We could filter out timestamps with missing values, but that would introduce some hard to explain bias.

I think we could filter out the missing ones, and introduce new ones.
As long as we then do some analysis on the new test set and check its not bias, then it should be ok.

What bias' were you thinking about?

@JasonFengGit
Copy link
Author

For example, the missing values might be due to similar reasons and could share some patterns that are either easier or harder to predict, thereby making the evaluation biased.

@peterdudfield
Copy link
Contributor

For example, the missing values might be due to similar reasons and could share some patterns that are either easier or harder to predict, thereby making the evaluation biased.

ah I see, from what I've seen, there are normally quite random as they are all random pv panels throughout the UK. But we can check this

@JasonFengGit
Copy link
Author

Oh OK! That would make it easier.

@zakwatts
Copy link
Contributor

@JasonFengGit Nice spot! thanks for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants