Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a cumulative energy check function #165

Open
kperrynrel opened this issue Oct 17, 2022 · 3 comments
Open

Adding a cumulative energy check function #165

kperrynrel opened this issue Oct 17, 2022 · 3 comments
Assignees

Comments

@kperrynrel
Copy link
Collaborator

Many of the data streams I deal with on fleets are AC energy streams, and frequently they're cumulative and always increasing. I need to correct these streams by differencing them to make them look like normal data. I wrote a simple little function to check if the data is always increasing (I have a passable threshold of increasing 95% of the time), and difference the data if so. Here it is:

def cumulativeEnergyCheck(energy_series, pct_increase_threshold = 95):
    """
    Check if an energy time series represents cumulative energy or not.
    """
    differenced_series = energy_series.diff()
    differenced_series = differenced_series.dropna()
    if len(differenced_series) == 0:
        warnings.warn("The energy time series has a length of zero and "
                     "cannot be run.")
    else:
        #If over X percent of the data is increasing (set via the pct_increase_threshold), 
        #then assume that the column is cumulative 
        differenced_series_positive_mask = (differenced_series >= -.5)
        pct_over_zero = differenced_series_positive_mask.value_counts(normalize=True) * 100
        if pct_over_zero[True] >= pct_increase_threshold:
            energy_series = energy_series.diff()
            cumulative_energy = True
        else:
            cumulative_energy = False
    return energy_series, cumulative_energy

I'd like to adapt this and add it into PVAnalytics. @cwhanse and @kanderso-nrel what do you think?

@kperrynrel kperrynrel self-assigned this Oct 17, 2022
@kandersolar
Copy link
Member

I think inferring whether an energy series is cumulative is unrelated enough from estimating the corresponding interval energy series that they should be implemented in separate functions. I also think properly calculating the interval energy corresponding to a cumulative energy series can sometimes be complicated, or at least more complicated than just energy.diff() anyway (happy to discuss more if you want).

I'm also a little unhappy about that magic -.5 value. Cumulative inverter time series typically have 0.0 at night, but system meter data may or may not include the effect of nighttime system self-consumption and can tick downwards at a unit- and timescale-dependent rate that may or may not be allowed by a hardcoded -.5. I suggest an optional parameter for that value rather than hardcoding it.

What module would this function be added to?

@kperrynrel
Copy link
Collaborator Author

@kanderso-nrel I'm in agreement that we should turn this into two functions: one for determining if stream is cumulative, and one for correcting if stream is cumulative and we don't want it to be. I'm totally open to other suggestions than the .diff() option, which was my quick-and-dirty implementation in fleets. I do like the idea of making that -.5 passable, it was empirically derived after I checked a bunch of data sets but shouldn't be set in stone.

I'm thinking this doesn't actually fit in an existing module right now, so we'd have to make one for it. Say energy.py in the quality folder?

@kandersolar
Copy link
Member

Here's a short example of appropriate treatment of two ways cumulative energy is often reported: https://gist.github.com/kanderso-nrel/672763bb0dc8c23432a2072ba011ce9f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants