Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculating number of years for PR metrics using irregular time index #427

Open
RenanBRibeiro opened this issue Mar 15, 2024 · 3 comments
Open

Comments

@RenanBRibeiro
Copy link

When working with data that features highly irregular time steps, such as altimetry data, the method for calculating years in metrics.py may produce inaccurate counts of years. This, in turn, affects the calculation used to identify and order peak numbers. Consequently, the computation of Peak Ratio (PR) may be flawed, relying solely on a singular event.

By introducing additional print statements and making minor modifications to the code, the outcome is altered as follows:

# Calculate number of years
    dt_int = time[1:].values - time[0:-1].values
    dt_int_mode = float(stats.mode(dt_int, keepdims=False)[0]) / 1e9  # in seconds
    N_years = dt_int_mode / 24 / 3600 / 365.25 * len(time)
 # changing code here
    print(N_years)
    dt = time[-1] - time[0]
    N_years = dt.days / 365.25
    print(N_years)
 # changing code here

Output

0.18286146601769462
10.992470910335387

my data is from 2013 to 2023, so ~11years.

I understand that this simple calculation dt = time[-1] - time[0] can also cause problems when we have many gaps in the data, but in this case it works well.

@jsmariegaard
Copy link
Member

@daniel-caichac-DHI - peak ratio sounds like something for you? Any comments on this?

@daniel-caichac-DHI
Copy link
Collaborator

daniel-caichac-DHI commented Apr 4, 2024

Agree with Renan, that the peak ratio if you have many gaps in you data, could be wrong as the number of years are not true.
This could actually be an improvement.
Now, on a altimmetry topic, the PR on altimetry data is not good anyway, for a different reason though. The peak over threshold algorithm will mark thing as peak which are not necessarily a peak. Let me show an example picture.

Black line : Continuos measurement from a non-existen buoy in a point
Red line: Altimetry data at the ~same location (lots of gaps)
Blue points: Detected points.

As you can see, P1 could be falsely detected as a peak, because it has more than 36h between peaks, and fulfills all criteria.
Peak Ratio on discontinuous altimetry data is dangerous.
image

Palle is a little bit more blunt about it and is saying it does not make sense to make a peak-ratio on satellite data.
Maybe Renan could ask him for more details?

@RenanBRibeiro
Copy link
Author

Thank you Daniel, for the feedback.
Regarding PR on altimetry. It's true! It makes no sense to calculate the PR with this type of data. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants