Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passed max_shift_seconds in CrossCorrelator not coverted to milliseconds #38

Open
earthgecko opened this issue Jan 7, 2018 · 2 comments

Comments

@earthgecko
Copy link
Contributor

In src/luminol/algorithms/correlator_algorithms/cross_correlator.py if max_shift_seconds is passed, it is not converted to milliseconds like the DEFAULT_ALLOWED_SHIFT_SECONDS

earthgecko added a commit to earthgecko/luminol that referenced this issue Jan 7, 2018
@earthgecko
Copy link
Contributor Author

Are milliseconds needed? Are milliseconds breaking the Correlator?

On further evaluation it appears as if the specific use of milliseconds in the CrossCorrelator has an undesired effect in the subsequent _find_allowed_shift and _find_first_bigger which is using residual_timestamps (which are not necessarily in milliseconds).

Perhaps I am missing something. The documentation refers to epoch seconds, there is no mention of milliseconds another that in these functions so far as I can tell. Although there is the ```def timestamps_ms`` it does not appear to be called anywhere. Therefore as it stands, it appears that the the correlator is always going to shift epoch timestamps far too many positions.

Example:

timestamps = [
    1515337587,
    1515337647,
    1515337707,
    1515337767,
    1515337827,
    1515337887,
    1515337947,
    1515338007,
    1515338067,
    1515338127,
    1515338187]

# def _find_allowed_shift(self, timestamps):
init_ts = timestamps[0]
residual_timestamps = [ts - init_ts for ts in timestamps]
n = len(residual_timestamps)

residual_timestamps
>> [0, 60, 120, 180, 240, 300, 360, 420, 480, 540, 600]

#def _find_first_bigger(self, timestamps, target, lower_bound, upper_bound):
target = 60000  # default max_shift_milliseconds
lower_bound = 0
upper_bound = n
while lower_bound < upper_bound:
    pos = lower_bound + (upper_bound - lower_bound) / 2
    pos = int(pos)
    if residual_timestamps[pos] > target:
        upper_bound = pos
    else:
        lower_bound = pos + 1
pos
>> 10
# if you use target = 60 pos is 1

This is not the expected result or I am missing something?

If it is not the desired result, standardise everything to milliseconds or just have milliseconds as an argument if desired? I can imagine that milliseconds may be useful in some cases however the majority of current time series data is commonly going to be to second resolution, so as an argument and default being seconds would probably be convenient to most new users.

@earthgecko
Copy link
Contributor Author

Updated the PR to remove milliseconds - #39 - changed the use of milliseconds to seconds and removed reference to milliseconds as it is not really used and when it is used, it is used somewhat erroneously.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant