Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] matrix profile distance index out of bounds test failure #1498

Closed
MatthewMiddlehurst opened this issue May 7, 2024 · 3 comments · Fixed by #1513
Closed

[BUG] matrix profile distance index out of bounds test failure #1498

MatthewMiddlehurst opened this issue May 7, 2024 · 3 comments · Fixed by #1513
Labels
bug Something isn't working distances Distances package

Comments

@MatthewMiddlehurst
Copy link
Member

Describe the bug

The matrix profile distance is currently failing on the code coverage tests which disable numba. Most likely from changes in #1415.

Why does the test_mpdist test fail and the other distances ones don't, is the distance function only being used here?

Steps/Code to reproduce the bug

Run the test_mpdist test without numba

https://github.com/aeon-toolkit/aeon/actions/runs/8978393264/job/24658747510

Expected results

Test passes with and without numba

Actual results

Test fails with no numba

Versions

See CI failure

@MatthewMiddlehurst MatthewMiddlehurst added bug Something isn't working distances Distances package labels May 7, 2024
@TonyBagnall
Copy link
Contributor

TonyBagnall commented May 10, 2024

so this is to do with the default for the subsequence length being 0.

def mpdist(x: np.ndarray, y: np.ndarray, m: int = 0) -> float:

this causes a call to sliding dot product with an empty array

first_dot_prod_ab = _sliding_dot_products(y[0:m], x, m, len_x)

which then get passed down to here

   for i in range(n_t_subs):
        temp = (dot_prod[i] - q_len * q_mean * t_mean[i]) / (q_len * q_std * t_std[i])
        d[i] = 2 * q_len * (1 - temp)

and although ignored by numba because hey, in C you can do anything, its not valid, e.g.

x = np.array([1,2,3,4,5,6,7,8,9,10])
y = x[0:0]
print(y[0])

@TonyBagnall
Copy link
Contributor

I dont think a zero length subsequence makes much sense, and m should be constrained to be >0

@TonyBagnall
Copy link
Contributor

but then in _stomp_ab we have

    len_x = len(x)
    len_y = len(y)
    if m == 0:
        if len_x > len_y:
            m = int(len_x / 4)
        else:
            m = int(len_y / 4)

so zero is indicating set to 1/4 length really. I should go back and remind myself how it works lol

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working distances Distances package
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants