-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[stdlib] Fix out of bounds access in List.index()
#2745
[stdlib] Fix out of bounds access in List.index()
#2745
Conversation
Signed-off-by: gabrieldemarmiesse <gabrieldemarmiesse@gmail.com>
Signed-off-by: gabrieldemarmiesse <gabrieldemarmiesse@gmail.com>
Signed-off-by: gabrieldemarmiesse <gabrieldemarmiesse@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I really appreciate the detailed commit message and explanation of the problem(s) at hand. It made this review easy. Thank you for that and fixing the bugs!
!sync |
✅🟣 This contribution has been merged 🟣✅ Your pull request has been merged to the internal upstream Mojo sources. It will be reflected here in the Mojo repository on the nightly branch during the next Mojo nightly release, typically within the next 24-48 hours. We use Copybara to merge external contributions, click here to learn more. |
[External] [stdlib] Fix out of bounds access in `List.index()` Related to #2687 There were multiple bugs related to clipping there. Long story short, the behavior of `list.index()` in python is this one: given a `start` and `end`, python will look for the element in `my_list[start:end]` and report the result, (`start` is added to the result to give the index with respect to the original list). You can take a look at the description of the `index` method here: https://docs.python.org/3/tutorial/datastructures.html#more-on-lists Since there is slicing semantics applied to `start` and `end`, we should do multiple things: 1) default to the start and the end of the list 2) normalize negative values by doing `+ len(my_list)` 3) clip both start and end between `0` and `len(my_list)` The last step wasn't done correctly. Especially for the `stop` argument where the clipping was applied only when negative values were found (this caused the out of bounds bug in the tests). Effectively ```mojo return end if end > 0 else min(end + size, size) ``` is equivalent to ```mojo return end if end > 0 else end + size ``` since the min is applied only when `end <= 0`. So `end + size <= size` This test can cause some flakyness in our CI: `test_list_a.index(10, start=5, stop=50)` for a list of size 6. The stop was positive, so it was never clipped, thus too many values (out of bounds) were tried, causing some to match, sometimes. Another bug was that because of this condition, `end = 0` means "check until the end of the list" while in python, with the slicing semantics, `end = 0` means "do nothing" (empty slice). ### TL;DR Multiple clipping bugs. Slicing semantics applied to `start` and `stop` now like in Python. No more flakyness in our CI. No more out of bounds access. Corresponding tests added. Co-authored-by: Gabriel de Marmiesse <gabriel.demarmiesse@datadoghq.com> Closes #2745 MODULAR_ORIG_COMMIT_REV_ID: 7bad2830dc41d96ae383fc4a8eac9ca3a69581de
Landed in 387bc03! Thank you for your contribution 🎉 |
[External] [stdlib] Fix out of bounds access in `List.index()` Related to modularml#2687 There were multiple bugs related to clipping there. Long story short, the behavior of `list.index()` in python is this one: given a `start` and `end`, python will look for the element in `my_list[start:end]` and report the result, (`start` is added to the result to give the index with respect to the original list). You can take a look at the description of the `index` method here: https://docs.python.org/3/tutorial/datastructures.html#more-on-lists Since there is slicing semantics applied to `start` and `end`, we should do multiple things: 1) default to the start and the end of the list 2) normalize negative values by doing `+ len(my_list)` 3) clip both start and end between `0` and `len(my_list)` The last step wasn't done correctly. Especially for the `stop` argument where the clipping was applied only when negative values were found (this caused the out of bounds bug in the tests). Effectively ```mojo return end if end > 0 else min(end + size, size) ``` is equivalent to ```mojo return end if end > 0 else end + size ``` since the min is applied only when `end <= 0`. So `end + size <= size` This test can cause some flakyness in our CI: `test_list_a.index(10, start=5, stop=50)` for a list of size 6. The stop was positive, so it was never clipped, thus too many values (out of bounds) were tried, causing some to match, sometimes. Another bug was that because of this condition, `end = 0` means "check until the end of the list" while in python, with the slicing semantics, `end = 0` means "do nothing" (empty slice). ### TL;DR Multiple clipping bugs. Slicing semantics applied to `start` and `stop` now like in Python. No more flakyness in our CI. No more out of bounds access. Corresponding tests added. Co-authored-by: Gabriel de Marmiesse <gabriel.demarmiesse@datadoghq.com> Closes modularml#2745 MODULAR_ORIG_COMMIT_REV_ID: 7bad2830dc41d96ae383fc4a8eac9ca3a69581de
[External] [stdlib] Fix out of bounds access in `List.index()` Related to #2687 There were multiple bugs related to clipping there. Long story short, the behavior of `list.index()` in python is this one: given a `start` and `end`, python will look for the element in `my_list[start:end]` and report the result, (`start` is added to the result to give the index with respect to the original list). You can take a look at the description of the `index` method here: https://docs.python.org/3/tutorial/datastructures.html#more-on-lists Since there is slicing semantics applied to `start` and `end`, we should do multiple things: 1) default to the start and the end of the list 2) normalize negative values by doing `+ len(my_list)` 3) clip both start and end between `0` and `len(my_list)` The last step wasn't done correctly. Especially for the `stop` argument where the clipping was applied only when negative values were found (this caused the out of bounds bug in the tests). Effectively ```mojo return end if end > 0 else min(end + size, size) ``` is equivalent to ```mojo return end if end > 0 else end + size ``` since the min is applied only when `end <= 0`. So `end + size <= size` This test can cause some flakyness in our CI: `test_list_a.index(10, start=5, stop=50)` for a list of size 6. The stop was positive, so it was never clipped, thus too many values (out of bounds) were tried, causing some to match, sometimes. Another bug was that because of this condition, `end = 0` means "check until the end of the list" while in python, with the slicing semantics, `end = 0` means "do nothing" (empty slice). ### TL;DR Multiple clipping bugs. Slicing semantics applied to `start` and `stop` now like in Python. No more flakyness in our CI. No more out of bounds access. Corresponding tests added. Co-authored-by: Gabriel de Marmiesse <gabriel.demarmiesse@datadoghq.com> Closes #2745 MODULAR_ORIG_COMMIT_REV_ID: 7bad2830dc41d96ae383fc4a8eac9ca3a69581de
Related to #2687
There were multiple bugs related to clipping there.
Long story short, the behavior of
list.index()
in python is this one: given astart
andend
, python will look for the element inmy_list[start:end]
and report the result, (start
is added to the result to give the index with respect to the original list).You can take a look at the description of the
index
method here: https://docs.python.org/3/tutorial/datastructures.html#more-on-listsSince there is slicing semantics applied to
start
andend
, we should do multiple things:+ len(my_list)
0
andlen(my_list)
The last step wasn't done correctly. Especially for the
stop
argument where the clipping was applied only when negative values were found (this caused the out of bounds bug in the tests).Effectively
is equivalent to
since the min is applied only when
end <= 0
. Soend + size <= size
This test can cause some flakyness in our CI:
test_list_a.index(10, start=5, stop=50)
for a list of size 6.The stop was positive, so it was never clipped, thus too many values (out of bounds) were tried, causing some to match, sometimes.
Another bug was that because of this condition,
end = 0
means "check until the end of the list" while in python, with the slicing semantics,end = 0
means "do nothing" (empty slice).TL;DR
Multiple clipping bugs. Slicing semantics applied to
start
andstop
now like in Python. No more flakyness in our CI. No more out of bounds access. Corresponding tests added.