Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug or feature? sliced.elementCount != arr.length #313

Open
mw66 opened this issue Aug 6, 2020 · 13 comments
Open

bug or feature? sliced.elementCount != arr.length #313

mw66 opened this issue Aug 6, 2020 · 13 comments

Comments

@mw66
Copy link

mw66 commented Aug 6, 2020

  // http://mir-algorithm.libmir.org/mir_ndslice_slice.html#sliced
  auto arr = [
      1, 2, 3,
      4, 5, 6];
  auto s = arr.sliced(2, 1);
  Assert.equal(s.elementCount, arr.length);  // failed
  s.prettyArr.writeln;

Expected:6
Actual:2

When user turn an array into a slice, I think most of the time they want turn the whole array into a N-d slice; if this is a feature, I'd think it's a mis-feature, should be better documented (so no surprise to the user), and maybe also issue a run-time warning by default, if the converted sliced.elementCount != arr.length

@jmh530
Copy link
Contributor

jmh530 commented Aug 6, 2020

Use elementCount to get the number of elements in the whole slice. length works the same way that it does for D's dynamic arrays. For instance, the following compiles without error

  auto arr = [
      [1, 2, 3],
      [4, 5, 6]
 ];
  auto s = arr.sliced(2, 1);
  assert(s.length == arr.length);

@mw66
Copy link
Author

mw66 commented Aug 6, 2020

Maybe I didn't make it clear, what I mean is:

"When user turn a 1-d array into a N-d slice, ... "

in my original example, what the intended is turn a 1-d array of 6 elements into a (2, 3) 2-d slice; and if the user mistakenly write (2, 1), the library should generate at least an warning.

Just as reshape does:

http://mir-algorithm.libmir.org/mir_ndslice_topology.html#reshape

Actually, I think function sliced should be merged into function reshape, it will be much more consistent with numpy's interface & behaviour.

As a work-around (i.e. the intended operations):

  auto r = arr.sliced(arr.length).reshape([2, 3], err);
  Assert.equal(err, 0);

In your modified example, the input is a 2-d array already.

@mw66 mw66 changed the title bug or feature? sliced.length != arr.length bug or feature? sliced.elementCount != arr.length Aug 6, 2020
@jmh530
Copy link
Contributor

jmh530 commented Aug 6, 2020

I see your point.

Your issue is really that the result of arr.sliced(2, 1) is [[1], [2]] without any error. Right now, the relevant section of the code you would be interested in in mir.ndslice.slice is

    static if (isDynamicArray!Iterator)
    {
        assert(lengthsProduct(_lengths) <= iterator.length,
            "array length should be greater or equal to the product of constructed ndslice lengths");
        auto ptr = iterator.length ? &iterator[0] : null;
        return Slice!(typeof(C.init[0])*, N)(_lengths, ptr);
    }

which will generate an assert error, for instance, if you use (7,1). It seems deliberate that it is a <= instead of ==. There is also an overload for if you leave off the (2, 1), then the result is what you would expect.

Also, you might find that fuse is a little less susceptible to bugs for this use case (or just chaining sliced to reshape, as you suggest).

  auto arr = [
      [1, 2, 3],
      [4, 5, 6]
];
  auto s = arr.fuse;

@mw66
Copy link
Author

mw66 commented Aug 6, 2020

"... It seems deliberate that it is a <= instead of ==. "

If you can ask the lib users to vote on this issue; or add network code to send actual usage stats to you, my gut feeling is that > 99% of the time, user want ==.

Silently perform <= will cause more issues as software evolve, actually that's why I found this bug / feature:

In my program, I convert an 1-d array to 2-d slice for numerical processing; after some code change, there are more elements appended into the 1-d array, but I forget to change the sliced dimension. And the program continue to run silently without complaining anything, but the computation result is wrong: worse, silently wrong, because in numerical computations, you know, if the result is (say 1%) off the correct answer, typically user won't notice it easily.

If this is in numpy, I will be alerted of the mis-match new dimension problem easily.

That's why I'm suggesting: sliced should be merged into function reshape, it will be much more consistent with numpy's interface & behavior.

@mw66
Copy link
Author

mw66 commented Aug 7, 2020

And for the current sliced behavior (i.e only take partial array as converted N-d array from the head of the input array): the user can always do it explicitly via D's built-in array slice syntax on the 1-d array, e.g. from the original example:

  auto arr = [
      1, 2, 3,
      4, 5, 6];
  auto s = arr[0..2].sliced(2, 1);   // take the head 2 elements
  auto t = arr[2..4].sliced(2, 1);   // take the 2 elements in the middle

The point is that user do it explicitly, and the library enforce the assertion that assert(lengthsProduct(_lengths) == iterator.length).

@9il
Copy link
Member

9il commented Aug 13, 2020

should be fixed in v3.10

@mw66
Copy link
Author

mw66 commented Aug 14, 2020

should be fixed in v3.10

Excellent. Thanks.

@9il
Copy link
Member

9il commented Aug 14, 2020

should be fixed in v3.10

Excellent. Thanks.

It may take a while.

@9il
Copy link
Member

9il commented Aug 14, 2020

It is a breaking change, but quite small for a release. Needs to wait another one breaking change.

@mw66
Copy link
Author

mw66 commented Aug 14, 2020

I'm patient :-)

@mw66
Copy link
Author

mw66 commented Aug 20, 2020

I saw this new user's mistake:

https://forum.dlang.org/post/boifsvfxvjhjbfdttmsg@forum.dlang.org

i.e. sliced -> fuse

Since there will be a new breaking (or deprecation) change release, I'd want to suggest we should deprecate both sliced and fuse, and move all their functionality into reshape.

  1. it will be a single function reshape, that user need to remember
  2. it's more consistent with numpy, i.e more friendly to new users.

@mw66
Copy link
Author

mw66 commented Aug 13, 2022

should be fixed in v3.10

It's v3.14.14 now :-) or it's done already?

@9il
Copy link
Member

9il commented Aug 14, 2022

Oh, it isn't. To be honest this change will require to much efforts to update all the codebase we support.

Instead, we can add slicedExactly, that will throw of the array length doesn't equal to the element count.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants