Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for operations on option[Vector[...]] #283

Open
agoose77 opened this issue Oct 26, 2022 · 3 comments
Open

Better support for operations on option[Vector[...]] #283

agoose77 opened this issue Oct 26, 2022 · 3 comments
Labels
bug The problem described is something that must be fixed

Comments

@agoose77
Copy link
Contributor

Vector Version

0.10.0

Python Version

3.10.7

OS / Environment

Linux (Ubuntu 22.04)
Conda-provisioned Python

Describe the bug

Outer options become inner-options after operations on vectors:

>>> x = vector.awk([{"x": 1, "y": 0, "z": 0}])
>>> y = vector.awk([{"x": 0, "y": 1, "z": 0}])
>>> x.cross(y).type
1 * Vector3D["x": int64, "y": int64, "z": int64]
>>> x.mask[[False]].cross(y).type
1 * Vector3D["x": ?int64, "y": ?int64, "z": ?int64]

Steps to Reproduce

Wrap a vector in an option type, and then operate on it e.g. with cross.

Expected Results

The option should survive outside the record.

Observed Results

The option moves inside the record.

Relevant log output

No response

@agoose77 agoose77 added the bug (unverified) The problem described would be a bug, but needs to be triaged label Oct 26, 2022
@agoose77
Copy link
Contributor Author

This is one of the motivations for the optiontype_outside_record in ak.zip. However, using this option alone might not be the safest way of fixing this bug; users might want options to exist inside the record. @jpivarski this is a question that goes beyond just vector. Do you have any thoughts / ideas here?

@agoose77
Copy link
Contributor Author

(self note) maybe add an option to zip that allows it to broadcast below record.

@jpivarski
Copy link
Member

We talked about this on Slack; having a general array-cast mechanism as described in scikit-hep/awkward#1807 (comment) would allow users and library designers to fix the type after the fact. Looking at use-cases like this helps, though, because it specifies how flexible the type casting would have to be.

For instance, given an array like

# * Vector3D[x: ?int64, y: ?int64, z: ?int64]

and desiring a type

# * option[Vector3D[x: int64, y: int64, z: int64]]

it would have to add the option[·] to Vector3D and remove the ? from all three fields in a single transformation, which checks that the fields have missing values in all the same entries. It can't do the more natural algorithm:

  1. add option[·] to Vector3D as an UnmaskedArray because option-type is requested and not present in the array, then
  2. recurse into each field separately, and
  3. try to remove the ? from a field, only to discover that there's a non-zero number of missing values, and so raise an error.

I might need to start with a list of transformations that the algorithm needs to have.

@Saransh-cpp Saransh-cpp added bug The problem described is something that must be fixed and removed bug (unverified) The problem described would be a bug, but needs to be triaged labels Mar 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The problem described is something that must be fixed
Projects
None yet
Development

No branches or pull requests

3 participants