Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing structured types with datatype #385

Open
odashi opened this issue Aug 15, 2022 · 1 comment
Open

Replacing structured types with datatype #385

odashi opened this issue Aug 15, 2022 · 1 comment

Comments

@odashi
Copy link
Contributor

odashi commented Aug 15, 2022

Several functions use NumPy's structured arrays. I thought it is better to replace them to dataclasses for several reasons:

  • It is basically not informative to users to know what members are contained in the array since the array does not provide static type information (i.e., users can't comprehend the behavior by seeing the function signatures). Unfortunately we couldn't expect that structured arrays are supported by a mypy-friendly manner in the near future: Type hinting / annotation (PEP 484) for ndarray, dtype, and ufunc numpy/numpy#7370 (comment)
  • Manipulating structured arrays with heterogeneous data is usually much expensive than manipulating usual objects. Enforcing memory alignment can mitigate this problem, but it requires to constrain the length of string fields, which may be not suitable for the current implementation.

RFC: @neubig @pfliu-nlp

@neubig
Copy link
Contributor

neubig commented Oct 10, 2022

Sorry I realize I didn't comment on this yet, but I agree with this general direction and I think we're moving in this direction already!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants