Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type Coercion In Pandas #2

Open
OmegaDroid opened this issue Aug 11, 2020 · 0 comments
Open

Type Coercion In Pandas #2

OmegaDroid opened this issue Aug 11, 2020 · 0 comments

Comments

@OmegaDroid
Copy link
Contributor

The type coercion in pandas is currently inefficient due to recording the error on the row forcing the data type to be object rather than a number type.

If we drop the specific error message w can keep the value type as number and find new NaN values to report as being invalid.

We would need to check for null values in 2 stages.

  1. Perform all null transformations, record all null values in the row.
  2. Perform all value transformations, recording all new null values, dropping these new rows and reporting Failed to coerce 'x' (<value>): Failed to convert <value> to <type>: Row ...

This should keep the correct data types on the dataframe improving transformation efficiency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant