Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow column-level coerce setting to override class-level coerce. #1491

Open
benlindsay opened this issue Feb 10, 2024 · 0 comments
Open

Allow column-level coerce setting to override class-level coerce. #1491

benlindsay opened this issue Feb 10, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@benlindsay
Copy link

Is your feature request related to a problem? Please describe.
I have a schema where I want to coerce every column but one, because that one has a dict type, for which coercion is not supported. Currently, I need to add coerce=True to every field but that one, which is verbose and less clear than the alternative.

Describe the solution you'd like
The solution I'd prefer is to have coerce=True in the Config class, but then allow the one column-level coerce=False to override the default setting.

Describe alternatives you've considered
The alternative I've considered is the less concise workaround. That works for me, so this is not a blocker. Just a convenience request.

Additional context

Here's a script demonstrating the working solution (VerboseModel) and the failing solution I wish would work (ConciseModel).

import pandas as pd
import pandera as pa
from pandera.typing import Series


class VerboseModel(pa.DataFrameModel):
    no_coerce: Series[dict]
    yes_coerce_1: Series[int] = pa.Field(coerce=True)
    yes_coerce_2: Series[int] = pa.Field(coerce=True)
    # ...
    yes_coerce_n: Series[int] = pa.Field(coerce=True)


class ConciseModel(pa.DataFrameModel):
    no_coerce: Series[dict] = pa.Field(coerce=False)
    yes_coerce_1: Series[int]
    yes_coerce_2: Series[int]
    # ...
    yes_coerce_n: Series[int]

    class Config:
        coerce = True


df = pd.DataFrame(
    [
        {
            "no_coerce": {"some_key": "some_value"},
            "yes_coerce_1": 1,
            "yes_coerce_2": 2,
            # ...
            "yes_coerce_n": 42,
        }
    ]
)
validated_verbose = VerboseModel(df)
print(validated_verbose)
validated_concise = ConciseModel(df)  # Fails

I think this would be a breaking change so I understand if we don't want to go this route, but putting it out there just in case.

@benlindsay benlindsay added the enhancement New feature or request label Feb 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant