Skip to content

Dynamical schema #465

Answered by cosmicBboy
quancore asked this question in Q&A
Apr 15, 2021 · 5 comments · 2 replies
Discussion options

You must be logged in to vote

hey @quancore this is a fun question! I've had this use case several times and here are the two solutions I've implemented in the past:

  1. if the dynamic (only-known-at-runtime) column names follow a particular pattern, you can still pre-define a schema using that pattern Column(..., regex=True)
import pandas as pd
import pandera as pa

schema = pa.DataFrameSchema({
    "target": pa.Column(int),
    "^feature_": pa.Column(int, pa.Check.gt(0), regex=True),
}, index=pa.Index(int, name="time"))

df = pd.DataFrame({
    "target": [1, 2, 3],
    "feature_1": [3, -1, 5],
    "feature_2": [3, 4, 5],
    "feature_3": [3, 4, 5],
})

schema(df)  # this should fail due to -1 in "feature_1"
  1. create a …

Replies: 5 comments 2 replies

Comment options

You must be logged in to vote
2 replies
@alejandro-yousef
Comment options

@khalida
Comment options

Answer selected by cosmicBboy
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
4 participants
Converted from issue

This discussion was converted from issue #461 on April 20, 2021 01:44.