Skip to content

Releases: unionai-oss/pandera

Release 0.13.0: Option to Report All Errors, "Try Pandera" with Jupyterlite

05 Oct 18:27
36fe267
Compare
Choose a tag to compare

Highlights ⭐️

  • try pandera: add jupyterlite notebooks, add support for py3.7 (#951) @cosmicBboy
  • Feature/922 add other ways to report unique errors as an argument (#914) @ng-henry

What's Changed 📈

New Contributors 🎉

Shout out to all the first-time contributors!

Full Changelog: v0.12.0...v0.13.0

Beta Release: v0.13.0b1

05 Oct 13:25
Compare
Choose a tag to compare
Pre-release
beta release v0.13.0b1

Beta Release v0.13.0b0

04 Oct 21:09
1bcfe01
Compare
Choose a tag to compare
Pre-release
beta release 0.13.0b0

Release 0.12.0: Logical Types, New Built-in Check, Bugfixes, Doc Improvements

22 Aug 19:53
a66d71f
Compare
Choose a tag to compare

Release 0.12.0

Highlights ⭐️

This release features:

  • Support for Logical Data Types #798: These data types check the actual values of the data container at runtime to support data types like "URL", "Name", etc.
  • Check.unique_values_eq #858: Make sure that all of the values in the data container cover the entire domain of the specified finite set of values.

What's Changed 📈

New Contributors 🎉

Full Changelog: v0.11.0...v0.12.0

Beta release v0.12.0b0

12 Aug 16:31
7a9c6ca
Compare
Choose a tag to compare
Pre-release
beta release v0.12.0b0

0.11.0: Docs support dark mode, custom names and errors for built-in checks, bug fixes

01 May 00:31
c494ee7
Compare
Choose a tag to compare

Big shoutout to the contributors on this release!

Highlights

Docs Gets Dark Mode 🌓

Just a little something for folks who prefer dark mode!

image

Enhancements

  • Make DataFrameSchema respect subclassing #830
  • Feature: Add support for Generic to SchemaModel #810
  • feat: make schema available in SchemaErrors #831
  • add support for custom name and error in builtin checks #843

Bugfixes

  • Make DataFrameSchema respect subclassing #830
  • fix pandas_engine.DateTime.coerce_value not consistent with coerce #827
  • fix mypy 9c5eaa3

Documentation Improvements

0.11.0b1: fix mypy error

30 Apr 13:39
9c5eaa3
Compare
Choose a tag to compare
Pre-release
v0.11.0b1

release v0.11.0b1

0.11.0b0: Docs support dark mode, custom names and errors for built-in checks, bug fixes

29 Apr 21:13
Compare
Choose a tag to compare

0.10.1: Pyspark documentation fixes

04 Apr 03:13
Compare
Choose a tag to compare
v0.10.1

release 0.10.1

0.10.0: Pyspark.pandas Support, PydanticModel datatype, Performance Improvements

01 Apr 13:56
Compare
Choose a tag to compare

Highlights

pandera now supports pyspark dataframe validation via pyspark.pandas

The pandera koalas integration has now been deprecated

You can now pip install pandera[pyspark] and validate pyspark.pandas dataframes:

import pyspark.pandas as ps
import pandas as pd
import pandera as pa

from pandera.typing.pyspark import DataFrame, Series


class Schema(pa.SchemaModel):
    state: Series[str]
    city: Series[str]
    price: Series[int] = pa.Field(in_range={"min_value": 5, "max_value": 20})


# create a pyspark.pandas dataframe that's validated on object initialization
df = DataFrame[Schema](
    {
        'state': ['FL','FL','FL','CA','CA','CA'],
        'city': [
            'Orlando',
            'Miami',
            'Tampa',
            'San Francisco',
            'Los Angeles',
            'San Diego',
        ],
        'price': [8, 12, 10, 16, 20, 18],
    }
)
print(df)

PydanticModel DataType Enables Row-wise Validation with a pydantic model

Pandera now supports row-wise validation by applying a pydantic model as a dataframe-level dtype:

from pydantic import BaseModel

import pandera as pa


class Record(BaseModel):
    name: str
    xcoord: str
    ycoord: int

import pandas as pd
from pandera.engines.pandas_engine import PydanticModel


class PydanticSchema(pa.SchemaModel):
    """Pandera schema using the pydantic model."""

    class Config:
        """Config with dataframe-level data type."""

        dtype = PydanticModel(Record)
        coerce = True  # this is required, otherwise a SchemaInitError is raised

⚠️ Warning: This may lead to performance issues for very large dataframes.

Improved conda installation experience

Before this release there were only two conda packages: one to install pandera-core and another to install pandera (which would install all extras functionality)

The conda packaging now supports finer-grained control:

conda install -c conda-forge pandera-hypotheses  # hypothesis checks
conda install -c conda-forge pandera-io          # yaml/script schema io utilities
conda install -c conda-forge pandera-strategies  # data synthesis strategies
conda install -c conda-forge pandera-mypy        # enable static type-linting of pandas
conda install -c conda-forge pandera-fastapi     # fastapi integration
conda install -c conda-forge pandera-dask        # validate dask dataframes
conda install -c conda-forge pandera-pyspark     # validate pyspark dataframes
conda install -c conda-forge pandera-modin       # validate modin dataframes
conda install -c conda-forge pandera-modin-ray   # validate modin dataframes with ray
conda install -c conda-forge pandera-modin-dask  # validate modin dataframes with dask

Enhancements

Bugfixes

Deprecations

Docs Improvements

Testing Improvements

Misc Changes

Contributors