Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to load schema from pyspark struct or avro format from schema registry ? #1599

Open
pthalasta opened this issue Apr 24, 2024 · 2 comments
Labels
question Further information is requested

Comments

@pthalasta
Copy link

Question about pandera

How do i create the DataFrameSchema using the avro schema? What are our options? If used, i see the DataFrameSchema object to have an empty column field. Can this be added as a feature that can help pull the schema from the registries that are most widely used?

@pthalasta pthalasta added the question Further information is requested label Apr 24, 2024
@cosmicBboy
Copy link
Collaborator

Hi @pthalasta looking at the avro schema docs it looks like we'll need to write a translation layer between avro -> pandera, similar to the frictionless integration: https://pandera.readthedocs.io/en/stable/frictionless.html?highlight=frictionless#frictionless-data-schema

Feel free to change the label of this issue to enhancement and re-write the title as a feature request.

Happy to review a PR contribution from you or someone in the community!

@pthalasta
Copy link
Author

@cosmicBboy i'm not sure i can edit the label of the issue, but i can certainly change the description. Please let me know if that helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants