Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for PySpark #1055

Open
gracemiguel opened this issue Oct 26, 2023 · 2 comments
Open

Support for PySpark #1055

gracemiguel opened this issue Oct 26, 2023 · 2 comments
Assignees
Labels
New Feature A feature addition not currently in the library

Comments

@gracemiguel
Copy link

Is your feature request related to a problem? Please describe.

Hello, I see that this package supports Pandas, but does it support pyspark? I'd like to use this on large datasets and pandas is insufficient for my use case.

Describe the outcome you'd like:
I'd like to be able to run this on large datasets over 10k+ rows. Do you think this would be possible?

@gracemiguel gracemiguel added the New Feature A feature addition not currently in the library label Oct 26, 2023
@taylorfturner
Copy link
Contributor

taylorfturner commented Oct 26, 2023

Depends on how many columns you are also dealing with, but my first though is you should be fine at that data size with pandas, @gracemiguel. Thanks!

@taylorfturner
Copy link
Contributor

@gracemiguel any additional questions on this? Any luck using? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New Feature A feature addition not currently in the library
Projects
None yet
Development

No branches or pull requests

5 participants