Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read in process Python objects like Dataframe, Numpy or dict #211

Draft
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

auxten
Copy link
Member

@auxten auxten commented Apr 12, 2024

This PR is in very early stage. The implementation could change a lot for final patch.

Just hold this PR for other projects to tracking the progress of "chDB on Pandas/NumPy..."

Related issues:

@auxten auxten added the Arrow Apache Arrow support label Apr 12, 2024
@auxten auxten self-assigned this Apr 12, 2024
@auxten auxten marked this pull request as draft April 12, 2024 09:41
@auxten
Copy link
Member Author

auxten commented Apr 29, 2024

Still working on it. Good news is the prototype worked. Python API example could be like this below. Any suggestion?

#!python3

import chdb


class myReader(chdb.PyReader):
    def __init__(self, data):
        self.data = data
        self.cursor = 0
        super().__init__(data)

    def read(self, col_names, count):
        # count ignored for demo
        if self.cursor >= len(self.data["a"]):
            return []
        block = [self.data[col] for col in col_names]
        self.cursor += len(block[0])
        return block


reader = myReader(
    {
        "a": [1, 2, 3, 4, 5, 6],
        "b": ["tom", "jerry", "auxten", "tom", "jerry", "auxten"],
    }
)

chdb.query("SELECT b, sum(a) FROM Python('reader') GROUP BY b", "debug").show()

Output:

"tom",5
"auxten",9
"jerry",7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arrow Apache Arrow support
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

None yet

1 participant