Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support Clickhouse database as source and sink #342

Open
Delphin1 opened this issue Oct 2, 2023 · 4 comments
Open

Add support Clickhouse database as source and sink #342

Delphin1 opened this issue Oct 2, 2023 · 4 comments

Comments

@Delphin1
Copy link

Delphin1 commented Oct 2, 2023

It will be great if Arroyo also will be able to work with Clickhouse.

@kzk2000
Copy link

kzk2000 commented Oct 7, 2023

FWIW, if your data is already on Kafka, it's trivial to sync

that said, syncing Arroyo stream to Clickhouse without Kafka would be cool indeed for long-term storage.

@MuhtasimTanmoy
Copy link

What would the high-level design be for implementing this feature and testing procedure?
Looks like a cool one.

@kzk2000
Copy link

kzk2000 commented Nov 2, 2023

Clickhouse has various integrations for data ingestions, Kafka as mentioned above is just one of them.

I'm no expert but maybe any of these https://clickhouse.com/docs/en/integrations -> search for "Data ingestion" work well together with what Arroyo.dev already has.

@marvin-hansen
Copy link

Maybe try remote select?

Something like:

SELECT * FROM remote('127.0.0.1', db.remote_engine_table) LIMIT 3;

CH Docs:
https://clickhouse.com/docs/en/sql-reference/table-functions/remote

With that in place, you can simply run a remote insert into a ClickHouse table via the tcp protocol.

This might be easier and faster to implement than a full-blown integration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants