Skip to content

Generate chDB and ClickHouse queries with ChatGPT/OpenAI APIs

Notifications You must be signed in to change notification settings

lmangani/chdb-GPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

chdb-GPT

Generate chDB and ClickHouse queries using natural language with ChatGPT/OpenAI APIs

👉 Run & Clone on Google Colab

Status

  • Just a toy, hallucinating states 🐍
  • Needs Prompt fine tuning and hacks
  • Do not use this!

Requirements

  • OPENAI_API_KEY
export OPENAI_API_KEY {your_openai_token_here}

Usage

Count local files

python3 promtp.py "count rows from file data.csv"
SELECT count(*) FROM file('data.csv')

URL Engine, Parquet

python3 promtp.py "show the top 10 towns from url https://datasets-documentation.s3.eu-west-3.amazonaws.com/house_parquet/house_0.parquet" 
SELECT town, COUNT(*) AS count
FROM url('https://datasets-documentation.s3.eu-west-3.amazonaws.com/house_parquet/house_0.parquet', 'Parquet')
GROUP BY town
ORDER BY count DESC
LIMIT 10;

⚠️ Pipe query to chdb

# python3 -m chdb "$(./prompt.py "count rows from file data.csv" | awk -v FS="(sql|\`\`\`)" '{print $1}')" Pretty
┏━━━━━━━━━┓
┃ count() ┃
┡━━━━━━━━━┩
│       2 │
└─────────┘

Interactive Mode

# python3 interactive.py 

Hi, I'm chdbGPT, an AI assistant that can execute ClickHouse SQL queries for you.

What would you like to know? => show the top 10 towns by price from url https://datasets-documentation.s3.eu-west-3.amazonaws.com/house_parquet/house_0.parquet

The query returned data for 10 towns. The towns are listed in descending order of price.
The town with the highest price is London, with a price of 337,000,000.
The remaining towns are also in London, with prices ranging from 315,000,000 to 160,000,000.