Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with LLMs for SQL generation #348

Open
darabos opened this issue Feb 13, 2023 · 2 comments
Open

Experiment with LLMs for SQL generation #348

darabos opened this issue Feb 13, 2023 · 2 comments
Assignees

Comments

@darabos
Copy link
Contributor

darabos commented Feb 13, 2023

SQL queries on graphs are very flexible and powerful. With large language models like OpenAI's Codex we could make this more accessible.

The idea is to create a prompt of the following format:

We have these tables:
 - VERTICES with columns ID (long), X (number), Y (string), Z (timestamp)
 - EDGE_ATTRIBUTES with columns SRC_ID (long), DST_ID (long), FOO (number), BAR (string), ...
 - EDGES with columns SRC_ID (long), DST_ID (long), SRC_X (number), DST_X ..., EDGE_FOO (number), EDGE_BAR (string), ...

Write a query to find [WHAT THE USER WANTS]:
SELECT

The task is to experiment to find the best prompt. Please create 10 different use cases with made up tables and user queries. (Some queries can share the same tables, but make sure there is good variety.)

Imaginary tables are very flexible and easy to work with. But let's use a real dataset as one of the examples. That way we can try the generated queries for real.

For all test cases try the effects of the following:

  • Adding one or more exemplars. An exemplar helps the model understand the format. It can also be useful for detecting the end of the generated query.
  • Changing the set of tables. LynxKite has vertices, edges, and edge_attributes. But it never makes sense to use all three in a query. What if we only mention vertices and edges? Or if we only mention edge_attributes?
  • Try GPT-3 and Codex. You can drive these from code, so you can easily run a full experimental suite on them. Try a few examples manually on ChatGPT for comparison.
  • Experiment with the specific phrasing. Find the best option instead of We have these tables: and find the best format for specifying the schema. For GPT-3/ChatGPT I think an instruction ought to work well. (Like Write a query that.) For Codex I expect a comment would work better. (Like -- This query is going to.)
  • Check the effects of unrelated columns. Try adding 100 unrelated columns in your imaginary table.

Please submit the experimental suite in a PR. I don't want to clutter the main branch with this experimental stuff, but we can merge it to a research branch for posterity. Once we are past the prototyping phase, this code can become the basis of the real implementation.

Please write the code so that the generated queries are saved to a file and include this file in the PR too. You don't have to include the output for every variation, but do include what you want to show. The good stuff! 😊 (Make sure you don't include the API key though!)

Thank you!

@darabos
Copy link
Contributor Author

darabos commented Feb 13, 2023

From today's meeting:

  • Lena has seen good results with SQL-to-text. For now we don't have an application for that, but it's good to know.
  • Lena experimented with different temperatures. Like the docs say, zero is probably best.

(Was there anything else?)

@darabos
Copy link
Contributor Author

darabos commented Feb 16, 2023

Serious researchers have also looked at this. Evaluating the Text-to-SQL Capabilities of Large Language Models (2022) by Rajkumar et al looks like the perfect overview!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants