Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doesn't seem to see a different table schema #5

Open
LaguePesikin opened this issue Jul 2, 2023 · 2 comments
Open

Doesn't seem to see a different table schema #5

LaguePesikin opened this issue Jul 2, 2023 · 2 comments

Comments

@LaguePesikin
Copy link

I've gone through your codes and paper but still confused about how you passed the table structure to the prompt to let the model know which operations it should take.
I've seen table_schema.py but it only contains fixed SQL sentences.
If this part of the design is missing, does it mean that the experiment is only valid in the fruit shop dataset scenario?

@khill-fbmc
Copy link

Maybe I can help... I was so fascinated by the paper and concept that I really wanted to try it, I don't know python as well as I know TypeScript.... so I first made this to do the heavy lifting for me and have spent a day doing hand edits where ChatGPT struggled.

This is where everything starts

If you're curious on a TypeScript version (with bonus docker-compose file to run MariaDB and PhpMyAdmin) I'm close to putting my work on github once it actually runs. This is what I have done so far:

# Project
  - Add `debug.ts`
  - Add `types.ts`
    - `SqlStep` type added
    - `ChatContext` type added
  - Add `utils.ts`
    - Adding `sleep()` method
    - Adding `input()` method with `prompt-sync`
    - Adding `new_message_list()` to type safely start `ChatCompletionRequestMessage[]` arrays
  - Maps `List` to `Array`
  - Maps `Dict` to `Record`
  - Add Dependencies
    - tiktoken
    - mysql2
    - langchain
    - winston
    - chalk

## call_ai_function.ts
  - Adding `export` to all `function`
  - Typing `model` as `TiktokenModel`
  - Adding `async` to `call_ai_function()`
  - Adding `async` to `populate_sql_statement()`

## chat.ts
  - Adding `export` to all `function`
  - Modify `generate_context` model param from `any` to `TiktokenModel`
  - Modify return to `object` from `tuple`
  - Adding types from `openai` to `create_chat_message()` to return `ChatCompletionRequestMessage`
  - Typing `full_message_history` as `ChatCompletionRequestMessage[]`

## chatdb_prompts.ts
  - ChatDB had alot of trouble on this one. It was easy to fix by hand.
  - Renaming `user_inp` to `user_input` for use in `chatdb.ts`

## chatdb.ts
  - Adding `export` to all `function`
  - Modify `get_steps_from_response()` to return `SqlStep[]`
  - Modify `chain_of_memory()` to be `async` and accept `SqlStep` instead of `Array<object>`
  - Fix `init_system_msg()` PromptTemplate creation and making `async`
  - Modify `generate_chat_responses()` to be `async`
    - Typing `historical_message` as `ChatCompletionRequestMessage[]`
    - Fix named argument calls incorrectly transpiled `{ user_input }` was `(user_inp = user_inp)`

## chatgpt.ts
  - Adding `const sleep = (seconds: number) => new Promise((resolve)=> setTimeout(resolve, seconds * 1000));` to mimic python `sleep()`
  - Adding `export async` to `create_chat_completion`
  - Edit `create_chat_completion()` messages param from `any[]` to `ChatCompletionRequestMessage[]`

## config.ts
  - Adding `import "dotenv/config";` was `load_dotenv();`
  - Adding `const getEnv = (key: string) => process.env[key];`
    - Alias for was `os.getenv`
  - Commenting out `Singleton` class...
  - Commenting out `Azure` stuff for now...
  - Adding `mysql_database: string | undefined;` prop to `Config` class
  - `export const config = new Config();` was `const cfg = new Config();`

## fruit_shop_schema.ts
  - Adding `export` to `const`

## mysql.ts
  - Edit `import  from "mysql2";` was `import * as pymysql from "pymysql";`
  - Add `export` to `class MySQLDB`
  - Remove `Cursor` references from `class MySQLDB`
  - Modify `insert` and `update` from `data:any` to `data: Record<string, string>`
  - Remove
    ```
    if (require.main === module) {
      import { cfg } from "./config";
      let mysql: MySQLDB = new MySQLDB(cfg.mysql_host, cfg.mysql_user, cfg.mysql_password, cfg.mysql_port, "try2");
    }
    ```
  - Modify Removing connect/disconnect to use `mysql2` connection pool features
  - `./scripts/test-db-connection.ts` added to test DB

## sql_examples.ts
  - Remove `import re;`
  - Add `export` to all entries
  - Rename `ex_*` was `eg_*`
  - Rename `examples` was `egs`

## tables.ts
  - Add `export` to all `function` and `const`
  - Add `async` to `init_database()`
  - 2nd Pass at `get_table_info()` since the regex calls were not converted correctly

## token_counter.ts
  - Adding imports from `tiktoken`
  - Adding `export` to all `function`
  - Edit `TiktokenModel` for `string` on the model parameter on `count_string_tokens()`
  - Edit `TiktokenModel` for `string` on the model parameter on `count_message_tokens()`
  - Edit `let encoding: Tiktoken;` outside of try/catch
  - Edit `let tokens_per_message: number;` to top of method body
  - Edit `let tokens_per_name: number;` to top of method body
  - The lower loop over `messages` in `count_message_tokens()` was missing the call off `encoding`
    - Fixed errors in named model args
    - 2nd pass on loop through ChatGPT produced the correct loop
    - Typing parameter as messages as `ChatCompletionRequestMessage[]`
  - ChatGPT incorrectly converted `len(encoding.encode(string))` to `string.length`
    - Edit `len(encoding.encode(string))` => `encoding.encode(string).length`

@MurrayC7
Copy link

MurrayC7 commented Sep 7, 2023

I've gone through your codes and paper but still confused about how you passed the table structure to the prompt to let the model know which operations it should take. I've seen table_schema.py but it only contains fixed SQL sentences. If this part of the design is missing, does it mean that the experiment is only valid in the fruit shop dataset scenario?

是的。有篇解读在这里 [大模型的符号性记忆框架,提升精确记忆和复杂推理能力-ChatDB - 知乎], 里面宣称:

之前的一些大语言模型和数据库结合的工作(比如DB-GPT和ChatExcel)也涉及用大语言模型生成SQL或Excel的指令,但ChatDB跟它们有本质上的不同。DB-GPT和ChatExcel更多关注利用大语言模型解决自然语言到SQL或Excel指令的转化,而且更多只是用来解决查询的问题,数据源本身是给定好的。ChatDB则是将数据库作为符号性记忆模块,不只涉及查询,还包括了数据库的增删改查等所有操作。整个数据库是从无到有,不断记录并更新大语言模型的历史信息。并且,ChatDB中的数据库,即符号性记忆模块,是与大语言模型紧密关联、融为一体的,可以帮助大语言模型进行复杂的多步推理。

我理解这篇工作是想基于Text-to-SQL 做更通用的数据库对话,更具体来说,目前实现的场景,就是在静态查询以外还能增删改这些动态能力。但是肯定还是依赖Text-to-SQL准确度的(目前都不高),所以这篇还是比较偏概念设计,难以拓展。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants