Skip to content

Some regexes and string manipulation to clean the chaotic JSON produced by ChatGPT (when frequency_penalty ~ 1)

Notifications You must be signed in to change notification settings

stouch/chatgpt-json-cleaner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Some regexes to fix the broken JSON produced by ChatGPT

(frequent behaviour for high frequency_penalty)

Even using that kind of prompt : https://community.openai.com/t/getting-response-data-as-a-fixed-consistent-json-response/28471/4

const prompt = `
pretend to be an expert child behavioural researcher.
create a valid JSON array of objects for translating baby speak into English following this format:

[{"baby": "sound the baby makes",
"volumeDb": "how loud is the sound, decibels as a floating-point number",
"timeMin": "how long the sound is made, minutes with 2 decimal places",
"meaning": "what the baby might be trying to communicate",
"confidencePct": "certainty of meaning, percent as an integer,
"response": "what sound the parent should reply with"}]

The JSON object:
`.trim()

... ChatGPT sometimes gives me really ugly JSON.

I typically got that kind of results (but many other cases) :

Screenshot from 2023-02-26 22-32-37

This behaviour is frequent in case you use a high frequency_penalty in your request.

So I tried these regexes to improve it. Let me know what you use ! Thanks a lot.

Tests

Fill test.json.file

php test.php

If you want to debug a specific string, set your own string in $content at the top of test.php

About

Some regexes and string manipulation to clean the chaotic JSON produced by ChatGPT (when frequency_penalty ~ 1)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages