Skip to content

One-Step-Translator for Paradox Interactive's Clausewitz-style i18n locale files of games and mods with ChatGPT

Notifications You must be signed in to change notification settings

wzp21142/Paradox-OST-Chatgpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Paradox-One-Step-Translator-Chatgpt

MIT licensed 中文

Translate non-standard paradox-style yml i18n files with AI Assistance. With just one step, then you can enjoy mods and games with all text localized!

Personal Proposal: If you intend to publicly release the translation mod generated by this tool, please respect the work of the manual translators and proofreaders, and indicate that the mod was translated by ChatGPT!

Usage

Step 1. Create .env file with openai key

OPENAI_API_KEY=<your key>

Step 2. Install dependencies

pip install -r requirements.txt

Step 3. Run the script

python translate.py -i "YOUR_SOURCE_DIRECTORY" -l LANGUAGE -o "YOUR_OUTPUT_DIRECTORY"

The script will find YML files in all subfolders of source directory, so the SOURCE_DIRECTORY should be like"...\localization\english" or other languages. DO NOT ASSIGN THE "...\localization" FOLDER! Notice that the output directory is unnecessary. If not provided, the default directory will be YOUR_SOURCE_DIRECTORY/localization/LANGUAGE. For example, I run the script with

python translate.py -i "...\localization\english" -l simp_chinese

then we can translate the english i18n files into translated files in "...\localization\simp_chinese".

Step 4. If you haven't specified an output path and are certain that there are no machine-translated i18n files in the default output path created by the mod developer, then you can enjoy it!

Notice

  1. Currently, the program uses a rather naive approach to split long texts and translates them segment by segment using ChatGPT. However, due to the poor capability of GPT-3.5 model in handling long texts, the translation results may not be satisfactory for large-sized mods, and much proofreading work may be required. I believe this issue can be resolved by providing better prompts or more reasonable hyperparameter settings. For example, the current script sets the segmentation at every 1350 input tokens, but setting it to 400 or even 300 can result in higher translation quality. However, lower segmentation values also mean more translation requests, resulting in more wasted tokens on prompts. Please consider this trade-off carefully if you require high-quality translations.

  2. There may be longer waiting times for large files, as 99% of the program's runtime is spent waiting for replies from ChatGPT. Concurrent translation requests will not be implemented for now due to concerns about account safety amid widespread account suspensions. So unless OpenAI improves response speed at the API level, it's temporarily unfixable.

  3. Known bug:

    [1] ChatGPT sometimes outputs in an unexpected format, causing mismatches in key-value pairs and resulting in an IndexError: list index out of range error. The current workaround is to use "↑" as a separator for each input value to ChatGPT instead of the default quotation marks and commas (as the value text itself may contain these characters, and ChatGPT may not fully understand the segmentation when processing contextual relationships, leading to translation errors). This issue has been reduced significantly in testing, but may still occur occasionally. If this issue arises, it may be due to the presence of "↑" characters in the input file. You can try replacing them with other characters, compare with the translated files in the output path, temporarily delete the corresponding source text file in the input path from where the translation was interrupted, and restart the translation from the last checkpoint. If the error persists in multiple tests on the same file, then try to reduce the size of each segmentation (see 1). Due to the randomness of GPT's responses, there is currently no better solution.

    [2] When there are several lines of text with similar content (may not be the only trigger), ChatGPT may likely miss some of them, resulting in unmatched key-value pairs. This issue does not occur on the web interface, but it is almost certain to occur on the API. It may still require more refined prompt design, and if you are interested you can try translating the Victoria 3\game\localization\english\concepts_l_english.yml file of Vicky 3, where has a bound-to-occur error when translating the first segment near the line 10, text of "Healthy".

  4. The current script is only a initial version and may have other issues. If you have better suggestions, configuration solutions, or discover any other bugs, please feel free to create an issue or submit a pull request (PR)!

About

One-Step-Translator for Paradox Interactive's Clausewitz-style i18n locale files of games and mods with ChatGPT

Topics

Resources

Stars

Watchers

Forks

Languages