Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

produce sourcemap of translation #372

Open
milahu opened this issue Oct 15, 2023 · 1 comment
Open

produce sourcemap of translation #372

milahu opened this issue Oct 15, 2023 · 1 comment

Comments

@milahu
Copy link
Contributor

milahu commented Oct 15, 2023

source-to-source compilers usually produce sourcemaps
so for each output token i can see "where does this token come from?"

sourcemaps would be useful for language-to-language translators
for translating rich text formats like html, odt, docx, pdf...

to translate a rich text document, i would remove all markup
feed the plain text of sentences to the translator
and then use the sourcemap to reconstruct the markup

would this be possible?

google translate shows the connection between sentences
such a "sourcemap of sentences" would also be useful

@PJ-Finlay
Copy link
Collaborator

If CTranslate2 has support for sourcemaps then this might be possible.

argos-translate-files supports translating odt, html, docx

LibreTranslate/argos-translate-files#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants