Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tx client pulls malformed source files for gettext (.pot) resources #172

Open
Mi605 opened this issue Mar 24, 2023 · 2 comments
Open

tx client pulls malformed source files for gettext (.pot) resources #172

Mi605 opened this issue Mar 24, 2023 · 2 comments

Comments

@Mi605
Copy link

Mi605 commented Mar 24, 2023

Bug report:

scope:

  • TX Client, version=1.6.5

focus:

  • tx pull -s

issue:

  • When pulling gettext resource files (.pot) using the tx client from transifex these are invalid, since all msgstr fields are filled in while they are to be empty in the source file.

Example of valid .pot source file:
(string entries section)

...

#. String position: Command line help message, usage instructions header line
#: test2:31
msgid "Usage:"
msgstr ""

#. String position: Command line help message, single word in usage
#. instructions header line and descriptions header
#. String position: Command line help message, general program description
#: test2:33 test2:35
msgid "Options"
msgstr ""

...

Example of what tx client pulls instead:
(string entries section)

  ...

  #. String position: Command line help message, usage instructions header line
  #: test2:31
  msgid "Usage:"
  msgstr "Usage:"

  #. String position: Command line help message, single word in usage
  #. instructions header line and descriptions header
  #. String position: Command line help message, general program description
  #: test2:33 test2:35
  msgid "Options"
  msgstr "Options"

  ...

The malformed .pot files tx client pulls are unusable for further usage and need manual cleaning from all the false msgstr entries added.

@kbairak
Copy link
Member

kbairak commented Mar 28, 2023

Hey @Mi605 ,

I think this is a problem with Transifex in general and not with the client. ie I think that if you try to download the source file from the Transifex UI, you will get the same behavior.

My question is: why would you need to download the source pot file from Transifex in the first place?

We implemented pull -s mostly for completeness's sake, but for most workflows, we expect users to push source files and pull translation files. Can you please explain your use-case in order to explore potential solutions?

@Mi605
Copy link
Author

Mi605 commented Mar 28, 2023

Hello @kbairak,

why would you need to download the source pot file from Transifex in the first place?

There are several reasons why this is needed.
In the first place it's about making sure not to accidentally deleting existing translations. Think the scenario a team of volunteer localisation maintainers on transifex of e.g. a linux distribution, handling many different resources in a project. Before one of them update one of the resources in need of adding some strings he/she should be able to download the original source in use by transifex for either checksum comparison or diff, to make sure not somebody else has uploaded some fixings to transifex in the meantime, which his new upload would revert, since he can't know whether the recent .pot source used by transifex still matches the one he has uploaded some months before. There must be a way to countercheck the locally stored .pot file against what is on transifex, to avoid accidental reverts with consequence of loss of already translated strings in translations. And this security check simply doesn't work when the returned source file was changed by transifex, the shasum comparison will fail, and any diff tool will come up with literally each single entry was changed (while actually it was not). No way besides of manually checking thousands of msgid entries within the .pot file one by one and character by character in huge resources to make sure nothing was changed since his last upload by somebody else's upload. This manual comparison is an impossible task for resources like e.g. zzzFM (gettext translation file), comprising of more than 8.000 translatable words in more than 1400 msgid entries.

A second reason is the scenario: You work from different locations, without having access to anything besides the resource file you can download from transifex (either by tx client or by Transifex web UI). But what you get is not a valid .pot resource file, you have to clean manually thousands of filled in msgstr entries before you can start work with the file, e.g. adding some few new entries or remove some outdated ones, before uploading the resource back to transifex for update. For this transifex needs to return the source files at least properly formatted. In a .pot file all the msgstr entries are to be empty strings:

msgstr ""

What you return as source is simply unusable without intricate manual cleaning.

if you try to download the source file from the Transifex web UI, you will get the same behavior.

You are true, I've just checked. There exists the very same issue. The source returned by UI is modified by transifex, so you can't countercheck against your locally stored .pot source you've used for former upload before uploading a new one. So the big fat warning about deletion of translations in UI when the new source is uploaded is all in vain. The source file returned can't get checked against what you think what should be present on transifex. You will only notice after upload, when somebody complains about regression/revert of changes made by somebody else in the meantime.

But even when Transifex web UI fails on this, at least the tx client should return a proper .pot source file when requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants