Skip to content
This repository has been archived by the owner on Dec 18, 2019. It is now read-only.

Cannot select OCR language other than English #735

Open
Nadrazhul opened this issue Jan 14, 2018 · 6 comments
Open

Cannot select OCR language other than English #735

Nadrazhul opened this issue Jan 14, 2018 · 6 comments
Milestone

Comments

@Nadrazhul
Copy link

I had an issue similar to #107 but with a twist: I'm running Ubuntu 16.04 LTS and as per recommendations, I have successfully installed paperwork via flatpak.
So it now lives in ~/.local/share/flatpak/app/work.openpaper.Paperwork
Also, I have installed an additional OCR language (German) via
sudo apt-get install tesseract-ocr tesseract-ocr-deu
and I have confirmed that deu.traineddata has been installed to/usr/share/tesseract-ocr/tessdata/

However, when I now open paperwork/Settings, I can only select English as OCR language?

What is the correct way to install additional OCR languages in this scenario? I have tried to cp deu.traineddata from the main share directory to .local/share/flatpak/app/work.openpaper.Paperwork/current/active/files/share/tessdata/ and now I can select German as OCR language and German language recognition seems to work now.

Is this all there is to it? If so, could you maybe update the OCR language entry of the FAQ accordingly to inform other flatpak users?

@jflesch
Copy link
Member

jflesch commented Jan 15, 2018

sudo apt-get install tesseract-ocr tesseract-ocr-deu

If you installed Paperwork using Flatpak, apt is of no use. Paperwork run in its own container. Flatpak should have installed automatically the Tesseract file data for your language (based on your system locale).

@jflesch
Copy link
Member

jflesch commented Jan 15, 2018

what you can try:

flatpak run --command=bash work.openpaper.Paperwork
find /app/share/runtime/locale -type f

It should show a deu.traineddata.

Also the diagnostic output could help again.

@jflesch jflesch added this to the 1.2.3 milestone Jan 15, 2018
@jflesch
Copy link
Member

jflesch commented Feb 1, 2018

ping ?

@jflesch jflesch modified the milestones: 1.2.3, 1.2.4 Feb 1, 2018
@Nadrazhul
Copy link
Author

Apologies, I had a mind to configure ecryptfs to protect the paperwork documents and all the other home folder contents. This led into another rabbit hole of issues, unrelated to paperwork, which I haven´t fully resolved yet.

Of course, you are right, it makes perfect sense for paperwork to install default OCR language based on your system locale. My default locale is en_US.UTF-8, by the way, so again defaulting to eng.traineddata makes perfect sense. So I would rather consider adding additional OCR languages in flatpak installation to be a documentation improvement/addition to the FAQ than a program issue.

Anyway, running commands above does not show any languages, only:
/app/share/runtime/locale/.ref

And here is the diagnostic output....
issue735diag.log

@Nadrazhul Nadrazhul reopened this Feb 3, 2018
@jflesch
Copy link
Member

jflesch commented Feb 4, 2018

Ok. Just beware, because of #744 , the path for the tessdata will change at some point later (probably be placed somewhere in /home I guess).

@jflesch jflesch modified the milestones: 1.2.4, 1.4.0 Feb 27, 2018
@kafran
Copy link

kafran commented May 30, 2018

If you run $ flatpak list --all the work.openpaper.Paperwork.Locale/x86_64/master package is only partially installed.

To fix this, you should reinstall the locale package. This will download the whole language support:

$ flatpak --user install --reinstall work.openpaper.Paperwork-origin work.openpaper.Paperwork.Locale//master

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants