Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cant read some fonts and characters and returns cid() value #106

Open
iodabasi opened this issue Feb 7, 2024 · 2 comments
Open

Cant read some fonts and characters and returns cid() value #106

iodabasi opened this issue Feb 7, 2024 · 2 comments
Labels
bug Something isn't working pdfminer.six related to underlying pdfminer.six

Comments

@iodabasi
Copy link
Collaborator

iodabasi commented Feb 7, 2024

Some characters like € is not readable: text 339,45 € is read as 339,45 cid(128)
If needed I can send the pdf, cant add here

It seems like cause by underlining pdfminer issue, check the similar open issues in the repo:
pdfminer/pdfminer.six#635
pdfminer/pdfminer.six#796
pdfminer/pdfminer.six#927

@iodabasi iodabasi added the bug Something isn't working label Feb 7, 2024
@iodabasi
Copy link
Collaborator Author

iodabasi commented Feb 7, 2024

@krishnasism

@krishnasism
Copy link
Contributor

krishnasism commented Feb 23, 2024

Should be fixed with pdfminer.six updates in fork https://github.com/weareprestatech/pdfminer.six/tree/master

I have the changes in branch 20240222 for now. I installed it directly from the branch for now in our Dockerfile.

pip install --no-cache-dir git+https://github.com/weareprestatech/pdfminer.six.git@20240222#egg=pdfminer-six

@krishnasism krishnasism added the pdfminer.six related to underlying pdfminer.six label Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pdfminer.six related to underlying pdfminer.six
Projects
None yet
Development

No branches or pull requests

2 participants