Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Due to reduced Image quality (after conversion), text is not readable #241

Open
jaiswati opened this issue Aug 25, 2022 · 1 comment
Open

Comments

@jaiswati
Copy link

jaiswati commented Aug 25, 2022

Describe the bug
Due to reduced Image quality (after conversion), text is not readable . This has been tried in colab notebook
To Reproduce
Steps to reproduce the behavior:
from pdf2image import convert_from_path, convert_from_bytes
from IPython.display import display, Image

images = convert_from_bytes(open('/content/sample_data/test.pdf', 'rb').read(), size=800,dpi=400)
display(images[0])

Expected behavior
text in the image should be clear

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):
colab notebook with Chrome browser

@Belval
Copy link
Owner

Belval commented Sep 3, 2022

The way the code is written, both parameters (dpi and size) are being sent to pdftoppm this means that the issue you are seeing is most likely not at the pdf2image level, but in the underlying library.

My best solution would be to resize the output of pdf2image manually instead of using the parameter. Something like:

from PIL import Image

images = convert_from_bytes(open('/content/sample_data/test.pdf', 'rb').read(), size=800,dpi=400)

images[0].thumbnail((800, 800)) # This is in place I think

display(images[0])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants