You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tmpFileName = pdfImage.extract_to(fileprefix = "tmp")
File "C:\ProgramData\Anaconda3\lib\site-packages\pikepdf\models\image.py", line 668, in extract_to
extension = self._extract_to_stream(stream=bio)
File "C:\ProgramData\Anaconda3\lib\site-packages\pikepdf\models\image.py", line 611, in _extract_to_stream
im = self._extract_transcoded()
File "C:\ProgramData\Anaconda3\lib\site-packages\pikepdf\models\image.py", line 581, in _extract_transcoded
im = self._extract_transcoded_1248bits()
File "C:\ProgramData\Anaconda3\lib\site-packages\pikepdf\models\image.py", line 528, in _extract_transcoded_1248bits
im = _transcoding.image_from_buffer_and_palette(
File "C:\ProgramData\Anaconda3\lib\site-packages\pikepdf\models\_transcoding.py", line 143, in image_from_buffer_and_palette
im = image_from_byte_buffer(buffer, size, stride)
File "C:\ProgramData\Anaconda3\lib\site-packages\pikepdf\models\_transcoding.py", line 107, in image_from_byte_buffer
return Image.frombuffer('L', size, buffer, "raw", 'L', stride, ystep)
File "C:\ProgramData\Anaconda3\lib\site-packages\PIL\Image.py", line 2932, in frombuffer
im = im._new(core.map_buffer(data, size, decoder_name, 0, args))
ValueError: buffer is not large enough
when trying to extract pngs from some pdfs. Most pngs are extracted correctly, but some are causing such exception. I tried to debug a bit, but except of "wrong" mode is given to PIL.Image.frombuffer() I was unable to find the issue. By "wrong" I mean always sending 'L' there, when at least in case of that problematic png self.mode == 'P'. I have no idea what it is about, but this is the only thing I was able to notice.
The code I'm using:
import os
from pathlib import Path
from pikepdf import Name, Pdf, PdfImage
files = [f for f in os.listdir('.') if os.path.isfile(f) and str(f).endswith(".pdf")]
for fileName in files:
pdfFile = Pdf.open(fileName, allow_overwriting_input = True)
for page in pdfFile.pages:
for j, (name, rawImage) in enumerate(page.images.items()):
pdfImage = PdfImage(rawImage)
tmpFileName = pdfImage.extract_to(fileprefix = "tmp")
# some unrelated work is done here
pdfFile.save()
pdfFile.close()
I'm getting
when trying to extract pngs from some pdfs. Most pngs are extracted correctly, but some are causing such exception. I tried to debug a bit, but except of "wrong" mode is given to PIL.Image.frombuffer() I was unable to find the issue. By "wrong" I mean always sending 'L' there, when at least in case of that problematic png self.mode == 'P'. I have no idea what it is about, but this is the only thing I was able to notice.
The code I'm using:
It crashes on element
from attached pdf.
Dyko.pdf
The text was updated successfully, but these errors were encountered: