PDF format #611

heinrich5991 · 2022-06-13T10:10:45Z

Specification: https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf
Sample: https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf

armijnhemel · 2022-06-13T10:32:06Z

please note: PDF is typically parsed from the end of the file using an index with offsets, which is difficult with kaitai struct, as you first have to jump all the data, search for the index and then parse the file using the information from the index.

Kreijstal · 2022-07-27T15:47:49Z

if you want to understand pdf better use qpdf

rillig · 2024-03-31T20:55:30Z

It would definitely be interesting to see how far Kaitai Struct can model the PDF format, due to these specialties:

Embedded streams that can be decoded into other file formats (TTF, PNG, JPEG)
Multiple references to the same PDF object
Possible gaps in the file that could be garbage-collected or used for steganography
Circular references between PDF objects
Textual PDF commands l, m, Tj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PDF format #611

PDF format #611

heinrich5991 commented Jun 13, 2022

armijnhemel commented Jun 13, 2022

Kreijstal commented Jul 27, 2022

rillig commented Mar 31, 2024

PDF format #611

PDF format #611

Comments

heinrich5991 commented Jun 13, 2022

armijnhemel commented Jun 13, 2022

Kreijstal commented Jul 27, 2022

rillig commented Mar 31, 2024