Skip to content
boazsegev edited this page Sep 4, 2014 · 3 revisions

CombinePDF - the ruby way for merging PDF files

CombinePDF is a nifty model, written in pure Ruby, to parse PDF files and combine (merge) them with other PDF files, watermark them or stamp them (all using the PDF file format and pure Ruby code).

Merge / Combine Pages

Combining PDF files s very straight forward.

First you create the PDF object that will contain all the combined data.

Then you "inject", using the << operator, the data - either page by page (which is slower) or file by file (which is faster).

Last, you render or save the data.

For Example:

pdf = CombinePDF.new
# one way to combine, very fast:
pdf << CombinePDF.new "file1.pdf"
# different way to combine, slower, but allows to mix things up:
CombinePDF.new("file2.pdf").pages.each {|page| pdf << page}
# you can also parse PDF files from memory.
pdf_data = IO.read 'file3.pdf'
# we will add just the first page:
pdf << CombinePDF.parse(pdf_data).pages[0]
# Save to file
pdf.save "combined.pdf"
# or render to memory
pdf.to_pdf

The page by page is great if you want to mix things up, but since the "Catalog" dictionary of the PDF file must be updated (the Catalog is an internal PDF dictionary that contains references to all the pages and the order in which they are displayed), it is slower.

Stamp / Watermark

has issues with specific PDF files - please see the issue published here.

To stamp PDF files (or data), first create the stamp from an existing PDF file.

After the stamp was created, inject to existing PDF pages.

# load the stamp
stamp_pdf_file = CombinePDF.new "stamp_pdf_file.pdf"
stamp_page = stamp_pdf_file.pages[0]
# load the file to stamp on
pdf = CombinePDF.new "file1.pdf"
#stamping each page with the << operator
pdf.pages.each {|page| page << stamp_page}

Notice the << operator is on a page and not a PDF object. The << operator acts differently on PDF objects and on Pages. The Page objects are Hash class objects and the << operator was added to the Page instances without altering the class.