Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev/doc encrypt #1015

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

PhakornKiong
Copy link

@PhakornKiong PhakornKiong commented Oct 3, 2021

Continuation of PR #917 & #243

Hope to get some advanced review on the PR as this is quite a significant addition,

What?

This PR add the feature for user to be able to Encrypt an unencrypted file.

Why?

Enables the encryption feature in this library

How?

On the branch, you can try it by using the scratchpad

yarn run scratchpad:start to start the build, then yarn scratchpad:run. It should create a new pdf in the root level. Then you can verify it.

  const pdfDoc = await PDFDocument.create();
  const existingPdfBytes = fs.readFileSync('test.pdf');
  const pdfDoc = await PDFDocument.load(existingPdfBytes);
  const page = pdfDoc.addPage();
  page.drawText('Creating PDFs in JavaScript is awesome!', {
    x: 50,
    y: 700,
  });

  pdfDoc.encrypt({
    userPassword: 'abcd',
    ownerPassword: '1234',
    permissions: { modifying: true },
  });

 // useObjectStreams true and false works
 // note on the following section on bugs on adobe acrobat reader
  const pdfBytes = await pdfDoc.save({ useObjectStreams: false });
  fs.writeFileSync('sample.pdf', pdfBytes);

Testing?

Manual testing by verifying the permission in PDF Reader application.

Added Integration test test19.ts and test20.ts in node

File should only be able to open and get decrypted in the PDF Reader if everything is implemented correctly.

Suggestions on testing is welcomed.

New Dependencies?

Crypto.js
Saslprep

Both are famous public libraries.

Screenshots

N/A

Suggested Reading?

Yes

Anything Else?

There are 2 areas that I would need help with on this PR.

  1. How to verify that the build is working, as raised in Feature: Document Encryption #917

  2. Currently, there is a bug when using pdfDoc.save({ useObjectStreams: true }), the file is not able to be opened in Adobe Acrobat Reader, but works fine on browser and other PDF readers.

Checklist

  • I read CONTRIBUTING.md.
  • I read MAINTAINERSHIP.md#pull-requests.
  • I added/updated unit tests for my changes.
  • I added/updated integration tests for my changes.
  • I ran the integration tests.
  • I tested my changes in Node, Deno, and the browser.
  • I viewed documents produced with my changes in Adobe Acrobat, Foxit Reader, Firefox, and Chrome.
  • I added/updated doc comments for any new/modified public APIs.
  • My changes work for both new and existing PDF files.
  • I ran the linter on my changes.

@Hopding
Copy link
Owner

Hopding commented Oct 31, 2021

@PhakornKiong thanks for reopening this PR! It's pretty hefty, so it'll be awhile before I can look through all this.

Sidenote: If you'd like for folks to be able to use your work without waiting, you might consider making a separate library that people can pass a PDFDocument object to for encryption.

@PhakornKiong
Copy link
Author

@PhakornKiong thanks for reopening this PR! It's pretty hefty, so it'll be awhile before I can look through all this.

Sidenote: If you'd like for folks to be able to use your work without waiting, you might consider making a separate library that people can pass a PDFDocument object to for encryption.

No worries, take your time.

It would be difficult to spin off as a separate library as the change rely heavily on the pdfWriter to write the encrypted content into the file.

@taxilian
Copy link
Contributor

taxilian commented Dec 2, 2021

This looks awesome; it would be great to see this pulled in and also the ability to decrypt pdfs =] (dealing with that now, trying to decide how to handle it)

@cristianocaruso
Copy link

Hi, how can I use it in a browser environment?
It seems that npm run build creates a dist compatible only with node
In my case the inclusion of the saslprep library produced a few lines of code that point directly to the local filesystem to access code-point.mem
Can you add some instruction on how to use this mod in standard web pages?
Thanks and congratulations again for the excellent work done

@42retfa
Copy link

42retfa commented Apr 6, 2022

Hello,

I'm very interested by this one. Any chance to see this feature added in the near future ?

Thanks.

@Lex-talionis
Copy link

Hey @Hopding @PhakornKiong, this feature is a great and i see how it could re written outside of pdf-lib so i can keep using the latest release without doing any refactoring in the future, but before i do is it going to be added in the near future? Not a problem at all just curious.

@PhakornKiong
Copy link
Author

Hey @Hopding @PhakornKiong, this feature is a great and i see how it could re written outside of pdf-lib so i can keep using the latest release without doing any refactoring in the future, but before i do is it going to be added in the near future? Not a problem at all just curious.

At this rate, i doubt it will be merge anytime soon. Might be a great solution to those that need this functionality.

How would you write it outside of pdf-lib ? (Just curious, since i relied heavily on the parser in pdf-lib)

@Lex-talionis
Copy link

Hey @Hopding @PhakornKiong, this feature is a great and i see how it could re written outside of pdf-lib so i can keep using the latest release without doing any refactoring in the future, but before i do is it going to be added in the near future? Not a problem at all just curious.

At this rate, i doubt it will be merge anytime soon. Might be a great solution to those that need this functionality.

How would you write it outside of pdf-lib ? (Just curious, since i relied heavily on the parser in pdf-lib)

This idea is really more a guess. But a while back I found a solution on here to update spot colours in a pdf and has been expanded for other features. So using the research from your blog plus the pdf specification I'm thinking that might work.

@slegay
Copy link

slegay commented Jun 21, 2022

This would be incredibly useful for us. Any idea when it might get merged?

Copy link

@Lex-talionis Lex-talionis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't see anything wrong let get this added :D

@JunCaiLi
Copy link

Hi~ @PhakornKiong the PR code I use cannot be opened in the adobe reader after encrypting the pdf in my workplace, but it can be opened in other pdf readers (the browser can be opened)

@ryangriggs
Copy link

This feature would be immensely useful. I'm voting to move forward with merging. Any ETA?

@Sharcoux
Copy link

@PhakornKiong What an incredible job!

Maybe, while this processes until merge, can you release your fork on npm?

@peterpoliwoda
Copy link

+1 for adding it in. Anybody from the repo bossmen available for a review? @MatheusHCP I see it was approved, could we merge it in?
@PhakornKiong Some merge conflicts here, a small fix for man, a giant PDF encryption leap for all developer kind. Plz resolve.

@PhakornKiong
Copy link
Author

+1 for adding it in. Anybody from the repo bossmen available for a review? @MatheusHCP I see it was approved, could we merge it in? @PhakornKiong Some merge conflicts here, a small fix for man, a giant PDF encryption leap for all developer kind. Plz resolve.

Ok, let me do it this weekend.

@peterpoliwoda
Copy link

Thanks so much @PhakornKiong!

@kenhosr
Copy link

kenhosr commented May 29, 2023

This looks awesome; it would be great to see this pulled in and also the ability to decrypt pdfs =] (dealing with that now, trying to decide how to handle it)

decrypt is super helpful as well, any updates on this? thanks!

@gonyulian415
Copy link

@PhakornKiong I fork your repo and build, but error about fs founded, would u kindly tell me how to build correctly?

@CBEngineer9
Copy link

Would love to get this implemented. As of right now, there are not a lot of ways to encrypt an already existing pdf using node.

@peterpoliwoda
Copy link

@PhakornKiong
Still some merge conflicts 🥲. Or are there some new ones now? Let's get this party started please.

@gonyulian415 are you using the correct version of nodejs?

@PROxZIMA
Copy link

PROxZIMA commented Aug 7, 2023

Hey @peterpoliwoda, I faced 2 issues while building Phakorn's fork for browser.

  1. The fork uses crypto-js and saslprep. Rollup config bundles them too when building the umd or esm package via yarn build:esm or yarn build:umd.
  2. Even if you do not bundle crypto-js(as most browsers support it), saslprep has statements like const fs = require('fs') which throws errors in browser.

Can you guide me how to bundle the fork to make it work on browser?

Edit:

  1. Resolved above two by dropping crypto-js during build and including saslprep from https://github.com/foliojs/pdfkit/tree/master/lib/saslprep.
  2. Another error popped up, ReferenceError: Buffer is not defined which was quite easy to resolve.
npm install --save buffer

src/core/security/PDFSecurity.ts should be somewhat like this.

// import CryptoJS from 'crypto-js';
import { Buffer } from 'buffer';
import saslprep from './saslprep/index';

@gonyulian415 Please go through this once and let me know whether your issue is resolved or not

@peterpoliwoda
Copy link

Awesome! Well done sir!

@LLT9
Copy link

LLT9 commented Aug 21, 2023

Hell

Awesome! Well done sir!

Hello, I want to ask this feature will be merge or still is review?

@peterpoliwoda
Copy link

Closing this? So is there no hope for this to go in? What's happening?

@PhakornKiong
Copy link
Author

PhakornKiong commented Aug 22, 2023

@peterpoliwoda I forced pushed and it automatically closed it, probably have to do with history

@PhakornKiong PhakornKiong mentioned this pull request Aug 22, 2023
@PROxZIMA
Copy link

PROxZIMA commented Aug 22, 2023

Hi @PhakornKiong, with encryption enabled (ownerPassword and permissions only), document's metadata seems to be encrypted as well. Is this expected result or a bug?

Without Encryption With Encryption
image image

@PhakornKiong
Copy link
Author

Hi @PhakornKiong, with encryption enabled (ownerPassword and permissions only), document's metadata seems to be encrypted as well. Is this expected result or a bug?

Without Encryption With Encryption
image image

I'm not 100% sure, would need to check the standard to verify.

What tools do you use to display this? Maybe i can take a peek

Left is unencrypted, middle is encrypted by ilovepdf, right is from the PR
image

@PROxZIMA
Copy link

PROxZIMA commented Aug 22, 2023

What tools do you use to display this? Maybe i can take a peek

Simply viewing the pdf in MS Edge (chromium to be specific).

Left is unencrypted, middle is encrypted by ilovepdf, right is from the PR image

Both the images of the encrypted pdf are not showing the metadata.

Also from Adobe Acrobat Reader
Untitled


Created a sample pdf usingjsPDF with ownerPassword and permissions only. The metadata is intact.

The more modern PDF encryption methods allow the file’s XMP metadata stream (XML Metadata) to be left unencrypted so it may be extracted and read by programs which don’t know how to open encrypted PDF files, or if the password is not known.1

Footnotes

  1. https://www.oreilly.com/library/view/pdf-explained/9781449321581/ch08.html

@PhakornKiong
Copy link
Author

I'm of the opinion that the default is to encrypt metadata. So probably i would not change this implementation here.

image

https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf

@ccidcce32167
Copy link

The encryption is grate, however I’m facing the same issue as PROxZIMA ,metadata are also encrypted, and I can’t reset team by pdfDoc.setTitle(“”) (same as setAuthor, setSubject)

@ryangriggs
Copy link

In most PDF encryption apps and libraries there is an option to enable/disable the encryption of document metadata. I would recommend adding this option so that the developer can choose to encrypt the metadata or leave as-is.

@ccidcce32167
Copy link

The encryption is grate, however I’m facing the same issue as PROxZIMA ,metadata are also encrypted, and I can’t reset team by pdfDoc.setTitle(“”) (same as setAuthor, setSubject)

I try to clean the meta info and save as a temporary pdf then reload the clean meta pdf for encrypted. You know what, the meta info recovered, I just don’t know how it happened, is crazy 😂

@ccidcce32167
Copy link

The encryption is grate, however I’m facing the same issue as PROxZIMA ,metadata are also encrypted, and I can’t reset team by pdfDoc.setTitle(“”) (same as setAuthor, setSubject)

I try to clean the meta info and save as a temporary pdf then reload the clean meta pdf for encrypted. You know what, the meta info recovered, I just don’t know how it happened, is crazy 😂

It works If I remove the meta by adobe reader instead of pdf-lib

@PhakornKiong
Copy link
Author

The encryption is grate, however I’m facing the same issue as PROxZIMA ,metadata are also encrypted, and I can’t reset team by pdfDoc.setTitle(“”) (same as setAuthor, setSubject)

I try to clean the meta info and save as a temporary pdf then reload the clean meta pdf for encrypted. You know what, the meta info recovered, I just don’t know how it happened, is crazy 😂

Can you provide steps to reproduce this? If it works then this might be more straightforward to implement.

Else the solution would require substantial change in how the PDFwriter keep tracks of all the object, which I haven't really got into

@ccidcce32167
Copy link

`const pdfName = "./somePDFWithMeta.pdf";

//step 1 remove meta and save to disk
const pdfContent = fs.readFileSync(pdfName);
const pdfDoc = await PDFDocument.load(pdfContent);
//wipeout meta
pdfDoc.setAuthor("");
pdfDoc.setTitle("");
pdfDoc.setSubject("");
pdfDoc.setKeywords("");

const pdfBytes = await pdfDoc.save();
fs.writeFileSync(cleanMeta.${pdfName}, pdfBytes);

//step 2 encrypt PDF
const pdfContent2 = fs.readFileSync(cleanMeta.${pdfName});
const pdfDoc2 = await PDFDocument.load(pdfContent2);
pdfDoc.encrypt(encryptConfig);//password and permissions
const pdfBytes2 = await pdfDoc.save({ useObjectStreams: false });
fs.writeFileSync(encrypted.${pdfName}, pdfBytes); `

A simple step for your reference.
The meta info is clean in cleanMeta.pdf however it recovered and encrypted in encrypted.pdf

@sandeepjadhav
Copy link

Can anyone know when this can be merged?. Im desperately looking for this PR to get it merged.

Thank you.

@Sharcoux
Copy link

Sharcoux commented Nov 4, 2023

If I get a working PR on @cantoo/pdf-lib we can merge it. But it has to be working out of the box.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet