Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge pdf files. Merged files are not released and cannot be deleted #1112

Open
lako12 opened this issue Mar 22, 2024 · 4 comments
Open

Merge pdf files. Merged files are not released and cannot be deleted #1112

lako12 opened this issue Mar 22, 2024 · 4 comments
Labels

Comments

@lako12
Copy link

lako12 commented Mar 22, 2024

I have a Java method that needs to read the PDF files inside a folder and create a file corresponding to the merge of all those inside the folder. Finally I have to delete the files and leave only the result in the folder.
The problem is that once the final file is created, I can't delete all the other files used for the merge, as if it doesn't release them after use. I also tried to delete them manually from the file explorer, but I can't (as long as the application remains running). I'm using this version:(Java 11)

        <dependency>
            <groupId>com.github.librepdf</groupId>
            <artifactId>openpdf</artifactId>
            <version>1.4.1</version>
        </dependency>

My Code:

        File[] files = directory.listFiles();
        if (files != null && files.length > 0) {
            try (Document document = new Document()) {
                PdfCopy copy = new PdfCopy(document, new FileOutputStream(fileName));
                document.open();

                for (File file : files) {
                    try (PdfReader reader = new PdfReader(file.getAbsolutePath())) {
                        for (int i = 1; i <= reader.getNumberOfPages(); i++) {
                            copy.addPage(copy.getImportedPage(reader, i));
                        }
                        copy.freeReader(reader);
                    }
                }
            }
        }

if after this piece of code I try to delete all the files in the "files" list by calling the .delete() method, it doesn't work

I also tried removing the try wth resources and manually closing reader, document copy etc... I tried inserting the PdfCopy in the try with resources together with document, but nothing, I can't find a solution...

@lako12 lako12 added the bug label Mar 22, 2024
@asturio
Copy link
Member

asturio commented Mar 27, 2024

This really looks like the files are still open by the process. I assume you are also using Windows, as this OS is known for keeping resources locked.

Important is that the documents you want to delete are all closed.

You may check if the "closeStream" flag of the Objects are really "true". Maybe it is being set somewhere to "false".

Please check if the

currentPdfReaderInstance.getReader().close();
currentPdfReaderInstance.getReaderFile().close();

are really being executed in freeReader and/or getImportedPage. I think if they are called in freeReader, they shouldn't be called in getImportedPage

@asturio
Copy link
Member

asturio commented Mar 27, 2024

It seems, that the Files are only closed if the PdfReader.partial = true. Using the constructor PdfReader(String) don't set partial.

PdfReader.close() is still as from 2010.

I think tokens.close() should always be called. Any opinions on that?

@lako12
Copy link
Author

lako12 commented Mar 28, 2024

yes, i am usign widnows.
this methods are called:

  currentPdfReaderInstance.getReader().close();
currentPdfReaderInstance.getReaderFile().close();

but they don't do anything because they don't respect the conditions

@sa-sh
Copy link

sa-sh commented Mar 30, 2024

Hello guys, I want to share my experience with PdfReader.
i noticed that issue happens only when it uses MappedRandomAccessFile (which is by default (Document.plainRandomAccess = false).
loos like issue caused by FileChannel (used in MappedRandomAccessFile) which in java 11+ may be released/closed latter (by GC).

as workaround you can control how file opened for PdfReader
i prefer to use RandomAccessFileOrArray directly, to control how file is loaded, small files better to load into memory

for (File file : files) {
   boolean forceRead = file.length() <= 10_000_000; // limit memory usage, set greater value if your app can use large heap
   boolean plainRandomAccess = true; // do not use MappedRandomAccessFile, Note: this flag ignored when forceRead=true
   try (RandomAccessFileOrArray raf = new RandomAccessFileOrArray(file.getPath(), forceRead, plainRandomAccess);
      PdfReader reader = new PdfReader(raf, null)) {
      //  do copy
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants