Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work through a large number of files piece by piece #143

Open
WolfgangDpunkt opened this issue Jan 23, 2023 · 2 comments
Open

Work through a large number of files piece by piece #143

WolfgangDpunkt opened this issue Jan 23, 2023 · 2 comments

Comments

@WolfgangDpunkt
Copy link

When I try to convert a large number of HTML files to Epub by specifying a folder with e.g. 120 files as source, percollate tries to process the whole bunch at once. If the process fails, not a single file is successfully processed and it has to be started completely over.

After all, there could always be a corrupt html file that causes the entire process to fail and since no file-by-file processing is possible, prevents the conversion of all other files.

It would be more practical if percollate would process one file after the other iteratively, so that the finished epubs are saved at once, not only when the conversion of all other source files has worked.

After all, there could always be a corrupt html file that causes the entire process to fail and since no file-by-file processing is possible, prevents the conversion of all other files.

This would be my idea, which would make the application even more useful for power users.

Thanks a lot!

@danburzo
Copy link
Owner

Hi @WolfgangDpunkt, do you mean feeding the entire list of files to a single percollate command using the --individual flag? If so, I agree that can be made more resilient to errors in some of the input files. In fact, the entire flow could use a refresh to be made more robust. In the meantime, you can use xargs -L1 to invoke separate percollate commands for each file:

cat urls.txt | xargs -L1 percollate epub

@WolfgangDpunkt
Copy link
Author

Hi @danburzo ,

sorry, I have expressed myself a bit unclearly.
Yes, what I do is feeding an entire list of files to a single percollate command using the --individual flag.
But not from a list like urls.txt, but I use a specified folder as source (in this folder HTML files are saved automatically, which I create with the browser addon SingleFile, which together results in my individually built read-it-later solution).
Here is the command:
percollate epub --individual --output /home/myEpubs/ /home/dTmpHtmlFolder/*.html

As I said, as result everything ends up in one process, which can easily fail because of only one file.
Thanks for the tip, I will try that as a workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants