We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi All,
while starting to use organize I will setup my rules before execute it on productive files. Therefore I started to set up a easy config file:
rules: - folders: ~/tmp_doc-test subfolders: false filters: - extension: pdf - filecontent: "Entgeltbescheinigung" actions: - echo: "Found PDF!" - copy: "~/tmp_doc-test/sortiert/Lohnzettel/"
And execute it as usual:
organize run
For some files i face Following issues:
File BWG.pdf: - (FileContent) ERROR! 'charmap' codec can't decode byte 0x9d in position 9796: character maps to <undefined>
I tried to resolve this issue but i have no Idea about the reason. First of all I was thinking that's because the files charset is of type binary:
file -i BWG.pdf BWG.pdf: application/pdf; charset=binary
But I have also other PDF files with charset binary
binary
So I'm completely out of ideas. Someone of you has any idea?
The text was updated successfully, but these errors were encountered:
organize uses textract under the hood. So you might check the output of:
textract
textract file.pdf
You can also try installing another parser which is supported by textract:
pip install pdftotext
Sorry, something went wrong.
No branches or pull requests
Hi All,
while starting to use organize I will setup my rules before execute it on productive files.
Therefore I started to set up a easy config file:
And execute it as usual:
For some files i face Following issues:
I tried to resolve this issue but i have no Idea about the reason.
First of all I was thinking that's because the files charset is of type binary:
file -i BWG.pdf BWG.pdf: application/pdf; charset=binary
But I have also other PDF files with charset
binary
So I'm completely out of ideas.
Someone of you has any idea?
The text was updated successfully, but these errors were encountered: