Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching Issue in icepdf viewer #263

Open
Muhammad-Muddasir opened this issue Mar 26, 2023 · 7 comments
Open

Searching Issue in icepdf viewer #263

Muhammad-Muddasir opened this issue Mar 26, 2023 · 7 comments
Labels
bug Something isn't working

Comments

@Muhammad-Muddasir
Copy link

Hi, I'm using ice pdf-version 7.0.2 in my project but some reports searching not working properly.

Searching Issue

Searching Issue.pdf

Some warnings are shown

Mar 26, 2023 11:12:42 AM org.icepdf.core.pobjects.Document setInputStream
WARNING: Cross reference deferred loading failed, will fall back to linear reading.
Mar 26, 2023 11:12:42 AM org.icepdf.core.pobjects.Catalog <clinit>
INFO: ICEpdf Core 7.0.2
Mar 26, 2023 11:12:44 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6876, font size: 1689704
Mar 26, 2023 11:12:46 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table 'DSIG' which goes past the file size; offset: 9517144, size: 65536, font size: 9524020
Mar 26, 2023 11:12:46 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table 'DSIG' which goes past the file size; offset: 9733700, size: 65536, font size: 9740576
Mar 26, 2023 11:12:47 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '   $' which goes past the file size; offset: 668, size: 1146308935, font size: 27506260
Mar 26, 2023 11:12:48 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '   $' which goes past the file size; offset: 604, size: 1146308935, font size: 36791212
Mar 26, 2023 11:12:48 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '   $' which goes past the file size; offset: 780, size: 1146308935, font size: 9209540
Mar 26, 2023 11:12:48 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6880, font size: 21482296
Mar 26, 2023 11:12:48 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6880, font size: 14437520
Mar 26, 2023 11:12:48 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6880, font size: 11106464
Mar 26, 2023 11:12:49 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6772, font size: 10080360
Mar 26, 2023 11:12:49 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6880, font size: 21632712
Mar 26, 2023 11:12:49 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6880, font size: 14460368
Mar 26, 2023 11:12:49 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6880, font size: 12058672
Mar 26, 2023 11:12:49 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:49 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:50 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6880, font size: 18259888
Mar 26, 2023 11:12:50 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '  ��' which goes past the file size; offset: 1146308935, size: 6876, font size: 941112
Mar 26, 2023 11:12:50 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '  ��' which goes past the file size; offset: 1146308935, size: 6876, font size: 929364
Mar 26, 2023 11:12:50 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '  ��' which goes past the file size; offset: 1146308935, size: 6876, font size: 994664
Mar 26, 2023 11:12:50 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '  ��' which goes past the file size; offset: 1146308935, size: 6876, font size: 984412
Mar 26, 2023 11:12:50 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:50 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:51 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:51 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:51 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:51 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:52 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6876, font size: 1689704
Mar 26, 2023 11:12:53 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table 'DSIG' which goes past the file size; offset: 9517144, size: 65536, font size: 9524020
Mar 26, 2023 11:12:53 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table 'DSIG' which goes past the file size; offset: 9733700, size: 65536, font size: 9740576
Mar 26, 2023 11:12:53 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '   $' which goes past the file size; offset: 668, size: 1146308935, font size: 27506260
Mar 26, 2023 11:12:53 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '   $' which goes past the file size; offset: 604, size: 1146308935, font size: 36791212
Mar 26, 2023 11:12:53 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '   $' which goes past the file size; offset: 780, size: 1146308935, font size: 9209540
Mar 26, 2023 11:12:54 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6880, font size: 21482296
Mar 26, 2023 11:12:54 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6880, font size: 14437520
Mar 26, 2023 11:12:54 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6880, font size: 11106464
Mar 26, 2023 11:12:54 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6772, font size: 10080360
Mar 26, 2023 11:12:54 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6880, font size: 21632712
Mar 26, 2023 11:12:54 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6880, font size: 14460368
Mar 26, 2023 11:12:54 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6880, font size: 12058672
Mar 26, 2023 11:12:54 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:54 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:55 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '    ' which goes past the file size; offset: 1146308935, size: 6880, font size: 18259888
Mar 26, 2023 11:12:55 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '  ��' which goes past the file size; offset: 1146308935, size: 6876, font size: 941112
Mar 26, 2023 11:12:55 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '  ��' which goes past the file size; offset: 1146308935, size: 6876, font size: 929364
Mar 26, 2023 11:12:55 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '  ��' which goes past the file size; offset: 1146308935, size: 6876, font size: 994664
Mar 26, 2023 11:12:55 AM org.apache.fontbox.ttf.TTFParser parse
WARNING: Skip table '  ��' which goes past the file size; offset: 1146308935, size: 6876, font size: 984412
Mar 26, 2023 11:12:55 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:55 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:55 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:55 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:55 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
Mar 26, 2023 11:12:55 AM org.apache.fontbox.ttf.CmapSubtable processSubtype14
WARNING: Format 14 cmap table is not supported and will be ignored
2
Mar 26, 2023 11:14:45 AM org.icepdf.core.pobjects.Document setInputStream
WARNING: Cross reference deferred loading failed, will fall back to linear reading.
@ctoabidmaqbool
Copy link

I think, partial searching is working sometimes, like when write first character is entered then search happens. But when two or more characters are write for search purpose searching disappear on the pdf page.

@pcorless
Copy link
Owner

pcorless commented Apr 1, 2023

This is an interestingly encoded PDF. The text selection and search code use the same base layout code to determine glyph order. The system property org.icepdf.core.views.page.text.spaceFraction=1 improves the situation a bit but the results still aren't ideal. I'll need to take a closer look to figure out what's happening here withe auto space detection code.

@Muhammad-Muddasir
Copy link
Author

Hi @pcorless, I'm using ice-pdf version com.github.pcorless.icepdf:icepdf-core:7.0.2 and com.github.pcorless.icepdf:icepdf-viewer:7.0.2, other I generated pdf file by itext pdf version com.itextpdf:itextpdf:5.5.13.2

@pcorless
Copy link
Owner

I finally got back to this issue. I have a hunch that the landscape layout is throwing off the text sorting code. This is a really good test case for dealing with text that is layed out using the y coord instead of x.

@ctoabidmaqbool1
Copy link

Hi! Any Progress in this side, as searching in Very leandthy reports are very necessory feature!

I am facing issue in very latest icePdf library too e.g. 7.2.0.

The report generated using itextpdf 5.5.13.2. In both Potrait and Landscape same issue, I have handred of different report in my software still same issue in every report!

Should I have to make some sample repo to test the issue or this is alreay detected one!

@pcorless
Copy link
Owner

pcorless commented May 3, 2024

Sorry I haven't looked at this one in a while. I'll try and make some time for it as I do have some new ideas on how to solve this that came out of the redaction work.

@pcorless pcorless added the bug Something isn't working label May 3, 2024
@ctoabidmaqbool1
Copy link

As iText Pdf 5.x is very old one and 7.x I can't use due to license issue.

So I have trid to switch to foked version Open Pdf 2.x, which is still active and latest one!

In Open Pdf still same issue, e.g. Searching is not working fine, Also, Text selection is also not working fine!

image

Sample report saved through ice-pdf viewer (Orignal report is genered in memory)!

PurchaseReport-new.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants