Can we get all PDF data into the String variable, instead of getting data page by page? #8

Phannd7 · 2016-10-12T01:19:48Z

Hi a. Tho,

Currently, I'm using "get" method to get PDF data from specific page. I wonder that can we get all PDF data at once instead of getting data page by page like that?
My code:

public static int rowNumberOfPDFFile(String pdfLink, int pagePDFNumber) throws IOException {
PDFTableExtractor extractor = new PDFTableExtractor();
List

tables = extractor.setSource(pdfLink).extract();
// get date from page 1 to String html. Page number starts from 0
String html = tables.get(pagePDFNumber).toHtml();

    html = html.substring(html.indexOf("border='1'>") + 11);
    int rowNumber = org.apache.commons.lang3.StringUtils.countMatches(html, "/tr");
    return rowNumber;
}

I would like to get all PDF data into "html" field. Could you please help?

Thanks,
Phan Nguyen

The text was updated successfully, but these errors were encountered:

thoqbk · 2016-10-14T14:14:47Z

Hi Phan Nguyen,

I think you can do it by getting the html content of tables in all pages
then use html parser such as Jsoup to parse table content and put them all
together. Or you can also loop through all table models which are result of
PDFTableExtractor.extract().

Sorry for my late reply.

Regards,
Tho Q Luong

2016-10-12 9:19 GMT+08:00 Phannd7 notifications@github.com:

Hi a. Tho,

Currently, I'm using "get" method to get PDF data from specific page. I
wonder that can we get all PDF data at once instead of getting data page by
page like that?
My code:

public static int rowNumberOfPDFFile(String pdfLink, int pagePDFNumber)
throws IOException {
PDFTableExtractor extractor = new PDFTableExtractor();
List
tables = extractor.setSource(pdfLink).extract();
// get date from page 1 to String html. Page number starts from 0
String html = tables.get(pagePDFNumber).toHtml();
html = html.substring(html.indexOf("border='1'>") + 11);
int rowNumber = org.apache.commons.lang3.StringUtils.countMatches(html, "/tr");
return rowNumber;
}

I would like to get all PDF data into "html" field. Could you please help?

Thanks,
Phan Nguyen

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#8, or mute the thread
https://github.com/notifications/unsubscribe-auth/ABbAn2ZzaPOdx0HXzydDbJO0nisZvldnks5qzDW2gaJpZM4KURI4
.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we get all PDF data into the String variable, instead of getting data page by page? #8

Can we get all PDF data into the String variable, instead of getting data page by page? #8

Phannd7 commented Oct 12, 2016

thoqbk commented Oct 14, 2016

Can we get all PDF data into the String variable, instead of getting data page by page? #8

Can we get all PDF data into the String variable, instead of getting data page by page? #8

Comments

Phannd7 commented Oct 12, 2016

thoqbk commented Oct 14, 2016