Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF::Inspector::Text.analyze does not return strings within repeat block #25

Open
ruinunes opened this issue May 31, 2017 · 7 comments
Open

Comments

@ruinunes
Copy link

ruinunes commented May 31, 2017

When using PDF::Inspector::Text.analyze(some_pdf).strings, it does not return strings that were added via

repeat(1..page_count) do
  text_box "Hello", at: [0, bounds.top]
end

Is there a way to get a hold of these strings?

@pointlessone
Copy link
Member

Hi Rui,

Could you please provide a minimal example that demonstrates the issue?

@ruinunes
Copy link
Author

The minimal example is pretty much as above.
In a prawn document, if we write text_box "Hello", at: [0, bounds.top], the Hello string can be displayed when outputting with for example puts PDF::Inspector::Text.analyze(some_pdf).strings. However, if the text_box or text are sitting inside a repeat block, then the string will not be available in the strings array, when inspecting via PDF::Inspector::Text.analyze(some_pdf).strings.

text_box "Hello A", at: [0, bounds.top]
repeat(1..page_count) do
  text_box "Hello B", at: [100, bounds.top]
end

Hello B will not be available in the strings collection. Only Hello A.

@ruinunes
Copy link
Author

ruinunes commented Jun 1, 2017

@pointlessone what are the requirements for a minimal example?

@pointlessone
Copy link
Member

A good example should be:

  • Complete — someone could copy provided code into a file, run it and it would demonstrate your issue. It also means that if your issue requires any other resources (e.g. fonts, images, etc.) they should be provided as well. Also make sure you povide relevant information about your environment: Prawn version (latest release is implied, but not always the case), OS (including version), Ruby engine and version, etc.
  • Minimal — the code should include the minimal number of instructions required to demonstrate the issue. Please don't post 4k lines of your production code for someone else to sift through. That is of course unless that's what required to demonstrate the issue. Time is the most constrained resource on any OSS projects. The more work you do, the less there's left for someone else. It's more likely someone would be willing to invest smaller amount of time. Higher chance your issue would be resolved sooner.

Don't get me wrong. I (and everyone else) appreciate that you're reporting issues. That is a valuable contribution. Sometimes that is all people can contribute. But is they can do more it helps a lot.

@JoelWAnna
Copy link

failing test example


    it 'includes footer in analysed text' do
        pdf = Prawn::Document.new
        page_1 = 'First Page'
        page_2 = 'Second Page'
        pdf.text page_1, align: :center
        pdf.start_new_page
        pdf.text page_2, align: :center

        pdf.repeat(:all) do
          pdf.text_box 'test'
        end

        rendered_pdf = pdf.render
        page_analysis = PDF::Inspector::Page.analyze(rendered_pdf)
        expect(page_analysis.pages.size).to eq 2
        strings = PDF::Inspector::Text.analyze(rendered_pdf).strings
        expect(strings).to eq [page_1, 'test', page_2, 'test']

    end

@cupofjoakim
Copy link

Having the same issue here.

  it "contains client details" do
    text_analysis = PDF::Inspector::Text.analyze(@pdf.render)
    content_string = text_analysis.strings.join(" ")
    expect(content_string).to include("Client Name")
  end

And the expected string is in a header that is repeated on all pages:```

  def header(header_height)
    repeat :all do
      # Header
      bounding_box([0, bounds.top], width: 180, height: header_height) do
        move_down 8
        font_size 9
        text "Client name"
      end
    end
  end

@henrik
Copy link
Contributor

henrik commented Jan 4, 2018

I'm seeing the same with create_stamp/stamp. Worked fine with pdf-inspector 1.0.2 and pdf-reader 1.3.3. But fails (the text is not included) with pdf-inspector 1.3.0 and pdf-reader 2.0.0.

The underlying issue/change in behaviour seems to be with pdf-reader. Issue here: yob/pdf-reader#268

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants