Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception thrown when calling to_html on file with internal hyperlinks #142

Open
ycp3 opened this issue Oct 6, 2023 · 2 comments
Open
Labels

Comments

@ycp3
Copy link

ycp3 commented Oct 6, 2023

Describe the bug

undefined method `value' for nil:NilClass error thrown when calling to_html on a file with internal hyperlinks (hyperlinks to a bookmark or a heading within the file).

Backtrace:

docx (0.8.0) lib/docx/containers/text_run.rb:106:in `hyperlink_id'
docx (0.8.0) lib/docx/containers/text_run.rb:102:in `href'
docx (0.8.0) lib/docx/containers/text_run.rb:81:in `to_html'
docx (0.8.0) lib/docx/containers/paragraph.rb:48:in `block in to_html'
docx (0.8.0) lib/docx/containers/paragraph.rb:47:in `each'
docx (0.8.0) lib/docx/containers/paragraph.rb:47:in `to_html'
docx (0.8.0) lib/docx/document.rb:119:in `map'
docx (0.8.0) lib/docx/document.rb:119:in `to_html' 

According to here the anchor attribute is used instead of the id attribute for internal hyperlinks, breaking line 106 in text_run.rb.

To Reproduce

Open a docx file with a hyperlink to either a heading or a bookmark in the same file and call to_html.

example

require 'docx'

doc = Docx::Document.new('/path/to/your/docx/file_with_internal_hyperlink.docx')

doc.to_html

Sample docx file

https://docs.google.com/document/d/1H01zgmdC2LHAAwXAhmm6RyEz-lwbZm6R/edit?usp=sharing&ouid=103282161859668866778&rtpof=true&sd=true

Expected behavior

No exception thrown; html gets returned as normal.

Environment

  • Ruby version: 3.2.2
  • docx gem version: 0.8.0
  • OS: Alpine 3.17 docker container
@ycp3 ycp3 added the bug label Oct 6, 2023
@mateusg
Copy link

mateusg commented Oct 17, 2023

Hi @satoryu.
Any idea what could be happening here? I seem to be having a similar problem on any docx version bigger than 0.5.0.
0.5.0 and older versions just sanitize the hyperlinks and print the plain text.

I'm on Ruby 3.1.4, Ubuntu 20.04.

Backtrace:

undefined method `[]' for nil:NilClass
 @document_properties[:hyperlinks][hyperlink_id]
 ^^^^^^^^^^^^^^
docx-0.6.0/lib/docx/containers/text_run.rb:100:in `href'
docx-0.6.0/lib/docx/containers/text_run.rb:79:in `to_html'
docx-0.6.0/lib/docx/containers/paragraph.rb:48:in `block in to_html'
docx-0.6.0/lib/docx/containers/paragraph.rb:47:in `each'
docx-0.6.0/lib/docx/containers/paragraph.rb:47:in `to_html'

@satoryu
Copy link
Member

satoryu commented Oct 21, 2023

@ycp3 @mateusg Thank you for your reports.

I've just found out the root cause: this gem does not support internal links.
I would like to fix this issue but need time.

I seem to be having a similar problem on any docx version bigger than 0.5.0.
0.5.0 and older versions just sanitize the hyperlinks and print the plain text.

Yes, right.
Do you think that printing external links as sanitized text makes sense ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants