Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

saveWidget doesn't specify a <title> element #289

Closed
cpsievert opened this issue Nov 1, 2017 · 11 comments
Closed

saveWidget doesn't specify a <title> element #289

cpsievert opened this issue Nov 1, 2017 · 11 comments

Comments

@cpsievert
Copy link
Collaborator

And, as consequence, produces unsavory results with pandoc >= 2.0

m <- DT::datatable(mtcars)
htmlwidgets::saveWidget(m, "index.html")
#> [WARNING] This document format requires a nonempty <title> element.
#> Please specify either 'title' or 'pagetitle' in the metadata.
#> Falling back to 'index'

browseURL("index.html")

htmlwidgets:::.pandoc$version
#> ‘2.0.0.1’
@BillDunlap
Copy link

BillDunlap commented Nov 2, 2017

I noticed the same thing with jjallaire/sigma. I installed pandoc-1.19.2.1 and pandoc-2.0.1 under C:\Pandoc and switched between them by fiddling with htmlwidgets:::.pandoc. When using
saveWidget(selfcontained=TRUE) with pandoc-2.0.1 I got a warning and mangled html file.

> z <- sigma::sigma(system.file("examples/ediaspora.gexf.xml", package = "sigma"))
> htmlFilename <- tempfile(fileext=".html")
> w <- htmlwidgets:::saveWidget(z, file=htmlFilename, selfcontained=TRUE)
[WARNING] This document format requires a nonempty <title> element.
  Please specify either 'title' or 'pagetitle' in the metadata.
  Falling back to 'file2a643ec14d9f'
> writeLines(substring(readLines(htmlFilename, n=10), 1, 50))
&lt;!DOCTYPE html&gt;
<html>
<head>
<meta charset="utf-8" />
<script>(function() {
  // If window.HTMLWidgets is already defined, the
  // new object. This allows preceding code to set
  // initialization process (though none currently
  window.HTMLWidgets = window.HTMLWidgets || {};

Is this because htmlwidgets:::pandoc_self_contained_html claims to pandoc
that its html input file is a markdown file? Pandoc-2.0 seems to balk at this.

  # convert from markdown to html to get base64 encoding
  # (note there is no markdown in the source document but
  # we still need to do this "conversion" to get the
  # base64 encoding)
  pandoc_convert(
    input = input,
    from = "markdown",
    output = output,
    options = c(
      "--self-contained",
      "--template", template
    )
  )

@ramnathv
Copy link
Owner

ramnathv commented Nov 3, 2017

Fixing the title requirement is straightforward. We can explicitly allow users to specify a title in saveWidget. What should we default it to? Any thoughts @cpsievert.

@BillDunlap based on the output you posted, it seems like pandoc is escaping html tags. I will try to reproduce this with pandoc 2.0.1 and the latest version of htmlwidgets.

@BillDunlap
Copy link

BillDunlap commented Nov 3, 2017 via email

@ramnathv
Copy link
Owner

ramnathv commented Nov 3, 2017

@BillDunlap pandoc is mainly used to inline all dependencies recursively. Doing this in R will take some effort.

@kiefersmith
Copy link

I am also having this issue. Any updates?

@BillDunlap
Copy link

BillDunlap commented Nov 13, 2017 via email

@kiefersmith
Copy link

I seem to have circumvented the issue via htmltools::save_html(). My issue seems less involve than what others are getting into. Still interesting to dive deeper on some of these functions.

@jcheng5
Copy link
Collaborator

jcheng5 commented Nov 14, 2017

I'm going to submit a PR for this (the original issue) today. My proposal is that saveWidget will get a new title parameter that defaults to class(widget)[[1]]. Any objections to that?

@jjallaire
Copy link
Collaborator

jjallaire commented Nov 14, 2017 via email

@jcheng5
Copy link
Collaborator

jcheng5 commented Nov 15, 2017

Turns out this is significantly more annoying than we originally thought.

The htmlwidgets::pandoc_self_contained_html function tells pandoc that the input format is markdown. But the actual file is a well-formed .html file, generated by htmltools::save_html. This has two problems.

  1. Any <title> in the input document is ignored (you still get the warning); instead Pandoc wants you to add a yaml header to the input HTML file with a non-empty title or pagetitle field, which then is not actually used in our pandoc template (because its entire contents is $body$).
  2. The <!DOCTYPE html> at the top of the input HTML file (rightfully put there by htmltools::save_html) becomes escaped in the output HTML: &lt;!DOCTYPE html&gt;. This behavior is different than pandoc 1.x.

Rather than try to feed pandoc a well-formed HTML document, I think it's a better idea to fork the body of htmltools::save_html, which isn't much anyway, and more carefully generate an input file and template that both pandoc 1.x and 2.0 can be happy with.

@cpsievert
Copy link
Collaborator Author

This was fixed by #292

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants