Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text references don't work with multiline lines #1430

Open
eliocamp opened this issue May 16, 2023 · 7 comments · May be fixed by #1432
Open

Text references don't work with multiline lines #1430

eliocamp opened this issue May 16, 2023 · 7 comments · May be fixed by #1432

Comments

@eliocamp
Copy link

eliocamp commented May 16, 2023

Text references don't seem to work if the text is split in multiple lines.

Example: Create a new bookdown project and go to 02-cross-refs.Rmd in the sample book and change the first figure to this:

(ref:caption) Here is a nice figure!

```{r nice-fig, fig.cap='(ref:caption)', out.width='80%', fig.asp=.75, fig.align='center', fig.alt='Plot with connected points showing that vapor pressure of mercury increases exponentially as temperature increases.'}
par(mar = c(4, 4, .1, .1))
plot(pressure, type = 'b', pch = 19)
```

As expected, text reference works and the caption is rendered where it should be:

image

Now add a new line:


(ref:caption) Here is a nice figure!
And this is a second sentence. 

```{r nice-fig, fig.cap='(ref:caption)', out.width='80%', fig.asp=.75, fig.align='center', fig.alt='Plot with connected points showing that vapor pressure of mercury increases exponentially as temperature increases.'}
par(mar = c(4, 4, .1, .1))
plot(pressure, type = 'b', pch = 19)
```

The text reference is still a single paragraph so, according to the documentation, it should still work. However, the ouput doesn't show the correct caption:

image

This seems to be an issue for HTML (gitbook and epub) and word formats. PDF renders correctly.

@eliocamp eliocamp changed the title Text references don't work with multiline paragraphs Text references don't work with multiline lines May 16, 2023
@mschilli87
Copy link

@eliocamp: Do you mind linking the documentation you mentioned? I always assumed this was by design. Either way, I can definitely confirm the behaviour you describe for as long as I remember using text references.

@eliocamp
Copy link
Author

eliocamp commented May 16, 2023

Here: https://bookdown.org/yihui/bookdown/markdown-extensions-by-bookdown.html#text-references it states that "The text can contain anything that Markdown supports, as long as it is one single paragraph."

In markdown (at least the flavour used by bookdown) sentences belonging to the same paragraph can be each in its own line. You need two newlines to define a new paragraph. (You can see that in the bad example, the text reference is rendered as a single paragraph).

Also, PDF output does work. This seems to be an issue for HTML (gitbook and epub) and word formats only.

image

@eliocamp
Copy link
Author

eliocamp commented May 16, 2023

The issue is around here:

bookdown/R/html.R

Lines 807 to 821 in 52c31aa

parse_ref_links = function(x, regexp) {
r = sprintf(regexp, reg_ref_links)
if (length(i <- grep(r, x)) == 0) return()
tags = gsub(r, '\\1', x[i])
txts = gsub(r, '\\2', x[i])
if (any(k <- duplicated(tags))) {
warning('Possibly duplicated text reference labels: ', paste(tags[k], collapse = ', '))
k = !k
tags = tags[k]
txts = txts[k]
i = i[k]
}
x[i] = ''
list(content = x, tags = tags, txts = txts, matches = i)
}

At this point x has one element per line instead of one element per paragraph; like this:

<p>(ref:caption) Here is a nice figure!
And this is a second sentence.</p>

So grep(r, x) doesn't find it (the line doesn't match r = "^<p>(\\(ref:[-/[:alnum:]]+\\)) (.+)</p>$"). The fix might be to join each p block in its own line, but I don't know if that could break other stuff.

Using --wrap none here actually solves the issue, but it might create others.

bookdown/R/html.R

Lines 80 to 82 in 52c31aa

pandoc_args2 = function(args) {
if (pandoc2.0() && !length(grep('--wrap', args))) c('--wrap', 'preserve', args) else args
}

@eliocamp
Copy link
Author

Ok. It seems that --warp=preserve was introduced to fix #504. But that's only related to LaTeX, which doesn't have this problem.
Removing this option only for HTML documents seems to solve the issue in HTML and all tests pass (at least locally on my machine).

That would just pass pandoc_args directly in all HTML functions instead of passing it through pandoc_args2(). For instance, in this line:

bookdown/R/html.R

Lines 59 to 62 in 52c31aa

config = get_base_format(base_format, list(
toc = toc, number_sections = number_sections, fig_caption = fig_caption,
self_contained = FALSE, lib_dir = lib_dir,
template = template, pandoc_args = pandoc_args2(pandoc_args), ...

As a workaround, one can set

  pandoc_args:
    - --wrap=none

on the output format options in the YAML.

Don't know how you feel about this. If you agree with the fix, I can sent a PR.

@cderv
Copy link
Collaborator

cderv commented May 16, 2023

Thanks for the investigation. I think we should have used --wrap=none for HTML output instead of --preserve.

This is part of change in HTML output with pandoc 2.17 where wrap has now effect.
I wanted to do that in #1304 but preserve was already set. You reminded us why (#504)

@eliocamp eliocamp linked a pull request May 24, 2023 that will close this issue
@king-of-poppk
Copy link

I've got a similar issue when embedding \n's in fig.cap: the Figure x.y caption prefix is not resolved. Is this related or is it a separate issue?

@cderv
Copy link
Collaborator

cderv commented Oct 11, 2023

@king-of-poppk please open a new issue with a reproducible example and we can have a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants