Skip to content
This repository has been archived by the owner on Mar 1, 2023. It is now read-only.

Avoid breakage of URLs in plain text representations of quarterly reports #530

Open
grahamperrin opened this issue Oct 13, 2022 · 7 comments

Comments

@grahamperrin
Copy link
Contributor

grahamperrin commented Oct 13, 2022

An example:

MicrosoftAzure

image

Another example of breakage within the same report:

image


An example from the previous report, four URLs broken in the report from The FreeBSD Foundation:

image

– archived at https://markmail.org/message/wu5qp6kwmxajcxq5 (there's coincidental additional breakage, which is beyond the scope of this issue #530).


@grahamperrin
Copy link
Contributor Author

@sergio-carlavilla @Paeps @bapt I guess, use of angle brackets can (or should) prevent breakage.

The test e-mail archived at https://lists.freebsd.org/archives/freebsd-test/2022-October/000138.html was primarily for a different purpose, it seems to also demonstrate a long (non-broken) line, with use of angle brackets to enclose the URL:

image

  • background (top left), my sent copy of the e-mail
  • foreground, the copy archived by FreeBSD.

RFC 3986 - Uniform Resource Identifier (URI): Generic SyntaxAppendix C. Delimiting a URI in Context

grahamperrin added a commit to grahamperrin/freebsd-quarterly that referenced this issue Oct 13, 2022
The manual page for lsof is in section 8, not 1. 

The + for a hard line break was superfluous, where the next line was blank. 

<https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=266881> aims to raise awareness of macros such as package and man. 

Whilst the Link URL within this report is not part of a long sentence, the markup in this pull request might be good in the context of <freebsd#530>. 

Whilst here, a handful of suggested changes.
@grahamperrin
Copy link
Contributor Author

… use of angle brackets can (or should) prevent breakage. …

From https://docs.asciidoctor.org/asciidoc/latest/macros/autolinks/#url-schemes-for-autolinks, with added emphasis:

… In plain text documents, a bare URL is often enclosed in angle brackets.

You'll often see <https://example.org> used in examples.

To accommodate this convention, the AsciiDoc processor will still recognize the URL as an autolink, but will discard the angle brackets in the output (as they are not deemed significant). …

So, for example:

image

  • to the left, raw, with angle brackets
  • to the right, previewed by GitHub.

Given the discarding of angle bracket enclosures – in that context (the window to the right) – now I can't figure out whether discards will occur (and lead to breakage) in the context of e-mail …

@grahamperrin
Copy link
Contributor Author

Hmm. Until a few minutes ago, I assumed some use of a2x(1) (ref: https://asciidoc-py.github.io/chunked/ch05.html#X43 for production of plain texts.

Since experimenting, I imagine that some other routine is followed (not involving a2x) …

@pauamma
Copy link
Contributor

pauamma commented Oct 13, 2022

a2x seems to be geared toward DocBook XML intermediate stages, but we no longer use that at all. (It used to be the doc source format.)

@lsalvadore
Copy link
Member

Angle brackets seem nice to avoid breaking links, however I doubt they really fix any of our problems:

  • some links are just too long to stay on a line. Would angle brackets help for the link https://www.freebsd.org/status/report-2022-04-2022-06/#_freebsd_on_microsoft_hyperv_and_azure suggested above for example? I fear they would not.

  • if I understand correctly, angle brackets help in the web rendering of adoc files. Mails are generated from the web rendered page at a later stage and lines are broke at 80 columns using w3m -cols 80: I think this is what breaks the links and using angle brackets would have no effect unfortunately since angle brackets would not appear in the web rendered version.

The fix I propose: we introduce a new rule to enforce that each link corresponds to some short text as in link:https://github.com/lsof-org/lsof[lsof project repo]. This requires to remove the URL: fields that we have for all links in the top of the reports. At the moment these fields are used to add links to the mail version, so if we remove them the generation of the mail version of the quarterly reports needs to be done differently. Such a change would require me some time, so I suggest that for the 2022q3 quarter we simply do as usual and I try to rewrite the mail version generation later (hopefully for 2022q4).

Does that sound good?

grahamperrin added a commit to grahamperrin/freebsd-quarterly that referenced this issue Oct 14, 2022
Resolve <freebsd#531 (review)>. 

Attention to <freebsd#530 (comment)>, 

> … link: … URL: … At the moment these fields are used to add links to the mail version, …
@grahamperrin
Copy link
Contributor Author

Earlier, after realising that a2x was not the answer, I had taken a hint from:

w3m -dump -cols 2147483647 https://www.freebsd.org/status/report-2022-01-2022-03/ > /tmp/2022q1.txt && less /tmp/2022q1.txt

… and so on.


#530 (comment)

Does that sound good?

All good, thanks. No rush (I'm acutely aware that we're around two weeks past the deadline for submissions).

In the meantime, I'll probably continue to experiment …

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants