epub: fix fatal errors while parsing EPUB files #1854

milmazz · 2024-01-26T03:40:59Z

After generating the EPUB file for the Elixir docs with this version, and reviewing the result with epubcheck, I got the following summary:

$ epubcheck doc/elixir/Elixir.epub --json elixir_docs.json
Check finished with errors
Messages: 0 fatals / 140 errors / 0 warnings / 0 infos

If you compare the previous result with what we had on #1851

Messages: 9 fatals / 425 errors / 0 warnings / 0 infos

you can see that now we don't have messages with fatal severity and we have reduced considerably the number of errors =)

I manually checked the generated EPUB on Apple Books and the previous truncated sections are fixed, I don't see the banner Below is a rendering of the page up to the first error and also the links to different anchors seems to work.

Fixes: #1851

After generating the EPUB file for the Elixir docs with this version, and reviewing the result with `epubcheck`, I got the following summary: ```console $ epubcheck doc/elixir/Elixir.epub --json elixir_docs.json (base) Check finished with errors Messages: 0 fatals / 141 errors / 0 warnings / 0 infos ``` If you compare the previous result with what we had on #1851 ``` Messages: 9 fatals / 425 errors / 0 warnings / 0 infos ``` you can see that now we don't have messages with `fatal` severity and we have reduced considerably the number of errors =) I manually checked the generated EPUB on Apple Books and the previous truncated sections are solved, I don't see the banner _Below is a rendering of the page up to the first error_ and also the links to anchor different anchor seems to work. Fixes: #1851

milmazz · 2024-01-26T03:43:56Z

lib/ex_doc/formatter/epub.ex

+    |> String.replace(~r{id="&+/\d+[^"]*}, &String.replace(&1, "&", "&amp;"))
+    |> String.replace(~r{href="[^#"]*#&+/\d+[^"]*}, &String.replace(&1, "&", "&amp;"))


I frowned a little with these nested String.replace. So, please let me now if you have any advice on how to improve this function.

@wojtekmach I though we had already escaped those when generating the links. Maybe this is something (or an option) we can pass when autolinking? The id we can fix by escaping in the document itself.

milmazz · 2024-01-26T03:44:51Z

test/fixtures/README.md

+
+The following text includes a reference to an anchor that causes problems in EPUB documents.
+
+To remove this anti-pattern, we can replace `&&/2`, `||/2`, and `!/1` by `and/2`, `or/2`, and `not/1` respectively.


Added this line to demonstrate that we're transforming the links to problematic anchors in EPUB files.

milmazz commented Jan 26, 2024

View reviewed changes

milmazz added 2 commits January 25, 2024 21:57

Merge branch 'main' into epub/fix-fatal-errors

014d258

fix nav layout

146388d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

epub: fix fatal errors while parsing EPUB files #1854

epub: fix fatal errors while parsing EPUB files #1854

milmazz commented Jan 26, 2024 •

edited

milmazz Jan 26, 2024

josevalim Jan 26, 2024

milmazz Jan 26, 2024

		\|> String.replace(~r{id="&+/\d+[^"]*}, &String.replace(&1, "&", "&"))
		\|> String.replace(~r{href="[^#"]#&+/\d+[^"]}, &String.replace(&1, "&", "&"))


		The following text includes a reference to an anchor that causes problems in EPUB documents.

		To remove this anti-pattern, we can replace `&&/2`, `\|\|/2`, and `!/1` by `and/2`, `or/2`, and `not/1` respectively.

epub: fix fatal errors while parsing EPUB files #1854

Are you sure you want to change the base?

epub: fix fatal errors while parsing EPUB files #1854

Conversation

milmazz commented Jan 26, 2024 • edited

milmazz Jan 26, 2024

Choose a reason for hiding this comment

josevalim Jan 26, 2024

Choose a reason for hiding this comment

milmazz Jan 26, 2024

Choose a reason for hiding this comment

milmazz commented Jan 26, 2024 •

edited