Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Epub files without an NCX table of contents throw error when opening in browser #525

Open
dunxd opened this issue Apr 19, 2023 · 5 comments

Comments

@dunxd
Copy link

dunxd commented Apr 19, 2023

When trying to open some epub files in the browser viewer, I get a blank page and the following error in the log:

[Wed Apr 19 16:47:37 2023] PHP Fatal error:  Uncaught Error: Call to a member function attr() on null in /cops/resources/php-epub-meta/lib/EPub.php:79
Stack trace:
#0 /cops/epubreader.php(27): EPub->initSpineComponent()
#1 {main}
  thrown in /cops/resources/php-epub-meta/lib/EPub.php on line 77

This is the third line in the initSpineComponent() function in EPub.php that sets the $tochref variable:

public function initSpineComponent()
    {
        $spine = $this->xpath->query('//opf:spine')->item(0);
        $tocid = $spine->getAttribute('toc');
        $tochref = $this->xpath->query('//opf:manifest/opf:item[@id="' . $tocid . '"]')->item(0)->attr('href');
        $tocpath = $this->getFullPath($tochref);
        // read epub toc
        if (!$this->zip->FileExists($tocpath)) {
            throw new Exception('Unable to find ' . $tocpath);
        }

        $data = $this->zip->FileRead($tocpath);
        $this->toc = new DOMDocument();
        $this->toc->registerNodeClass('DOMElement', 'EPubDOMElement');
        $this->toc->loadXML($data);
        $this->toc_xpath = new EPubDOMXPath($this->toc);
        $rootNamespace = $this->toc->lookupNamespaceUri($this->toc->namespaceURI);
        $this->toc_xpath->registerNamespace('x', $rootNamespace);
    }

After some trial and error I found this error is thrown when opening EPub files that do not have an NCX table of contents. When editing the table of contents of an EPub in Calibre, an NCX is created if one doesn't already exist - one doesn't need to actually make any changes to the table of contents - just click Ok after opening the Edit Table of Contents dialog.

NCX seems to be an old ToC system that Calibre creates for backwards compatibility, when it creates a ToC. The PHP EPub Meta library used hasn't been maintained in a while. Perhaps NCX was normally used at that point.

I'm investigating whether Calibre can be set up to create the NCX ToC on importing. I also found a more recent fork of PHP EPub Meta that may not have this problem.

@dunxd
Copy link
Author

dunxd commented Apr 19, 2023

If Calibre is set to convert to EPub v3 files, then the NCX is missing. If it is set to convert to EPub v2 files, then the NCX is present instead of nav.xhtml.

I have not yet found a way in Calibre to convert to EPub v3 files including an NCX, which is allowed for compatibility. Editing ToC in Calibre does generate the NCX in EPub v3 files.

So there are two workarounds:

  1. Use only EPub v2 in Calibre
  2. Edit the ToC of EPub v3 files in Calibre.

mikespub added a commit to mikespub-org/seblucas-cops that referenced this issue Jun 7, 2023
@mikespub
Copy link

mikespub commented Jun 8, 2023

The fix above should allow you to view EPUB 3 files via the browser viewer in COPS, without needing to edit the TOC in Calibre.

That being said, the underlying "monocle" library hasn't been updated in 10 years and there's no filtering of content from the EPUB file, so there are security risks if you can't trust the origin of the ebook you want to view via browser.

See https://github.com/joseph/Monocle/wiki/EPUB-and-other-package-formats for details

@dunxd
Copy link
Author

dunxd commented Jun 8, 2023

Thanks,
I tried raising the issue on Mobile Read. I wish Calibre would just fix their conversion of EPubs to add the NCX ToC when automatically converting to ePub v3 instead of making it a manual only option. I'd prefer this to happen. Till then I'm going to stick with ePub v2.

@dunxd
Copy link
Author

dunxd commented Jun 8, 2023

Actually, I take that back as I didn't understand your comment or what you linked to first time around.

What Monocle's wiki is saying is that using ePub v3 in general is a security risk compared to v2 - this has nothing to do with the NCX issue but more to do with ePub v3 allowing JavaScript that could be malicious. I guess that could be a worry if obtaining ePub files from the darkweb, although it could also be FUD from 10 years ago.

The fix you made allows COPS' web reader to open ePub v3 files in the browser, and hopefully browsers that support javascript today are at no more risk from Javascript in ePubs than any other web page they open.

In other words - thanks for fixing this!!!

@mikespub
Copy link

Included in release 1.3.4 at https://github.com/mikespub-org/seblucas-cops

mikespub added a commit to mikespub-org/php-epub-meta that referenced this issue Sep 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants