Skip to content

Releases: danburzo/percollate

v4.2.0

06 May 18:36
Compare
Choose a tag to compare

New features

Compress entries in the EPUB file with the DEFLATE algorithm at maximum level of compression (#169).

v4.1.1

04 May 13:53
Compare
Choose a tag to compare

New features

Adds the --toc-level=<level> option. By default, the table of contents is a flat list of article titles. With the --toc-level option the table of contents will include headings under each article title (<h2>, <h3>, etc.), up to the specified heading depth. A number between 1 and 6 is expected. Using --toc-level with a value greater than 1 implies --toc.

v4.0.5

18 Jan 23:11
Compare
Choose a tag to compare

Bug fixes

v4.0.4

25 Nov 18:23
Compare
Choose a tag to compare

Bug fixes

v4.0.3

26 Sep 09:16
Compare
Choose a tag to compare

Bug fixes

  • Fixes usage of mdast-util-gfm to allow serializing HTML <table> elements to Markdown when using percollate md (#161)

v4.0.2

07 Mar 22:29
Compare
Choose a tag to compare

Bug fixes

Further improvements to detecting and bundling images re: #141 (which should have really been part of v4.0.1, had the necessary insight not manifested exactly five seconds after publishing said version).

v4.0.1

07 Mar 21:37
Compare
Choose a tag to compare

Bug fixes

Thank you @vongrad for contributing two fixes to this release:

  • Fixes regression in --inline failing to base64-encode images (#154)
  • Fixes heuristic in imagesAtFullSize() DOM enhancement to exclude non-English Wikipedia URLs that look like they point to images but are in fact HTML pages (eg. wiki/File: URLs in English) (#156, #141)

v4.0.0

20 Feb 14:06
Compare
Choose a tag to compare

Breaking changes

This release changes how Percollate interprets operands (See #150): when no operand is provided, an implicit - (stdin) is assumed. This makes it nicer to pipe data into percollate from an external tool.

Although not part of the public API, Percollate's logging has largely shifted from stdout to stderr, to allow html and md to be piped to an external tool.

New features

  • Support for Markdown output with percollate md (#93)
  • html and md commands can output to stdout with the -o - / --output=- flag (#150). When used in combination with the --individual flag, all results are concatenated to stdout.

v3.0.0

17 Feb 14:43
Compare
Choose a tag to compare

⚠️ Breaking changes

Node 14 required

Node.js 14.17 or later is required to run Percollate 3.0.0. Users on Node.js 12.x can continue using Percollate 2.x by installing it with:

npm install -g percollate@2

Programmatic API breaking changes

Note: The programmatic API is not currently part of the public, documented API.

fetchContent(), which used to return the page content as a string decoded to 'utf-8', will now return an object of the shape { buffer: ArrayBuffer, contentType: string? }. Consequently, calls to pdf(), epub() and html() will return on the .originalContent this new structure as well. See Programmatic API migration for details below.

New features

Experimental Firefox support for PDF rendering

Added experimental Firefox Nightly support for rendering PDFs, via the percollate pdf --browser=firefox option. To fetch Firefox Nightly, perform the following installation steps:

# fetches Chrome
npm install -g percollate

# fetches Firefox Nightly
PUPPETEER_PRODUCT=firefox npm install -g percollate

Bug fixes

Better default styles for code blocks with the tab-size: 2 CSS property.

Migration

Programmatic API migration

Note: The programmatic API is not currently part of the public, documented API.

In general, an ArrayBuffer can be converted to a String with the TextDecoder class available in Node.js. In case the content uses a different encoding than the default utf-8, you can use the whatwg-mimetype and html-encoding-sniffer packages (on which jsdom already depends) to obtain the content's encoding:

import { TextDecoder } from 'node:util';
import htmlEncodingSniffer from 'html-encoding-sniffer';
import MimeType from 'whatwg-mimetype';

const { buffer, contentType } = await fetchContent(...);

const encoding = contentType
	? new MimeType(contentType).parameters.get('charset')
	: undefined;

const str = new TextDecoder(
	htmlEncodingSniffer(buffer, {
		transportLayerEncodingLabel: encoding
	})
).decode(buffer);

v2.2.2

27 Jan 13:13
Compare
Choose a tag to compare

Bug fixes

  • Duplicate file names are now given a numeric suffix to avoid one overwriting the other (#144)