Skip to content

Releases: Conal-Tuohy/APIHarvester

Exponential back-off

08 Jul 03:06
Compare
Choose a tag to compare

In this release, when a request returns an error, and is retried (potentially several times, depending on the value of the retries parameter), the pause between each retry of the same URI is doubled each time. The first retry is made after a 5s wait, the second after a 10s wait, the third after a 20s wait, etc.

APIHarvester with filtering and enhanced resumption control

06 May 15:16
Compare
Choose a tag to compare

This release incorporates the resume-when-xpath and discard-xpath parameters. The first is a boolean expression which controls whether APIHarvester will resume harvesting after receiving a response from the API. The second is an XPath expression which matches elements in the response which the harvester can ignore.

V1.2

30 Dec 08:34
Compare
Choose a tag to compare

This version includes the ability to throttle requests by specifying a number of seconds delay between each request.

This version also allows for the resumptionXPath to return a string rather than just a nodeset. This means the resumption URL can be assembled from parts, rather than simply read out of the harvested document, which allows APIHarvester to harvest from less RESTful APIs such as OAI-PMH.

namespace-aware version

30 Dec 07:05
Compare
Choose a tag to compare

This version adds the ability to bind XML Namespace prefixes to URIs, and to use those prefixes to refer to namespaces in the XPath expressions used to control APIHarvester.

There is an additional option to indent harvested XML.

First release

29 Dec 14:31
Compare
Choose a tag to compare

The initial release of APIHarvester is a command-line application written in Java, which allows for easy harvesting bulk XML records from web APIs such as those of the National Library of Australia's Trove service, and the National Library of New Zealand's Digital NZ.