Parser for WARC (aka WebArchive) files
-
Updated
May 22, 2024 - C#
Parser for WARC (aka WebArchive) files
Save web pages as Safari webarchive files from the command line
Seeder - Czech webarchive curating tool and public site
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
A Rails engine supporting the discovery of web archives.
Docker image for the Archives Unleashed Toolkit
A toolkit for developing algorithms that sample mementos from a web archive collection.
Add-On for Google Sheets to help those working with web archives.
Repository for collecting scripts to help capture MyConvento newsroom press-releases from the MyConvento PR management suite. The README provides an analysis of the MyConvento URL architecture for users hoping to develop a solution for themselves.
A Python utility for publishing a social media story built from archived web pages to multiple services.
A library for interacting with web archive collections at Archive-It, Trove, Pandora, and more.
Rails application for the Archives Unleashed Cloud.
Links on the web break all the time, robustify them!
A dockerized, queued high fidelity web archiver based on Squidwarc
Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
Add a description, image, and links to the webarchives topic page so that developers can more easily learn about it.
To associate your repository with the webarchives topic, visit your repo's landing page and select "manage topics."