Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WACZ Spec: Better metadata about software used to create the WACZ #127

Open
ikreymer opened this issue Aug 29, 2022 · 0 comments
Open

WACZ Spec: Better metadata about software used to create the WACZ #127

ikreymer opened this issue Aug 29, 2022 · 0 comments
Assignees

Comments

@ikreymer
Copy link
Member

Add a way to include additional metadata about how a particular WACZ file was created, in case some does want to try to recreate the result, and the data is still available (though no general guarantees on reproducability are possible)

We do have the software field in datapackage.json, but perhaps that should be expanded.

  • For Browsertrix Crawler, does it make sense to include the hash of the image?
  • For Browsertrix Crawler, should we also include the full crawl config + crawl params?
  • For ArchiveWeb.page, probably just the version? (Could also include the git commit)
  • For py-wacz, probably also just the version? (Could also include the git commit)

Are there any existing standard approaches to 'software citation' for this that we could extend?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants