Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are there appropriate tools for processing and preservation of specific formats that we do not have listed? #28

Open
eengland opened this issue Jun 30, 2020 · 2 comments

Comments

@eengland
Copy link
Collaborator

No description provided.

@masinter
Copy link

Emulation of hardware or virtual machines for running OS/Device/software platform is increasing in viability for preserving ALL formats and necessary for software preservation.

@lljohnston
Copy link
Collaborator

You are right that it is increasing in viability, especially with projects like EaaSi (https://www.softwarepreservationnetwork.org/emulation-as-a-service-infrastructure/), or the work that the Internet Archive has put into in-browser game emulation using JSMESS and EM-DOSBOX (https://help.archive.org/hc/en-us/articles/360004715631-The-Internet-Arcade). Libraries, archives, and museums are absolutely taking advantage of emulation.

There are factors that organizations such as ours have to consider.

NARA started acquiring born-digital electronic records in 1970. We keep those records in their original formats to ensure that we have authentic records (and create public use copies for our catalog where we can), which means that we have over 1,000 variants of formats produced in 50 years worth of operating system and software environments that we cannot always identify with certainty because agencies may have already held onto those records for for 5-10-15-30+ years before they come to us as per their disposition schedules. That's a lot of environments and software packages to emulate, and there are definitely not existing emulators for all of them at this point.

It takes resources to document environments for formats, license the necessary software for the emulations, build the individual emulation environments, and maintain and grow an environment for current and future formats. We have to operate under a long-term preservation mandate, which according to our regulations is as long as the U.S. government exists. At our scale we have over 2 billion files, which makes for a lot to manage for decades, if not centuries. And as a federal agency, we fall under federal IT regulations and practices, which means that the process of acquiring and integrating ANY technology requires an extensive review. We may not be able to integrate everything that has been developed in the community.

I'd like to see a future where we make use of emulation in some way for processing of records and public access, but it's not as practical as we want right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants