Skip to content

internetarchive/fatcat

Repository files navigation

photo of a cat and cup of coffee, by Quinn Kampschroer [CC-0]

fatcat: Perpetual Access to the Scholarly Record

pipeline status coverage report

This repository contains source code for fatcat, an editable catalog of published research (mostly journal articles), with a focus on tracking the location and status of full-text copies on the public web, to ensure long term access. The primary public instance runs at fatcat.wiki. Both the software project and primary instance are a project of the Internet Archive.

Some resources for learning more about the aims, goals, and structure of this overall project:

Getting Started for Developers

There are three main components:

  • backend API server and database (in Rust)
  • API client libraries and bots (in Python)
  • front-end web interface (in Python; built on API and library)

The python/ and rust/ folders have their own READMEs describing how to set up development environments and requirements for those languages. Each also has Makefiles to help with builds and running tests.

The python client library, which is automatically generated from the API schema, lives under ./python_openapi_client/.

To do unified development involving both the python code (web interface, bot code) and the rust code (API server), you will likely need to install and run a PostgreSQL (11+) database locally. For more advanced development involving Kafka data pipelines or the metadata search index, there is a docker compose file in ./extra/docker/ to run these services locally.

Contributors are asked run all of the following and correct any (new) lint warnings before submitting patches:

make fmt
make lint
make test

It is very appreciated if new features and code comes with full test coverage, but maintainers can review code and help if this is difficult.

Contributing

Software, documentation, new bots, and other contributions to this repository are welcome! When considering a non-trivial contribution, it can save review time and duplicated work to jump in the chatroom or post an issue with your idea and intentions before starting work. Large changes and features usually have a design proposal drafted and merged first (see ./proposals/).

There is a public chatroom where you can discuss and ask questions at gitter.im/internetarchive/fatcat.

Contributors in this project are asked to abide by our Code of Conduct.

See the LICENSE file for detailed permissions and licensing of both python and rust code. In short, the auto-generated client libraries are permissively released, while the API server and web interface are strong copyleft (AGPLv3).

For software developers, the "help wanted" tag in Github Issues is a way to discover bugs and tasks that external folks could contribute to.

Thanks!

The "cat with coffee" photo at the top of this README is by Quinn Kampschroer, released under a CC-0 license (public domain).