Skip to content
@occrp-attic

Unmaintained OCCRP Projects

Popular repositories

  1. convert-document convert-document Public archive

    A docker container for LibreOffice and unoconv, used to generate PDF files from office-type documents.

    Python 65 49

  2. datacommons datacommons Public archive

    A fleet of Memorious scrapers for crawling various open data sources

    Python 15 4

  3. loom loom Public

    Weaving SQL databases into graph data.

    Python 9

  4. exactitude exactitude Public archive

    Parsing and normalising for identifying text data (emails, domains, phone numbers, dates). Combines external libraries into a coherent API.

    Python 8 2

  5. aleph-ui aleph-ui Public archive

    Front-end application for the Aleph data search engine, based on React/Redux and the Aleph API.

    JavaScript 8 1

  6. spindle spindle Public

    Front-end application for the loom graph pipeline

    Python 7 1

Repositories

Showing 10 of 24 repositories
  • urlnormalizer Public archive Forked from sunu/url-normalizer

    Normalize URLs, works with Python 2 and 3

    Python 5 MIT 2 2 0 Updated May 5, 2021
  • aleph-helm-charts Public archive

    Helm charts for Aleph

    Smarty 2 0 0 0 Updated Jan 14, 2021
  • convert-document Public archive

    A docker container for LibreOffice and unoconv, used to generate PDF files from office-type documents.

    Python 65 MIT 49 1 0 Updated Jan 6, 2021
  • datacommons Public archive

    A fleet of Memorious scrapers for crawling various open data sources

    Python 15 4 7 0 Updated Sep 24, 2020
  • extract-entities Public archive

    Service for extracting named entities from text fragments

    Python 1 MIT 0 0 0 Updated Mar 4, 2019
  • recognize-text Public archive

    A Tesseract 4 gRPC service container for optical character generation

    Python 5 MIT 0 0 0 Updated Mar 4, 2019
  • storagelayer Public archive

    Content-addressable storage for files across S3 and local file systems

    Python 7 MIT 4 0 0 Updated Dec 22, 2018
  • deduper Public archive

    A minimal flask app to let folks deduplicate possible matches generated by the company enrichment process

    Python 1 0 0 0 Updated Nov 21, 2018
  • platform Public archive

    Docker base image for Aleph and Ingestors

    Makefile 3 2 0 0 Updated Jul 15, 2018
  • exactitude Public archive

    Parsing and normalising for identifying text data (emails, domains, phone numbers, dates). Combines external libraries into a coherent API.

    Python 8 MIT 2 0 0 Updated Apr 30, 2018

Top languages

Loading…

Most used topics

Loading…