Skip to content
@pdfliberation

PDF Liberation

A commons for the work of liberating data from PDF files

Popular repositories

  1. knowledge knowledge Public

    A place to collect and share knowledge about liberating data from PDFs

    Shell 53 7

  2. whatwordwhere whatwordwhere Public

    Forked from jsfenfen/whatwordwhere

    Tooling to extract data from scanned paper forms OCR-ed by Tesseract using the HOCR standard.

    Python 22 5

  3. pdf-hackathon pdf-hackathon Public

    Resources related to PDF Liberation hackathon

    12 11

  4. pdf_table_extraction pdf_table_extraction Public

    experimenting with pdf2text and python pdf-table-extract

    JavaScript 11 3

  5. Jersey-City-Budget-PDF-Liberation Jersey-City-Budget-PDF-Liberation Public

    This project will liberate data from pdf files found on http://www.cityofjerseycity.com/pub-info.aspx?id=2430 and will create .csv and .json files to be uploaded on https://data.openjerseycity.org/…

    Python 6 1

  6. financial_disclosure_scraping financial_disclosure_scraping Public

    (DC team) experimenting with available options for extracting info from PFDs

    Python 5 2

Repositories

Showing 10 of 20 repositories

Top languages

Loading…

Most used topics

Loading…