Read and extract text and other content from PDFs in C# (port of PDFBox)
-
Updated
May 15, 2024 - C#
Read and extract text and other content from PDFs in C# (port of PDFBox)
Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
This repository provides train&test code, dataset, det.&rec. annotation, evaluation script, annotation tool, and ranking.
A curated list of resources for Document Understanding (DU) topic
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Powerful web application that combines Streamlit, LangChain, and Pinecone to simplify document analysis. Powered by OpenAI's GPT-3, RAG enables dynamic, interactive document conversations, making it ideal for efficient document retrieval and summarization.
Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
RObust document image BINarization
Pandora is an analysis framework to discover if a file is suspicious and conveniently show the results
Local adaptive image binarization
Document Visual Question Answering
Post-process Amazon Textract results with Hugging Face transformer models for document understanding
AssemblyLine 4: File triage and malware analysis
Effortlessly extract information from unstructured data with this library, utilizing advanced AI techniques. Compose AI in customizable pipelines and diverse sources for your projects.
Java Client for the expert.ai Natural Language API
Improving Document Binarization via Adversarial Noise-Texture Augmentation (ICIP 2019)
(ICFHR 2020 oral) Code for "docExtractor: An off-the-shelf historical document element extraction" paper
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
UTRNet: High-Resolution Urdu Text Recognition In Printed Documents (ICDAR'23)
Add a description, image, and links to the document-analysis topic page so that developers can more easily learn about it.
To associate your repository with the document-analysis topic, visit your repo's landing page and select "manage topics."