gpt-4-pdf

This GitHub repository hosts a Python-based program designed for advanced PDF processing and interaction. The program consists of two main components: PDFContextExtractor and PDFSearchAndDisplay. The PDFContextExtractor uses LangChain for contextual information retrieval from PDF documents, enhanced by AI capabilities from OpenAI's models. The PDFSearchAndDisplay class, leveraging PyMuPDF and pdfplumber, searches PDF documents for specific contexts, highlights them, and captures screenshots of pages with significant highlights. This tool is particularly useful for parsing and visually annotating PDFs based on context queries, suitable for academic research, document analysis, and automated report generation.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
README.md		README.md
app.py		app.py
chapter_9_Java236.pdf		chapter_9_Java236.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

app.py

app.py

chapter_9_Java236.pdf

chapter_9_Java236.pdf

requirements.txt

requirements.txt

Repository files navigation

gpt-4-pdf

About

Releases

Packages

Languages

harmindersinghnijjar/gpt-4-pdf

Folders and files

Latest commit

History

Repository files navigation

gpt-4-pdf

About

Topics

Resources

Stars

Watchers

Forks

Languages