Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An error occurred: PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead. #444

Open
aashams10 opened this issue Oct 6, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@aashams10
Copy link

import PyPDF2

Function to extract and display text from a PDF

def extract_and_display_pdf(pdf_file):
try:
# Open the PDF file
pdf_file = open(pdf_file, 'rb')

    pdf_reader = PyPDF2.PdfFileReader(pdf_file)

    text = ""

    for page_num in range(pdf_reader.numPages):
        page = pdf_reader.getPage(page_num)
        text += page.extractText()

    pdf_file.close()

    print(text)

except Exception as e:
    print(f"An error occurred: {str(e)}")

pdf_file_path = 'sample.pdf'

extract_and_display_pdf(pdf_file_path)

@aashams10 aashams10 added the bug Something isn't working label Oct 6, 2023
@xiaoymin
Copy link

I have the same problem, have resolved this??

@paluigi
Copy link

paluigi commented Oct 29, 2023

It seems that PyPDF2 (which camelot depends on) has implemented a breaking change on its API.

Short term fix is to manually install the last PyPDF2 version that works with the old API (before 3.0.0, as per the error message) after you have installed camelot:

python -m pip install "pypdf2<3"

Also, PyPDF2 is changing to PyPDF...this should be taken in account for the future. From their pypi page:

NOTE: The PyPDF2 project is going back to its roots. PyPDF2==3.0.X will be the last version of PyPDF2. Development will continue with pypdf==3.1.0.

@juliatong
Copy link

python -m pip install "pypdf2<3". Yet error remained.

Successfully installed pypdf2-2.12.1
camelot-py 0.9.0

DeprecationError: PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead.

@paluigi
Copy link

paluigi commented Nov 12, 2023

@juliatong , can you check with python -m pip freeze what version of pypdf2 is installed? From your error message it seems it's still version 3.
Maybe the previous version was installed in a different virtual environment? Also, if you are using Jupyter notebooks maybe you need to quit and restart the kernel for the updated library to be loaded.
Anyway, this should not be a long term solution. Not sure if any maintainer is working on this?

@juliatong
Copy link

juliatong commented Nov 17, 2023

@juliatong , can you check with python -m pip freeze what version of pypdf2 is installed? From your error message it seems it's still version 3. Maybe the previous version was installed in a different virtual environment? Also, if you are using Jupyter notebooks maybe you need to quit and restart the kernel for the updated library to be loaded. Anyway, this should not be a long term solution. Not sure if any maintainer is working on this?

Hi @paluigi,
Thanks a lot to the reply.
I solved the problem!

First of all, you are right. Indeed my jupter notebook wasn't picking up the change, despite Successfully installed pypdf2-2.12.1. It seems the kernel still took version 3 from the error msg you as point out. restarted the kernel, and the error msg is gone. Big shout on your attention to details.

However, after it, a new error came up. _raise RuntimeError('Ghostscript is not installed') RuntimeError: Please make sure that Ghostscript is installed .While I pip show Ghostscript is indeed there...

solution is to run commands below.
_sudo apit gives error _raise RuntimeError('Ghostscript is not installed') RuntimeError: Please make sure that Ghostscript is installed .While I pip show Ghostscript is indeed there...

solution is to run commands below.
sudo apt-get update
sudo at-get update
sudo apt-get install ghostscript

Here is the thing. The ghostscript package I installed through pip install is a Python interface to the Ghostscript C-API, and it doesn't include the Ghostscript executable itself. The Python package interacts with the Ghostscript library but doesn't install the Ghostscript command-line executable (gs). That's why above lines resolved my problem as they manually install the executable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants