Installation Problem #117

PanosHatz · 2024-03-18T16:21:22Z

Hi, first of all is this project still active?

When trying to install on Windows 11 Anaconda after the pip install . command I get the following error:

ERROR: Could not find a version that satisfies the requirement tensorflow==2.13.1 (from invoicenet) 
(from versions: 1.13.1, 1.13.2, 1.14.0, 1.15.0, 1.15.2, 1.15.3, 1.15.4, 1.15.5, 2.0.0, 2.0.1, 2.0.2, 2.0.3, 2.0.4, 2.1.0, 
2.1.1, 2.1.2, 2.1.3, 2.1.4, 2.2.0, 2.2.1, 2.2.2, 2.2.3, 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4, 
2.5.0, 2.5.1, 2.5.2, 2.5.3, 2.6.0rc0, 2.6.0rc1, 2.6.0rc2, 2.6.0, 2.6.1, 2.6.2, 2.6.3, 2.6.4, 2.6.5, 2.7.0rc0, 2.7.0rc1, 
2.7.0, 2.7.1, 2.7.2, 2.7.3, 2.7.4, 2.8.0rc0, 2.8.0rc1, 2.8.0, 2.8.1, 2.8.2, 2.8.3, 2.8.4, 2.9.0rc0, 2.9.0rc1, 2.9.0rc2, 
2.9.0, 2.9.1, 2.9.2, 2.9.3, 2.10.0rc0, 2.10.0rc1, 2.10.0rc2, 2.10.0rc3, 2.10.0, 2.10.1, 2.11.0rc0, 2.11.0rc1, 
2.11.0rc2, 2.11.0)
ERROR: No matching distribution found for tensorflow==2.13.1

Can anyone help me?

The text was updated successfully, but these errors were encountered:

GREGOR2000 · 2024-03-18T19:13:05Z

Change two lines (258,259) in setup.py:

install_requires=[
"tensorflow",
"numpy",
"six~=1.15.0",
"datefinder==0.7.1",
"opencv-python==4.5.1.48",
"pdf2image==1.14.0",
"pdfplumber==0.5.27",
"PyPDF2==1.27.9",
"pytesseract==0.3.7",
"python-dateutil==2.8.1",
"PyYAML==5.4.1",
"simplejson==3.17.2",
"tqdm==4.59.0",
"google-api-python-client",
"google-cloud-vision"
])

PanosHatz · 2024-03-18T20:19:48Z

Change two lines (258,259) in setup.py:

install_requires=[ "tensorflow", "numpy", "six~=1.15.0", "datefinder==0.7.1", "opencv-python==4.5.1.48", "pdf2image==1.14.0", "pdfplumber==0.5.27", "PyPDF2==1.27.9", "pytesseract==0.3.7", "python-dateutil==2.8.1", "PyYAML==5.4.1", "simplejson==3.17.2", "tqdm==4.59.0", "google-api-python-client", "google-cloud-vision" ])

Thank you very much, it worked!

eshsu · 2024-03-21T10:12:50Z

Have you implement this repo successfully in windows

GREGOR2000 · 2024-03-21T10:17:21Z

Yes. On Win 10 with miniconda.

PanosHatz · 2024-03-21T12:03:01Z

Yes. On Win 10 with miniconda.

I ran into some other problems and kind of gave up. Any idea if it works for Windows 11?

GREGOR2000 · 2024-03-21T12:06:23Z

Please tell us what problems or errors you have.

PanosHatz · 2024-03-21T13:24:55Z

Please tell us what problems or errors you have.

Thanks a lot for the immediate response. Actually, I think I managed to make it work after a fresh "reinstall"
Just two questions:
Can I train using a regular CPU? If my invoices are in Greek Language will it work?

GREGOR2000 · 2024-03-21T14:31:21Z

You can easily train the network using only the CPU. The tensorflow library will detect what it can run on.

As for the language, by default ORC tesseract has English enabled. The program must force the language to be Greek or English+Greek.
https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html

File InvoiceNet\invoicenet\common\util.py, line 95.

data = pytesseract.image_to_data(img, output_type=Output.DICT)

data = pytesseract.image_to_data(img, lang='grc', output_type=Output.DICT)

GREGOR2000 · 2024-03-21T15:59:03Z

You need to check what languages tesseract-ocr supports:

c:\Program Files\Tesseract-OCR\tesseract.exe --list-langs

PanosHatz · 2024-03-25T23:05:21Z

You can easily train the network using only the CPU. The tensorflow library will detect what it can run on.

As for the language, by default ORC tesseract has English enabled. The program must force the language to be Greek or English+Greek. https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html

File InvoiceNet\invoicenet\common\util.py, line 95.

data = pytesseract.image_to_data(img, output_type=Output.DICT)

data = pytesseract.image_to_data(img, lang='grc', output_type=Output.DICT)

Hi, I tried training using only CPU, it took a huge amount of time. Can I somehow use Google Colab's free GPUs for this? Do I have to make any modification to the code?

GREGOR2000 · 2024-03-26T08:09:53Z

On a normal computer, 5,000 invoices are processed and trained in about a few hours. It's enough once. Then the trained network works quickly.

The only thing I see in the Google OCR code is the util.py file line 37:

API keys for google ocr

os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="google_api_keys.json"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installation Problem #117

Installation Problem #117

PanosHatz commented Mar 18, 2024

GREGOR2000 commented Mar 18, 2024 •

edited

PanosHatz commented Mar 18, 2024

eshsu commented Mar 21, 2024

GREGOR2000 commented Mar 21, 2024

PanosHatz commented Mar 21, 2024

GREGOR2000 commented Mar 21, 2024

PanosHatz commented Mar 21, 2024 •

edited

GREGOR2000 commented Mar 21, 2024

GREGOR2000 commented Mar 21, 2024

PanosHatz commented Mar 25, 2024 •

edited

GREGOR2000 commented Mar 26, 2024

Installation Problem #117

Installation Problem #117

Comments

PanosHatz commented Mar 18, 2024

GREGOR2000 commented Mar 18, 2024 • edited

PanosHatz commented Mar 18, 2024

eshsu commented Mar 21, 2024

GREGOR2000 commented Mar 21, 2024

PanosHatz commented Mar 21, 2024

GREGOR2000 commented Mar 21, 2024

PanosHatz commented Mar 21, 2024 • edited

GREGOR2000 commented Mar 21, 2024

GREGOR2000 commented Mar 21, 2024

PanosHatz commented Mar 25, 2024 • edited

GREGOR2000 commented Mar 26, 2024

API keys for google ocr

GREGOR2000 commented Mar 18, 2024 •

edited

PanosHatz commented Mar 21, 2024 •

edited

PanosHatz commented Mar 25, 2024 •

edited