You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
textract.exceptions.ShellError: The command antiword is not installed on your system. Please make sure the appropriate dependencies are installed before using textract
#444
Open
faridelya opened this issue
Oct 27, 2022
· 0 comments
**Can not execute antword In production by Gunicorn while in Development on same computer it work **
i have install all dependences on Ubuntu before installing textract here is the link here Reading package lists... Done Building dependency tree Reading state information... Done Note, selecting 'python-dev-is-python2' instead of 'python-dev' libjpeg-dev is already the newest version (8c-2ubuntu8). antiword is already the newest version (0.37-16). flac is already the newest version (1.3.3-1build1). lame is already the newest version (3.100-3). libmad0 is already the newest version (0.15.1b-10ubuntu1). libsox-fmt-mp3 is already the newest version (14.4.2+git20190427-2). pstotext is already the newest version (1.9-6build1). python-dev-is-python2 is already the newest version (2.7.17-4). sox is already the newest version (14.4.2+git20190427-2). swig is already the newest version (4.0.1-5build1). tesseract-ocr is already the newest version (4.1.1-2build2). unrtf is already the newest version (0.21.10-clean-1). libxml2-dev is already the newest version (2.9.10+dfsg-5ubuntu0.20.04.4). libxslt1-dev is already the newest version (1.1.34-4ubuntu0.20.04.1). poppler-utils is already the newest version (0.86.1-0ubuntu1.1). ffmpeg is already the newest version (7:4.2.7-0ubuntu0.1). 0 upgraded, 0 newly installed, 0 to remove and 44 not upgraded.
The following work is done on same server
when i run gunicorn -b 0.0.0.0:8000 wsgi:app --workers 3 --timeout 600
The application convert all docx and doc file to txt. but
** problem:**
When i updated changes by sudo systemctl restart gunicorn.service and sudo systemctl restart nginx
In production it cannot convert docx and doc file to txt and error come up.
the application still give me error when i check gunicorn status
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: byte_string = self.extract(filename, **kwargs)
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: File "/home/ubuntu/web-server/env/lib/python3.7/site-packages/textract/parsers/doc_parser.py", line 9, in extract
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: stdout, stderr = self.run(['antiword', filename])
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: File "/home/ubuntu/web-server/env/lib/python3.7/site-packages/textract/parsers/utils.py", line 96, in run
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: ' '.join(args), 127, '', '',
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: textract.exceptions.ShellError: The command antiword /home/ubuntu/web-server/data/test_cvs/Yassin.docx failed because the executable
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: antiword is not installed on your system. Please make
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: sure the appropriate dependencies are installed before using
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: textract:
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: http://textract.readthedocs.org/en/latest/installation.html`
while i have installed antiword on Ubuntu i check by run which antiword the ouput is
ubuntu@:~/web-server/data/test_cvs$ which antiword
/usr/bin/antiword`
i also uninstalled and and reinstalled antiword but still the problem exist. i am stuck but it doesnt work in production but on port 8000 it work and i get output. why gunicorn cannot execute antiword? any help would be appreciated Thanks.
python version = 3.7
OS = Ubuntu 20.04
The text was updated successfully, but these errors were encountered:
**Can not execute antword In production by Gunicorn while in Development on same computer it work **
i have install all dependences on Ubuntu before installing textract here is the link here
Reading package lists... Done Building dependency tree Reading state information... Done Note, selecting 'python-dev-is-python2' instead of 'python-dev' libjpeg-dev is already the newest version (8c-2ubuntu8). antiword is already the newest version (0.37-16). flac is already the newest version (1.3.3-1build1). lame is already the newest version (3.100-3). libmad0 is already the newest version (0.15.1b-10ubuntu1). libsox-fmt-mp3 is already the newest version (14.4.2+git20190427-2). pstotext is already the newest version (1.9-6build1). python-dev-is-python2 is already the newest version (2.7.17-4). sox is already the newest version (14.4.2+git20190427-2). swig is already the newest version (4.0.1-5build1). tesseract-ocr is already the newest version (4.1.1-2build2). unrtf is already the newest version (0.21.10-clean-1). libxml2-dev is already the newest version (2.9.10+dfsg-5ubuntu0.20.04.4). libxslt1-dev is already the newest version (1.1.34-4ubuntu0.20.04.1). poppler-utils is already the newest version (0.86.1-0ubuntu1.1). ffmpeg is already the newest version (7:4.2.7-0ubuntu0.1). 0 upgraded, 0 newly installed, 0 to remove and 44 not upgraded.
The following work is done on same server
** problem:**
When i updated changes by sudo systemctl restart gunicorn.service and sudo systemctl restart nginx
`● app.service - Gunicorn instance to serve myproject
Loaded: loaded (/etc/systemd/system/app.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2022-10-27 06:11:06 UTC; 5min ago
Main PID: 389929 (gunicorn)
Tasks: 13 (limit: 38087)
Memory: 4.9G
CGroup: /system.slice/app.service
├─389929 /home/ubuntu/web-server/env/bin/python /home/ubuntu/web-server/env/bin/gunicorn --workers 3 --bind unix:app.sock -m 007 wsgi:app --timeout 3600
├─389992 /home/ubuntu/web-server/env/bin/python /home/ubuntu/web-server/env/bin/gunicorn --workers 3 --bind unix:app.sock -m 007 wsgi:app --timeout 3600
├─389993 /home/ubuntu/web-server/env/bin/python /home/ubuntu/web-server/env/bin/gunicorn --workers 3 --bind unix:app.sock -m 007 wsgi:app --timeout 3600
└─389994 /home/ubuntu/web-server/env/bin/python /home/ubuntu/web-server/env/bin/gunicorn --workers 3 --bind unix:app.sock -m 007 wsgi:app --timeout 3600
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: byte_string = self.extract(filename, **kwargs)
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: File "/home/ubuntu/web-server/env/lib/python3.7/site-packages/textract/parsers/doc_parser.py", line 9, in extract
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: stdout, stderr = self.run(['antiword', filename])
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: File "/home/ubuntu/web-server/env/lib/python3.7/site-packages/textract/parsers/utils.py", line 96, in run
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: ' '.join(args), 127, '', '',
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: textract.exceptions.ShellError: The command
antiword /home/ubuntu/web-server/data/test_cvs/Yassin.docx
failed because the executableOct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]:
antiword
is not installed on your system. Please makeOct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: sure the appropriate dependencies are installed before using
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: textract:
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: http://textract.readthedocs.org/en/latest/installation.html`
ubuntu@:~/web-server/data/test_cvs$
which antiword/usr/bin/antiword`
i also uninstalled and and reinstalled antiword but still the problem exist. i am stuck but it doesnt work in production but on port 8000 it work and i get output. why gunicorn cannot execute antiword? any help would be appreciated Thanks.
python version = 3.7
OS = Ubuntu 20.04
The text was updated successfully, but these errors were encountered: