Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lexnlp & blackstone dependancy incompatability #3

Open
pj-simpson opened this issue Jul 28, 2022 · 4 comments
Open

lexnlp & blackstone dependancy incompatability #3

pj-simpson opened this issue Jul 28, 2022 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@pj-simpson
Copy link

Description

Whilst I can build and run the ready-made Docker image, as per the projects README, I cannot build the Dockerfile locally without some kind of incompatible requirements error.

To Reproduce

Attempt to build the project's Dockerfile. A sample of the console output:

...
#7 94.43 ERROR: lexnlp 2.0.0 has requirement joblib==0.14.0, but you'll have joblib 1.1.0 which is incompatible.
#7 94.43 ERROR: lexnlp 2.0.0 has requirement nltk==3.5, but you'll have nltk 3.6.2 which is incompatible.
#7 94.43 ERROR: lexnlp 2.0.0 has requirement numpy==1.19.1, but you'll have numpy 1.19.5 which is incompatible.
#7 94.43 ERROR: lexnlp 2.0.0 has requirement pandas==1.1.3, but you'll have pandas 1.1.5 which is incompatible.
#7 94.43 ERROR: lexnlp 2.0.0 has requirement regex==2020.11.13, but you'll have regex 2022.3.15 which is incompatible.
#7 94.43 ERROR: lexnlp 2.0.0 has requirement requests==2.24.0, but you'll have requests 2.27.1 which is incompatible.
...

I've chopped and changed versions and wrangled the order of what gets installed as much as I can, but I just cant get them to work together. My suspicion is that because libraries like Pandas and Numpy use C-extensions, there is an added layer of complexity here. I've even tried running a linux container and installing Python and all the C dependancies needed from scratch, but to no avail, I always come up against this dependancy hell!

Current Workaround

Given that there are fewer endpoints dependant on Blackstone, I've just stripped Blackstone out of my branch. This loses me the 'abbreviation' , 'legislation' and 'named-entity' endpoints, but means I can reliably build and run the container.

my requirements.txt post-removing blackstone

asgiref==3.4.1
backports.zoneinfo==0.2.1
certifi==2022.6.15
chardet==3.0.4
click==8.0.4
dataclasses==0.8
datefinder-lexpredict==0.6.2.1
dateparser==0.7.2
docopt==0.6.2
ecdsa==0.18.0
fastapi==0.68.1
fastapi-cloudauth==0.4.0
gensim==3.8.3
h11==0.13.0
idna==2.10
importlib-metadata==4.8.3
importlib-resources==5.4.0
jellyfish==0.6.1
joblib==1.1.0
lexnlp==2.0.0
nltk==3.6.2
num2words==0.5.10
numpy==1.19.1
pandas==1.1.3
pyasn1==0.4.8
pycountry==20.7.3
pydantic==1.9.1
python-dateutil==2.8.2
python-jose==3.3.0
pytz==2022.1
pytz-deprecation-shim==0.1.0.post0
regex==2022.3.15
reporters-db==2.0.3
requests==2.24.0
rsa==4.9
scikit-learn==0.23.1
scipy==1.5.1
six==1.16.0
smart-open==6.0.0
starlette==0.14.2
threadpoolctl==3.1.0
tqdm==4.64.0
typing-extensions==4.1.1
tzdata==2022.1
tzlocal==4.2
Unidecode==1.1.1
urllib3==1.25.11
us==2.0.2
uvicorn==0.15.0
zipp==3.6.0

Dockerfile

FROM python:3.6.8

ENV SPACY_MODEL=en_core_web_sm

COPY requirements-all-but-black.txt .

RUN pip install -r requirements-all-but-black.txt

EXPOSE 80

COPY ./app /app

CMD ["python", "/app/main.py"]

Possible resolutions

  1. If someone knows the trick to getting blackstone, lexnlp and all of their dependancies installed and running within the same Docker container, please let me know!
  2. Completely strip blackstone out the project. Its a minority of endpoints and there are some grumblings over on the GH page for that project that its poorly maintained(last updated January).. compare that to lexnlp(last updated 17 days ago)
  3. Introduce Docker Compose and run a blackstone based service in its own container. The main fast-api app could consume endpoints from there, when it requires the functionality provided by blackstone.
@ryanmcdonough
Copy link
Contributor

ryanmcdonough commented Dec 23, 2022

Hey @pj-simpson - glad to see you here, albeit I didn't see this issue when it was originally raised many months ago! I'll certainly take a look as why a fresh creation isn't recreating the rebuilt container as I'd expect to run into the exact same issue.

@ryanmcdonough ryanmcdonough self-assigned this Dec 23, 2022
@ryanmcdonough ryanmcdonough added the bug Something isn't working label Dec 23, 2022
@ryanmcdonough
Copy link
Contributor

@pj-simpson I think with this issue, I'll resolve it in the same way you have - I'll remove the blackstone references and build a separate module to perform legislation extraction from text.

@pj-simpson
Copy link
Author

Sounds good @ryanmcdonough! All the best and have a Merry Xmas! 🎄

@ryanmcdonough
Copy link
Contributor

@pj-simpson and you too! Also, removal of blackstone and adding legislation replacement gets it down from 6gb to 1.5gb image and 150mb RAM usage on load so not bad at all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants