Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to run scikit-learn housing example with documented dependencies #342

Open
koverholt opened this issue Sep 9, 2020 · 1 comment

Comments

@koverholt
Copy link

Overview

Following along the scikit-learn example at https://algorithmia.com/developers/model-deployment/scikit and using the specified dependencies (numpy and scikit-learn>=0.14,<0.18) does not work with the Python 3.x environment (legacy or IPA) and throws an error when you call the algorithm.

Steps to reproduce

  1. Upload the data files from https://github.com/algorithmiaio/sample-apps/tree/master/algo-dev-demo/scikit-learn-demo/data to a hosted data collection.

  2. Create a Python 3.x (legacy) environment.

  3. Specify the dependencies as noted in the documentation:

numpy
scikit-learn>=0.14,<0.18
  1. Paste the example code from https://algorithmia.com/developers/model-deployment/scikit with the apply() function and edit the data path.

  2. Save and build the algorithm.

  3. Pass the path to the test data as an input in the test console.

  4. The following error occurs:

> "data://koverholt/scikit/boston_test_data.csv"
Error: Algorithm process exited
Traceback (most recent call last):
  File "/opt/algorithm/bin/pipe.py", line 14, in <module>
    algorithm = __import__('src.'+config['algoname'], fromlist=["apply"])
  File "/opt/algorithm/src/scikit.py", line 7, in <module>
    from sklearn.datasets import load_boston
  File "/opt/algorithm/dependencies/sklearn/datasets/__init__.py", line 24, in <module>
    from .twenty_newsgroups import fetch_20newsgroups
  File "/opt/algorithm/dependencies/sklearn/datasets/twenty_newsgroups.py", line 54, in <module>
    from ..feature_extraction.text import CountVectorizer
  File "/opt/algorithm/dependencies/sklearn/feature_extraction/__init__.py", line 10, in <module>
    from . import text
  File "/opt/algorithm/dependencies/sklearn/feature_extraction/text.py", line 29, in <module>
    from ..preprocessing import normalize
  File "/opt/algorithm/dependencies/sklearn/preprocessing/__init__.py", line 31, in <module>
    from .imputation import Imputer
  File "/opt/algorithm/dependencies/sklearn/preprocessing/imputation.py", line 9, in <module>
    from scipy import stats
  File "/opt/anaconda3/lib/python3.5/site-packages/scipy/stats/__init__.py", line 340, in <module>
    from .morestats import *
  File "/opt/anaconda3/lib/python3.5/site-packages/scipy/stats/morestats.py", line 16, in <module>
    from numpy.testing.decorators import setastest
ImportError: No module named 'numpy.testing.decorators'

Suggested fix

  1. Suggest updating the steps to include a specific algorithm environment (Python 3.x CPU) when creating an algorithm.

  2. Update the dependencies in https://github.com/algorithmiaio/sample-apps/blob/master/algo-dev-demo/scikit-learn-demo/demo/requirements.txt and the documentation page at https://algorithmia.com/developers/model-deployment/scikit.

The following pinned dependencies worked for me using the Python 3.x legacy environment:

numpy==1.11.3
scikit-learn==0.17.1
scipy==1.2.1

although there might be other version specs/ranges that work as well.

Note that these dependencies do not work with the Python 3.7 IPA environment as it fails to build, hence the recommendation to specify the Python 3.x legacy environment, but we should look more into that build failure as well.

@koverholt
Copy link
Author

This code example is fixed in algorithmiaio/sample-apps#57 and algorithmiaio/sample-apps#58 with updated code and dependencies.

So now, the documentation in the dev-center for the scikit-learn example is outdated, and we can update the docs, or just link to the README in https://github.com/algorithmiaio/sample-apps/tree/master/algo-dev-demo/scikit-learn-demo, or something else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant