Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot import package due to sklearn dependency #10

Open
kellieotto opened this issue Feb 5, 2020 · 7 comments
Open

Cannot import package due to sklearn dependency #10

kellieotto opened this issue Feb 5, 2020 · 7 comments

Comments

@kellieotto
Copy link

This issue is part of your JOSS review.

I was able to install the package but can't import it. I get the following error

ImportError: cannot import name 'Imputer' from 'sklearn.preprocessing' (/opt/conda/envs/py3-primary/lib/python3.7/site-packages/sklearn/preprocessing/__init__.py)

seems related to this? https://stackoverflow.com/questions/59439096/importerror-cannnot-import-name-imputer-from-sklearn-preprocessing

@Ji-Zhang
Copy link
Owner

Hi @kellieotto , this bug has been fixed. Please let me know if other problems. Thanks.

@kellieotto
Copy link
Author

@Ji-Zhang That error seems to be fixed, great! I'm still running into import issues.

ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

Seems to be coming from tensorflow.

@Ji-Zhang
Copy link
Owner

Ji-Zhang commented Mar 8, 2020

Hi @kellieotto, may I ask in which environment you are using the TensorFlow? There is not much I can do in the package to fix this.

If you are using TensorFlow with GPU, you need to install CUDA and cuDNN. Please follow instructions on https://www.tensorflow.org/install/

If you have already install CUDA and cuDNN, but still get this error, then you probably forgot to export your libraries: for Linux, you may need to set LD_LIBRARY_PATH to include CUDA libraries.

If the above can not fix the problem, please let me know so I can further help. Thanks.

@kellieotto
Copy link
Author

Hi @Ji-Zhang, sorry for the huge time delay between responses on this.

I have installed everything now. I am running the code example you have in the README and get this error:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-15-b513b18ad249> in <module>
      1 import datacleanbot.dataclean as dc
----> 2 Xy = dc.autoclean(Xy, data.name, features)

~/miniconda3/lib/python3.7/site-packages/datacleanbot/dataclean.py in autoclean(Xy, dataset_name, features)
   1345     features = unify_name_consistency(features)
   1346     features_new, Xy_filled = handle_missing(features, Xy)
-> 1347     Xy_cleaned = handle_outlier(features_new, Xy_filled)
   1348     return Xy_cleaned

~/miniconda3/lib/python3.7/site-packages/datacleanbot/dataclean.py in handle_outlier(features, Xy)
   1286     X = Xy[:,:-1]
   1287     y = Xy[:,-1]
-> 1288     best = predict_best_anomaly_algorithm(X, y)
   1289     df = pd.DataFrame(Xy)
   1290     display(HTML('<h4>Visualize Outliers ... </h4>'))

~/miniconda3/lib/python3.7/site-packages/datacleanbot/dataclean.py in predict_best_anomaly_algorithm(X, y)
   1050 
   1051     # load meta learner
-> 1052     metalearner = joblib.load(urlopen("https://github.com/Ji-Zhang/datacleanbot/blob/master/process/AutomaticOutlierDetection/metalearner_rf.pkl?raw=true"))
   1053     best_anomaly_algorithm = metalearner.predict(mf)
   1054     if best_anomaly_algorithm[0] == 0:

~/miniconda3/lib/python3.7/site-packages/sklearn/externals/joblib/numpy_pickle.py in load(filename, mmap_mode)
    586         filename = getattr(fobj, 'name', '')
    587         with _read_fileobject(fobj, filename, mmap_mode) as fobj:
--> 588             obj = _unpickle(fobj)
    589     else:
    590         with open(filename, 'rb') as f:

~/miniconda3/lib/python3.7/site-packages/sklearn/externals/joblib/numpy_pickle.py in _unpickle(fobj, filename, mmap_mode)
    524     obj = None
    525     try:
--> 526         obj = unpickler.load()
    527         if unpickler.compat_mode:
    528             warnings.warn("The file '%s' has been generated with a "

~/miniconda3/lib/python3.7/pickle.py in load(self)
   1083                     raise EOFError
   1084                 assert isinstance(key, bytes_types)
-> 1085                 dispatch[key[0]](self)
   1086         except _Stop as stopinst:
   1087             return stopinst.value

~/miniconda3/lib/python3.7/pickle.py in load_global(self)
   1371         module = self.readline()[:-1].decode("utf-8")
   1372         name = self.readline()[:-1].decode("utf-8")
-> 1373         klass = self.find_class(module, name)
   1374         self.append(klass)
   1375     dispatch[GLOBAL[0]] = load_global

~/miniconda3/lib/python3.7/pickle.py in find_class(self, module, name)
   1421             elif module in _compat_pickle.IMPORT_MAPPING:
   1422                 module = _compat_pickle.IMPORT_MAPPING[module]
-> 1423         __import__(module, level=0)
   1424         if self.proto >= 4:
   1425             return _getattribute(sys.modules[module], name)[0]

ModuleNotFoundError: No module named 'sklearn.ensemble._forest'

I think it's related to package versions + pickling. I found this issue that seems related.

@Ji-Zhang
Copy link
Owner

Ji-Zhang commented Jun 7, 2020

Hi @kellieotto , this bug should be fixed now. Could you please test it again? Thanks in advance.

@kellieotto
Copy link
Author

Sorry @Ji-Zhang it's still not working when I run dc.autoclean. I see Important Features, Statistical Information, Discover Data Types, etc... but when it gets to Outliers, the error posted above appears.

I did ran pip install datacleanbot==0.8 and pip install joblib but I still get the error posted above.

@Ji-Zhang
Copy link
Owner

Hi @kellieotto , sorry for the inconvenience. I changed the way to load the trained model. Could you please try it again? pip install datacleanbot==0.9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants