Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: Feature names #323

Open
jpinus opened this issue Feb 9, 2023 · 2 comments
Open

TypeError: Feature names #323

jpinus opened this issue Feb 9, 2023 · 2 comments

Comments

@jpinus
Copy link

jpinus commented Feb 9, 2023

I installed concoct via conda and followed the basic usage (https://concoct.readthedocs.io/en/latest/usage.html).
Everything is going fine until I run concoct:

command: concoct --composition_file contigs_10K.fa --coverage_file coverage_table.tsv -b concoct_output/ --thread 12

output:
Up and running. Check [...]/concoct_output/log.txt for progress
Traceback (most recent call last):
File "[...]/.conda/envs/concoct_env/bin/concoct", line 90, in
results = main(args)
File "[...]/.conda/envs/concoct_env/bin/concoct", line 37, in main
transform_filter, pca = perform_pca(
File "[...]/.conda/envs/concoct_env/lib/python3.10/site-packages/concoct/transform.py", line 5, in perform_pca
pca_object = PCA(n_components=nc, random_state=seed).fit(d)
File "[...]/.conda/envs/concoct_env/lib/python3.10/site-packages/sklearn/decomposition/_pca.py", line 435, in fit
self._fit(X)
File "[...]/.conda/envs/concoct_env/lib/python3.10/site-packages/sklearn/decomposition/_pca.py", line 485, in _fit
X = self._validate_data(
File "[...]/.conda/envs/concoct_env/lib/python3.10/site-packages/sklearn/base.py", line 529, in _validate_data
self._check_feature_names(X, reset=reset)
File "[...]/.conda/envs/concoct_env/lib/python3.10/site-packages/sklearn/base.py", line 396, in _check_feature_names
feature_names_in = _get_feature_names(X)
File "[...]/.conda/envs/concoct_env/lib/python3.10/site-packages/sklearn/utils/validation.py", line 1903, in _get_feature_names
raise TypeError(
TypeError: Feature names are only supported if all input features have string names, but your input has ['int', 'str'] as feature name / column name types. If you want feature names to be stored and validated, you must convert them all to strings, by using X.columns = X.columns.astype(str) for example. Otherwise you can remove feature / column names from your input data, or convert them all to a non-string data type.

@jakob-wirbel
Copy link

I had the same issue with a new install of concoct via conda. I think the problem comes from the version of sklearn, which is too advanced in a new install.

For the conda install that did not work, I got the following output:

python
Python 3.10.8 | packaged by conda-forge | (main, Nov 22 2022, 08:23:14) [GCC 10.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sklearn
>>> sklearn.__version__
'1.2.1'
>>> quit()

I instead used a singularity container that had a working version of concoct:

singularity shell docker://quay.io/biocontainers/concoct:1.1.0--py27h88e4a8a_0
python
Python 2.7.15 | packaged by conda-forge | (default, Jul  2 2019, 00:39:44) 
[GCC 7.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sklearn
>>> sklearn.__version__
'0.20.3'
>>> quit() 

I suggest adding an upper bound for the sklearn package (or specify exactly which version of sklearn you install).

@jakob-wirbel
Copy link

The problem seems to be this line here in an older version of sklearn
https://github.com/scikit-learn/scikit-learn/blob/ffc0f66676b4835eb1bdd3f3ecab025e9c1be9fe/sklearn/utils/validation.py#L1859

It works with these packages installed

name: concoct
channels:
  - conda-forge
  - bioconda
dependencies:
  - scikit-learn=1.1.0
  - concoct=1.1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants