Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError when call run_scenicplus #76

Closed
alexwang1001 opened this issue Dec 15, 2022 · 9 comments
Closed

AttributeError when call run_scenicplus #76

alexwang1001 opened this issue Dec 15, 2022 · 9 comments
Labels
question Further information is requested

Comments

@alexwang1001
Copy link

alexwang1001 commented Dec 15, 2022

Hi! I was running scenicplus PBMC 3K tutorial using the singularity container. When I run the following code at the indicated step:

from scenicplus.wrappers.run_scenicplus import run_scenicplus
try:
    run_scenicplus(
        scplus_obj = scplus_obj,
        variable = ['GEX_celltype'],
        species = 'hsapiens',
        assembly = 'hg38',
        tf_file = 'pbmc_tutorial/data/utoronto_human_tfs_v_1.01.txt',
        save_path = os.path.join(work_dir, 'scenicplus'),
        biomart_host = biomart_host,
        upstream = [1000, 150000],
        downstream = [1000, 150000],
        calculate_TF_eGRN_correlation = True,
        calculate_DEGs_DARs = True,
        export_to_loom_file = True,
        export_to_UCSC_file = True,
        path_bedToBigBed = 'pbmc_tutorial',
        n_cpu = 12,
        _temp_dir = os.path.join(tmp_dir, 'ray_spill'))
except Exception as e:
    #in case of failure, still save the object
    dill.dump(scplus_obj, open(os.path.join(work_dir, 'scenicplus/scplus_obj.pkl'), 'wb'), protocol=-1)
    raise(e)

I got this error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[25], line 23
     20 except Exception as e:
     21     #in case of failure, still save the object
     22     dill.dump(scplus_obj, open(os.path.join(work_dir, 'scenicplus/scplus_obj.pkl'), 'wb'), protocol=-1)
---> 23     raise(e)

Cell In[25], line 3
      1 from scenicplus.wrappers.run_scenicplus import run_scenicplus
      2 try:
----> 3     run_scenicplus(
      4         scplus_obj = scplus_obj,
      5         variable = ['GEX_celltype'],
      6         species = 'hsapiens',
      7         assembly = 'hg38',
      8         tf_file = 'pbmc_tutorial/data/utoronto_human_tfs_v_1.01.txt',
      9         save_path = os.path.join(work_dir, 'scenicplus'),
     10         biomart_host = biomart_host,
     11         upstream = [1000, 150000],
     12         downstream = [1000, 150000],
     13         calculate_TF_eGRN_correlation = True,
     14         calculate_DEGs_DARs = True,
     15         export_to_loom_file = True,
     16         export_to_UCSC_file = True,
     17         path_bedToBigBed = 'pbmc_tutorial',
     18         n_cpu = 12,
     19         _temp_dir = os.path.join(tmp_dir, 'ray_spill'))
     20 except Exception as e:
     21     #in case of failure, still save the object
     22     dill.dump(scplus_obj, open(os.path.join(work_dir, 'scenicplus/scplus_obj.pkl'), 'wb'), protocol=-1)

File /opt/venv/lib/python3.8/site-packages/scenicplus/wrappers/run_scenicplus.py:309, in run_scenicplus(scplus_obj, variable, species, assembly, tf_file, save_path, biomart_host, upstream, downstream, region_ranking, gene_ranking, simplified_eGRN, calculate_TF_eGRN_correlation, calculate_DEGs_DARs, export_to_loom_file, export_to_UCSC_file, tree_structure, path_bedToBigBed, n_cpu, _temp_dir, **kwargs)
    307 if export_to_loom_file is True:
    308     log.info('Exporting to loom file')
--> 309     export_to_loom(scplus_obj,
    310            signature_key = 'Gene_based',
    311            tree_structure = tree_structure,
    312            title =  'Gene based eGRN',
    313            nomenclature = assembly,
    314            out_fname=os.path.join(save_path,'SCENIC+_gene_based.loom'))
    315     export_to_loom(scplus_obj,
    316            signature_key = 'Region_based',
    317            tree_structure = tree_structure,
    318            title =  'Region based eGRN',
    319            nomenclature = assembly,
    320            out_fname=os.path.join(save_path,'SCENIC+_region_based.loom'))
    322 if export_to_UCSC_file is True:

File /opt/venv/lib/python3.8/site-packages/scenicplus/loom.py:174, in export_to_loom(scplus_obj, signature_key, out_fname, eRegulon_metadata_key, auc_key, auc_thr_key, keep_direct_and_extended_if_not_direct, selected_features, selected_cells, cluster_annotation, tree_structure, title, nomenclature)
    170     cv = CountVectorizer(
    171         lowercase=False, token_pattern=r'(?u)\b\w\w+\b:\b\w\w+\b-\b\w\w+\b')
    172 regulon_mat = cv.fit_transform(regulons.values())
    173 regulon_mat = pd.DataFrame(regulon_mat.todense(
--> 174 ), columns=cv.get_feature_names(), index=regulons.keys())
    175 regulon_mat = regulon_mat.reindex(columns=feature_names, fill_value=0).T
    176 if keep_direct_and_extended_if_not_direct is True:

AttributeError: 'CountVectorizer' object has no attribute 'get_feature_names'

Do you know why and could you help me fix it?
Thank you!
Li

@alexwang1001 alexwang1001 added the question Further information is requested label Dec 15, 2022
@satrapankti
Copy link

Screenshot (68)
.get_feature_names_out() instead of get_feature_names()

@alexwang1001
Copy link
Author

Screenshot (68) .get_feature_names_out() instead of get_feature_names()

That is what I thought as well. Is this a bug for run_scenicplus that needs to be fixed?

@SeppeDeWinter
Copy link
Collaborator

Hi both

You're right. get_feature_names got replaced by get_feature_names_out (see: scikit-learn/scikit-learn#18444). I will update the code.

Best,

Seppe

@JoGraesslin
Copy link

JoGraesslin commented Apr 28, 2023

Hi everyone, I am running into the same issue using Scenic+1.01:

, line 174, in export_to_loom
    ), columns=cv.get_feature_names(), index=regulons.keys())
AttributeError: 'CountVectorizer' object has no attribute 'get_feature_names'

I have created my pyscenic file using the method described in #48 (comment)_ as I am trying to make scenicplus run for zebrafish.

Would be very happy for any ideas!
Best,
Jo

@colquittlab
Copy link

I received an identical error as the OP while running run_scenicplus on the PBMC tutorial using scenicplus 1.0.1.dev2+g26677cb.

@jflucier
Copy link

a patch seems available in developmeent branch.

@SeppeDeWinter should I switch the scenicplus git repo to developement branch?

Do you plan to merge the fix to master branch?

thank in advance for your help

@solvi808
Copy link
Collaborator

Getting this error as well using scenicplus v. 1.0.1.dev4+ge4bdd9f

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[59], line 23
     20 except Exception as e:
     21     #in case of failure, still save the object
     22     dill.dump(scplus_obj, open(os.path.join(work_dir, 'scenicplus/scplus_obj.pkl'), 'wb'), protocol=-1)
---> 23     raise(e)

Cell In[59], line 3
      1 from scenicplus.wrappers.run_scenicplus import run_scenicplus
      2 try:
----> 3     run_scenicplus(
      4         scplus_obj = scplus_obj,
      5         variable = [ KEY_TO_GROUP_BY_1 ],
      6         species = 'mmusculus', # hsapiens mmusculus
      7         assembly = 'mm10', # hg38 mm10
      8         tf_file = '/media/solvi/WORKSITE1001/refDBs/allTFs_mm.txt',
      9         save_path = os.path.join(work_dir, 'scenicplus'),
     10         biomart_host = biomart_host,
     11         upstream = [1000, 150000],
     12         downstream = [1000, 150000],
     13         calculate_TF_eGRN_correlation = True,
     14         calculate_DEGs_DARs = True,
     15         export_to_loom_file = True,
     16         export_to_UCSC_file = True,
     17         path_bedToBigBed = 'MU4',
     18         n_cpu = NCPUS ,
     19         _temp_dir = os.path.join(tmpDir, 'ray_spill'))
     20 except Exception as e:
     21     #in case of failure, still save the object
     22     dill.dump(scplus_obj, open(os.path.join(work_dir, 'scenicplus/scplus_obj.pkl'), 'wb'), protocol=-1)

File ~/scenicplus/src/scenicplus/wrappers/run_scenicplus.py:323, in run_scenicplus(scplus_obj, variable, species, assembly, tf_file, save_path, biomart_host, upstream, downstream, region_ranking, gene_ranking, simplified_eGRN, calculate_TF_eGRN_correlation, calculate_DEGs_DARs, export_to_loom_file, export_to_UCSC_file, tree_structure, path_bedToBigBed, n_cpu, _temp_dir, save_partial, **kwargs)
    321 if export_to_loom_file is True:
    322     log.info('Exporting to loom file')
--> 323     export_to_loom(scplus_obj, 
    324            signature_key = 'Gene_based',
    325            tree_structure = tree_structure,
    326            title =  'Gene based eGRN',
    327            nomenclature = assembly,
    328            out_fname=os.path.join(save_path,'SCENIC+_gene_based.loom'))
    329     export_to_loom(scplus_obj, 
    330            signature_key = 'Region_based',
    331            tree_structure = tree_structure,
    332            title =  'Region based eGRN',
    333            nomenclature = assembly,
    334            out_fname=os.path.join(save_path,'SCENIC+_region_based.loom'))
    336 if export_to_UCSC_file is True:

File ~/scenicplus/src/scenicplus/loom.py:174, in export_to_loom(scplus_obj, signature_key, out_fname, eRegulon_metadata_key, auc_key, auc_thr_key, keep_direct_and_extended_if_not_direct, selected_features, selected_cells, cluster_annotation, tree_structure, title, nomenclature)
    170     cv = CountVectorizer(
    171         lowercase=False, token_pattern=r'(?u)\b\w\w+\b:\b\w\w+\b-\b\w\w+\b')
    172 regulon_mat = cv.fit_transform(regulons.values())
    173 regulon_mat = pd.DataFrame(regulon_mat.todense(
--> 174 ), columns=cv.get_feature_names(), index=regulons.keys())
    175 regulon_mat = regulon_mat.reindex(columns=feature_names, fill_value=0).T
    176 if keep_direct_and_extended_if_not_direct is True:

AttributeError: 'CountVectorizer' object has no attribute 'get_feature_names'

@jflucier
Copy link

Hi,

I got around this problem (if I remember well) using the follwing singularity container. Here is the recipe to build container:


# to build: singularity build --force --fakeroot scenicplus.sif scenicplus.def

BootStrap: docker
From: ubuntu:22.04

%setup

%environment
    export PATH=/miniconda3/bin:$PATH
    export PATH=/ucsc.v386:$PATH

%post
    apt-get update && apt-get -y upgrade

    ln -fs /usr/share/zoneinfo/America/New_York /etc/localtime

    # # needed for concoct
    export DEBIAN_FRONTEND=noninteractive
    apt-get -y install \
    build-essential \
    wget \
    git \
    less \
    rsync \
    curl libcurl4 \
    python3 python3-dev python3-pybedtools

    cd /
    wget -c https://repo.anaconda.com/miniconda/Miniconda3-py39_4.11.0-Linux-x86_64.sh
    /bin/bash Miniconda3-py39_4.11.0-Linux-x86_64.sh -bfp /miniconda3
    export PATH=/miniconda3/bin:$PATH

    conda config --file /miniconda3/.condarc --add channels defaults
    conda config --file /miniconda3/.condarc --add channels conda-forge
    conda config --file /miniconda3/.condarc --add channels bioconda
    conda config --file /miniconda3/.condarc --add channels ursky

    echo ". /miniconda3/etc/profile.d/conda.sh" >> $SINGULARITY_ENVIRONMENT
    echo "conda activate scenicplus" >> $SINGULARITY_ENVIRONMENT

    . /miniconda3/etc/profile.d/conda.sh

    conda create --name scenicplus python=3.8
    conda activate scenicplus

    cd /
    mkdir /ucsc.v386
    cd /ucsc.v386
    wget -O bedToBigBed http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/bedToBigBed
    chmod a+x /ucsc.v386/*

    cd /
    wget https://github.com/macs3-project/MACS/archive/refs/tags/v2.2.7.1.tar.gz -O MACS.tar.gz
    tar -xvf MACS.tar.gz
    cd MACS-2.2.7.1
    sed -i 's/install_requires = \[f"numpy>={numpy_requires}",\]/install_requires = \[f"numpy{numpy_requires}",\]/' setup.py
    pip install -e .

    conda install --channel conda-forge --channel bioconda bedtools htslib pyrle pybedtools scanpy python-igraph leidenalg
    
    cd /
    git clone https://github.com/aertslab/scenicplus
    cd scenicplus
    
    # patch https://github.com/aertslab/scenicplus/commit/821ee7b719afbd1d1e74aadb3ffda9e27165c930
    sed -i 's/get_feature_names/get_feature_names_out/' /scenicplus/src/scenicplus/loom.py
    pip install -e .

    conda install --channel conda-forge numpy=1.23.5 --force
    pip install louvain

Hope this helps!

@Umaarasu
Copy link

Umaarasu commented Mar 16, 2024

Hi both

You're right. get_feature_names got replaced by get_feature_names_out (see: scikit-learn/scikit-learn#18444). I will update the code.

Best,

Seppe
@SeppeDeWinter
Hi, Does this mean we have to just update the scikit-learn?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

8 participants