Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError with PCA Initialization in TensorFlow-Modisco: Seeking Advice for Sparse Matrix Handling #111

Open
JoneSu1 opened this issue Jan 19, 2024 · 1 comment

Comments

@JoneSu1
Copy link

JoneSu1 commented Jan 19, 2024

Hello TensorFlow-Modisco Community,

I recently encountered an issue while running a TensorFlow-Modisco workflow which involves TSNE computations. The workflow fails with a TypeError related to PCA initialization, specifically when handling sparse input matrices. The error message is as follows:

# install TF-modisco
!pip install modisco==0.5.16.0

import sklearn
print(sklearn.__version__)

import modisco

# check parameters
import inspect
def get_default_args(func):
    signature = inspect.signature(func)
    return {
        k: v.default
        for k, v in signature.parameters.items()
        if v.default is not inspect.Parameter.empty
    }

get_default_args(modisco.tfmodisco_workflow.workflow.TfModiscoWorkflow)

from importlib import reload
# reload(modisco.util)
# reload(modisco.pattern_filterer)
# reload(modisco.aggregator)
# reload(modisco.core)
# reload(modisco.seqlet_embedding.advanced_gapped_kmer)
# reload(modisco.affinitymat.transformers)
# reload(modisco.affinitymat.core)
# reload(modisco.affinitymat)
# reload(modisco.cluster.core)
# reload(modisco.cluster)
# reload(modisco.tfmodisco_workflow.seqlets_to_patterns)
# reload(modisco.tfmodisco_workflow)
# reload(modisco)

null_per_pos_scores = modisco.coordproducers.LaplaceNullDist(num_to_samp=5000)

# prepare TF-modisco function to run for dev or hk
def my_tfmodisco(task):
    ...
    [Rest of your code here]
    ...
    return tfmodisco_results

# function to visualize motifs
from collections import Counter
import numpy as np

from modisco.visualization import viz_sequence
reload(viz_sequence)
from matplotlib import pyplot as plt

import modisco.affinitymat.core
reload(modisco.affinitymat.core)
import modisco.cluster.phenograph.core
reload(modisco.cluster.phenograph.core)
import modisco.cluster.phenograph.cluster
reload(modisco.cluster.phenograph.cluster)
import modisco.cluster.core
reload(modisco.cluster.core)
import modisco.aggregator
reload(modisco.aggregator)

def modisco_motif_plots(task):
    ...
    [Rest of your code here]
    ...
    hdf5_results.close()

# Run TF-Modisco - takes around 10mins
dev_tfmodisco_results = my_tfmodisco('Dev_contrib_scores')

# save results
import modisco.util
reload(modisco.util)
grp = h5py.File("/content/drive/MyDrive/DeepSTARR_tutorial/Dev_modisco_results.hdf5", "w")
dev_tfmodisco_results.save_hdf5(grp)
grp.close()

TypeError Traceback (most recent call last)
[<ipython-input-121-086db4f84169>] in <cell line: 2>()
    1 # Run TF-Modisco - takes around 10mins
----> 2 dev_tfmodisco_results = my_tfmodisco('Dev_contrib_scores')
    3 
    4 
    5 # save results
5 frames
[/usr/local/lib/python3.10/dist-packages/sklearn/manifold/_t_sne.py] in _fit(self, X, skip_num_points)
    833 
    834         if isinstance(self.init, str) and self.init == "pca" and issparse(X):
--> 835             raise TypeError(
    836                 "PCA initialization is currently not supported "
    837                 "with the sparse input matrix. Use "
TypeError: PCA initialization is currently not supported with the sparse input matrix. Use init="random" instead.


This formatted code can be copied directly into a Markdown editor. The triple backticks (` ``` `) are used to start and end the code block, and `python` after the first set of backticks indicates that the block contains Python code, which helps in syntax highlighting. 😊👨‍💻
@AvantiShri
Copy link
Collaborator

Hi @JoneSu1, the fix for this had been helpfully contributed by another user in v0.5.16.3, but I hadn't pushed it to pypi. I just pushed it to pypi, so if you download the latest version you shouldn't get this error. Best, Avanti

(btw "tf" here stands for "transcription factor", it was given that prefix to distinguish it from Eclipse Modisco)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants