Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError: 50 1 in assert len(fwd)==len(rev),str(len(fwd))+" "+str(len(rev)) #74

Open
ofiryaish opened this issue Dec 6, 2020 · 4 comments

Comments

@ofiryaish
Copy link

Hey, Installed modisco 5.9.0 (update it actually), and I get an error while it running:

TF-MoDISco is using the TensorFlow backend.
WARNING:tensorflow:From /users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
/users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/sklearn/utils/deprecation.py:143: FutureWarning: The sklearn.neighbors.kde module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.neighbors. Anything that cannot be imported from sklearn.neighbors is now part of the private API.
  warnings.warn(message, FutureWarning)
MEMORY 1.009188864
On task task0
Computing windowed sums on original
Generating null dist
peak(mu)= -3.022349007487297
Computing threshold
Thresholds from null dist were -0.017756938934326172  and  0.00016736984252929688
Passing windows frac was 0.9985476992143659 , which is above  0.2 ; adjusting
Final raw thresholds are -4.143244409561158  and  4.143244409561158
Final transformed thresholds are -0.8  and  0.8
saving plot to figures/scoredist_0.png
Got 10934 coords
After resolving overlaps, got 10934 seqlets
Across all tasks, the weakest transformed threshold used was: 0.7999
MEMORY 1.039724544
10934 identified in total
min_metacluster_size_frac * len(seqlets) = 109 is more than min_metacluster_size=100.
Using it as a new min_metacluster_size
Reducing weak_threshold_for_counting_sign to match weakest_transformed_thresh, from 0.8 to 0.7999
2 activity patterns with support >= 109 out of 2 possible patterns
Metacluster sizes:  [10203, 731]
Idx to activities:  {0: '-1', 1: '1'}
MEMORY 1.040121856
On metacluster 1
Metacluster size 731
Relevant tasks:  ('task0',)
Relevant signs:  (1,)
TfModiscoSeqletsToPatternsFactory: seed=1234
(Round 1) num seqlets: 731
(Round 1) Computing coarse affmat
MEMORY 1.069858816
Beginning embedding computation
Computing embeddings
Using TensorFlow backend.
MAKING A SESSION
2020-12-06 15:26:34.713266: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-12-06 15:26:34.714747: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      
Finished embedding computation in 7.69 s
Starting affinity matrix computations
Normalization computed in 0.68 s
Cosine similarity mat computed in 0.79 s
Normalization computed in 0.77 s
Cosine similarity mat computed in 0.85 s
Finished affinity matrix computations in 1.64 s
(Round 1) Compute nearest neighbors from coarse affmat
MEMORY 1.347534848
Computed nearest neighbors in 0.09 s
MEMORY 1.356439552
(Round 1) Computing affinity matrix on nearest neighbors
MEMORY 1.356439552
Launching nearest neighbors affmat calculation job
MEMORY 1.356570624
Parallel runs completed
MEMORY 1.316503552
Job completed in: 12.76 s
MEMORY 1.316507648
Launching nearest neighbors affmat calculation job
MEMORY 1.316466688
Parallel runs completed
MEMORY 1.328021504
Job completed in: 12.96 s
MEMORY 1.332158464
(Round 1) Computed affinity matrix on nearest neighbors in 25.91 s
MEMORY 1.336352768
Filtered down to 633 of 731
(Round 1) Retained 633 rows out of 731 after filtering
MEMORY 1.336487936
(Round 1) Computing density adapted affmat
MEMORY 1.336487936
[t-SNE] Computing 31 nearest neighbors...
[t-SNE] Indexed 633 samples in 0.001s...
[t-SNE] Computed neighbors for 633 samples in 0.007s...
[t-SNE] Computed conditional probabilities for sample 633 / 633
[t-SNE] Mean sigma: 0.225459
(Round 1) Computing clustering
MEMORY 1.336500224
Beginning preprocessing + Leiden
  0%|                                                                                                                                                                                                                                                           | 0/50 [00:00<?, ?it/s]Quality: 0.5709345452688861
  2%|████▊                                                                                                                                                                                                                                              | 1/50 [00:00<00:06,  8.05it/s]Quality: 0.5713024674728743
  4%|█████████▋                                                                                                                                                                                                                                         | 2/50 [00:00<00:07,  6.67it/s]Quality: 0.571337497320167
  6%|██████████████▌                                                                                                                                                                                                                                    | 3/50 [00:00<00:07,  6.19it/s]Quality: 0.5725037634010955
 16%|██████████████████████████████████████▉                                                                                                                                                                                                            | 8/50 [00:01<00:06,  6.55it/s]Quality: 0.5726370747530787
 18%|███████████████████████████████████████████▋                                                                                                                                                                                                       | 9/50 [00:01<00:06,  5.87it/s]Quality: 0.5731774961658465
 26%|██████████████████████████████████████████████████████████████▉                                                                                                                                                                                   | 13/50 [00:02<00:06,  5.84it/s]Quality: 0.5755771740992032
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:08<00:00,  5.60it/s]
Got 11 clusters after round 1
Counts:
{1: 123, 2: 97, 5: 44, 6: 37, 7: 34, 3: 84, 4: 67, 8: 16, 0: 127, 10: 2, 9: 2}
MEMORY 1.337581568
(Round 1) Aggregating seqlets in each cluster
MEMORY 1.337581568
Aggregating for cluster 0 with 127 seqlets
MEMORY 1.337581568
Trimming eliminated 0 seqlets out of 127
Skipped 57 seqlets
Aggregating for cluster 1 with 123 seqlets
MEMORY 1.337581568
Trimming eliminated 0 seqlets out of 123
Skipped 59 seqlets
Aggregating for cluster 2 with 97 seqlets
MEMORY 1.337667584
Trimming eliminated 0 seqlets out of 97
Skipped 40 seqlets
Aggregating for cluster 3 with 84 seqlets
MEMORY 1.337790464
Trimming eliminated 0 seqlets out of 84
Skipped 38 seqlets
Aggregating for cluster 4 with 67 seqlets
MEMORY 1.337864192
Trimming eliminated 0 seqlets out of 67
Skipped 28 seqlets
Aggregating for cluster 5 with 44 seqlets
MEMORY 1.33789696
Trimming eliminated 0 seqlets out of 44
Skipped 23 seqlets
Aggregating for cluster 6 with 37 seqlets
MEMORY 1.337933824
Trimming eliminated 0 seqlets out of 37
Skipped 19 seqlets
Aggregating for cluster 7 with 34 seqlets
MEMORY 1.337933824
Trimming eliminated 0 seqlets out of 34
Skipped 19 seqlets
Aggregating for cluster 8 with 16 seqlets
MEMORY 1.337933824
Trimming eliminated 0 seqlets out of 16
Skipped 4 seqlets
Aggregating for cluster 9 with 2 seqlets
MEMORY 1.337950208
Trimming eliminated 0 seqlets out of 2
Skipped 1 seqlets
Aggregating for cluster 10 with 2 seqlets
MEMORY 1.337950208
Trimming eliminated 0 seqlets out of 2
Skipped 1 seqlets
(Round 2) num seqlets: 344
(Round 2) Computing coarse affmat
MEMORY 1.337950208
Beginning embedding computation
Computing embeddings
Finished embedding computation in 2.76 s
Starting affinity matrix computations
Normalization computed in 0.3 s
Cosine similarity mat computed in 0.36 s
Normalization computed in 0.34 s
Cosine similarity mat computed in 0.39 s
Finished affinity matrix computations in 0.75 s
(Round 2) Compute nearest neighbors from coarse affmat
MEMORY 1.329205248
Computed nearest neighbors in 0.05 s
MEMORY 1.324064768
(Round 2) Computing affinity matrix on nearest neighbors
MEMORY 1.324064768
Launching nearest neighbors affmat calculation job
MEMORY 1.324101632
Parallel runs completed
MEMORY 1.326985216
Job completed in: 6.71 s
MEMORY 1.326723072
Launching nearest neighbors affmat calculation job
MEMORY 1.326460928
Parallel runs completed
MEMORY 1.32646912
Job completed in: 6.71 s
MEMORY 1.32646912
(Round 2) Computed affinity matrix on nearest neighbors in 13.5 s
MEMORY 1.32646912
Not applying filtering for rounds above first round
MEMORY 1.32646912
(Round 2) Computing density adapted affmat
MEMORY 1.32646912
[t-SNE] Computing 31 nearest neighbors...
[t-SNE] Indexed 344 samples in 0.000s...
[t-SNE] Computed neighbors for 344 samples in 0.003s...
[t-SNE] Computed conditional probabilities for sample 344 / 344
[t-SNE] Mean sigma: 0.232394
(Round 2) Computing clustering
MEMORY 1.32646912
Beginning preprocessing + Leiden
  0%|                                                                                                                                                                                                                                                           | 0/50 [00:00<?, ?it/s]Quality: 0.5417845542582778
Quality: 0.5466505032751225
  8%|███████████████████▍                                                                                                                                                                                                                               | 4/50 [00:00<00:02, 17.08it/s]Quality: 0.5473188417171921
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:03<00:00, 16.05it/s]
Got 10 clusters after round 2
Counts:
{1: 64, 7: 19, 5: 23, 6: 20, 2: 49, 0: 71, 3: 49, 8: 19, 4: 28, 9: 2}
MEMORY 1.326731264
(Round 2) Aggregating seqlets in each cluster
MEMORY 1.326731264
Aggregating for cluster 0 with 71 seqlets
MEMORY 1.326731264
Trimming eliminated 0 seqlets out of 71
Skipped 11 seqlets
Aggregating for cluster 1 with 64 seqlets
MEMORY 1.326731264
Trimming eliminated 0 seqlets out of 64
Skipped 7 seqlets
Aggregating for cluster 2 with 49 seqlets
MEMORY 1.326731264
Trimming eliminated 0 seqlets out of 49
Skipped 7 seqlets
Aggregating for cluster 3 with 49 seqlets
MEMORY 1.326731264
Trimming eliminated 0 seqlets out of 49
Skipped 8 seqlets
Removed 1 duplicate seqlets
Aggregating for cluster 4 with 28 seqlets
MEMORY 1.326731264
Trimming eliminated 0 seqlets out of 28
Skipped 4 seqlets
Aggregating for cluster 5 with 23 seqlets
MEMORY 1.326731264
Trimming eliminated 0 seqlets out of 23
Skipped 3 seqlets
Aggregating for cluster 6 with 20 seqlets
MEMORY 1.326731264
Trimming eliminated 0 seqlets out of 20
Skipped 6 seqlets
Aggregating for cluster 7 with 19 seqlets
MEMORY 1.326731264
Trimming eliminated 0 seqlets out of 19
Skipped 7 seqlets
Aggregating for cluster 8 with 19 seqlets
MEMORY 1.326731264
Trimming eliminated 0 seqlets out of 19
Skipped 10 seqlets
Aggregating for cluster 9 with 2 seqlets
MEMORY 1.326731264
Trimming eliminated 0 seqlets out of 2
Skipped 1 seqlets
Dropping cluster 9 with 1 seqlets due to sign disagreement
Got 9 clusters
Splitting into subclusters...
MEMORY 1.326731264
Inspecting for spurious merging
Wrote graph to binary file in 0.010907888412475586 seconds
Running Louvain modularity optimization
After 1 runs, maximum modularity is Q = 0.0044247
Louvain completed 21 runs in 1.5680828094482422 seconds
Similarity is 0.91470367; is_dissimilar is False
Inspecting for spurious merging
Wrote graph to binary file in 0.010420799255371094 seconds
Running Louvain modularity optimization
After 1 runs, maximum modularity is Q = 0.004879
After 3 runs, maximum modularity is Q = 0.00499837
After 4 runs, maximum modularity is Q = 0.00521547
After 16 runs, maximum modularity is Q = 0.00521548
Louvain completed 36 runs in 2.6976420879364014 seconds
Similarity is 0.8170022; is_dissimilar is False
Inspecting for spurious merging
Wrote graph to binary file in 0.005519390106201172 seconds
Running Louvain modularity optimization
After 1 runs, maximum modularity is Q = 0.00386571
After 3 runs, maximum modularity is Q = 0.00456507
Louvain completed 23 runs in 1.697765588760376 seconds
Similarity is 0.8003647; is_dissimilar is False
Inspecting for spurious merging
Wrote graph to binary file in 0.005071401596069336 seconds
Running Louvain modularity optimization
After 1 runs, maximum modularity is Q = 0.00160869
After 2 runs, maximum modularity is Q = 0.00210899
After 4 runs, maximum modularity is Q = 0.00252618
After 5 runs, maximum modularity is Q = 0.00263157
After 6 runs, maximum modularity is Q = 0.00266519
After 8 runs, maximum modularity is Q = 0.0026652
Louvain completed 28 runs in 2.269061803817749 seconds
Similarity is 0.9035732; is_dissimilar is False
Merging on 9 clusters
MEMORY 1.32550656
On merging iteration 1
Numbers for each pattern pre-subsample: [60, 57, 42, 40, 24, 20, 14, 12, 9]
Numbers after subsampling: [60, 57, 42, 40, 24, 20, 14, 12, 9]
TF-MoDISco is using the TensorFlow backend.
TF-MoDISco is using the TensorFlow backend.
TF-MoDISco is using the TensorFlow backend.
TF-MoDISco is using the TensorFlow backend.
2020-12-06 15:27:51.339844: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-12-06 15:27:51.340008: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-12-06 15:27:51.340128: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-12-06 15:27:51.340421: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
WARNING:tensorflow:From /users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:tensorflow:From /users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:tensorflow:From /users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:tensorflow:From /users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
/users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/sklearn/utils/deprecation.py:143: FutureWarning: The sklearn.neighbors.kde module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.neighbors. Anything that cannot be imported from sklearn.neighbors is now part of the private API.
  warnings.warn(message, FutureWarning)
/users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/sklearn/utils/deprecation.py:143: FutureWarning: The sklearn.neighbors.kde module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.neighbors. Anything that cannot be imported from sklearn.neighbors is now part of the private API.
  warnings.warn(message, FutureWarning)
/users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/sklearn/utils/deprecation.py:143: FutureWarning: The sklearn.neighbors.kde module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.neighbors. Anything that cannot be imported from sklearn.neighbors is now part of the private API.
  warnings.warn(message, FutureWarning)
/users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/sklearn/utils/deprecation.py:143: FutureWarning: The sklearn.neighbors.kde module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.neighbors. Anything that cannot be imported from sklearn.neighbors is now part of the private API.
  warnings.warn(message, FutureWarning)
Applying left/right pad of 0 and 1 for (7479, 61, 111) with total sequence length 110
Traceback (most recent call last):
  File "/data/yaishof/deg_project/deg_project/general/run.py", line 47, in <module>
    GPU=False
  File "/data/yaishof/deg_project/deg_project/NN/NN_train_test_models_utilies.py", line 302, in evaluate_model_type
    model_id,  model_type, data_type, preforme_TF_modisco)
  File "/data/yaishof/deg_project/deg_project/NN/NN_train_test_models_utilies.py", line 346, in evaluate_TF_modisco
    TF_modisco.run_modisco(hyp_impscores, impscores, onehot_data)
  File "/data/yaishof/deg_project/deg_project/NN/TF_modisco.py", line 59, in run_modisco
    one_hot=onehot_data)#,
  File "/users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/modisco/tfmodisco_workflow/workflow.py", line 377, in __call__
    metacluster_seqlets)
  File "/users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/modisco/tfmodisco_workflow/seqlets_to_patterns.py", line 913, in __call__
    patterns=split_patterns) 
  File "/users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/modisco/aggregator.py", line 845, in __call__
    self.pattern_comparison_settings.track_names) 
  File "/users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/modisco/core.py", line 149, in create_seqlets
    attribute_names=attribute_names))
  File "/users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/modisco/core.py", line 159, in create_seqlet
    attribute_names=attribute_names) 
  File "/users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/modisco/core.py", line 165, in augment_seqlet
    data_track=self.track_name_to_data_track[track_name])
  File "/users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/modisco/core.py", line 456, in add_snippet_from_data_track
    snippet = data_track.get_snippet(coor=self.coor)
  File "/users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/modisco/core.py", line 111, in get_snippet
    has_pos_axis=self.has_pos_axis)
  File "/users/agnon/year2016/yaishof/.conda/envs/ofir_env/lib/python3.6/site-packages/modisco/core.py", line 18, in __init__
    assert len(fwd)==len(rev),str(len(fwd))+" "+str(len(rev))
AssertionError: 50 1

It's probably something with the update that is not working any more.

Another problem that I saw in this version is that you preformed tf.compat.v1.disable_v2_behavior(), and therefore if the modisco is part of a certain pipeline that work make other preation in TensorFlow 2, then there is a problem: for example, it seems that it disable eager execution. when I tried to perform tf.compat.v1.enable_v2_behavior after the use with modisco it still was not working, and therefore other solution is needed.

Thank you for you work

@AvantiShri AvantiShri mentioned this issue Dec 10, 2020
@AvantiShri
Copy link
Collaborator

AvantiShri commented Dec 10, 2020

Hi @ofiryaish, thanks for bringing this error to my attention. Can you try out the fix in PR #76? If it works, I can create a new release with the fix. Let me know if you have any confusion.

As for the issue with tf.compat.v1.disable_v2_behavior(), unfortunately I'm not sure there's a simple way to fix this without breaking backwards compatibility with tensorflow 1 (which some codebases are still on). If you run tfmodisco in a separate script or (if you are running in a jupyter notebook) if you restart the jupyter kernel after running tfmodisco, then tf.compat.v1.enable_v2_behavior might work.

@ofiryaish
Copy link
Author

@AvantiShri
Seems to work.

regarding the Tensorflow problem, I understand, I thought it has easy fix. It is hard to combine modisco in a pipeline that involved other operations in TF2.

Thank you

@AvantiShri
Copy link
Collaborator

FYI @ofiryaish, my more recent versions of TF-MoDISco have switched to a different default way of computing the gapped kmer embeddings that doesn't rely on the GPU (and hence does not require importing tensorflow). The old way of computing gapped kmer embeddings is still supported, but if you use the new default settings then it won't need the GPU, and you might find that easier. Let me know if you have any questions.

@ofiryaish
Copy link
Author

I will try it in the future. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants