Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PCmetrics not being calculated for some clusters #74

Open
JRicardo24 opened this issue Sep 6, 2022 · 5 comments
Open

PCmetrics not being calculated for some clusters #74

JRicardo24 opened this issue Sep 6, 2022 · 5 comments

Comments

@JRicardo24
Copy link

JRicardo24 commented Sep 6, 2022

Hi guys, I was checking out the function on the metrics module that calculates the Principal Component related metrics, from which the next printscreen was taken:
Github_askJosh
The condition in line 286 seems to be working just fine, as I tested on my own dataset and it doesn't calculate the metrics for clusters with 20 spikes or less (also tested for 100 spikes threshold). However, I do have some other clusters in my dataset with more than 20 spikes, including one with 30k ish spikes, for which metrics are not being calculated (its row on the DataFrame is filled with NaN values). I was wondering if this has to do with the conditions on lines 284 or 285, since I haven't quite fully understood what they accomplish. Any help is appreciated :) @jsiegle

@jsiegle
Copy link
Collaborator

jsiegle commented Sep 7, 2022

Just to clarify – have you done any manual curation on this dataset? If not, then do you know which of the four conditions (all_pcs.shape[0] > 10, not (all_labels == cluster_id).all(), etc.) is causing it to skip the calculation for the units with lots of spikes?

@JRicardo24
Copy link
Author

Sorry for the delay @jsiegle .
No manual curation has been done on the dataset.
After some tests I found out that the condition that is causing it to skip the calculation for units with lots of spikes, like the cluster we have with 30917 spikes, is the (sum(all_labels == cluster_id) > 100). That is, in this case it is cluster 240 but when I print it's (sum(all_labels == cluster_id)) the result is 0. I trully don't know why the all_labels for this cluster does not contain 240 in it's elements, but that is what is preventing the calculation of the PC_metrics.

@jsiegle
Copy link
Collaborator

jsiegle commented Oct 20, 2022

Can you print the values in relative_counts for this cluster? It's possible that the count scaling is causing there to be zero PCs included in the calculation.

@JRicardo24
Copy link
Author

Git_resposta
This is the relative_counts for cluster 240. The value printed below, 41, it's just the length.
This test was made with a max_spikes_for_unit value of 2000.
Here's more info that might be helpful, @jsiegle :
aditional_info

@jsiegle
Copy link
Collaborator

jsiegle commented Oct 31, 2022

I'm not sure what could be causing this. Let me know if you're able to gain any more insight into the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants