Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moran's I as bio conservation metric #245

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

Hrovatin
Copy link
Contributor

@Hrovatin Hrovatin commented Jul 1, 2021

I have finally added Moran's I as bio conservation metric.

Motivation:

Metric that captures bio preservation ** without cell type labels. Useful when no annotation is available or annotation is unreliable (e.g. for cell subtypes*).
Method:
Moran's I measures how clear spatially variable patterns genes have across embedding. Thus, if genes vary non-randomly on non-integrated data one would also want them to vary non-randomly across embedding on integrated data.

  • Compute HVGs on non-integrated data.
  • Compute Moran's I for these genes on integrated data. Case B: report mean(integrated_I).
  • Optional: Compute Moran's I on for each non-integrated sample. Case A: report mean(integrated_I - max(sample_I))

I have used only case B so far. But if same HVGs are used across different integrations it might be the case that B would suffice (with less computation). However, this would need to be tested.

** This corresponded strongly to ranking from other metrics on my pancreatic beta cell subtypes.

  • I think this is very important point. Conservation across cell sub-types may sometimes be much worse than across cell types, as I have observed when comparing scVI and the autoencoder from scArches (not trVAE).

TODO:

  • Please check that code matches your parameter naming etc. or add any other parameters you may find necessary. For example I always recompute connectives.

  • Please check that it runs in your env (Moran's I was added to Scanpy only recently) - I did not try to run it in any of your envs, just in mine

I have finally added Moran's I as bio conservation metric.
Motivation:
Metric that captures bio preservation ** without cell type labels. Useful when no annotation is available or annotation is unreliable (e.g. for cell subtypes*).
Method:
Moran's I measures how clear spatially variable patterns genes have across embedding. Thus, if genes vary non-randomly on non-integrated data one would also want them to vary non-randomly across embedding on integrated data.
- Compute HVGs on non-integrated data.
- Compute Moran's I for these genes on integrated data. Case B: report mean(integrated_I).
- Optional: Compute Moran's I on for each non-integrated sample. Case A: report mean(integrated_I - max(sample_I))

I have used only case B so far. But if same HVGs are used across different integrations  it might be the case that B would suffice (with less computation). However, this would need to be tested.

** This corresponded strongly to ranking from other metrics on my pancreatic beta cell subtypes.
* I think this is very important point. Conservation across cell sub-types may sometimes be much worse than across cell types, as I have observed when comparing scVI and the autoencoder from scArches (not trVAE).

TODO: 
- Please check that code matches your parameter naming etc. or add any other parameters you may find necessary. For example I always recompute connectives.
- Please check that it runs in your env (Moran's I was added to Scanpy only recently) - I did not try to run it in any of your envs, just in mine
@Hrovatin Hrovatin requested a review from LuckyMD July 1, 2021 10:16
@Hrovatin
Copy link
Contributor Author

Known issue: If any of the batches has <15 cells this will raise an Error since PCA with 15 components can not be computed.

@mumichae mumichae added this to To do in scib maintenance Mar 16, 2022
@mumichae mumichae requested a review from MxMstrmn March 29, 2022 14:02
@mumichae mumichae moved this from To do to In progress in scib maintenance Apr 19, 2022
@mumichae mumichae mentioned this pull request May 5, 2022
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
scib maintenance
In progress
Development

Successfully merging this pull request may close these issues.

None yet

1 participant