Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust ontology levels figure #66

Closed
3 tasks done
bschilder opened this issue Apr 10, 2024 · 5 comments
Closed
3 tasks done

Adjust ontology levels figure #66

bschilder opened this issue Apr 10, 2024 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@bschilder
Copy link
Contributor

bschilder commented Apr 10, 2024

  • See what p-value vs. ontology level looks like
  • Remove specificity.
  • Add labels to the y-axis to make it more obvious what ontology level means (more broad terms <--> more specific terms)
Screenshot 2024-04-10 at 13 38 32
@bschilder
Copy link
Contributor Author

bschilder commented Apr 10, 2024

p-values (all)

When you plot all phenotype-celltype pvalues against ontology, this really just reflects the number of significant associations. This shows what we expect, fewer significant pvalues with more specific phenotypes.

p_all

p-values (significant only)

If you subset to tests to only those that passed the FDR<0.05 threshold and then plot the pvalues, there's a very significant (but very small) effect of more significant pvalues with more specific phenotypes.
p_sig

@bschilder bschilder self-assigned this Apr 10, 2024
@bschilder bschilder added the enhancement New feature or request label Apr 10, 2024
@bschilder
Copy link
Contributor Author

preview of the arrows annotating the y-axis

ontLvl_arrows

@bschilder bschilder reopened this Apr 16, 2024
@bschilder
Copy link
Contributor Author

When I checked whether logging the pvalues and plotting only the sig ones, I don't think it made any difference. But i'll go back and try again manually to make sure.

@bschilder
Copy link
Contributor Author

Would also be worth plotting against Information Content, which is a metric that approximates ontological terms specificity while normalising for different branch depths.

@bschilder
Copy link
Contributor Author

results <- MSTExplorer::load_example_results()
results <- HPOExplorer::add_hpo_name(results, hpo = hpo)
results <- HPOExplorer::add_ont_lvl(results)

Ontology level vs. Genes, Cell Types (sig), and P-values

plot_ontology_levels_out <- MSTExplorer::plot_ontology_levels(
  results = results, 
  ctd_list = ctd_list,
  x_vars = c("genes","cell types","p"), 
  nrow = 1 ) 

Image

Ontology level vs. Genes, Cell Types (sig), and P-values (sig)

plot_ontology_levels_out <- MSTExplorer::plot_ontology_levels(
  results = results, 
  ctd_list = ctd_list,
  x_vars = c("genes","cell types","p"),
  sig_vars= c(FALSE, TRUE, TRUE),
  log_vars = c(FALSE, FALSE, FALSE),
  nrow = 1) 

Image

Ontology level vs. Genes, Cell Types (sig), and P-values (logged)

When logging p-values, I replace p-values of exactly 0 to avoid resulting in Inf.
I replace all p-values==0 with the smallest number R can compute:

> .Machine$double.xmin
[1] 2.225074e-308
plot_ontology_levels_out <- MSTExplorer::plot_ontology_levels(
  results = results, 
  ctd_list = ctd_list,
  x_vars = c("genes","cell types","p"),
  sig_vars= c(FALSE, TRUE, FALSE),
  log_vars = c(FALSE, FALSE, TRUE),
  nrow = 1) 

Image

Ontology level vs. Genes, Cell Types (sig), and P-values (sig, logged)

plot_ontology_levels_out <- MSTExplorer::plot_ontology_levels(
  results = results, 
  ctd_list = ctd_list,
  x_vars = c("genes","cell types","p"),
  sig_vars= c(FALSE, TRUE, TRUE),
  log_vars = c(FALSE, FALSE, TRUE),
  nrow = 1) 

Conclusion

Plotting all non-logged p-values seems to be the most interpretable to me.

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant