New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The taxon list table for the differential heat tree #317
Comments
That information is in the table returned by library(metacoder)
library(dplyr)
# Parse data for plotting
x = parse_tax_data(hmp_otus, class_cols = "lineage", class_sep = ";",
class_key = c(tax_rank = "taxon_rank", tax_name = "taxon_name"),
class_regex = "^(.+)__(.+)$")
# Convert counts to proportions
x$data$otu_table <- calc_obs_props(x, data = "tax_data", cols = hmp_samples$sample_id)
#> Calculating proportions from counts for 50 columns for 1000 observations.
# Get per-taxon counts
x$data$tax_table <- calc_taxon_abund(x, data = "otu_table", cols = hmp_samples$sample_id)
#> Summing per-taxon counts from 50 columns for 174 taxa
# Calculate difference between groups
x$data$diff_table <- compare_groups(x, data = "tax_table",
cols = hmp_samples$sample_id,
groups = hmp_samples$body_site)
# Subset the table for p-value
signif_taxon_diff <- filter(x$data$diff_table, p.adjust(wilcox_p_value) < 0.05)
# Add taxon name
signif_taxon_diff$taxon_name <- taxon_names(x)[signif_taxon_diff$taxon_id]
signif_taxon_diff
#> # A tibble: 45 x 8
#> taxon_id treatment_1 treatment_2 log2_median_ratio median_diff mean_diff
#> <chr> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 ad Nose Saliva -5.42 -0.267 -0.262
#> 2 ae Nose Saliva 5.12 0.597 0.581
#> 3 al Nose Saliva -5.68 -0.248 -0.243
#> 4 am Nose Saliva 5.12 0.597 0.581
#> 5 ap Nose Saliva -5.82 -0.173 -0.196
#> 6 ay Nose Saliva -5.68 -0.248 -0.243
#> 7 az Nose Saliva 5.13 0.597 0.583
#> 8 bc Nose Saliva -5.82 -0.173 -0.196
#> 9 bf Nose Saliva -3.26 -0.154 -0.210
#> 10 bx Nose Saliva -6.86 -0.170 -0.186
#> # … with 35 more rows, and 2 more variables: wilcox_p_value <dbl>,
#> # taxon_name <chr> Created on 2021-08-23 by the reprex package (v2.0.0) |
Thank you very much for your help! Your script is very useful. |
No problem, glad it helped! |
Hola Zachary, I am trying to do something similar here - I am actually just trying to see which taxa overlap in both treatments so I figured I'd only use the median_diff =0 instead of wilcox_p_value column (? or is this complete nonsense?!). I tried running this code in my own data, but I had some errors. I thought of trying out the one here, but I still get an error that the wilcox_p_value column (?) is not found:
I tried changing the code thinking its a matter of specifying which column, but I still get an error (although a different one).
When I try this with my own data, using the code below, I do get some 'signif_taxon_diff' except its now a time series table populated with NA!
Sorry if this is all a bit convoluted for what I attempt to do - maybe there is even an easier way to just show which taxa are present in both treatments, but any help will be very much appreciated! |
Hola @DanielSoutoV, There is probably an easier way to do that, if I understand your goal correctly. Is something like this what you want? library(metacoder)
#> This is metacoder verison 0.3.5 (stable)
hmp_samples <- dplyr::ungroup(hmp_samples) # Needed until next package update
# Parse data from abundance matrix
x = parse_tax_data(hmp_otus, class_cols = "lineage", class_sep = ";",
class_key = c(tax_rank = "taxon_rank", tax_name = "taxon_name"),
class_regex = "^(.+)__(.+)$")
# Calculate the taxon abundance for groups of columns (e.g. treatments)
x$data$taxon_abund <- calc_taxon_abund(x, "tax_data", cols = hmp_samples$sample_id, groups = hmp_samples$body_site)
#> Summing per-taxon counts from 50 columns in 5 groups for 174 taxa
# Check which taxa overlap
taxon_names(x)[x$data$taxon_abund$Nose > 0 & x$data$taxon_abund$Saliva > 0]
#> ab ac ad
#> "Root" "Proteobacteria" "Bacteroidetes"
#> ae af ag
#> "Actinobacteria" "Firmicutes" "Fusobacteria"
#> aj ak al
#> "Gammaproteobacteria" "Flavobacteria" "Bacteroidia"
#> am an ao
#> "Actinobacteria" "Betaproteobacteria" "Bacilli"
#> ap aq as
#> "Clostridia" "Fusobacteria" "Epsilonproteobacteria"
#> aw ax ay
#> "Pasteurellales" "Flavobacteriales" "Bacteroidales"
#> az ba bc
#> "Actinomycetales" "Neisseriales" "Clostridiales"
#> bd be bf
#> "Burkholderiales" "Fusobacteriales" "Lactobacillales"
#> bg bi bk
#> "Gemellales" "Enterobacteriales" "Coriobacteriales"
#> bl bq br
#> "Campylobacterales" "Pasteurellaceae" "Flavobacteriaceae"
#> bs bt bu
#> "Porphyromonadaceae" "Propionibacteriaceae" "Neisseriaceae"
#> bw bx bz
#> "Bacteroidaceae" "Veillonellaceae" "Burkholderiaceae"
#> ca cb cd
#> "Corynebacteriaceae" "Actinomycetaceae" "Fusobacteriaceae"
#> ce cf cg
#> "Streptococcaceae" "Lachnospiraceae" "Prevotellaceae"
#> ch ck cm
#> "Gemellaceae" "Enterobacteriaceae" "Coriobacteriaceae"
#> cn cp cr
#> "Ruminococcaceae" "Carnobacteriaceae" "Peptostreptococcaceae"
#> cx dm dn
#> "Micrococcaceae" "Haemophilus" "Capnocytophaga"
#> do dp dq
#> "Porphyromonas" "Propionibacterium" "Neisseria"
#> ds dt dv
#> "Bacteroides" "Selenomonas" "Lautropia"
#> dw dx dz
#> "Corynebacterium" "Actinomyces" "Veillonella"
#> ea eb ed
#> "Leptotrichia" "Streptococcus" "Prevotella"
#> ee ef eg
#> "Gemella" "Oribacterium" "Shuttleworthia"
#> em es et
#> "Fusobacterium" "Ruminococcus" "Granulicatella"
#> ey ez fb
#> "Dialister" "Peptostreptococcus" "Parabacteroides"
#> fg fi fk
#> "Roseburia" "Mitsuokella" "Catonella"
#> fm gi gl
#> "Rothia" "Tannerella" "Providencia" Created on 2022-02-02 by the reprex package (v2.0.1) |
Excellent Zachary thanks a lot. |
Regarding the differential heat tree, how I can print the taxon list table for the differential heat tree after Wilcoxon rank Sum test? The taxon list should include the more abundant taxons for both comparative groups.
The text was updated successfully, but these errors were encountered: