The taxon list table for the differential heat tree #317

biosciences · 2021-08-23T12:02:41Z

Regarding the differential heat tree, how I can print the taxon list table for the differential heat tree after Wilcoxon rank Sum test? The taxon list should include the more abundant taxons for both comparative groups.

zachary-foster · 2021-08-24T04:19:34Z

That information is in the table returned by compare_groups or calc_diff_abund_deseq2. You can subset it to just the significant results and add other information to it as needed. For example:

library(metacoder)
library(dplyr)

# Parse data for plotting
x = parse_tax_data(hmp_otus, class_cols = "lineage", class_sep = ";",
                   class_key = c(tax_rank = "taxon_rank", tax_name = "taxon_name"),
                   class_regex = "^(.+)__(.+)$")

# Convert counts to proportions
x$data$otu_table <- calc_obs_props(x, data = "tax_data", cols = hmp_samples$sample_id)
#> Calculating proportions from counts for 50 columns for 1000 observations.

# Get per-taxon counts
x$data$tax_table <- calc_taxon_abund(x, data = "otu_table", cols = hmp_samples$sample_id)
#> Summing per-taxon counts from 50 columns for 174 taxa

# Calculate difference between groups
x$data$diff_table <- compare_groups(x, data = "tax_table",
                                    cols = hmp_samples$sample_id,
                                    groups = hmp_samples$body_site)

# Subset the table for p-value
signif_taxon_diff <- filter(x$data$diff_table, p.adjust(wilcox_p_value) < 0.05)

# Add taxon name
signif_taxon_diff$taxon_name <- taxon_names(x)[signif_taxon_diff$taxon_id]
signif_taxon_diff
#> # A tibble: 45 x 8
#>    taxon_id treatment_1 treatment_2 log2_median_ratio median_diff mean_diff
#>    <chr>    <chr>       <chr>                   <dbl>       <dbl>     <dbl>
#>  1 ad       Nose        Saliva                  -5.42      -0.267    -0.262
#>  2 ae       Nose        Saliva                   5.12       0.597     0.581
#>  3 al       Nose        Saliva                  -5.68      -0.248    -0.243
#>  4 am       Nose        Saliva                   5.12       0.597     0.581
#>  5 ap       Nose        Saliva                  -5.82      -0.173    -0.196
#>  6 ay       Nose        Saliva                  -5.68      -0.248    -0.243
#>  7 az       Nose        Saliva                   5.13       0.597     0.583
#>  8 bc       Nose        Saliva                  -5.82      -0.173    -0.196
#>  9 bf       Nose        Saliva                  -3.26      -0.154    -0.210
#> 10 bx       Nose        Saliva                  -6.86      -0.170    -0.186
#> # … with 35 more rows, and 2 more variables: wilcox_p_value <dbl>,
#> #   taxon_name <chr>

^{Created on 2021-08-23 by the reprex package (v2.0.0)}

biosciences · 2021-08-29T02:09:18Z

Thank you very much for your help! Your script is very useful.

zachary-foster · 2021-08-30T05:48:36Z

No problem, glad it helped!

DanielSoutoV · 2022-02-01T20:49:33Z

Hola Zachary,

I am trying to do something similar here - I am actually just trying to see which taxa overlap in both treatments so I figured I'd only use the median_diff =0 instead of wilcox_p_value column (? or is this complete nonsense?!). I tried running this code in my own data, but I had some errors.

I thought of trying out the one here, but I still get an error that the wilcox_p_value column (?) is not found:

signif_taxon_diff <- filter(x$data$diff_table, p.adjust(wilcox_p_value) < 0.05)
Error in p.adjust(wilcox_p_value) : object 'wilcox_p_value' not found

I tried changing the code thinking its a matter of specifying which column, but I still get an error (although a different one).

signif_taxon_diff <- filter(x$data$diff_table, p.adjust(x$data$diff_table$wilcox_p_value) < 0.05)
Error in filter(x$data$diff_table, p.adjust(x$data$diff_table$wilcox_p_value) < :
missing values in 'filter'

When I try this with my own data, using the code below, I do get some 'signif_taxon_diff' except its now a time series table populated with NA!

mediandiff<- filter(obj$data$diff_table, obj$data$diff_table$median_diff == 0)
mediandiff
Time Series:
Start = 1
End = 283
Frequency = 1
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
1 NA NA NA NA NA NA NA
2 NA NA NA NA NA NA NA
3 NA NA NA NA NA NA NA
4 NA NA NA NA NA NA NA
5 NA NA NA NA NA NA NA
6 NA NA NA NA NA NA NA
7 NA NA NA NA NA NA NA
8 NA NA NA NA NA NA NA
9 NA NA NA NA NA NA NA
10 NA NA NA NA NA NA NA
.......
136 NA NA NA NA NA NA NA
137 NA NA NA NA NA NA NA
138 NA NA NA NA NA NA NA
139 NA NA NA NA NA NA NA
140 NA NA NA NA NA NA NA
141 NA NA NA NA NA NA NA
142 36521 268 268 NaN 3.5 -1.6754 58.81118
[ reached getOption("max.print") -- omitted 141 rows ]

Sorry if this is all a bit convoluted for what I attempt to do - maybe there is even an easier way to just show which taxa are present in both treatments, but any help will be very much appreciated!

zachary-foster · 2022-02-02T08:36:24Z

Hola @DanielSoutoV,

There is probably an easier way to do that, if I understand your goal correctly. Is something like this what you want?

library(metacoder)
#> This is metacoder verison 0.3.5 (stable)
hmp_samples <- dplyr::ungroup(hmp_samples) # Needed until next package update

# Parse data from abundance matrix
x = parse_tax_data(hmp_otus, class_cols = "lineage", class_sep = ";",
                   class_key = c(tax_rank = "taxon_rank", tax_name = "taxon_name"),
                   class_regex = "^(.+)__(.+)$")

# Calculate the taxon abundance for groups of columns (e.g. treatments)
x$data$taxon_abund <- calc_taxon_abund(x, "tax_data", cols = hmp_samples$sample_id, groups = hmp_samples$body_site)
#> Summing per-taxon counts from 50 columns in 5 groups for 174 taxa

# Check which taxa overlap
taxon_names(x)[x$data$taxon_abund$Nose > 0 & x$data$taxon_abund$Saliva > 0]
#>                      ab                      ac                      ad 
#>                  "Root"        "Proteobacteria"         "Bacteroidetes" 
#>                      ae                      af                      ag 
#>        "Actinobacteria"            "Firmicutes"          "Fusobacteria" 
#>                      aj                      ak                      al 
#>   "Gammaproteobacteria"         "Flavobacteria"           "Bacteroidia" 
#>                      am                      an                      ao 
#>        "Actinobacteria"    "Betaproteobacteria"               "Bacilli" 
#>                      ap                      aq                      as 
#>            "Clostridia"          "Fusobacteria" "Epsilonproteobacteria" 
#>                      aw                      ax                      ay 
#>        "Pasteurellales"      "Flavobacteriales"         "Bacteroidales" 
#>                      az                      ba                      bc 
#>       "Actinomycetales"          "Neisseriales"         "Clostridiales" 
#>                      bd                      be                      bf 
#>       "Burkholderiales"       "Fusobacteriales"       "Lactobacillales" 
#>                      bg                      bi                      bk 
#>            "Gemellales"     "Enterobacteriales"      "Coriobacteriales" 
#>                      bl                      bq                      br 
#>     "Campylobacterales"       "Pasteurellaceae"     "Flavobacteriaceae" 
#>                      bs                      bt                      bu 
#>    "Porphyromonadaceae"  "Propionibacteriaceae"         "Neisseriaceae" 
#>                      bw                      bx                      bz 
#>        "Bacteroidaceae"       "Veillonellaceae"      "Burkholderiaceae" 
#>                      ca                      cb                      cd 
#>    "Corynebacteriaceae"      "Actinomycetaceae"      "Fusobacteriaceae" 
#>                      ce                      cf                      cg 
#>      "Streptococcaceae"       "Lachnospiraceae"        "Prevotellaceae" 
#>                      ch                      ck                      cm 
#>           "Gemellaceae"    "Enterobacteriaceae"     "Coriobacteriaceae" 
#>                      cn                      cp                      cr 
#>       "Ruminococcaceae"     "Carnobacteriaceae" "Peptostreptococcaceae" 
#>                      cx                      dm                      dn 
#>        "Micrococcaceae"           "Haemophilus"        "Capnocytophaga" 
#>                      do                      dp                      dq 
#>         "Porphyromonas"     "Propionibacterium"             "Neisseria" 
#>                      ds                      dt                      dv 
#>           "Bacteroides"           "Selenomonas"             "Lautropia" 
#>                      dw                      dx                      dz 
#>       "Corynebacterium"           "Actinomyces"           "Veillonella" 
#>                      ea                      eb                      ed 
#>          "Leptotrichia"         "Streptococcus"            "Prevotella" 
#>                      ee                      ef                      eg 
#>               "Gemella"          "Oribacterium"        "Shuttleworthia" 
#>                      em                      es                      et 
#>         "Fusobacterium"          "Ruminococcus"        "Granulicatella" 
#>                      ey                      ez                      fb 
#>             "Dialister"    "Peptostreptococcus"       "Parabacteroides" 
#>                      fg                      fi                      fk 
#>             "Roseburia"           "Mitsuokella"             "Catonella" 
#>                      fm                      gi                      gl 
#>                "Rothia"            "Tannerella"           "Providencia"

^{Created on 2022-02-02 by the reprex package (v2.0.1)}

DanielSoutoV · 2022-02-02T14:07:54Z

Excellent Zachary thanks a lot.
Yes, definitely much easier this way.
Best,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The taxon list table for the differential heat tree #317

The taxon list table for the differential heat tree #317

biosciences commented Aug 23, 2021

zachary-foster commented Aug 24, 2021

biosciences commented Aug 29, 2021

zachary-foster commented Aug 30, 2021

DanielSoutoV commented Feb 1, 2022

zachary-foster commented Feb 2, 2022

DanielSoutoV commented Feb 2, 2022

The taxon list table for the differential heat tree #317

The taxon list table for the differential heat tree #317

Comments

biosciences commented Aug 23, 2021

zachary-foster commented Aug 24, 2021

biosciences commented Aug 29, 2021

zachary-foster commented Aug 30, 2021

DanielSoutoV commented Feb 1, 2022

zachary-foster commented Feb 2, 2022

DanielSoutoV commented Feb 2, 2022