Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with parse_tax_data and installation issues #334

Open
zachary-foster opened this issue Mar 1, 2022 · 8 comments
Open

Error with parse_tax_data and installation issues #334

zachary-foster opened this issue Mar 1, 2022 · 8 comments

Comments

@zachary-foster
Copy link
Contributor

Transferred from ropensci/taxa#210 for @emankhalaf

I have a feature table with taxonomy collapsed to the genus level, where the first column is the taxonomy (ranks separated by ;), then the rest of columns represents samples_id showing the read count of each feature. I need to split the taxonomy column into 6 taxonomic ranks using parse_tax_data function.
I used this code:

obj <- parse_tax_data(feature-table-with-taxonomyl6,
                      class_cols = "taxonomy",
                      class_sep = ";",
                      class_regex = "^([a-z]{0,1})_{0,2}(.*)$",
                      class_key = c("tax_rank" = "taxon_rank", "name" = "taxon_name"))
print(obj)

then I got this error:

Error in parse_tax_data(feature - table - with - taxonomyl6, class_cols = "taxonomy",  :    could not find function "parse_tax_data"

However, I already loaded taxa package but I have a problem when installed devtools.

Thanks!
Eman

@zachary-foster zachary-foster changed the title Failure when installing metacoder in RStudio in both windows and Ubuntu 20.04 Error with parse_tax_data and installation issues Mar 1, 2022
@zachary-foster
Copy link
Contributor Author

Can you give me part of the input data so I can see how it is formatted?

I need to split the taxonomy column into 6 taxonomic ranks

If you are just trying to split taxonomy column in to 6 per-rank columns and don't need to use other metacoder functions that require the taxmap objects produced by parse_tax_data, you can use:

library(tidyr)
separate(feature-table-with-taxonomyl6, taxonomy, c("Kingdom", "Class", "Order", "etc..."), sep = ';')

@emankhalaf
Copy link

emankhalaf commented Mar 1, 2022

I did the following:

my_table <- read_csv("file.csv", col_names = TRUE) #  readr function
GT <- separate(my_table, taxonomy, c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species"), sep = ";")
head(GT)

I got this error:

Error:
! Must extract column with a single valid subscript.
x Subscript `var` has the wrong type `function`.
ℹ It must be numeric or character.
Backtrace:
  1. tidyr::separate(...)
  2. tidyr:::separate.data.frame(...)
  3. tidyselect::vars_pull(names(data), !!enquo(col))
  4. tidyselect:::pull_as_location2(loc, n, vars)
 13. vctrs::vec_as_subscript2(i, arg = "var", logical = "error")
 14. vctrs:::result_get(...)
 Error: 
x Subscript `var` has the wrong type `function`.
ℹ It must be numeric or character.

Any recommendations here!
Much thanks!

@zachary-foster
Copy link
Contributor Author

What does the table look like?

@emankhalaf
Copy link

It is feature table with taxonomy as txt file then I converted it into csv.
So, the first row is the header including taxonomy, S1, S2,....
Then, the row names are the taxonomy (d_kingdom up to s_species), and the abundance/read count of each feature across samples. I can e.mail the file to you if you do not mind!

Thank you!
Eman

@zachary-foster
Copy link
Contributor Author

Yea, it would be helpful if you emailed the file to me or attached it here.

zacharyfoster1989@gmail.com

@zachary-foster
Copy link
Contributor Author

You have a column at the end named taxonomy too. Since you have two columns with the same name readr::read_csv renames them, which is why your code did not work. Note that readr::read_csv tells you when it renames columns in the output below. Does this do what you wanted?

library(readr)
library(tidyr)
my_table <- read_csv("~/Downloads/feature-table-with-taxonomyl6.csv", col_names = TRUE) #  readr function
#> New names:
#> * taxonomy -> taxonomy...1
#> * taxonomy -> taxonomy...58
#> Rows: 308 Columns: 58
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr  (1): taxonomy...1
#> dbl (56): 1P-GH-R1, 1P-GH-R2, P1, P10, P11, P12b, P13, P14b, P15, P16, P17, ...
#> lgl  (1): taxonomy...58
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
GT <- separate(my_table, "taxonomy...1", c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus"), sep = ";") # No species rank in data
GT # Dont need to use head for tibbles
#> # A tibble: 308 × 63
#>    Kingdom     Phylum Class Order Family Genus `1P-GH-R1` `1P-GH-R2`    P1   P10
#>    <chr>       <chr>  <chr> <chr> <chr>  <chr>      <dbl>      <dbl> <dbl> <dbl>
#>  1 d__Bacteria p__Pr… c__G… o__E… f__Er… g__P…          0          0  8276  6048
#>  2 d__Bacteria __     __    __    __     __             0          0     0     2
#>  3 d__Bacteria p__Ch… c__D… o__S… f__S0… g__S…          0          0     0    53
#>  4 d__Bacteria p__Ba… c__B… o__S… f__Sp… g__S…          0          0     0     0
#>  5 d__Bacteria p__Fi… c__B… o__B… f__Ba… g__B…          0          0     0  1283
#>  6 d__Bacteria p__Ba… c__B… o__C… __     __             0          0     0     0
#>  7 d__Bacteria p__Fi… c__S… o__S… f__Sy… g__C…          0          0     0     0
#>  8 d__Bacteria p__Ba… c__B… o__C… f__Cy… g__S…          0          0    26    11
#>  9 d__Bacteria p__Pr… c__G… __    __     __             0          0     0     0
#> 10 d__Bacteria p__Ba… c__B… o__F… f__We… g__C…          0          0    34    32
#> # … with 298 more rows, and 53 more variables: P11 <dbl>, P12b <dbl>,
#> #   P13 <dbl>, P14b <dbl>, P15 <dbl>, P16 <dbl>, P17 <dbl>, P19 <dbl>,
#> #   P2 <dbl>, P20 <dbl>, P21 <dbl>, P22 <dbl>, P23 <dbl>, P24 <dbl>, P25 <dbl>,
#> #   P26 <dbl>, P27 <dbl>, P28 <dbl>, P29 <dbl>, P31 <dbl>, P32 <dbl>,
#> #   P33 <dbl>, P34b <dbl>, P35 <dbl>, P36b <dbl>, P37 <dbl>, P38 <dbl>,
#> #   P39b <dbl>, P40b <dbl>, P41 <dbl>, P42 <dbl>, P43 <dbl>, P44 <dbl>,
#> #   P45 <dbl>, P46 <dbl>, P47 <dbl>, P48 <dbl>, P49 <dbl>, P4b <dbl>, …

Created on 2022-03-02 by the reprex package (v2.0.1)

@emankhalaf
Copy link

@zachary-foster
Thank you so much! Now it works. I exported the file as tsv and deleted the extra taxonomy column.

@zachary-foster
Copy link
Contributor Author

No problem! Glad its working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants