Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to get a data frame from split_sheet? #10

Open
benmarwick opened this issue Feb 21, 2017 · 1 comment
Open

how to get a data frame from split_sheet? #10

benmarwick opened this issue Feb 21, 2017 · 1 comment

Comments

@benmarwick
Copy link

benmarwick commented Feb 21, 2017

I'd like to get the output of jailbreakr::split_sheet() as a data frame. How can I do that?

Here's what I've tried:

mg <- rexcel::rexcel_read("mini-gap.xlsx")
mg_split_sheet <- jailbreakr::split_sheet(mg)
thing <- mg_split_sheet[[1]]$values()
thing
#>      [,1]           [,2]        [,3]   [,4]      [,5]    [,6]       
#> [1,] "country"      "continent" "year" "lifeExp" "pop"   "gdpPercap"
#> [2,] "Algeria"      "Africa"    1952   43.077    9279525 2449.008   
#> [3,] "Angola"       "Africa"    1952   30.015    4232095 3520.61    
#> [4,] "Benin"        "Africa"    1952   38.223    1738315 1062.752   
#> [5,] "Botswana"     "Africa"    1952   47.622    442308  851.2411   
#> [6,] "Burkina Faso" "Africa"    1952   31.975    4469979 543.2552

It looks like a matrix or array, but the output of str is list (is that normal for an array? I don't work with them much)...

str(thing)
#> List of 36
#>  $ : chr "country"
#>  $ : chr "Algeria"
#>  $ : chr "Angola"
#>  $ : chr "Benin"
#>  $ : chr "Botswana"
#>  $ : chr "Burkina Faso"
#>  $ : chr "continent"
#>  $ : chr "Africa"
#>  $ : chr "Africa"
#>  $ : chr "Africa"
#>  $ : chr "Africa"
#>  $ : chr "Africa"
#>  $ : chr "year"
#>  $ : num 1952
#>  $ : num 1952
#>  $ : num 1952
#>  $ : num 1952
#>  $ : num 1952
#>  $ : chr "lifeExp"
#>  $ : num 43.1
#>  $ : num 30
#>  $ : num 38.2
#>  $ : num 47.6
#>  $ : num 32
#>  $ : chr "pop"
#>  $ : num 9279525
#>  $ : num 4232095
#>  $ : num 1738315
#>  $ : num 442308
#>  $ : num 4469979
#>  $ : chr "gdpPercap"
#>  $ : num 2449
#>  $ : num 3521
#>  $ : num 1063
#>  $ : num 851
#>  $ : num 543
#>  - attr(*, "dim")= Named int [1:2] 6 6
#>   ..- attr(*, "names")= chr [1:2] "row" "col"

I see in https://github.com/rsheets/linen/blob/master/R/export.R that the values element is created by array(lapply(...), which is why I posted here, rather than at the jailbreakr repo.

I tried as.data.frame.array, but that gives a data frame of lists:

thing_asdfa <- as.data.frame.array(thing)

str(thing_asdfa)
#> 'data.frame':    6 obs. of  6 variables:
#>  $ V1:List of 6
#>   ..$ : chr "country"
#>   ..$ : chr "Algeria"
#>   ..$ : chr "Angola"
#>   ..$ : chr "Benin"
#>   ..$ : chr "Botswana"
#>   ..$ : chr "Burkina Faso"
#>  $ V2:List of 6
#>   ..$ : chr "continent"
#>   ..$ : chr "Africa"
#>   ..$ : chr "Africa"
#>   ..$ : chr "Africa"
#>   ..$ : chr "Africa"
#>   ..$ : chr "Africa"
#>  $ V3:List of 6
#>   ..$ : chr "year"
#>   ..$ : num 1952
#>   ..$ : num 1952
#>   ..$ : num 1952
#>   ..$ : num 1952
#>   ..$ : num 1952
#>  $ V4:List of 6
#>   ..$ : chr "lifeExp"
#>   ..$ : num 43.1
#>   ..$ : num 30
#>   ..$ : num 38.2
#>   ..$ : num 47.6
#>   ..$ : num 32
#>  $ V5:List of 6
#>   ..$ : chr "pop"
#>   ..$ : num 9279525
#>   ..$ : num 4232095
#>   ..$ : num 1738315
#>   ..$ : num 442308
#>   ..$ : num 4469979
#>  $ V6:List of 6
#>   ..$ : chr "gdpPercap"
#>   ..$ : num 2449
#>   ..$ : num 3521
#>   ..$ : num 1063
#>   ..$ : num 851
#>   ..$ : num 543

And data frame-style indexing gives unexpected results:

thing[1,]
#> [[1]]
#> [1] "country"
#> 
#> [[2]]
#> [1] "continent"
#> 
#> [[3]]
#> [1] "year"
#> 
#> [[4]]
#> [1] "lifeExp"
#> 
#> [[5]]
#> [1] "pop"
#> 
#> [[6]]
#> [1] "gdpPercap"

I've also tried reshape2::melt and plyr::adply, but had no luck with those.

What do you recommend for getting this into a basic data frame?

Session info
devtools::session_info()
#> Session info --------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.3.2 (2016-10-31)
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_Australia.1252      
#>  tz       America/Los_Angeles         
#>  date     2017-02-20
#> Packages ------------------------------------------------------------------
#>  package    * version    date       source                             
#>  assertthat   0.1        2013-12-06 CRAN (R 3.2.2)                     
#>  backports    1.0.5      2017-01-18 CRAN (R 3.3.2)                     
#>  cellranger   1.1.0.9000 2017-02-20 Github (rsheets/cellranger@024d5ba)
#>  devtools     1.12.0     2016-06-24 CRAN (R 3.3.2)                     
#>  digest       0.6.12     2017-01-27 CRAN (R 3.3.2)                     
#>  evaluate     0.10       2016-10-11 CRAN (R 3.3.1)                     
#>  htmltools    0.3.5      2016-03-21 CRAN (R 3.2.4)                     
#>  jailbreakr   0.0.1      2017-02-20 Github (rsheets/jailbreakr@2fbec5f)
#>  knitr        1.15.1     2016-11-22 CRAN (R 3.3.2)                     
#>  lazyeval     0.2.0.9000 2016-11-07 Github (hadley/lazyeval@c155c3d)   
#>  linen        0.0.4      2017-02-20 Github (rsheets/linen@7618a13)     
#>  magrittr     1.5        2014-11-22 CRAN (R 3.3.1)                     
#>  memoise      1.0.0      2016-01-29 CRAN (R 3.2.5)                     
#>  R6           2.2.0      2016-10-05 CRAN (R 3.3.1)                     
#>  Rcpp         0.12.9     2017-01-14 CRAN (R 3.3.2)                     
#>  rexcel       0.0.1      2017-02-20 Github (rsheets/rexcel@e8dd5d3)    
#>  rmarkdown    1.3.1      2017-02-16 Github (rstudio/rmarkdown@e672d41) 
#>  rprojroot    1.2        2017-01-16 CRAN (R 3.3.2)                     
#>  stringi      1.1.2      2016-10-01 CRAN (R 3.3.1)                     
#>  stringr      1.1.0      2016-08-19 CRAN (R 3.3.1)                     
#>  tibble       1.2        2016-08-26 CRAN (R 3.3.1)                     
#>  withr        1.0.2      2016-06-20 CRAN (R 3.3.0)                     
#>  xml2         1.1.1      2017-02-20 Github (hadley/xml2@c84db5e)       
#>  yaml         2.1.14     2016-11-12 CRAN (R 3.3.2)

p.s. thanks for this project, and for the reprex pkg for making it easier to post code and output!

@jennybc
Copy link
Member

jennybc commented Feb 22, 2017

This is a question for @richfitz I think ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants