Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rnoaa::lcd() Error in safe_read_csv() #405

Open
reblake opened this issue Dec 10, 2021 · 6 comments
Open

rnoaa::lcd() Error in safe_read_csv() #405

reblake opened this issue Dec 10, 2021 · 6 comments

Comments

@reblake
Copy link

reblake commented Dec 10, 2021

Issue: Receiving this error.
Error in safe_read_csv(path, col_types = col_types) :
Attempt to override column 34 <> of inherent type 'string' down to 'float64' ignored. Only overrides to a higher type are currently supported. If this was intended, please coerce to the lower type afterwards.

R code that generates the error:

library(rnoaa)
df <- rnoaa::lcd(station = "70341025507", year = 2002)  # year doesn't seem to influence the error 

What used to happen a month ago in early Nov 2021:
The above line of code would download the specified data without errors.

Expected use:
I used to be able to use the above code in a custom function I've written to download and clean data from this station for the years 1999: 2021. I expect I should be able to continue to use the above code in my custom function as it worked in Nov 2021.

Session Info:
R version 4.1.1 (2021-08-10)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux 8.5 (Ootpa)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.3.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] rnoaa_1.3.8 lubridate_1.8.0 stringr_1.4.0 tidyr_1.1.4 rvest_1.0.1

loaded via a namespace (and not attached):
[1] Rcpp_1.0.7 pillar_1.6.4 compiler_4.1.1 tools_4.1.1 digest_0.6.29
[6] jsonlite_1.7.2 lifecycle_1.0.1 tibble_3.1.6 gtable_0.3.0 pkgconfig_2.0.3
[11] rlang_0.4.12 DBI_1.1.1 cli_3.1.0 crul_1.2.0 curl_4.3.2
[16] gridExtra_2.3 httr_1.4.2 dplyr_1.0.7 xml2_1.3.3 generics_0.1.1
[21] vctrs_0.3.8 rappdirs_0.3.3 triebeard_0.3.0 grid_4.1.1 tidyselect_1.1.1
[26] data.table_1.14.2 glue_1.5.1 httpcode_0.3.0 R6_2.5.1 fansi_0.5.0
[31] XML_3.99-0.8 ggplot2_3.3.5 purrr_0.3.4 hoardr_0.5.2 magrittr_2.0.1
[36] urltools_1.7.3 scales_1.1.1 ellipsis_0.3.2 assertthat_0.2.1 colorspace_2.0-2
[41] utf8_1.2.2 stringi_1.7.6 munsell_0.5.0 crayon_1.4.2

@reblake
Copy link
Author

reblake commented Dec 20, 2021

Anyone have ideas on what is causing this problem? Some investigation into how to solve this would really help! Thanks!

sckott added a commit to sckott/rnoaa that referenced this issue Dec 20, 2021
essentially allow warnings that still result in the data being read to not stop lcd
@sckott
Copy link
Contributor

sckott commented Dec 20, 2021

previous maintainer here. @djhocking perhaps this will work remotes::install_github("sckott/rnoaa@lcd-allow-warnings") sckott@89e004d?w=1 There's some warnings that the helper fxn currently catches and doesn't allow the file read to go happen. could just let those through and only stop on errors

@djhocking
Copy link
Collaborator

The issue seems to be with using col_classes in data.table::fread when "numeric" is called but the data have characters in it. In dailycoolingdegreedays there is a value of "0s". fread doesn't allow character strings to be converted to NA when there are characters.

I implemented the fixed proposed by @sckott and it works to bring in the data as characters and gives a warning. For now, if you install from github it should work to read in the data. If using those columns, then they'd have to be cleaned up after that.

I'm not sure about this as a long term solution. Alternatively, it could be read in as characters and then converted within the function.

@reblake
Copy link
Author

reblake commented Jan 4, 2022

Thanks for taking a look @djhocking! The re-installed package from GitHub works for me because I only need the hourlydrybulbtemperatureF data column. Any idea what changed from Nov 2021 to Dec 2021 to cause this problem? The non-numeric characters were in the data before, as they are used for flags for certain data conditions.

@djhocking
Copy link
Collaborator

I think it was a change in the data.table package's fread function but I'm not positive. That is what was causing the error but it could have also been a change in our internal function that calls that passing that error rather than a warning. fread doesn't allow strings to be converted to numeric if some of the values are non-numeric characters. This can be done after reading in the values as strings with data.table::fread using the as.numeric function. I think the philosophy with readr is to avoid making decisions (converting non-numeric strings to NA) when converting "to a lower" type. That way, things get read in as they are stored and then the user can make explicit decisions. The fix we implemented allows these through with a warning rather than throwing an error.

Changing input to the col_types argument is also an option to get around this. Tell fread to read in the column as a character and then convert after.

Glad it's working for you currently.

@steeleb
Copy link

steeleb commented Jul 13, 2022

Further update here - still an issue in the CRAN version, but fixed in the current main branch here (accessible using: remotes::install_github("ropensci/rnoaa"), just make sure you uninstall the cran version and restart your R session). I tried for some time to coerce columns to character using the col_types argument within the lcd function, but couldn't get it to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants