Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout error when attempting to retrieve data #685

Open
danfished opened this issue Dec 19, 2023 · 5 comments
Open

Timeout error when attempting to retrieve data #685

danfished opened this issue Dec 19, 2023 · 5 comments
Labels

Comments

@danfished
Copy link

I get a curl timeout error when attempting to download data using the example code from the USGS page, example below:

library("dataRetrieval")

siteNo <- "01540500"
pCode <- "00060"
start.date <- "2022-08-01"
end.date <- "2022-09-30"

danville <- readNWISuv(siteNumbers = siteNo,
parameterCd = pCode,
startDate = start.date,
endDate = end.date)

Error message:

"Error in curl::curl_fetch_memory(url, handle = handle): Timeout was reached: [nwis.waterservices.usgs.gov] Failed to connect to nwis.waterservices.usgs.gov port 443 after 7519 ms: Timed out
Request failed [ERROR]. Retrying in 1 seconds..."

It usually will try two more times before final timeout.
I have tried to increase timeout limit, but it always times out around 7000 ms.

I am attempting to use dataRetrieval on a state network/state issued computer. I have reached out to our IT department, but they didn't really have any suggestions other than to update R and RStudio, but that opens up another issue myself and others have found with updates- without the proper combination of the two, our network won't download packages properly either. We also did attempt to update the .Renviron file with the following but it didn't seem to change anything:

http_proxy=http://proxy.state.gov/
http_proxy_user=user:pw

http_proxy=http://waterservices.usgs.gov/
http_proxy_user=user:pw

I have spoke with people at other offices who are having the same issue, clearly appears to be a problem with our network/firewall/proxy settings, but if someone could provide any insight for my simple brain to pass on to IT it would be greatly appreciated.

Session info:

R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
system code page: 65001

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] dataRetrieval_2.7.14

loaded via a namespace (and not attached):
[1] Rcpp_1.0.10 rstudioapi_0.15.0 magrittr_2.0.3 units_0.8-1 tidyselect_1.2.0 R6_2.5.1
[7] rlang_1.1.0 fansi_1.0.4 httr_1.4.6 dplyr_1.1.2 tools_4.1.2 grid_4.1.2
[13] KernSmooth_2.23-20 utf8_1.2.3 cli_3.6.1 e1071_1.7-13 DBI_1.1.3 class_7.3-21
[19] tibble_3.2.1 lifecycle_1.0.3 sf_1.0-12 vctrs_0.6.1 curl_5.0.0 glue_1.6.2
[25] proxy_0.4-27 compiler_4.1.2 pillar_1.9.0 generics_0.1.3 classInt_0.4-9 pkgconfig_2.0.3

@lstanish-usgs
Copy link
Contributor

Hello @danfished and thanks for the question. Doing some digging, but just an FYI that many of our resident experts are away for the holidays so it will take a bit longer than normal.

@lstanish-usgs
Copy link
Contributor

No guarantees that this will solve the problem, but the URL for waterservices should be https://waterservices.usgs.gov/

@ldecicco-USGS
Copy link
Collaborator

(Copying this from #270 ):

You'll have to setup the R client to use the proxy. It should hopefully be somewhat straight forward. I don't have a proxy to test this against, but...

This should hopefully get you your proxy info. If not, the config script should.

curl::ie_proxy_info()

Then, you need to setup httr to use that proxy. There is a command use_proxy that needs to be fed into set_config.

library(httr)
set_config(use_proxy(url="abc.com",port=8080, username="username", password="password"))

Note: Username and pass may be optional.

That, hopefully should fix your issue. But keep in mind, only for the life of your R session. You'd need to re-run when you restart R or put it into an .Renviron file so it runs every time on startup.

Let me know if that doesn't solve the problems

@danfished
Copy link
Author

@ldecicco-USGS Thanks for your reply. I was able to verify I'm using the correct proxy (still unsure 100% on 8080 being the correct port):

curl::ie_proxy_info()
$AutoDetect
[1] FALSE

$AutoConfigUrl
[1] "http://o365proxy.pa.gov"

$Proxy
NULL

$ProxyBypass
NULL

But unfortunately I'm still get the similar timeout error code after using set_config both and without username/pw.

@ldecicco-USGS
Copy link
Collaborator

So what exactly (OK, not exactly... don't paste in a password) do you have written in your .Renviorn? Is it:

library(httr)
set_config(use_proxy(url="o365proxy.pa.gov",
                     port=8080, 
                     username="user:pw"))

(above it sounded like you might be putting the waterservices URL into the use_proxy function, that would not be correct).

Assuming that looks good above, do you know if you use a PAC file? What do you get when you run:
ie_get_proxy_for_url("https://waterservices.usgs.gov")
I ask because it sounds like that might be a slightly different way to deal with proxies:
https://www.opencpu.org/posts/curl-release-0-9-2/
https://stackoverflow.com/questions/33538695/how-to-tell-r-to-use-proxy-auto-config-script-pac-in-windows

We can also set options in httr like this:

library(httr)
set_config(verbose())
set_config(progress())
daily <- readNWISdv("05427718", "00060")

I'm not sure if the "verbose" output would tell us any more information, worth a shot I guess.

I think another thing to try is some other examples to make sure they are working. Can you get all of these lines to work (but using your proxy information?)?
https://gist.github.com/jeroen/5127c288f8914bdb20be

Here are some links I've been looking at. It's always tricky to help with proxy questions because we don't have the a proxy to work with.
https://stackoverflow.com/questions/6467277/proxy-setting-for-r
https://stackoverflow.com/questions/4832560/how-do-i-tell-the-r-interpreter-how-to-use-the-proxy-server
jeroen/curl#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants