Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http_400 Invalid argument when gcs_list_objects() returns exactly 1000 rows #179

Open
lisovyk opened this issue Jul 31, 2023 · 8 comments
Open

Comments

@lisovyk
Copy link

lisovyk commented Jul 31, 2023

Today my shiny app started returning such an error when executing gcs_list_objects('my-bucket'):

> gcs_list_objects('my-images')
ℹ 2023-07-31 17:40:24 > Request Status Code:  400
Error in `abort_http()`:
! http_400 Invalid argument.
Run `rlang::last_trace()` to see where the error occurred.
> rlang::last_trace()
<error/http_400>
Error in `abort_http()`:
! http_400 Invalid argument.
---
Backtrace:
    ▆
 1. └─googleCloudStorageR::gcs_list_objects("my-images")
 2.   └─googleAuthR::gar_api_page(...)
 3.     └─googleAuthR (local) f(pars_arguments = l)
 4.       └─googleAuthR:::doHttrRequest(...)
 5.         └─googleAuthR:::retryRequest(...)
 6.           └─googleAuthR:::abort_http(status_code, error)

I'm using the latest version – please, help me debug it. Is it a problem on my side or the google API has changed? I have not found information about API changes..

@lisovyk
Copy link
Author

lisovyk commented Jul 31, 2023

Further debug lead me to the thought that the problem is with pagination - This problem arised when the bucket got to 1000 entries, thus pagination started to matter.

page_f parameter in gar_api_page() is set to page_f = function(x) attr(x, "nextPageToken"). Renaming the nextPageToken to anything else removes the error, but pagination does not work: it returns only 1000 entries.

I have added another item to the bucket non-programatically, so it has 1001 entries – the problem dissapeared! I guess now I'm waiting when we get 2000 items in a bucket :)

@lisovyk lisovyk changed the title http_400 Invalid argument when using gcs_list_objects() http_400 Invalid argument when gcs_list_objects() returns exactly 1000 rows Jul 31, 2023
@MarkEdmondson1234
Copy link
Collaborator

Weird it started to go wrong, will check if api response has changed.

@lisovyk
Copy link
Author

lisovyk commented Aug 22, 2023

@MarkEdmondson1234 hey, have you had the time to look into it?

I happened to get to 1000 entries in another bucket, and here adding an entry by hand does not solve the problem,
I get same error, but can still get "some" results by passing a delimiter parameter..

> dim(gcs_list_objects('my-images', delimiter = ""))
ℹ 2023-08-22 07:54:16 > Request Status Code:  400
Error in `abort_http()`:
! http_400 Invalid argument.
Run `rlang::last_trace()` to see where the error occurred.
> dim(gcs_list_objects('my-images', delimiter = "a"))
[1] 830   3

@MarkEdmondson1234
Copy link
Collaborator

Can I see you sessionInfo()?

@lisovyk
Copy link
Author

lisovyk commented Aug 31, 2023

Sorry for late reply. I have reverted the code to the commit where the issue was persistent – as I have removed the gcs_list_objects()-related functionality from the app – but I can not reproduce it now, the function works as intended for me.

Here is the session info in any case – the same issue was present on ubuntu 18 server that runs shinyproxy with the app.

> sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Ventura 13.5.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] googleCloudStorageR_0.7.0 RColorBrewer_1.1-3        stringdist_0.9.8         
 [4] plotly_4.10.0             ggplot2_3.3.6             stringi_1.7.8            
 [7] stringr_1.4.1             lubridate_1.8.0           dotenv_1.0.3             
[10] mongolite_2.6.2           rclipboard_0.1.6          shinyvalidate_0.1.2      
[13] ssh_0.8.1                 emojifont_0.5.5           data.table_1.14.2        
[16] rhandsontable_0.3.8       shinyBS_0.61.1            shinyjs_2.1.0            
[19] shinydashboardPlus_2.0.3  shinydashboard_0.7.2      shiny_1.7.4.1            
[22] httr_1.4.6                DT_0.25                  

loaded via a namespace (and not attached):
 [1] tidyr_1.2.1       viridisLite_0.4.1 jsonlite_1.8.7    showtext_0.9-5    assertthat_0.2.1 
 [6] askpass_1.1       showtextdb_3.0    renv_0.17.3       yaml_2.3.5        pillar_1.8.1     
[11] glue_1.6.2        digest_0.6.33     promises_1.2.0.1  googleAuthR_2.0.1 colorspace_2.0-3 
[16] htmltools_0.5.5   httpuv_1.6.11     pkgconfig_2.0.3   sysfonts_0.8.8    purrr_0.3.4      
[21] xtable_1.8-4      scales_1.2.1      later_1.3.1       tibble_3.1.8      openssl_2.1.0    
[26] generics_0.1.3    ellipsis_0.3.2    cachem_1.0.8      withr_2.5.0       lazyeval_0.2.2   
[31] credentials_1.3.2 cli_3.6.1         proto_1.0.0       magrittr_2.0.3    mime_0.12        
[36] memoise_2.0.1     fs_1.6.3          fansi_1.0.3       tools_4.2.1       gargle_1.5.2     
[41] lifecycle_1.0.3   munsell_0.5.0     zip_2.2.1         compiler_4.2.1    rlang_1.1.1      
[46] grid_4.2.1        rstudioapi_0.14   sys_3.4.2         htmlwidgets_1.5.4 gtable_0.3.1     
[51] curl_5.0.1        R6_2.5.1          knitr_1.40        dplyr_1.0.10      fastmap_1.1.1    
[56] utf8_1.2.2        parallel_4.2.1    Rcpp_1.0.11       vctrs_0.4.1       tidyselect_1.1.2 
[61] xfun_0.33 

@MarkEdmondson1234
Copy link
Collaborator

This could have been there a while but intermittent if its exactly when the paging == page_size.

Will have a look through here to see if anything has changed recently https://cloud.google.com/storage/docs/json_api/v1/objects/list

@MarkEdmondson1234
Copy link
Collaborator

This looks different:

Returns results in a directory-like mode, with / being a common value for the delimiter.

    items[] contains object metadata for objects whose names do not contain delimiter, or whose names only have instances of delimiter in their prefix.
    prefixes[] contains truncated object names for objects whose names contain delimiter after any prefix. Object names are truncated beyond the first applicable instance of the delimiter, mimicking a directory. If multiple objects have the same truncated name, duplicates are omitted. Truncated object names in prefixes[] always end with /.

Must be set to / when used with the matchGlob parameter to filter results in a directory-like mode.

@wlandau
Copy link

wlandau commented Nov 9, 2023

For what it's worth, I just tested this for ropensci/targets#1172 using version 0.7.0, and gcs_list_objects() worked fine on my end even when there were exactly 1000 objects. Maybe somebody already solved this on the Google Cloud API end?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants