Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dir_ls crashes R 4.2.1 when reading very large directories #447

Open
arthurgailes opened this issue Mar 21, 2024 · 3 comments
Open

dir_ls crashes R 4.2.1 when reading very large directories #447

arthurgailes opened this issue Mar 21, 2024 · 3 comments

Comments

@arthurgailes
Copy link

Hello,

The following command crashes when reading through a directory with dozens of folders and over a 600k total files. I'm not sure what the inflection point is, but it works fine in similar directory with 100k total files. This problem does not occur in R 4.3.1.

fs::dir_ls(mydir, recurse = T)

@gaborcsardi
Copy link
Member

Can you show the output, and also the stack trace after the crash?

@arthurgailes
Copy link
Author

well I can't do a traceback because it crashes, here's the Rterm output

& Rterm --no-save --no-restore --verbose -e "fs::dir_ls('my/dir/path', recurse
= TRUE)"
'verbose' and 'quietly' are both true; being verbose then ..
now dyn.load("W:/R/R-4.2.1/library/methods/libs/x64/methods.dll") ...

R version 4.2.1 (2022-06-23 ucrt) -- "Funny-Looking Kid"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

'verbose' and 'quietly' are both true; being verbose then ..
'verbose' and 'quietly' are both true; being verbose then ..
Garbage collection 1 = 0+0+1 (level 2) ...
12.1 Mbytes of cons cells used (35%)
2.8 Mbytes of vectors used (4%)
now dyn.load("W:/R/R-4.2.1/library/utils/libs/x64/utils.dll") ...
'verbose' and 'quietly' are both true; being verbose then ..
now dyn.load("W:/R/R-4.2.1/library/grDevices/libs/x64/grDevices.dll") ...
'verbose' and 'quietly' are both true; being verbose then ..
now dyn.load("W:/R/R-4.2.1/library/graphics/libs/x64/graphics.dll") ...
'verbose' and 'quietly' are both true; being verbose then ..
now dyn.load("W:/R/R-4.2.1/library/stats/libs/x64/stats.dll") ...
ending setup_Rmainloop(): R_Interactive = 0 {main.c}

R_ReplConsole(): before "for(;;)" {main.c}
fs::dir_ls('my/dir/path', recurse = TRUE)
now dyn.load("W:/R/R-4.2.1/library/fs/libs/x64/fs.dll") ...
Garbage collection 2 = 1+0+1 (level 0) ...
15.4 Mbytes of cons cells used (45%)
3.5 Mbytes of vectors used (5%)
Garbage collection 3 = 2+0+1 (level 0) ...
17.7 Mbytes of cons cells used (51%)
6.5 Mbytes of vectors used (10%)
Garbage collection 4 = 3+0+1 (level 0) ...
21.4 Mbytes of cons cells used (62%)
11.5 Mbytes of vectors used (18%)
Garbage collection 5 = 4+0+1 (level 0) ...
24.3 Mbytes of cons cells used (71%)
16.0 Mbytes of vectors used (25%)
Garbage collection 6 = 5+0+1 (level 0) ...
26.5 Mbytes of cons cells used (77%)
18.7 Mbytes of vectors used (29%)
Garbage collection 7 = 6+0+1 (level 0) ...
28.3 Mbytes of cons cells used (82%)
20.8 Mbytes of vectors used (32%)
Garbage collection 8 = 6+1+1 (level 1) ...
29.6 Mbytes of cons cells used (86%)
22.7 Mbytes of vectors used (35%)
Garbage collection 9 = 6+1+2 (level 2) ...
30.6 Mbytes of cons cells used (44%)
24.0 Mbytes of vectors used (37%)
Garbage collection 10 = 7+1+2 (level 0) ...
39.4 Mbytes of cons cells used (56%)
34.7 Mbytes of vectors used (54%)
Garbage collection 11 = 8+1+2 (level 0) ...
46.3 Mbytes of cons cells used (66%)
48.0 Mbytes of vectors used (75%)
Garbage collection 12 = 9+1+2 (level 0) ...
51.6 Mbytes of cons cells used (73%)
54.5 Mbytes of vectors used (85%)
Garbage collection 13 = 9+2+2 (level 1) ...
55.8 Mbytes of cons cells used (79%)
59.5 Mbytes of vectors used (93%)
Garbage collection 14 = 9+2+3 (level 2) ...
59.0 Mbytes of cons cells used (47%)
60.9 Mbytes of vectors used (68%)
Garbage collection 15 = 10+2+3 (level 0) ...
73.6 Mbytes of cons cells used (59%)
88.7 Mbytes of vectors used (100%)
Garbage collection 16 = 10+2+4 (level 2) ...
73.9 Mbytes of cons cells used (59%)
84.1 Mbytes of vectors used (72%)
Garbage collection 17 = 11+2+4 (level 0) ...
85.2 Mbytes of cons cells used (68%)
97.8 Mbytes of vectors used (84%)
Garbage collection 18 = 11+3+4 (level 1) ...
94.0 Mbytes of cons cells used (75%)
108.4 Mbytes of vectors used (93%)
Garbage collection 19 = 11+3+5 (level 2) ...
100.8 Mbytes of cons cells used (49%)
116.7 Mbytes of vectors used (75%)
Garbage collection 20 = 12+3+5 (level 0) ...
123.7 Mbytes of cons cells used (61%)
148.5 Mbytes of vectors used (95%)
Garbage collection 21 = 12+3+6 (level 2) ...
15.4 Mbytes of cons cells used (9%)
7.0 Mbytes of vectors used (6%)

@gaborcsardi
Copy link
Member

Sorry, I meant a stack trace from a low lever debugger, like gdb or drmingw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants