-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't list files with chinese characters on windows 10 #316
Comments
Hi. What happens when you run |
Invalid argument @ rb_file_s_lstat - D:/??.pdf Also it if helps, I've made sure to check the UTF-8 checkbox in the Ruby windows installer. |
I found this bug report which might be related: https://bugs.ruby-lang.org/issues/14591 What is the output of |
ruby -e 'puts Encoding.find("filesystem")' Filesystem is NTFS. For example colorls in WSL (Ubuntu on windows) is able to correctly display the same file |
I guess that's bad. As you cannot properly encode chinese characters with CP 1251. That's probably why you only see those question marks in the output. What happens if you change the codepage of the console to unicode: |
chcp 65001 doesn't help |
Did you try with |
You could also try |
RUBY_DEBUG doesn't work The same ??.pdf error |
FTR, here is how Ruby initializes the encoding it uses for filesystem operations on Windows: https://github.com/ruby/ruby/blob/c5eb24349a4535948514fe765c3ddb0628d81004/localeinit.c#L124-L130 To make it work, you need to use a codepage which properly encodes all of the characters you want to use with it. Ie. you would want that |
After lots of back-and-forth (https://github.com/avdv/clocale/pull/45/commits), I discovered that That way, I can force the Encoding to be UTF-8, which seems to work: https://ci.appveyor.com/project/avdv/clocale/build/job/48acvkeq5madf01d#L74 When displaying the strings on the console I probably have to encode them to the Could you save the following code to a file and run puts Encoding.find('filesystem')
puts Encoding.default_external
puts Encoding.default_internal
arg = ARGV[0]
puts arg
@name = File.basename(arg)
@stats = File.lstat(arg)
if @stats.directory?
begin
@contents = Dir.entries(arg, encoding: Encoding::UTF_8)
@contents.each do |item|
puts "#{item.encode(Encoding.default_external, Encoding::UTF_8)}: #{File.lstat(File.join(arg, item)).size}"
rescue => e
warn "#{item}: #{e}"
end
rescue StandardError => e
puts "#{e}: #{e.backtrace}"
end
end |
Here's the output Windows-1251 . |
OK, there was an encoding problem. But the output here looks OK, without the need to encode it explicitly. Did it also show up this way on your console? Or was it somehow mangled? |
Yes, the console output was ok |
OK, then it is probably better to simply print it as is, without re-encoding and substituting the undefined characters (in the target encoding) with the replacement character, ie. I'll see that I can come up with a PR this evening... if nobody beats me to it. |
I have created a PR, but I am still missing test coverage. If you want, check out the branch and see if it works for you. |
Hi @vasiby, long time no progress here... Sorry! But after fixing #352 I am revisiting your issue. Are you still using colorls on Windows? Could you perhaps check wether the latest changes done in #353 work for your use case too?
You can install a pre release including the latest fixes via |
Description
Shows an error and doesn't list files when any file in the directory has chinese characters in it's name, for example
杨帆.pdf
Error
Invalid argument @ rb_file_s_lstat - D:/??.pdf
Windows 10
colorls 1.2.0
ruby 2.5.5p157 (2019-03-15 revision 67260) [x64-mingw32]
The text was updated successfully, but these errors were encountered: