Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download paths with diacritical marks on Windows #783

Open
Alberto706 opened this issue Apr 18, 2023 · 3 comments
Open

Download paths with diacritical marks on Windows #783

Alberto706 opened this issue Apr 18, 2023 · 3 comments

Comments

@Alberto706
Copy link

While working with the tz.cpp library on Windows I noticed that the downloading of the time zone database failed because of the wrong conversion of a std::string, containing the download path with the character í, to a std::wstring.

This conversion is performed in the function convert_utf8_to_utf16. It uses the function MultiByteToWideChar with CP_UTF8 as the CodePage parameter. After some tests the problem was solved using the CodePage CP_ACP.

I am not very familiar with this system function, so I am not sure if this solution breaks other character conversions.

@HowardHinnant
Copy link
Owner

Thank you for this report.

I'm not familiar with (or have) a Windows machine to test on. But some care has been taken to never send a / path delimiter to the Windows OS, for example: https://github.com/HowardHinnant/date/blob/master/src/tz.cpp#L171-L175

Apparently somewhere the code must hardwire / instead of using folder_delimiter. After a brief search I've been unable to locate where. I see a few hardwired '/' but they are #ifdef'd out on Windows. If you have the inclination, if you discover where the / is coming from, I'd love a report on that. Or if you can send me a stack trace leading down to the failing convert_utf8_to_utf16 that would help too.

Thanks.

@Alberto706
Copy link
Author

The character I was mentioning is an i with an acute accent: i + ´ = í (using italics on the character made it similar to an /, sorry). I guess any other character with diacritical marks would have the same problem, this is just the one that gave me problems.

@Erroneous1
Copy link

According to MultiByteToWideChar CP_ACP is the default multibyte encoding your current machine is using with its current settings. Is it possible the download path that is being set is not first being encoded as UTF-8? I'd recommend using wide strings for various Windows functions and convert to a UTF-8 string before interacting with the library using WideCharToMultiByte with a code page of CP_UTF8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants