Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failures on Windows #13

Open
jayvdb opened this issue Apr 2, 2019 · 10 comments · May be fixed by #16
Open

Failures on Windows #13

jayvdb opened this issue Apr 2, 2019 · 10 comments · May be fixed by #16

Comments

@jayvdb
Copy link

jayvdb commented Apr 2, 2019

Copying the same .appveyor.yml from PiDelport/backports.test.support#5 exposes lots of Windows failures.

e.g. py27 https://ci.appveyor.com/project/jayvdb/backports-os/build/job/6kr6vh0cirw1kos5

======================================================================
FAIL: test_identity (test_os.FSEncodingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\projects\backports-os\tests\test_os.py", line 35, in test_identity
    self.assertEqual(os.fsdecode(bytesfn), fn)
AssertionError: u'unicodeL' != u'unicode\u0141'
- unicodeL
?        ^
+ unicode\u0141
?        ^

And extras test_text_roundtrip, test_encode_surrogates & test_decode_surrogates

On Python 3.6 test_decode_binary and test_encode_text fail.

@jayvdb
Copy link
Author

jayvdb commented Apr 2, 2019

@jaraco ping re https://github.com/jaraco/path.py which depends on this.

@pombredanne
Copy link

@jayvdb does it make any sense to even try to run this on Python 3? IMHO the native fsencode/fsdecode should be used there and not the backport. Furthermore, IMHO the whole bytes vs. unicode is a not an issue on Windows as long as you always feed unicode to path operations.

@jayvdb
Copy link
Author

jayvdb commented Apr 2, 2019

https://github.com/PiDelport/backports.os/blob/master/.travis.yml has lots of Python 3 targets, so I added one to AppVeyor.

However I am more concerned about the Python 2 results.

One approach would be for these functions to reject bytes as inputs on Windows, and the relevant tests to be disabled.

That would at least show whether all of the errors are bytes related, or if some are also present when unicode is used.

@jayvdb
Copy link
Author

jayvdb commented Apr 2, 2019

It is failing to roundtrip unicode->fs->unicode on Windows, which is the core functionality needed for any sane Python 2 code which is mostly working in unicode and only switching to bytes when necessary.

@jayvdb
Copy link
Author

jayvdb commented Apr 3, 2019

In jayvdb@5975cd0 I split the os tests, and remove the extra tests, so it is easier to see the main problem.

Results at https://ci.appveyor.com/project/jayvdb/backports-os/builds/23554348/job/ak61ymmy6ej40erp

In short

backports.os.fsdecode(backports.os.fsencode(u'unicode\u0141')) != u'unicode\u0141'

As a result, the safest thing to do is not use this backport library on Windows until someone digs into these problems.

It is useful on Linux, but it has marginal/negligible benefits on Windows/etc and it is better to just not use these backported functions on Windows, instead always use unicode when interacting with the FS functions as @pombredanne explains above, or use the routines we used to use before os.fsencode/decode, as explained at https://github.com/henrysher/duplicity/blob/4ac4322/duplicity/util.py (fwiw, they have a decent collection of homegrown backports, e.g. https://github.com/henrysher/duplicity/blob/master/duplicity/tempdir.py , and file ops e.g. https://github.com/henrysher/duplicity/blob/master/duplicity/selection.py)

And someone should check this backport on Mac and other platforms before using it on those platforms too.

@jaraco
Copy link

jaraco commented Apr 3, 2019

In my experience, macOS and Windows do handle Unicode filenames well, and it's only on Linux that bytes handling is still relied upon, so I agree with the instinct. What I'd really like is for the interface, whatever it is, to be uniform. It shouldn't be the responsibility of each client to write open(os.fsencode(name) if platform.name() == 'Linux' else name) for every use of the name. Instead, os.fsencode should provide passthrough on platforms that don't need it or at least should provide a wrapper for fsencode that disables it when unneeded, maybe something like open(os.ifneeded(os.fsencode)(name)).

Looking at the stdlib on Python 3.7.3, I see this on macOS:

>>> os.fsdecode(os.fsencode('unicode\u0141'))                                                                                                                                  
'unicodeŁ'

Similar on Windows:

~ # python
Python 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.fsdecode(os.fsencode('unicode\u0141'))
'unicodeŁ'

So if the backport isn't handling those conditions similarly, that does seem like a bug to me.

@jayvdb
Copy link
Author

jayvdb commented Apr 3, 2019

Could someone on macos try backports.os.fsdecode(backports.os.fsencode(u'unicode\u0141')) on Python 2.7 , and Python 3.4 too would be helpful. It would be nice to know if this is a Windows specific problem, or a non-Linux problem.

@techalchemy
Copy link

@jayvdb https://dev.azure.com/tech-alchemy/backports.os/_build/results?buildId=3

I set up an azure pipeline for this, as @jaraco rightly pointed out (and he would definitely know this as well as anyone, go look at some of the libraries he has written) this is just an implementation bug.

techalchemy added a commit to techalchemy/backports.os that referenced this issue Apr 4, 2019
- Mirrors the new python 3.7 implementation
- Taken from `vistir` (my other library) -> discussion over at
  sarugaku/vistir#54
- Fixes PiDelport#13
- Fixes PiDelport#6 (I think?)

Signed-off-by: Dan Ryan <dan@danryan.co>
@jaraco
Copy link

jaraco commented Apr 5, 2019

Here's what I see on macOS on Python 3.4 and 2.7:

~ $ python2.7 -m pip-run backports.os -- -c "import backports.os; print(backports.os.fsdecode(backports.os.fsencode(u'unicode\u0141')))"                                       
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7.
Collecting backports.os
  Downloading https://files.pythonhosted.org/packages/20/c2/193ff30cab747f205535a3a8c24fcc7647dbaf0aa667b7dd580a740b8de4/backports.os-0.1.1-py2-none-any.whl
Collecting future (from backports.os)
Installing collected packages: future, backports.os
Successfully installed backports.os-0.1.1 future-0.17.1
unicodeÅ
~ $ python3.4 -m pip-run backports.os -- -c "import backports.os; print(backports.os.fsdecode(backports.os.fsencode(u'unicode\u0141')))"                                       
Collecting backports.os
  Using cached https://files.pythonhosted.org/packages/2b/e0/e7f8afefb8219e6b686722923db073b5bd54e1336b5a8b32da66c2edd0e8/backports.os-0.1.1-py3-none-any.whl
Installing collected packages: backports.os
Successfully installed backports.os-0.1.1
You are using pip version 18.1, however version 19.0.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
unicodeŁ

@jayvdb
Copy link
Author

jayvdb commented Apr 5, 2019

@techalchemy master...techalchemy:bugfix/os-fs-encode is looking really good . https://dev.azure.com/tech-alchemy/backports.os/_build/results?buildId=8 has most problems fixed except for Python 3 on Windows. 👍

techalchemy added a commit to techalchemy/backports.os that referenced this issue Apr 17, 2019
- Mirrors the new python 3.7 implementation
- Taken from `vistir` (my other library) -> discussion over at
  sarugaku/vistir#54
- Fixes PiDelport#13
- Fixes PiDelport#6 (I think?)

Signed-off-by: Dan Ryan <dan@danryan.co>
@techalchemy techalchemy linked a pull request Apr 17, 2019 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants