Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: insert paths from PYTHONPATH manager before system's PYTHONPATH #17066

Open
oscargus opened this issue Dec 22, 2021 · 37 comments · May be fixed by #21769
Open

Suggestion: insert paths from PYTHONPATH manager before system's PYTHONPATH #17066

oscargus opened this issue Dec 22, 2021 · 37 comments · May be fixed by #21769

Comments

@oscargus
Copy link
Contributor

oscargus commented Dec 22, 2021

Problem Description

It would be convenient if the paths added to the PYTHONPATH were added before (or could be made to be added before) the system PYTHONPATH.

Use case: I want to run a package from source (for development reasons) that I already have installed through Conda or pip. Right now, on Windows, I have to add it to the system PYTHONPATH, which, sort of, removes the benefit of the Spyder PYTHONPATH manager.

I noted #16429 and maybe that solves the problem (by importing and then moving the source to top), but maybe there can be an alternative solution as well, like a check box selecting if the context of the PYTHONPATH manager should be prepended or appended? I ping @mrclary directly here, to confirm that it can be done when/if #16429 is merged and here opinions on a possible implementation as part of that.

@dalthviz
Copy link
Member

Hi @oscargus thank you for the feedback! What do you think @spyder-ide/core-developers ?

@mrclary
Copy link
Contributor

mrclary commented Dec 22, 2021

@oscargus, I agree that whatever the user controls in Spyder's PYTHONPATH Manager should take precedence, i.e. should be placed in front of any other paths (site-packages, etc.). This gives the user the most control.

I will have to double-check whether #16429 enforces this or not. I'm pretty sure that this is the case for command completions in the Editor. I'm uncertain about the handling of sys.path in the IPython console, though. I'll report back.

@mrclary
Copy link
Contributor

mrclary commented Dec 22, 2021

The Python Language Server does not place PYTHONPATH Manager paths at the front of sys.path for Jedi completions in the Editor and #16429 cannot change this. The offending line is pylsp.workspace.Document:266 in jedi_script.
This

sys_path = self.sys_path(environment_path, env_vars=env_vars) + extra_paths

should be changed to this

sys_path = extra_paths + self.sys_path(environment_path, env_vars=env_vars)

I can submit a merge request to palantir/python-language-server python-lsp/python-lsp-server but there is no guarantee that it will be accepted.

@dalthviz
Copy link
Member

Just in case @mrclary I think the change then should be done over python-lsp/python-lsp-server (the org and repo we migrated from the palantir one and that we also support), right @ccordoba12 @andfoy ?

@mrclary
Copy link
Contributor

mrclary commented Dec 22, 2021

The IPython console does not place PYTHONPATH Manager paths at the front of sys.path and #16429 cannot change this. The offending line is spyder_kernels.customize.spydercustomize:730 in set_spyder_pythonpath.
This

sys.path.extend(pathlist)

should be changed to this

[sys.path.insert(0, p) for p in pathlist[::-1]]

I can submit a merge request to spyder/spyder-kernels, subject to approval by the Spyder team.

@mrclary
Copy link
Contributor

mrclary commented Dec 22, 2021

Just in case @mrclary I think the change then should be done over python-lsp/python-lsp-server (the org and repo we migrated from the palantir one and that we also support), right @ccordoba12 @andfoy ?

@dalthviz, you are correct.

@mrclary
Copy link
Contributor

mrclary commented Dec 22, 2021

Changing spydercustomize.py does not appear to behave as expected. Some further investigation into spyder-kernels will be required. Nevertheless, if the requested feature is accepted, I'll invest the time to get it working properly.

@mrclary
Copy link
Contributor

mrclary commented Dec 22, 2021

Changing spydercustomize.py does not appear to behave as expected. Some further investigation into spyder-kernels will be required. Nevertheless, if the requested feature is accepted, I'll invest the time to get it working properly.

Nevermind, it works as expected.

@mrclary mrclary changed the title Suggestion: insert paths from PYTHONPATH manager before or after system's PYTHONPATH Suggestion: insert paths from PYTHONPATH manager before system's PYTHONPATH Dec 22, 2021
@ccordoba12
Copy link
Member

@mrclary, what if the user has a module like string.py in one of the directories added to Spyder's PYTHONPATH manager? With your PRs, that would make their code and Jedi fail in bizarre ways because those directories would take precedence over standard library modules.

So, instead of doing that unconditionally, I think we should add an option in our PYTHONPATH manager (off by default) to ask users if they want the added paths to be in front of everything else (as @oscargus suggested). That way, users that really know what they are doing won't be surprised by this, and the rest would be safe.

@mrclary
Copy link
Contributor

mrclary commented Dec 22, 2021

@ccordoba12, I don't think this should be a problem after #16429 is merged. Currently there are some idiosyncrasies in the handling of these paths for command completion, which may be why Jedi may fail strangely. With #16429 Jedi's execution environment is isolated from the paths provided as search paths. Thus, per your example, Jedi's runtime environment should never see the user's module string.py, which would have caused it to break.

Now, if we want to add a feature to PYTHONPATH Manager that allows the user to choose where to place the paths within sys.path for IPython consoles and command completion, I think it can be incorporated into #16429, with some additional changes to spyder-kernels and python-lsp-server.

@mrclary
Copy link
Contributor

mrclary commented Dec 22, 2021

@ccordoba12, even without #16429 I could not reproduce any Jedi failures with string.py in the completion search path. Is there perhaps some other conditions required?
pypath-priority

@mrclary
Copy link
Contributor

mrclary commented Dec 22, 2021

Disabling the path in PYTHONPATH Manager restores the completion target to the standard library module.

@oscargus
Copy link
Contributor Author

Thanks @mrclary for the exhaustive investigation!

I can clearly see @ccordoba12's worry about users having files with standard names in the path. It may lead to an increased number of "bug" reports and although it works if you know what you are doing there is a certain degree of risk to it. (I've experienced bug report for SymPy where users just have some other random file in their path which breaks SymPy and although it states clearly in the error message which file was accessed, report are still posted.)

I can also imagine having the system path as a separate item in the manager and then be able to locate it exactly, so some can go before and some after. However, it may be a safer bet to have a check-box, possibly with a warning pop-up, if the paths in the manager should be added before the system path.

@mrclary
Copy link
Contributor

mrclary commented Dec 25, 2021

Thanks @mrclary for the exhaustive investigation!

Thank you for the kind words.

I can clearly see @ccordoba12's worry about users having files with standard names in the path. It may lead to an increased number of "bug" reports and although it works if you know what you are doing there is a certain degree of risk to it. (I've experienced bug report for SymPy where users just have some other random file in their path which breaks SymPy and although it states clearly in the error message which file was accessed, report are still posted.)

We certainly do not want to cause problems for our users and unnecessarily drive more bug reports. I was responding to @ccordoba12's comments targeting Jedi, specifically. However, you are correct that we should be concerned about the order of paths in sys.path in the IPython console environment. While for the language server and Jedi the execution environment can be separate from the completion search paths, any manipulation of sys.path in the IPython console environment will have direct implications for code that users execute therein, such as your example of SymPy.

I can also imagine having the system path as a separate item in the manager and then be able to locate it exactly, so some can go before and some after.

If you mean the base sys.path to remain as a contiguous block, and PYTHONPATH Manager paths can individually be placed before or after the base sys.path, then I think this may be possible. If you mean manipulating the paths in the base sys.path, then I am very incredulous that this can be done with any reliability, since these are determined by the site package of individual environments and beyond the scope of Spyder to control.

However, it may be a safer bet to have a check-box, possibly with a warning pop-up, if the paths in the manager should be added before the system path.

I would recommend label text that provides the warning over a pop-up window, similar to other instructional information that we have elsewhere, e.g. in the Preferences widget.

So, here are the two possible approaches that I would recommend.

  1. Have a single checkbox in the PYTHONPATH Manager that, when selected, inserts all the paths in order in the front of the base sys.path. The default would be deselected, in which case all the paths are appended to the back of the base sys.path (current behavior). The checkbox label could include a warning about placing paths in front of the base sys.path.
  2. Have a "dummy" PYTHONPATH Manager entry that is always greyed out but visible that represents the base sys.path, with a default position at the top of the list (replicating current behavior). The user can then arrange any path before or after the base sys.path. The PYTHONPATH Manager window could include text in the main frame warning the user about placing paths above the base sys.path.

Either of these options could be included in the #17077, python-lsp/python-lsp-server#139, and spyder-ide/spyder-kernels#352 PR chain. These would complement, but not depend on, #16429.

What do @spyder-ide/core-developers think?

@mrclary
Copy link
Contributor

mrclary commented Dec 25, 2021

I was responding to @ccordoba12's comments targeting Jedi, specifically.

Well, actually, I forgot about Spyder's option to use Jedi for command completion in IPython consoles. So, in this case, Jedi, like any other module, could fail when executed in the IPython console environment if users have a module name conflict with a standard module and the user's module is found before the standard module.

@oscargus
Copy link
Contributor Author

oscargus commented Jan 5, 2022

If you mean the base sys.path to remain as a contiguous block

Yes, that is what I meant. Although the alternative may be "nice", I cannot really see any practical use case for it, and I agree that it may indeed be quite challenging...

@mrclary
Copy link
Contributor

mrclary commented Jan 5, 2022

Okay, upon further consideration, I think that the paths listed in PYTHONPATH Manager should be placed at the front of sys.path because that is what Python does with the contents of the PYTHONPATH environment variable.

If a user has the PYTHONPATH environment variable defined, when starting Python from the command line sys.path becomes [''] + <PYTHONPATH paths> + <standard library paths>. However, Spyder places the user defined paths at the end of sys.path (after the standard library paths), which is contrary to standard Python behavior; in particular, [''] + <standard library paths> + <extra_paths>.

Additionally, I think that the "current working directory" entry '' should always be first to be consistent with standard Python behavior. #17077, python-lsp/python-lsp-server#139, and spyder-ide/spyder-kernels#352 currently result in <extra_paths> + [''] + <standard library paths> and should be updated to [''] + <extra_paths> + <standard library paths>.

@ccordoba12 expressed concern about this in a previous comment; however, I was unable to reproduce the issue. I believe we have a regression test for this as well (don't we?), so I don't think we should see any surge in bug reports.

Currently, Spyder has idiosyncratic behavior with respect to the users PYTHONPATH environment variable and Spyder's PYTHONPATH Manager. #16429 resolves these idiosyncrasies and homogenizes the behavior across platforms and launch mechanisms so that <extra_paths> will always be the contents of PYTHONPATH Manager, regardless of platform and launch mechanism.

@spyder-ide/core-developers, I invite your response.

@impact27
Copy link
Contributor

impact27 commented Jan 5, 2022

My opinion is that most people will add something to the python path so python can find it and they can use a script from wherever. Replacing installed package is probably a more niche use case. It might be better to add after by default and add an option to add before?

@mrclary
Copy link
Contributor

mrclary commented Jan 5, 2022

My opinion is that most people will add something to the python path so python can find it and they can use a script from wherever. Replacing installed package is probably a more niche use case. It might be better to add after by default and add an option to add before?

@impact27, regardless of the motivations of the OP, I think that user-supplied paths should go in front since that is the standard Python behavior, i.e. running a script with Python from the command line should have the same path structure as running the script in Spyder's IPython console. Can you comment on this point?

@impact27
Copy link
Contributor

impact27 commented Jan 5, 2022

That makes sense, but in the other hand, it is more easy to set up a path and forget in Spyder than when running from the command line.

@ccordoba12
Copy link
Member

ccordoba12 commented Jan 6, 2022

Replacing installed package is probably a more niche use case. It might be better to add after by default and add an option to add before?

I agree with @impact27. Spyder makes quite easy to add new paths to PYTHONPATH, so the chance for users to shoot themselves in their foot is much higher. What I mean by that is that Spyder would allow them to create code that (most likely) would fail to work outside of it (see below).

@ccordoba12 expressed concern about this in a previous comment; however, I was unable to reproduce the issue. I believe we have a regression test for this as well (don't we?), so I don't think we should see any surge in bug reports.

Even if the PyLSP and spyder-kernels work by putting our additions to PYTHONPATH in front of sys.path, that would allow users to (very easily) shadow standard library modules (like in the example shown by @mrclary in #17066 (comment)).

Since most of those module names are rather simple (random.py, logging.py, etc), they are usually picked up by users for their own projects. So this change would introduce a terrible confusion: why my code works perfectly fine inside Spyder but fails horribly outside of it?

I think that user-supplied paths should go in front since that is the standard Python behavior

I'm -1 on this for the reasons I mentioned above. However, I also understand that it's confusing that Spyder says it has a PYTHONPATH manager and nevertheless doesn't handle it in the same way as Python does. That's why I propose to rename that functionality to something like Add other paths to import code (or something like that) because it's what we are really doing at the moment.

@ccordoba12
Copy link
Member

ccordoba12 commented Jan 6, 2022

Additionally, I think that the "current working directory" entry '' should always be first to be consistent with standard Python behavior

I disagree with this too because (again) it would allow people to shadow standard library modules by creating their own ones in their cwd, which usually breaks other libraries. This was a very common problem and a source of a lot of confusion before we "fixed" it a couple of years ago.

I guess people are also confused when Python fails to import their modules and instead imports the corresponding ones in the standard library. But (i) their code doesn't break with incomprehensible error messages, so they don't need to open bugs or StackOverflow questions about it; and (ii) I guess they eventually learn there are other modules that take precedence over theirs, so they are forced to rename them (which at the end is a good thing).

I understand that for power users it's frustrating that Spyder has these small differences with regular Python. But, at least for me, the Python defaults are a terrible UX for newbies. And now that Python became the de facto language for science and engineering, we have to design things for them, not for power users.

@mrclary
Copy link
Contributor

mrclary commented Jan 6, 2022

Since most of those module names are rather simple (random.py, logging.py, etc), they are usually picked up by users for their own projects. So this change would introduce a terrible confusion: why my code works perfectly fine inside Spyder but fails horribly outside of it?

Perhaps you mis-typed, but the confusion "why my code works perfectly fine inside Spyder but fails horribly outside of it?" is the current state of affairs and would not be introduced by the changes that I propose. The changes I propose may elicit "why my code used to work perfectly fine but now fails horribly inside Spyder, and it also fails horribly outside of it?".

Nevertheless, it seems that the prevailing opinion is to keep the user-defined paths after the standard library paths. Are we then in agreement to provide a mechanism to allow users to place it before the standard library paths, according to my previous comment option 1?

@CAM-Gerlach
Copy link
Member

I don't see the problem with at least an opt-in option to enable the standard PYTHONPATH behavior, perhaps with a suitable warning. While the standard Python behavior may be confusing and frustrating to beginner users in certain situations, the non-standard Spyder behavior with no way to override it is likely to be the same for power users in others, at least without any way to override it. If users actively click the UI element (with a warning, if necessary) to enable this, after opening the PYTHONPATH manager to begin with (which already I don't think a beginner would likely do; I don't think I ever have) then we can reasonably assume they know what they are doing.

@impact27
Copy link
Contributor

Well I don't think a warning is necessary if the option is opt-in. I would be open to have the default python behaviour as the only option, but I am unclear on when the issue would manifest. I think I would lean on the "having an option" side. The question is, what is more likely: a user confused because he created a "threading.py" file or a user confused because the file he created doesn't replace the library?

@mrclary
Copy link
Contributor

mrclary commented Mar 22, 2022

@spyder-ide/core-developers, I think I understand better now what is going on.
First, the current situation:

  • An issue where the IPython Console would crash on startup when the user had shadowed standard libraries in their PYTHONPATH or PYTHONPATH Manager was resolved by Builtins can still be shadowed, causing kernel startup to fail, when a project is open #13519, in which @ccordoba12 introduced the SPY_PYTHONPATH work around. This avoids the issue by launching the console before adding the offending paths to its sys.path.
  • There is an inconsistency directly related to the order in which paths are added to sys.path. From Builtins can still be shadowed, causing kernel startup to fail, when a project is open #13519, the PYTHONPATH Manager paths are added to the end of sys.path in the IPython Console at startup. However, SpyderKernel.update_syspath adds these paths to position 1 of sys.path (almost at the front) when the PYTHONPATH Manager is updated after the IPython Console is already launched. So disabling then re-enabling paths after the console starts will change the position of the paths.

So what does all this mean?

  • For @oscargus, an immediate workaround for the OP may be to simply disable paths in the PYTHONPATH Manager, click OK, then re-enable them. This will put the paths at the front of sys.path. Well, sort-of, since they'll be at position 1 instead of 0. FYI, position 0 is <env>/lib/python39.zip, and the new paths would still be in front of all other paths including <env>/lib/python3.9/site-packages, so you could shadow all installed modules except those in <env>/lib/python39.zip.
  • The specific modules alluded to in previous comments, random.py and logging.py, for example, are installed in <env>/lib/python3.9 so users would see issues if they shadowed these modules and disabled/re-enabled their PYTHONPATH Manager paths. Maybe this doesn't happen very often, but if it were even a small problem I think we would have seen some complaints from users about this after Builtins can still be shadowed, causing kernel startup to fail, when a project is open #13519.

In short, I think we are safe to move these paths to the front of sys.path in the IPython Console. This will be in better compliance with standard (expected) Python and IPython behavior and SPY_PYTHONPATH already prevents the console from crashing in the scenarios that @ccordoba12 and others are concerned about.

I think #17512 and spyder-ide/spyder-kernels#378 should resolve this issue as well as resolve the issue where the paths in sys.path would be inconsistent if the user disables and re-enables paths.

@ccordoba12
Copy link
Member

@mrclary, could you take care of this one for Spyder 6? I have an idea on how to provide a really simple way to add paths from our PPM to the front of sys.path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment