Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency management (file requirements.txt and friends) is far too lenient: Too old versions and incompatible new versions. #3734

Open
aknrdureegaesr opened this issue Jan 10, 2024 · 12 comments
Labels

Comments

@aknrdureegaesr
Copy link
Contributor

Environment

Python Version: 3.9

Nikola Version: Git b96f05a

Operating System: Debian Bullseye

Description:

The present Nikola dependency version requirements are far too lenient.

I wish to raise two separate problems with the present files requirements*.txt files:

Allowing too new versions

Nikola allows very new versions of its dependencies. When some library introduces incompatible changes and upgrades to a new major version (in full compliance with semver), any unsuspecting user who does pip install Nikola[extras] might be the first person in the world that actually mixes that version with Nikola. And it may break.

In such a scenario, I see no fault with our dependency, nor with the user, so the fault lies with Nikola.

Some moral equivalent of this happened to me, when a pull request of mine collided with a new, incompatible version of lxml that happened to come out between the time when I had set up my venv and the time that pull request was tested, giving rise to #3732.

Allowing too old versions

I fiddled a bit to find the oldest versions of all Nikola dependencies that would install on my machine and still satisfy all Nikola dependencies. The result is requirements-also_doesntwork.txt. Running pytest with those ancient packages installed fails.

Analyzing that particular failure, I found that a requirement Pygments>=1.6 had been added to requirements.txt with git eab48d8 on 2014-07-17. Later, 2020-03-19 (git 7b792fe) and again 2022-04-24 (7e2fd4f), code was added that uses pygments.formatters._formatter_cache. That attribute does not yet exist in Pygments 1.6. So ever since then, Nikola was boasting to be able to run with that old Pygments version while in fact it could not. I believe this to be just a tip of an iceberg and there are many more compatibility problems of this sort hiding in the code.

@Kwpolska
Copy link
Member

I agree that the minimum dependencies listed in our requirements.txt file may be too old, and it would be a good idea to update the minimum versions there — perhaps to something at most a year old, or perhaps to the most recent major version (while verifying this specific setup works).

On the other hand, the maximum dependency versions should not be limited, unless the package has a track record of breaking backwards compatibility. There should be no pinning, especially in requirements.txt (unless the package’s maintainers break backwards compatibility very often, in which case we should reconsider the dependency).

The problem with dependencies in Python is caused by Python’s package management being awful, 14 competing standards and all. Python ties its packages to an environment — which might be system-wide (including user packages, but that doesn’t matter) or virtual. There are three main ways in which someone may install Nikola: in a venv, system/user-wide with pip, or using their system package manager. If someone is using a venv exclusively to run Nikola, everything would be fine, and we could be using pinned dependencies or very narrow ranges (maintenance burden notwithstanding). But if someone is using a system-wide install, we could cause total breakage if we demand very specific versions. Things are even worse if you include distro packages into the mix: Nikola is packaged in some distros, like Fedora, Arch Linux, or Gentoo. The way Python works means there is only one version of Pygments in the distro repositories. This one version in Arch Linux’s repos must work not only for Nikola, but also for 84 other things (and that doesn’t include transitive dependencies). Some distros are more eager to upgrade things, while others are more conservative, so we can’t just support the latest version — we must have some leeway.

Package versioning is hard, and different entities have different approaches to backwards compatibility. Microsoft will fight tooth and nail to keep things compatible (I have done some cursed package combinations in the .NET land, mixing 5-year-old versions with the latest-and-greatest), but CPython is eager to break stuff (#3719 is one example, but many things are removed with a 2-year warning). The packages Nikola depends on have many different maintainers with different views on backwards compatibility. Some of them follow semver, some follow a scheme detached from backwards compatibility, and some of them don’t have a predictable version numbering scheme. And even then, if we say >=1.2.0, <1.3.0, things can still break if the maintainer introduces a regression. The only predictable thing would be to pin specific versions, but that would get us removed from Linux distros and would make us break people’s systems. We instead choose to hope the latest version doesn’t break — and it often doesn’t, while also making it more likely for us to have support for new Python versions without requiring a new Nikola release.

To sum up: some things can certainly be improved, and the versions in requirements.txt are a lie that needs fixing. But we can’t pin dependencies, use very tight dependency ranges, or use <= unless we absolutely need it, because that would cause more trouble for our users and downstreams.

@aknrdureegaesr
Copy link
Contributor Author

aknrdureegaesr commented Jan 10, 2024

I have no "magic bullet" for the upper version limit problem.

My personal intuition would be: Let us trust everybody is using semantic versioning. That means, "pin to the same major version we tested". We also could have something automatic that alerts us of stuff releasing a new major version, so we can test that before relaxing our requirements to accept it.

So far the theory. In practice, semantic versioning is largely not being followed in the Python world. 23% of all of our dependencies and transitive dependencies sport a version number of 0.something. If you actually read semver 2.0.0 (rule 4.), that translates to "this package does not claim to have any stable API yet".

So far for the bad news. For the good news, at least I have some idea for the other end of the problem.

We could generate a requirement-oldest.txt, hopefully on the fly as part of our automatic tests. That file has a compilation of all oldest versions we think should still work. (The idea is, it is very simply generated by string replacement: Replace all >= with ==.) Then, in one automatic test run, we combine into one Python installations all those oldest versions - and, of course, see whether our tests still run. If not, some new code has introduced what you call a "lie", and we need to tighten our lower bound on some dependency.

Does that sound like something you'd merge if I managed to code it?

@Kwpolska
Copy link
Member

I would accept testing of requirements-oldest.txt on the oldest supported Python version on Linux (the newest one is unlikely to work with it). I would also accept pinning all dependencies dependencies to <= the latest major version as of ~2 months ago, or the specific version that was current ~12 months ago (with no >= anywhere).

@felixfontein
Copy link
Contributor

I don't like upper version pins (unless needed for explicitly known incompatibilities), I've seen them cause problems with dependent projects and users too often...

If we do upper version pins, we really need test infrastructure that keeps track of new major releases and runs tests with these.

@aknrdureegaesr
Copy link
Contributor Author

"with no >= anywhere" Why not, Chris?

I thought rather to the opposite: Judiciously ruling out ancient version seems to be the no-brainer here, though some effort in practice. It's the future versions that is the difficult part, even conceptually. Or at least that's what I thought.

Not ruling out ancient versions essentially claims, e.g., "Nikola is fine with any version of Pygments that ever existed." Why would we want to claim that if we know it is not true?

@aknrdureegaesr
Copy link
Contributor Author

Fortunately, Pypi is quite scriptable / can be scraped easily. Here is a list of all versions of the primary requirements of Nokia currently available on Pypi:

pypi_requirements_available_versions.json

@Kwpolska
Copy link
Member

"with no >= anywhere" Why not, Chris?

I might have messed up the equality signs. I’m against pinning the maximum/upper versions.

@aknrdureegaesr
Copy link
Contributor Author

During the last days, my PC has tried many combinations of dependency versions to see whether the Nikola tests still run with those or don't. "Experimental computer science." It is not done with that yet. But the oldest requirements known to me that make the tests pass are (output of pip freeze):

aiohttp==3.8.6
aiosignal==1.3.1
argh==0.31.0
asttokens==2.4.1
async-timeout==4.0.3
atomicwrites==1.4.1
attrs==23.2.0
Babel==2.12.0
beautifulsoup4==4.12.2
bleach==6.1.0
blinker==1.3
certifi==2023.11.17
charset-normalizer==3.3.2
cloudpickle==3.0.0
comm==0.2.1
coverage==7.4.0
debugpy==1.8.0
decorator==5.1.1
defusedxml==0.7.1
docutils==0.19
doit==0.33.1
exceptiongroup==1.2.0
executing==2.0.1
fastjsonschema==2.19.1
feedparser==6.0.2
flake8==7.0.0
freezegun==0.2.5
frozenlist==1.4.1
ghp-import==0.2.0
hsluv==5.0.0
html5lib==0.2
idna==3.6
importlib-metadata==7.0.1
ipykernel==6.25.2
ipython==8.18.1
ipython-genutils==0.2.0
jedi==0.19.1
Jinja2==3.1.0
jsonschema==4.20.0
jsonschema-specifications==2023.12.1
jupyter-client==8.6.0
jupyter-core==5.7.1
jupyterlab-pygments==0.3.0
lxml==4.5.2
Mako==1.0.9
Markdown==3.0
MarkupSafe==2.1.3
matplotlib-inline==0.1.6
mccabe==0.7.0
micawber==0.2.2
mistune==3.0.2
more-itertools==10.2.0
multidict==6.0.4
natsort==4.0.0
nbclient==0.9.0
nbconvert==7.14.1
nbformat==5.9.2
nest-asyncio==1.5.9
notebook==6.0.0
packaging==23.2
pandocfilters==1.5.0
parso==0.8.3
pathtools==0.1.2
pexpect==4.9.0
phpserialize==1.3
piexif==1.0.0
Pillow==9.2.0
platformdirs==4.1.0
pluggy==1.3.0
prometheus-client==0.19.0
prompt-toolkit==3.0.43
psutil==5.9.7
ptyprocess==0.7.0
pure-eval==0.2.2
py==1.11.0
pycodestyle==2.11.1
pyflakes==3.2.0
pygal==2.0.11
Pygments==2.12.0
pyinotify==0.9.6
Pyphen==0.8
PyRSS2Gen==1.1
pytest==4.3.0
pytest-cov==2.9.0
python-dateutil==2.8.2
PyYAML==6.0.1
pyzmq==25.1.2
referencing==0.32.1
requests==2.31.0
rpds-py==0.17.1
ruamel.yaml==0.15.98
Send2Trash==1.8.2
sgmllib3k==1.0.0
six==1.16.0
smartypants==2.0.1
soupsieve==2.5
stack-data==0.6.3
terminado==0.18.0
tinycss2==1.2.1
toml==0.7.1
tornado==6.4
traitlets==5.14.1
typing-extensions==4.9.0
typogrify==2.0.2
Unidecode==0.4.20
urllib3==2.1.0
watchdog==0.7.1
wcwidth==0.2.13
webencodings==0.5.1
yarl==1.9.4
zipp==3.17.0

This is the tests without coverage reports (to speed up things).

@aknrdureegaesr
Copy link
Contributor Author

After #3744 has been merged, it is possible to create one version of requirements-oldest.txt by simply replacing all >= with == in the three requirement files. This is not at all concerned about age of those versions that are included. My script has just searched for any old versions that still make the tests succeed.

Are you interested in the script itself, or only in the result (the latter being in #3744)?

@aknrdureegaesr
Copy link
Contributor Author

(The script is far from perfect. There could be even older versions that, too, make the tests succeed. In particular, if several of our dependencies depend on certain versions of each other without announcing that fact to pip.)

@Kwpolska
Copy link
Member

You can post the script in this issue for future reference, we probably don’t need to add it to the repository.

@aknrdureegaesr
Copy link
Contributor Author

aknrdureegaesr commented Jan 17, 2024

find_oldest_working_dependencies.zip actually has two scripts.

  • The first goes to Pypi and fetches all versions of our dependencies. The result is a JSON file.
  • The second tries to decrease version numbers and runs tests, to find out what the lowest versions is that work. It needs no dependencies itself, just plain Python (tested with 3.9).

The scripts want to sit in a subdirectory of scripts.

Deficiencies of the second script:

  • It will not terminate even if the algorithm cannot find any lower working versions any more (endless loop).
  • It writes a file (that could be easily parsed) that could be used if the program was interrupted to continue where it left off. However, parsing that file was never implemented, as, after one long run, I improved on the format. Instead, there is one-off code with known "good" versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants