Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase total numbers to 360 * 2 or may be even 360 * 3 to include more packages #40

Open
chhantyal opened this issue May 1, 2017 · 12 comments

Comments

@chhantyal
Copy link
Owner

From E-mail sent to me:

Current progress is 343/360 = 95% looks good but it is not correct because of there are much more than 360 active packages...

So I have suggestion to increase 360 to 3*360 = 1080. Why 1080? Because of it will be possible to save same style for graph (you can select green or white colour for each sector based on small group of 3 items)..

More packages - more fun to track progress. :-)

@anarcat
Copy link

anarcat commented Sep 13, 2018

in fact 120 is weird - why not make it round numbers like 100, 500 and 1000?

@anarcat
Copy link

anarcat commented Sep 13, 2018

oh nevermind, i see why it's 120 - it's a circle and those are degrees, sorry for the dumb comment. :)

@hugovk
Copy link
Contributor

hugovk commented Sep 14, 2018

This is how the top 1,080 would look like:

https://hugovk.github.io/py3readiness/

master...hugovk:top-1080

image

image

@pklapperich
Copy link

There's 150,000 projects on PyPI. Ideally, the chosen cut off for the "top X packages" should be chosen based on some metric from PyPI. I'm not sure what that metric would be (downloads/month maybe?) but it would be nice to say that any packages not represented on the readiness list are either inactive or represent a tiny user base.

@hugovk
Copy link
Contributor

hugovk commented Sep 17, 2018

#57 is increasing the limit to 720. I think that's a good limit

If the number is too big, it'll take a long time to generate results, make a lot of requests to the PyPI server, and take a long time to load on the py3readiness page, or need a radical redesign.

The current 360 (and soon 720) are already chosen on a metric from PyPI: number of downloads over the last 365 days.

By the way, the number 1 (pip) is at 249,545,448 downloads,

number 720 (azure-mgmt-datalake-store) at 924,266 downloads,

and number 5,000 (django-haystack-elasticsearch) at 21,412.

https://hugovk.github.io/top-pypi-packages/

@cclauss
Copy link
Contributor

cclauss commented Apr 7, 2019

Do an update and then increase the number yet again?

@CAM-Gerlach
Copy link

@hugovk @chhantyal Given all packages have been green for a while now, any chance we can get this bumped to 360 * 3 = 1080? It seems like a good number since vs. 720 its >> 1000 which "feels" much larger in base 10 and adds more authority to the resulting Python 3 percentage; vs. 1440 it reduces page size and bandwidth costs while still being >1000 and (subjectively) feeling like a more "significant" number (due to 1080p), and not expanding the list two quickly so we can always increase it later. Expanding the list to a substantially larger number is important to help identify remaining packages that need porting, deprecation/replacement, or addition as an exception in caniusepython3.

Assuming a number is decided upon, anything we can do to help make this happen?

For reference, here are the 30 and 365 day counts download numbers for the currently proposed numbers (breakpoints don't consider ignored packages for simplicity, which affects the rank by ~2%). 1800 has some attractive properties: zeros in the last two places, below 100 k downloads per month and below 1 million downloads per year. However, its a pretty big jump...

360  1400 k  11450 k
720   526 k   4566 k
1080  245 k   2083 k
1440  143 k   1242 k
1800   91 k    793 k

Thoughts?

@hugovk
Copy link
Contributor

hugovk commented Nov 17, 2019

I've updated the preview for 1,080: https://hugovk.github.io/py3readiness/

image

image

@CAM-Gerlach
Copy link

CAM-Gerlach commented Nov 18, 2019

Thanks @hugovk ! Another great property of 1080 is the three columns will be the 360, 720 and 1080 ones respectively.

Looking over the 16 non-Python3-tagged entries in the 360-720 bracket, 2 of them are dead/unmaintained projects with non-dropin Python 3 stdlib and third replacements respectively (pathtools and our oauth), 1 is in maintenance mode with no immediate Py3 porting plans but a Py3 stdlib replacement that has the most but not all of its features and may even be at least partially code compatible (ruamel-orderreddict) and the rest are all Python 3 compatible but simply not tagged, almost all of them with no trove tags at all (the 5 sphinxcontrib packages all have them, but due to an apparent packaging error all the PyPI metadata is inline in the readme instead of actually as literal Python code). So the real number is likely closer to something like ~98-99% Python 3, not 94%.

@hugovk
Copy link
Contributor

hugovk commented Nov 18, 2019

As mentioned, some of those can be turned green by adding Trove classifiers, or updating caniusepython3.

Note, yesterday I ran make generate a few times to update the numbers locally, and on the second or third run hit this:

(767, 1080, u'cfgv')
(768, 1080, u'azure-mgmt-machinelearningcompute')
(769, 1080, u'azure-mgmt-commerce')
Traceback (most recent call last):
  File "generate.py", line 21, in <module>
    main()
  File "generate.py", line 15, in main
    annotate_wheels(packages)
  File "/Users/hugo/github/py3readiness/src/utils.py", line 24, in annotate_wheels
    if caniusepython3.check(projects=[package['name']]):
  File "/usr/local/lib/python2.7/site-packages/caniusepython3/__init__.py", line 51, in check
    elif not pypi.supports_py3(dependency):
  File "/usr/local/lib/python2.7/site-packages/caniusepython3/pypi.py", line 82, in supports_py3
    request = requests.get("https://pypi.org/pypi/{}/json".format(project_name))
  File "/usr/local/lib/python2.7/site-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/requests/adapters.py", line 514, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /pypi/azure-mgmt-commerce/json (Caused by SSLError(SSLError("bad handshake: SysCallError(60, 'ETIMEDOUT')",),))
make: *** [generate] Error 1

Possibly rate limiting? Anyway, if the data is only updated once in a while 1,080 should be fine technically.

@CAM-Gerlach
Copy link

As mentioned, some of those can be turned green by adding Trove classifiers, or updating caniusepython3.

@hugovk Yup, I've helped a bit with each of those before. Especially if this looks like this change will be approved, I was planning on going through all of them, presuming someone else doesn't get to it first and send PRs/bump existing for the live Py3 ones, and get caniusepython3 exceptions added for the nonresponsive ones (that support Py3). What's a good timeout from submitting or bumping a PR and then just giving up and getting an exception added to caniusepython3?

@hugovk
Copy link
Contributor

hugovk commented Nov 18, 2019

What's a good timeout from submitting or bumping a PR and then just giving up and getting an exception added to caniusepython3?

Two weeks? One month?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants