Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3.13: ModuleNotFoundError: No module named 'cgi' #2014

Open
1 of 3 tasks
hroncok opened this issue Jan 5, 2024 · 9 comments · May be fixed by #2021
Open
1 of 3 tasks

Python 3.13: ModuleNotFoundError: No module named 'cgi' #2014

hroncok opened this issue Jan 5, 2024 · 9 comments · May be fixed by #2021
Assignees
Labels
bug CherryPy code critical Hacktoberfest help wanted reproducer: present This PR or issue contains code, which reproduce the problem described or clearly understandable STR task

Comments

@hroncok
Copy link

hroncok commented Jan 5, 2024

I'm submitting a ...

  • bug report
  • feature request
  • question about the decisions made in the repository

What is the current behavior?

With Python 3.13.0a2, importing cherrypy fails:

...
    import cherrypy
/usr/lib/python3.13/site-packages/cherrypy/__init__.py:66: in <module>
    from ._cperror import (
/usr/lib/python3.13/site-packages/cherrypy/_cperror.py:135: in <module>
    from cherrypy.lib import httputil as _httputil
/usr/lib/python3.13/site-packages/cherrypy/lib/httputil.py:15: in <module>
    from cgi import parse_header
E   ModuleNotFoundError: No module named 'cgi'

This is reproducible by:

$ python3.13 -m venv venv3.13
$ venv3.13/bin/pip install cherrypy
...
$ venv3.13/bin/python
>>> import cherrypy
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    import cherrypy
  File ".../venv3.13/lib64/python3.13/site-packages/cherrypy/__init__.py", line 66, in <module>
    from ._cperror import (
  File ".../venv3.13/lib64/python3.13/site-packages/cherrypy/_cperror.py", line 135, in <module>
    from cherrypy.lib import httputil as _httputil
  File ".../venv3.13/lib64/python3.13/site-packages/cherrypy/lib/httputil.py", line 15, in <module>
    from cgi import parse_header
ModuleNotFoundError: No module named 'cgi'

I see the code in the main branch:

from cgi import parse_header

What is the expected behavior?

The CGI module was deprecated and removed from Python. Expected behavior is that it is not used.

What is the motivation / use case for changing the behavior?

We are testing all Fedora packages with early pre-releases of Python 3.13 to ensure everything is working once the final 3.13 is out.

Please tell us about your environment:

  • Cheroot version: ?
  • CherryPy version: main
  • Python version: 3.13.0a2
  • OS: Fedora Linux 39, 40
  • Browser: N/A
@webknjaz
Copy link
Member

webknjaz commented Jan 5, 2024

Thanks for the report, Miro!

It looks like I attempted to replace cgi in 6b7c2cd exactly a year ago and then, reverted it with 7338b83 in a few days and suppressed the deprecation warning in a follow-up (97ae5b7).

I don't remember the exact reason, but it probably made the CI fail somehow and I didn't have an immediate solution.

It seems like the change needs to be re-introduced with some edits and the deprecation warning ignore reverted. This probably requires some investigation.

@webknjaz webknjaz added bug help wanted CherryPy code task critical Hacktoberfest reproducer: present This PR or issue contains code, which reproduce the problem described or clearly understandable STR labels Jan 5, 2024
@radez
Copy link
Contributor

radez commented Jan 9, 2024

I tried reintroducing the change and a bunch of tests failed. I'll try and find some time to help look at this.

@radez
Copy link
Contributor

radez commented Jan 9, 2024

The original patch ran into the difference in fundamental use of email.message.EmailMessage and cgi.parse_headers.

The email.message module is trying to create a higher level object that is intelligent about the headers it has defined. The cgi.parse_header function is a very low level unintelligent function that simply parses header content based on structure and does not inspect the contents.

Here's the cgi.parse_header code: https://github.com/python/cpython/blob/3.12/Lib/cgi.py#L226C1-L256C22
Here's the email.message.EmailMessage.get_content_type() code: https://github.com/python/cpython/blob/3.12/Lib/email/message.py#L596-L618

Because of the intelligence of the email.message object cherrypy can't use it in the very generic way that the cgi.parse_header function was being used. The email module is forcing text/plain as the return value because a slash is not part of the value and the suggested replacement funnels the value through logic that assumes a Content-Type header and forces the presence of a slash as opposed to cgi.parse_header which treats the value as any generic header.

So in the case that the value passed to the email module is ISO-8859-1, 0.7, utf-8, *, or gzip as test_tools.py::ToolTests.testCombinedTools tests, the content type will always be text/plain and therefore the return value always text/plain the way the suggested replacement is written.

I have looked though the email module and http modules. The closest thing I have found so far to what cgi.parse_header was doing is the email.message._splitparm function.

email.message._splitparam('application/json; charset="utf8"')
('application/json', 'charset="utf8"')
email.message._splitparam('gzip')
('gzip', None)
email.message._splitparam('0.7')
('0.7', None)

Even then, there's more massaging to do compared to the cgi module:

cgi.parse_header('application/json; charset="utf8"')
('application/json', {'charset': 'utf8'})
cgi.parse_header('gzip')
('gzip', {})

The email.message._parseparam() function seems to split multiple params into a list of k=v pairs.

So, using a combination of _parseparam and _splitparam and further parsing the k=v pairs we could recreate what cgi.parse_header was doing.

Is there any consideration to just replace the httputil.parse() function with a copy of the code from the cgi.parse_header code in the cherrypy code base?

I'll try and mock up a patch for each scenario to validate them.

@webknjaz
Copy link
Member

webknjaz commented Jan 10, 2024

Yeah, I also did some research and here's the logs of the differences:

$ python -c 'import cherrypy, sys; accept_header = cherrypy.lib.httputil.AcceptElement.from_str("iso-8859-5"); print(f"{accept_header=}")'
elementstr='iso-8859-5' ===> initial_value='text/plain' &&& params={}
accept_header=<cherrypy.lib.httputil.AcceptElement object at 0x7f3b6aa03e50>
$ python -c 'import cherrypy, sys; accept_header = cherrypy.lib.httputil.AcceptElement.from_str("0.8"); print(f"{accept_header=}")'
elementstr='0.8' ===> initial_value='text/plain' &&& params={}
accept_header=<cherrypy.lib.httputil.AcceptElement object at 0x7fde09f03e50>
$ python -c 'import cherrypy, sys; accept_header = cherrypy.lib.httputil.AcceptElement.from_str("unicode-1-1"); print(f"{accept_header=}")'
elementstr='unicode-1-1' ===> initial_value='text/plain' &&& params={}
accept_header=<cherrypy.lib.httputil.AcceptElement object at 0x7fae2f503e50>
$ python -c 'import cherrypy, sys; accept_header = cherrypy.lib.httputil.AcceptElement.from_str("iso-8859-5"); print(f"{accept_header=}")'
elementstr='iso-8859-5' ===> initial_value='iso-8859-5' &&& params={}
accept_header=<cherrypy.lib.httputil.AcceptElement object at 0x7f9f7d703e50>
$ python -c 'import cherrypy, sys; accept_header = cherrypy.lib.httputil.AcceptElement.from_str("0.8"); print(f"{accept_header=}")'
elementstr='0.8' ===> initial_value='0.8' &&& params={}
accept_header=<cherrypy.lib.httputil.AcceptElement object at 0x7f294b547f90>
$ python -c 'import cherrypy, sys; accept_header = cherrypy.lib.httputil.AcceptElement.from_str("unicode-1-1"); print(f"{accept_header=}")'
elementstr='unicode-1-1' ===> initial_value='unicode-1-1' &&& params={}
accept_header=<cherrypy.lib.httputil.AcceptElement object at 0x7eff5d603e50>

So it sounds like you're right and there's no easy way to reproduce the original behavior with the email module.

I guess, it'd be a good idea to vendor the removed function from CPython, putting it into https://github.com/cherrypy/cherrypy/blob/main/cherrypy/_cpcompat.py.

But before doing that, could you please, add tests for AcceptElement and HeaderElement that pass against the current implementation? And only then, make a separate PR or commit that replaces cgi.parse_headers() with a vendored copy?

It'd make sure there's a documented way of how HeaderElement is supposed to behave (as code).

@radez
Copy link
Contributor

radez commented Jan 10, 2024

Sounds good. I'll work on making some progress on it.

@webknjaz
Copy link
Member

@radez hey, did you manage to get to that?

@radez
Copy link
Contributor

radez commented Feb 19, 2024

I've got a fix but I've not had a chance to work on the unit tests yet.

@webknjaz
Copy link
Member

@radez it'd be a bit more transparent if you could make a draft PR out of it.

@radez
Copy link
Contributor

radez commented Feb 19, 2024

Sure, lemme post what I have. I had intended to get the tests in a draft state too and push it all up at once but work has taken a front seat and I wasn't able to make it happen all at once. I'll put a patch up today.

radez added a commit to radez/cherrypy that referenced this issue Feb 19, 2024
radez added a commit to radez/cherrypy that referenced this issue Feb 19, 2024
the module was deprecatied in py 3.11 and removed in py 3.13
there are examples of using the email module to resolve this:
PEP 594
python-babel/babel#873
https://stackoverflow.com/questions/69068527/python-3-cgi-parse-header

This method doesn't seem to work for cherrypy.
The email.message module is trying to create a higher level
object that is intelligent about the headers it has defined. The
cgi.parse_header function is a very low level unintelligent function
that simply parses header content based on structure and does not
inspect the contents. Because of the intelligence of the email.message
object cherrypy can't use it in the very generic way that the
cgi.parse_header function was being used.

Fix cherrypy#2014
@radez radez linked a pull request Feb 19, 2024 that will close this issue
7 tasks
radez added a commit to radez/cherrypy that referenced this issue Feb 19, 2024
the module was deprecatied in py 3.11 and removed in py 3.13
there are examples of using the email module to resolve this:
PEP 594
python-babel/babel#873
https://stackoverflow.com/questions/69068527/python-3-cgi-parse-header

This method doesn't seem to work for cherrypy.
The email.message module is trying to create a higher level
object that is intelligent about the headers it has defined. The
cgi.parse_header function is a very low level unintelligent function
that simply parses header content based on structure and does not
inspect the contents. Because of the intelligence of the email.message
object cherrypy can't use it in the very generic way that the
cgi.parse_header function was being used.

Fix cherrypy#2014
radez added a commit to radez/cherrypy that referenced this issue Feb 19, 2024
the module was deprecatied in py 3.11 and removed in py 3.13
there are examples of using the email module to resolve this:
PEP 594
python-babel/babel#873
https://stackoverflow.com/questions/69068527/python-3-cgi-parse-header

This method doesn't seem to work for cherrypy.
The email.message module is trying to create a higher level
object that is intelligent about the headers it has defined. The
cgi.parse_header function is a very low level unintelligent function
that simply parses header content based on structure and does not
inspect the contents. Because of the intelligence of the email.message
object cherrypy can't use it in the very generic way that the
cgi.parse_header function was being used.

Fix cherrypy#2014
radez added a commit to radez/cherrypy that referenced this issue Apr 2, 2024
This is in preparation for issue cherrypy#2014
Establishing tests to validate these objects so we can
pull in the deprecated cgi.parse_header code
radez added a commit to radez/cherrypy that referenced this issue Apr 3, 2024
This is in preparation for issue cherrypy#2014
Establishing tests to validate these objects so we can
pull in the deprecated cgi.parse_header code
radez added a commit to radez/cherrypy that referenced this issue May 16, 2024
This is in preparation for issue cherrypy#2014
Establishing tests to validate these objects so we can
pull in the deprecated cgi.parse_header code
radez added a commit to radez/cherrypy that referenced this issue May 16, 2024
This is in preparation for issue cherrypy#2014
Establishing tests to validate these objects so we can
pull in the deprecated cgi.parse_header code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug CherryPy code critical Hacktoberfest help wanted reproducer: present This PR or issue contains code, which reproduce the problem described or clearly understandable STR task
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants