Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression on parsing invalid URLs #2382

Open
kamil-certat opened this issue Jun 22, 2023 · 0 comments
Open

Regression on parsing invalid URLs #2382

kamil-certat opened this issue Jun 22, 2023 · 0 comments
Labels
bug Indicates an unexpected problem or unintended behavior component: bots component: core

Comments

@kamil-certat
Copy link
Contributor

As a continuation of #2377, we have a regression on parsing invalid URLs. Previously, the urllib was mach more liberal in processing URLs, now it rejects much more cases.

We use it for sanitize the URLs, and html_parser is an example of bot that uses the liberal behavior in tests:

EXAMPLE_EVENT2['source.url'] = "http://[D] lingvaworld.ru/media/system/css/messg.jpg"

def test_event_without_split(self):
self.sysconfig = {"columns": ["time.source", "source.url", "malware.hash.md5",
"source.ip", "__IGNORE__"],
"skip_head": True,
"default_url_protocol": "http://",
"type": "malware-distribution"}
self.run_bot()
self.assertMessageEqual(0, EXAMPLE_EVENT2)

In patched Python versions (e.g. 3.11.4), this URL is rejected. We need to either decide against allowing such URLs, or redesign our sanitization.

Temporally, the test is skipped to unlock other work.

@kamil-certat kamil-certat added bug Indicates an unexpected problem or unintended behavior component: bots component: core labels Jun 22, 2023
kamil-certat added a commit to kamil-certat/intelmq that referenced this issue Jun 22, 2023
More restrict validation in urllib causes troubles
when processing invalid URLs. The correct solution
on our side is at the moment unclear, see certtools#2382
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior component: bots component: core
Projects
None yet
Development

No branches or pull requests

1 participant