Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL.render_as_string doesn't always give an URL that is parsed same way #11234

Open
AlexeyDmitriev opened this issue Apr 4, 2024 · 8 comments
Labels
bug Something isn't working easy a one / two liner type of thing that anyone can do in short order. also see "fairly easy" engine engines, connections, transactions, isolation levels, execution options
Milestone

Comments

@AlexeyDmitriev
Copy link

AlexeyDmitriev commented Apr 4, 2024

Describe the bug

So the code below fails because database is not escaped

Optional link from https://docs.sqlalchemy.org which documents the behavior that is expected

No response

SQLAlchemy Version in Use

2.0.1

DBAPI (i.e. the database driver)

sqlite

Database Vendor and Major Version

sqlite

Python Version

3.11

Operating system

linux

To Reproduce

import sqlalchemy
from sqlalchemy.engine import URL
first_url = URL.create('sqlite', username=None, password=None, host=None, port=None, database="a?b=c") # I could have this filename
first_url_as_string = first_url.render_as_string()
second_url = sqlalchemy.make_url(first_url_as_string)
assert first_url.database == second_url.database, f"Fail, expected: {first_url.database} found: {second_url.database}"

Error

AssertionError: Fail, expected: a?b=c found: a

Additional context

I didn't see any doc that claims that this should be roundtrip conversion, but it seems it's intended to be because there's escaping of other fields.

@AlexeyDmitriev AlexeyDmitriev added the requires triage New issue that requires categorization label Apr 4, 2024
@CaselIT
Copy link
Member

CaselIT commented Apr 4, 2024

Hi,

I think this is expected behaviour. A string is interpreted as an url, so the ? delimits the query from the path.

@CaselIT CaselIT added expected behavior that's how it's meant to work. consider the "documentation" label in addition engine engines, connections, transactions, isolation levels, execution options and removed requires triage New issue that requires categorization labels Apr 4, 2024
@zzzeek
Copy link
Member

zzzeek commented Apr 4, 2024

it looks like a bug . they are creating the URL programattically. The name of the database is a?b=c. that should round trip.

That said, nobody would ever have a database name like that. so what's the XY problem here? (e.g. what do you actually need to do ?)

@AlexeyDmitriev
Copy link
Author

@CaselIT Hi, I might have been confusing:
I understand that if I pass sqlite:///a?b=c then, b=c are parsed as query and it is expected

What in my view is not expected is the fact, that first_url_as_string contains this ? directly (instead of some kind of escaping, e.g urlencoded, etc), so that it would roundtrip well

@zzzeek
Copy link
Member

zzzeek commented Apr 4, 2024

this is the patch, this is for 2.1 only as URL adjustments are extremely high risk

diff --git a/lib/sqlalchemy/engine/url.py b/lib/sqlalchemy/engine/url.py
index 1eeb73a23..7565252c7 100644
--- a/lib/sqlalchemy/engine/url.py
+++ b/lib/sqlalchemy/engine/url.py
@@ -641,7 +641,7 @@ class URL(NamedTuple):
         if self.port is not None:
             s += ":" + str(self.port)
         if self.database is not None:
-            s += "/" + self.database
+            s += "/" + quote_plus(self.database)
         if self.query:
             keys = list(self.query)
             keys.sort()
@@ -888,11 +888,10 @@ def _parse_url(name: str) -> URL:
             query = None
         components["query"] = query
 
-        if components["username"] is not None:
-            components["username"] = unquote(components["username"])
+        for comp in "username", "password", "database":
+            if components[comp] is not None:
+                components[comp] = unquote(components[comp])
 
-        if components["password"] is not None:
-            components["password"] = unquote(components["password"])
 
         ipv4host = components.pop("ipv4host")
         ipv6host = components.pop("ipv6host")

@zzzeek zzzeek added bug Something isn't working easy a one / two liner type of thing that anyone can do in short order. also see "fairly easy" and removed expected behavior that's how it's meant to work. consider the "documentation" label in addition labels Apr 4, 2024
@zzzeek zzzeek added this to the 2.1 milestone Apr 4, 2024
@AlexeyDmitriev
Copy link
Author

AlexeyDmitriev commented Apr 4, 2024

@zzzeek

I tried to understand whether I needed to escape my path when building an URL as a string (we don't have any strange character in the path to database but I wanted to fix it just in case). I've found that I can use URL class.
But while I was playing with it, I've noticed that behaviour is not what I expected

So, this behaviour doesn't bother me, I've already solved my problem, just decided to report what for me looks like a bug

@CaselIT
Copy link
Member

CaselIT commented Apr 4, 2024

I fear that the patch above will break existing use cases, since I'm not sure what all the various db driver would do in this case.

tried to create a db named "my?db=1" in pg and all kind of tools are failing to connect. Like I can't manage to connect using psql... Psycopg2 works though

@zzzeek
Copy link
Member

zzzeek commented Apr 4, 2024

this is less sensitive than "password". this breaks if someone had a database named "a%3Fb%3Dc" with URL escape symbols in it, that's basically it.

@CaselIT
Copy link
Member

CaselIT commented Apr 4, 2024

sure, probably not a big dial, but needs a breaking change node

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working easy a one / two liner type of thing that anyone can do in short order. also see "fairly easy" engine engines, connections, transactions, isolation levels, execution options
Projects
None yet
Development

No branches or pull requests

3 participants