Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trino python client integrating with OAUTH2 enabled trino cluster #207

Open
eadapa29 opened this issue Jul 22, 2022 · 11 comments
Open

trino python client integrating with OAUTH2 enabled trino cluster #207

eadapa29 opened this issue Jul 22, 2022 · 11 comments

Comments

@eadapa29
Copy link

Hello,
I am trying to connect to an OAUTH2 enabled trino cluster using python client. I am getting authenticated and everything works fine but i have to click a link which is an additional step.. Is there a possibility to automatically authenticate it without having to launch the URL externally.

7:59
In [1]: import trino
...: conn = trino.dbapi.connect(
...: host='trino.somedomain.net',
...: port=443,
...: user='first.last',
...: catalog='iceberg',
...: schema='ds_scratch',
...: http_scheme='https',
...: auth=trino.auth.OAuth2Authentication(),
...: )
...: cur = conn.cursor()

In [2]: cur.execute('SELECT * FROM system.runtime.nodes')

Open the following URL in the browser for the external authentication:
https://trino.somedomain.net/oauth2/token/initiate/042f6e4167d4e6a3f70068ec4389037c2b9c34f3ec356ddc5522a3e13e179fd9

I am talking about the last step where it says "Open the following URL in the browser...."
Can it be redirected within the python client ?

@eadapa29 eadapa29 changed the title Support for OKTA Group provider trino python client integrating with OAUTH2 enabled trino cluster Jul 22, 2022
@mdesmet
Copy link
Contributor

mdesmet commented Jul 22, 2022

Note in our docs:

https://trinodb.slack.com/archives/CFPVDCDHV/p1657883633509519

A callback to handle the redirect url can be provided via param redirect_auth_url_handler of the trino.auth.OAuth2Authentication class. By default, it will try to launch a web browser (trino.auth.WebBrowserRedirectHandler) to go through the authentication flow and output the redirect url to stdout (trino.auth.ConsoleRedirectHandler). Multiple redirect handlers are combined using the trino.auth.CompositeRedirectHandler class.

@eadapa29
Copy link
Author

Correct.. By what would be the redirect_auth_url_handler be... Should i write my own handler ?? Is that what it implies ?

@mdesmet
Copy link
Contributor

mdesmet commented Jul 25, 2022

True. By default it launches a webbrowser and it is up to the user to authenticate itself on the OAuth provider. Now if you want to automate that you would have to perform the authentication automatically and ensure your OAuth provider redirects to the Trino server and provides it with an access/refresh token as per the OAuth protocol. That token will be picked up by the trino-python-client

@mdesmet
Copy link
Contributor

mdesmet commented Jul 25, 2022

Are you saying that it doesn't launch your webbrowser?

Can you try following code:

import webbrowser

webbrowser.open_new("https://trino.io/")

If not working, could you tell us your operation system and python version? Are you on a notebook server or is it simply a local Python script?

@eadapa29
Copy link
Author

eadapa29 commented Jul 25, 2022 via email

@JeevansSP
Copy link

Now if you want to automate that

i got till here, but i did not get further steps, background: i have my trino clusters running on kubernetes and i also have a website that i can visit for the UI, i have configured azure active directory (Oauth2) with my trino clusters,

@mdesmet
Copy link
Contributor

mdesmet commented Aug 10, 2022

You need to supply your own handler as explained in the docs.

Your handler needs to ensure that the passed url is executed by the user to authenticate. After your Trino python client will poll the Trino server and receive the token when user authentication is complete and your query will be executed. Note that the caching implementation is probably not exactly what you want for a multi user web application and is currently not pluggable. PRs are welcome!

@IceS2
Copy link

IceS2 commented Aug 17, 2022

Hey folks, I'm not sure I should ask this here, but I've been having an issue when caching the Tokens...
At every single Query I'm prompted with the Web URL...

I'm connecting with this code:

        self._conn = trino.dbapi.connect(
            host=self.host,
            port=self.port,
            user=self.user,
            http_scheme="https",
            auth=trino.auth.OAuth2Authentication()
        )

And executing my queries like this:

with self._conn as conn:
            cur = conn.cursor()
            cur.execute(query)

I've tried it both with pip install 'trino[external-authentication-token-cache]' and without it.

Any idea of what I might be doing wrong?
Thanks!

@IceS2
Copy link

IceS2 commented Aug 17, 2022

Alright, found out that reusing the same cursor it actually doesn't prompt for new authentication.
I thought it'd make sense to have that behavior when reusing the connection, no?
Thoughts?

Thanks folks (=

@mdesmet
Copy link
Contributor

mdesmet commented Aug 17, 2022

HI @IceS2, that's not how it supposed to work.

There are unit tests that cover the in memory token cache with cursors over multiple threads reusing the same connection. See

def test_multithreaded_oauth2_authentication_flow(sample_post_response_data):
redirect_handler = RedirectHandler()
auth = trino.auth.OAuth2Authentication(redirect_auth_url_handler=redirect_handler)
token_server = MultithreadedTokenServer(sample_post_response_data)
class RunningThread(threading.Thread):
lock = threading.Lock()
def __init__(self):
super().__init__()
self.token = None
def run(self) -> None:
request = TrinoRequest(
host="coordinator",
port=constants.DEFAULT_TLS_PORT,
client_session=ClientSession(
user="test",
),
http_scheme=constants.HTTPS,
auth=auth)
for i in range(10):
# apparently HTTPretty in the current version is not thread-safe
# https://github.com/gabrielfalcao/HTTPretty/issues/209
with RunningThread.lock:
response = request.post("select 1")
self.token = response.request.headers["Authorization"].replace("Bearer ", "")
threads = [RunningThread(), RunningThread(), RunningThread()]
# run and join all threads
for thread in threads:
thread.start()
for thread in threads:
thread.join()
# should issue only 1 token and each thread should reuse it
assert len(token_server.tokens) == 1
for thread in threads:
assert thread.token in token_server.tokens
# should start only 1 challenge
assert len(token_server.challenges.keys()) == 1
for challenge_id, challenge in token_server.challenges.items():
assert f"{REDIRECT_RESOURCE}/{challenge_id}" in redirect_handler.redirect_server
assert challenge.attempts == 0
assert len(_get_token_requests(challenge_id)) == 1
# 3 threads * (10 POST /statement each + 1 replied request by authentication)
assert len(_post_statement_requests()) == 31

Can you create a new issue and provide reproduction steps?

@IceS2
Copy link

IceS2 commented Aug 17, 2022

Thanks for the answer @mdesmet and sorry for Hijacking the thread!

I'll create some boilerplate code, add the context information and create a new issue soon!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants