Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commit a7ebc26 disables Open Telemetry's httpx instrumentation in some scenarios #1144

Open
1 task done
palvarezcordoba opened this issue Feb 9, 2024 · 8 comments
Open
1 task done
Assignees
Labels
bug Something isn't working

Comments

@palvarezcordoba
Copy link

palvarezcordoba commented Feb 9, 2024

Confirm this is an issue with the Python library and not an underlying OpenAI API

  • This is an issue with the Python library

Describe the bug

This commit a7ebc26, which was introduced in PR #966, for release v1.3.9, disables httpx instrumentation in some cases.

Specifically, it is disabled when openai is imported before instrumenting httpx.
This is because opentelemetry.instrumentation.httpx .HTTPXClientInstrumentor._instrument creates subclasses of httpx.Client and httpx.AsyncClient. And then replaces the original clients with those subclasses, which adds telemetry.
In the above commit, openai creates SyncHttpxClientWrapper and AsyncHttpxClientWrapper subclasses of httpx's clients.

That means that when we instrument first, and then import openai, the client wrappers inherit from the instrumented clients.
When we import openai first, and then instrument httpx, the wrapper clients inherit from the original httpx clients.

Maybe this should be addressed at the opentelemetry-python-contrib side, but even so, v1.3.9 broke telemetry.

Could the change implemented in #966 be implemented in a backward-compatible way?
If not, at the very least, the changelog should be updated, warning about this.

To Reproduce

  1. Import openai
  2. Use HTTPXClientInstrumentor from opentelemetry.instrumentation.httpx to instrument httpx.

Code snippets

Running this, openai httpx requests are instrumented:

from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor, _InstrumentedClient
HTTPXClientInstrumentor().instrument()
from openai._base_client import SyncHttpxClientWrapper
assert issubclass(SyncHttpxClientWrapper, _InstrumentedClient)

But this version is not instrumented, and the assertion fails:

from openai._base_client import SyncHttpxClientWrapper
from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor, _InstrumentedClient
HTTPXClientInstrumentor().instrument()
assert issubclass(SyncHttpxClientWrapper, _InstrumentedClient)

OS

any

Python version

any

Library version

openai from v1.3.9.0 to v1.12.0

@rattrayalex
Copy link
Collaborator

rattrayalex commented Feb 11, 2024

Thank you for reporting, @palvarezcordoba !

cc @RobertCraigie , can you take a look?

@RobertCraigie
Copy link
Collaborator

Thanks for the detailed report and for bisecting @palvarezcordoba! Unfortunately I don't think there's anything we can realistically do here and the fact that open telemetry worked independently of the import order was pure coincidence.

It looks like opentelemetry-python-contrib supports instrumenting individual client instances as well which should work with the openai SDK independently of the import order if instrumenting before importing isn't feasible for certain cases.

I think we'll want to add a section to the README.md showcasing instrumentation and mentioning these points cc @rattrayalex

@palvarezcordoba
Copy link
Author

@RobertCraigie Yeah, I didn't expect the issue to be solved here, although it would have been nice.
As I said, this isn't a bug, but v1.3.9 broke telemetry. Shouldn't this be stated in the changelog?

Writing an example in the README would be very good.
I was going to instrument a client like in the link you shared, but I have some uncertainty about that.

Your default clients are in src/openai/_base_client.py. Since it starts with an underscore, I assume it is not considered part of the public API and should not be relied upon.
So, I must not create instances of those. And I don't want to make my own clients, because if Openai makes more changes to their clients, I'm going to be using a different version.

For that reason, an example would be appreciated.

Thanks, @rattrayalex and @RobertCraigie for your fast response.

@RobertCraigie
Copy link
Collaborator

Ah of course, you shouldn't have to touch anything in _base_client, here's an example that should work (I haven't tested it):

import httpx
from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor
from openai import OpenAI

http_client = httpx.Client(transport=telemetry_transport)
HTTPXClientInstrumentor.instrument_client(http_client)

openai_client = OpenAI(http_client=http_client)

# use `openai_client` as normal

Shouldn't this be stated in the changelog?

Sure we can update the changelog, the only reason it wasn't included in the first place is that we were unaware it would break this case.

@pamelafox
Copy link
Contributor

@RobertCraigie Where does telemetry_transport come from in that code? I'm testing this out in our samples, as I'm not seeing good traces for the OpenAI calls currently.

@RobertCraigie
Copy link
Collaborator

Ah sorry it comes from this example: https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation/opentelemetry-instrumentation-httpx#using-transports-directly

import httpx
from opentelemetry.instrumentation.httpx import (
    SyncOpenTelemetryTransport,
)

transport = httpx.HTTPTransport()
telemetry_transport = SyncOpenTelemetryTransport(transport)

@tonybaloney
Copy link

Ah sorry it comes from this example: https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation/opentelemetry-instrumentation-httpx#using-transports-directly

import httpx
from opentelemetry.instrumentation.httpx import (
    SyncOpenTelemetryTransport,
)

transport = httpx.HTTPTransport()
telemetry_transport = SyncOpenTelemetryTransport(transport)

Sadly it's a bit more involved.

HTTPClient (or AsyncHTTPClient) needs base_url set which is an argument for OpenAIClient but a factory for AzureOpenAIClient, you need limits set to DEFAULT_LIMITS, you need follow_redirects needs to be enabled.

This seems just as fragile as HTTPXClientInstrumentor.instrument_client(openai_client._client)

@tonybaloney
Copy link

I'm using the opentelemetry-instrumentation-openai for now, this works well and puts the patch at the API level instead of relying on underlying implementation details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants