Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.NET 8: Could not load type 'OpenTelemetry.Trace.TracerProviderBuilderExtensions' from assembly 'OpenTelemetry.Instrumentation.AspNetCore, Version=1.0.0.0, Culture=neutral, PublicKeyToken=7bd6737fe5b67e3c'.' #5569

Closed
maxgolov opened this issue Apr 25, 2024 · 8 comments
Labels
question Further information is requested

Comments

@maxgolov
Copy link

maxgolov commented Apr 25, 2024

Bug Report

Could not load type 'OpenTelemetry.Trace.TracerProviderBuilderExtensions' from assembly 'OpenTelemetry.Instrumentation.AspNetCore, Version=1.0.0.0, Culture=neutral, PublicKeyToken=7bd6737fe5b67e3c'.'

Packages - latest everything in my service code. All the latest libs get pulled at runtime:

    <PackageReference Include="OpenTelemetry" Version="1.8.1" />
    <PackageReference Include="OpenTelemetry.Api" Version="1.8.1" />
    <PackageReference Include="OpenTelemetry.Exporter.Console" Version="1.8.1" />
    <PackageReference Include="OpenTelemetry.Exporter.Geneva" Version="1.7.0" />
    <PackageReference Include="OpenTelemetry.Extensions.Hosting" Version="1.8.1"
    <PackageReference Include="OpenTelemetry.Api.ProviderBuilderExtensions" Version="1.8.1" />
    <PackageReference Include="OpenTelemetry.Instrumentation.AspNetCore" Version="1.8.1" />

Plus, external nuget packages that on a build server may have had the v1.6.0-beta3 in their build folder. One of the dependencies explicitly references v1.6.0-beta3, others do not. However, even these other packages end up being compiled with wrong ABI.

Runtime version:

  • Multitargeting net6.0 and net8.0 in one build project. Problem happens with .NET 8.0

Symptom

Code builds successfully. However, HTTP tracing / AspNetCore functionality causes a crash at runtime:

image

System.TypeLoadException: 'Could not load type 'OpenTelemetry.Trace.TracerProviderBuilderExtensions' from assembly 'OpenTelemetry.Instrumentation.AspNetCore, Version=1.0.0.0, Culture=neutral, PublicKeyToken=7bd6737fe5b67e3c'.'

What is the expected behavior?

Code should not crash.

What is the actual behavior?

ASP.NET web service code crashes during DI resolution phase. This could be soon on start, or later if the code is lazy-loaded.

What did you see instead?

Reproduce

This gets harder. Because I believe it's an issue that happens due to the following root cause - you're at a mercy of nuget package resolution/loading order! Even a subtle reordering might fix it. We are building with a set of framework libraries, which themselves depend on OpenTelemetry.

Concrete issue with the API breaking change

image

Different versions of OpenTelemetry, specifically the classes under OpenTelemetry.Trace namespaces in OpenTelemetry.Instrumentation.AspNetCore.dll have been refactored between 1.6.0-beta.3 and 1.8.1.

You can see this if you unpack the nuget file and inspect the contents of OpenTelemetry.Instrumentation.AspNetCore.xml :

v1.6.0-beta3:

<member name="T:OpenTelemetry.Trace.TracerProviderBuilderExtensions">
    <summary>
    Extension methods to simplify registering of ASP.NET Core request instrumentation.
    </summary>
</member>

v1.8.0:

<member name="T:OpenTelemetry.Trace.AspNetCoreInstrumentationTracerProviderBuilderExtensions">
    <summary>
    Extension methods to simplify registering of ASP.NET Core request instrumentation.
    </summary>
</member>

TracerProviderBuilderExtensions was changed to AspNetCoreInstrumentationTracerProviderBuilderExtensions. However, the AssemblyVersion attribute of the corresponding library OpenTelemetry.Instrumentation.AspNetCore.dll has not changed. It stayed at v1.0.0.0, as if there was no breaking change. You can also find a few similar cases, where API got broken between the two released versions of the lib.

Now what happens - if a build system + runtime, specifically .NET 8 in this case - if it uses DI to load the class, it might as well resolve to the wrong version of the DLL when it needs to lazily load things. Because both DLLs were reporting the same exact ABI/API version. But in reality, are binary incompatible. This would result in a crash at runtime later on. You won't catch it at build time.

My thoughts are that this issue (not versioning the DLL properly) - is actually the root cause of the other bug reported previously here: #4300 - which essentially illustrates this same issue, where reordering may or may not fix things. If you are lucky (or unlucky) enough, depending on your dependencies and build order, one of the transient libs using incompatible OpenTelemetry - may pull the wrong version of the library into the build at compile time , so wrong code is being generated. Now when you deploy with latest - you'd be puzzled to debug why you are running and pulling the latest nuget in your service code, but your code crashes. Because some of these transient downstream deps 3-4 levels deep messed up your instrumentation. Presently the build system can't differentiate between which one is good vs bad: both versions are versioned with the same AssemblyVersion=v1.0.0.0 , despite ABI/API breaking changes between them.

image

Additional Context

I believe that if the XML of the public API changes, in this case OpenTelemetry.Instrumentation.AspNetCore.xml. Then it'd be best to ensure that you bump-up the AssemblyVersion attribute of it as well. If a C# library changes its API without updating the AssemblyVersion, then it can lead to runtime loading issues, especially in a complex project with many nuget package dependencies. This is because the AssemblyVersion is used by the .NET runtime to identify different versions of an assembly.

When two different NuGet package versions are pulled into a project, and both contain a library with the same DLL name and the same AssemblyVersion, but different APIs within (due to refactoring or other changes!), then the runtime will have difficulty determining which version to load exactly. This can result in a binary-level break, where client assemblies compiled against the older version of the API may not load correctly at runtime with the new version - see https://stackoverflow.com/questions/1456785/a-definitive-guide-to-api-breaking-changes-in-net

To prevent such issues, it’s crucial to increment the AssemblyVersion when making ABI/API breaking changes to the public API of a library. Feel free to reach out to me for additional context. Unfortunately, I cannot share the source code / list of nuget packages that I am referencing, but most of them reference the latest version. There could be one package in the build system (fairly sophisticated one and I am a consumer of the binaries from there), that references the said Beta version of OpenTelemetry, and that one spoils the build for the rest of it.

This issue might also be related to #5564 , as it appears like the symptoms are very similar.

TL;DR - enforce ABI compat at build time, prefer to break the build over experiencing the crashes later at runtime, when DI fails to resolve something, especially with lazy-loading.

@maxgolov maxgolov added the bug Something isn't working label Apr 25, 2024
@maxgolov
Copy link
Author

maxgolov commented May 7, 2024

Current outstanding issue:

  • OpenTelemetry is now Released. But all of its released assemblies are still always labeled v1.0.0.0.

  • This will cause a runtime crash for anything that relies on DI to load OpenTelemetry. For example, one of our libraries that wraps around OpenTelemetry has a relaxed check for a nuget package OpenTelemetry >=v1.6.0-beta[2-3. But ultimately this check is not good: due to ABI-breaking changes and no proper assembly versioning, we run into subtle runtime failures that are not otherwise detectable at build time.

  • For the current ASP.NET instrumentation v1.8.1 , which still has its assembly labeled v1.0.0.0 ... if ABI changes again in the next release v1.9 - how do we know that there were no ABI breaking changes, specifically around the Dependency Injection / lazy loading? This is fragile.

Our internal library is being updated to use the latest released bits. But the proper AssemblyVersion strategy needs to be developed to avoid similar crashes in future.

@cijothomas
Copy link
Member

OpenTelemetry is now Released. But all of its released assemblies are still always labeled v1.0.0.0

Sorry, I didn't fully analyze this issue, but just want to share that the assembly versioning is following the library guidelines from https://learn.microsoft.com/en-us/dotnet/standard/library-guidance/versioning#assembly-version, enacted via #1697

image

@reyang
Copy link
Member

reyang commented May 9, 2024

  • This will cause a runtime crash for anything that relies on DI to load OpenTelemetry. For example, one of our libraries that wraps around OpenTelemetry has a relaxed check for a nuget package OpenTelemetry >=v1.6.0-beta[2-3. But ultimately this check is not good: due to ABI-breaking changes and no proper assembly versioning, we run into subtle runtime failures that are not otherwise detectable at build time.

This sounds very problematic to me, the following version upgrades never have compatibility guarantee:

  1. from 1.6.0-beta2 to 1.6.0-beta3
  2. from 1.6.0-beta3 to 1.6.0 (stable)
  3. from 1.6.0-beta2 to 1.8.0

Essentially, I believe a flexible range such as ">=v1.6.0-beta" should never be used.

@reyang reyang added question Further information is requested and removed bug Something isn't working labels May 10, 2024
@maxgolov
Copy link
Author

maxgolov commented May 15, 2024

@reyang @cijothomas

Essentially, I believe a flexible range such as ">=v1.6.0-beta" should never be used.

Now that the OpenTelemetry v1.8.0 is released and its assembly version remains constant 1.0.0.0 , leading to any ABI breaking change undetectable at compile time, especially when the libraries get pulled via external nuget package that itself pulls OpenTelemetry in - what is the flexible range that can be safely used?

@maxgolov
Copy link
Author

OpenTelemetry is now Released. But all of its released assemblies are still always labeled v1.0.0.0

Sorry, I didn't fully analyze this issue, but just want to share that the assembly versioning is following the library guidelines from https://learn.microsoft.com/en-us/dotnet/standard/library-guidance/versioning#assembly-version, enacted via #1697

image

@cijothomas - last rule you quoted says:

❌ DO NOT have a fixed AssemblyVersion.

At this time, OpenTelemetry always uses a fixed AssemblyVersion=v1.0.0.0. This leads to runtime crashes when ABI breaking changes are introduced.

@cijothomas
Copy link
Member

OpenTelemetry is now Released. But all of its released assemblies are still always labeled v1.0.0.0

Sorry, I didn't fully analyze this issue, but just want to share that the assembly versioning is following the library guidelines from https://learn.microsoft.com/en-us/dotnet/standard/library-guidance/versioning#assembly-version, enacted via #1697
image

@cijothomas - last rule you quoted says:

❌ DO NOT have a fixed AssemblyVersion.

At this time, OpenTelemetry always uses a fixed AssemblyVersion=v1.0.0.0. This leads to runtime crashes when ABI breaking changes are introduced.

Will update assemblyversion when OTel nuget does major version bump (like 2.*).

@maxgolov
Copy link
Author

maxgolov commented May 15, 2024

@cijothomas - does it mean that if v1.9 gets released, it will still be fully ABI backwards compatible with v1.8.1? I'd like to avoid the next runtime crash when our services upgrade from v1.8.1 to v1.9.x. This is something that could have been prevented at build time by strict enforcement of AssemblyVersion. If you intend to introduce a breaking change in v1.9 prior to v2, then an update to Minor tuple of the version is required.

As of note - a flexible range was set by our partner library that acts as a "distro" of OpenTelemetry. As a customer - it was relatively hard for me to understand where the breakage is coming from.

What flexible range can we suggest our partner library to use, i.e. would [1.8.1,2.0) -
1.8.1 ≤ x < 2.0 --- be a safe version range for a nuget package that uses OpenTelemetry bits? Looking at nuget.org dynamics, it looks like a minor OpenTelemetry.NET release gets posted every 2 months. Are you intending to go straight to v2.0 after v1.8, or can you give us ABI stability promise for v1.9 that it will not be causing us crashes at runtime when DI fails to resolve a method?

Please let me know. Thank you.

@reyang
Copy link
Member

reyang commented May 21, 2024

does it mean that if v1.9 gets released, it will still be fully ABI backwards compatible with v1.8.1?

Yes.

What flexible range can we suggest our partner library to use, i.e. would [1.8.1,2.0) - 1.8.1 ≤ x < 2.0 --- be a safe version range for a nuget package that uses OpenTelemetry bits?

Correct.

Are you intending to go straight to v2.0 after v1.8...

No, it'll be 1.9.0, 1.10.0, 1.11.0, etc.

@reyang reyang closed this as completed May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants