Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve deps.json documentation #19958

Merged
merged 3 commits into from Aug 18, 2020
Merged

Improve deps.json documentation #19958

merged 3 commits into from Aug 18, 2020

Conversation

jzabroski
Copy link
Contributor

@jzabroski jzabroski commented Aug 11, 2020

Summary

I frequently have to refer back to a comment Nate McMaster made on his blog to me a long time ago to find any reference to System.AppContext.GetData("APP_CONTEXT_DEPS_FILES") when debugging assembly loading issues. It would be very convenient to mention this here, as I frequently ask my customers to dump this value to troubleshoot their problems. In particular, Azure App Service AI SQL Diagnostics feature causes issues loading System.Threading.AccessControl, and customers somehow think this is my fault, and it would be nice to be able to show them exactly step by step what is going on and that the root cause lies with Azure App Service injecting a random assembly version requirement into their default AssemblyLoadContext.

@jzabroski jzabroski requested a review from a team as a code owner August 11, 2020 13:50
@dotnet-bot dotnet-bot added this to the August 2020 milestone Aug 11, 2020
@BillWagner
Copy link
Member

Thanks for making this change @jzabroski

I'm checking with the product team to make sure this is the best recommendation. Once I hear back, we'll merge or suggest alternatives.

@jzabroski
Copy link
Contributor Author

@BillWagner Thanks, I actually want to write up a Medium blog post or equivalent about how to troubleshoot .NET modularity issues. I find that I have rather arcane knowledge collected through supporting a lot of customers over the years that almost nobody knows about, and I will likely forget in a few years if I don't write it down. Also, by writing it down, I can get more feedback and improve and see holes in my own logic. Several of my friends have sent me messages saying, something like, "Hey, I found your post on this GitHub issue and your workaround solved my problem, even though you wrote it's not the ideal way to solve the problem."

@vitek-karas
Copy link
Member

Given that you say you want "detailed, step-by-step trace", did you try using the host tracing?
https://docs.microsoft.com/en-us/dotnet/core/dependency-loading/default-probing#how-do-i-debug-the-probing-properties-construction

This provides very detailed log of what the host did and how it resolved assemblies on app startup. This includes any additional deps processing and more.

The APP_CONTEXT_DEPS_FILES is something I personally would like to get rid of eventually. Runtime itself doesn't use this, it's basically an implementation detail of the host and how it resolves assemblies on startup. And for diagnosing assembly resolution problems it's just one of many pieces of information which may be needed to figure out the problem. The host tracing should ideally provide all that information (and if there's missing info there, I'd be happy to add it to the trace).

@jzabroski
Copy link
Contributor Author

jzabroski commented Aug 11, 2020

@vitek-karas Well, it's an awfully big implementation detail for you to just say the runtime doesn't use it, and I am dead set against that unless you give me features in exchange - I nearly quit .NET a few years ago due to how horrific it is to debug assembly loading failures. As for whether or not the runtime uses this, I'm not sure that's the point. The point is to provide providence information for how a value got to where it is. As for why I don't use your host tracing API today, it's because it's too noisy and logs a bunch of stuff and I have no use for any of it. I can see why you might find it more helpful, but it's not. I can give you so, so many examples where I could not figure something out by looking at the host tracing because its too ideopathic.

Example 1. When .NET Standard 2.0 was introduced, some packages were packages against both net472 and netstandard2.0, and .NET Framework 7.2 was advertised as quasi-supporting .NET Standard 2.0. So, what ended up happening is the System.ComponentModel.DataAnnotations nuget package shipped as version 4.7.0, but internally the assembly versions for the net472 and netstandard2.0 packages had different build number. So, the MSBuild PackageReference subsystem will allow you to silently reference a netstandard2.0 assembly in a net472 console application, but the net472 console application can't compose with the netstandard2.0 assembly if it also has a transitive dependency on System.ComponentModel.DataAnnotations package through a net472 dependency. And all of this is a silent failure. Yes, you can say this is all outside the runtime, but who cares if it costs me weeks of my life? I literally read every article I could about .NET runtime behavior to try to understand why this error was happening. I read articles by a now retired Microsoft engineer on how type forwarding works, to see if the issue was with type forwarding, since of course this assembly was type forwarded on top of being packaged in a way that causes these silent design time failures to occur.

Example 2. Azure App Service injects ambient dependencies into the default assembly load context. Arguably, this is something Microsoft engineers should discuss internally and rework due to it being a bad practice. But it raises a lot of questions about how to enable injection of profiling tools into an AssemblyLoadContext.

Example 3. I am told by several Microsoft engineers that the whole reason deps.json exists (despite you calling it an implementation detail) is to smooth over issues with binding redirects by capturing the transitive closure of all dependencies. But, in those days, we had file paths as references, not magic PackageReference and FrameworkReference. With magic, comes extra requirements where you can't simply do host tracing. Because then the runtime folks get to say, "Well, that's the MSBuild SDK guys problem, not my problem." And it's a Microsoft customer problem at the end of the day. You can't ship systems that take forever to debug dependencies for.

@jzabroski
Copy link
Contributor Author

The other, practical matter, is that since PowerShell is a .NET Core process, the current host tracing is awful when triggered from POwerShell. You have to use a 40 year old cmd.exe to start dotnet.exe so that you don't get host tracing from PowerShell itself AND the process you care about. So yeah, host tracing so far is no good for me and I've been waiting for it to get much better. I created issues for this and they've been open for years, so I feel a bit threatened as a customer when I'm told this stuff I find pretty useful is just an implementation detail of the host I'm using. I read this as, "You shouldn't use this because we reserve the right to kill it off." What I want to read is, "Let's work together to figure out the user stories that make .NET an awesome platform for composing third party dependencies in a modular way."

@jzabroski
Copy link
Contributor Author

jzabroski commented Aug 11, 2020

The host tracing should ideally provide all that information (and if there's missing info there, I'd be happy to add it to the trace).

I think the basic challenge to your offer is what I recall Andrew Appel calling the data representation independence problem in bootstrapping a compiler. You can't give me the providence information I want, because the providence information is in csproj as PackageReference and FrameworkReference. In the legacy .NET Framework days, when we had concrete file paths or file path hints and a GAC, all of the "assembly fusion logging" made sense because the providence information was all extremely tangible, and the only magic was whether you enabled AutoGenerateBindingRedirects, which did not change which files on disk you loaded but rather how the runtime interpreted the version constraints to decide if a dependency satisfied the version requirements of the application. - This very "data representation indepence problem" is why Nuget packages themselves have to export the dependencies they take, because, for example, you can't solve transitive dependencies without it.

Obviously, the ideal but not realistic approach is to "compile globally at HEAD" since that guarantees we are all using consistent latest versions of everything and have no potential for dependency version conflicts. (This is basically what abstracting assembly dependencies away from concrete file paths and towards minimum PackageReference Versions in conjunction "roll forward LatestMinor" tries to achieve, in a way, but in an ambiguous way.) However, in the real world, applications are composed of mutually suspicious third parties, and Composite Application Frameworks must assume dependency conflicts are natural part of working among different teams with different stakeholders, goals and requirements (e.g., one example I brought up to Rob Relyea years ago was that XAML has no support for scope disambiguation, and so it cannot know what to do when two assemblies with different names, like DevExpress.Grid.v11.2.dll and DevExpress.Grid.v12.1.dll, share the same namespaces, and so you cannot have two versions of the same data grid in one WPF application). In a very real sense, different groups working on a composite application can be unintentional adversaries and cause problems in building products modularly. In the case of "roll forward LatestMinor," some who practice semver would even argue the wrong default was chosen, and it should be "roll forward LatestFeature." I would probably just say at this point, "Dang, I start to see why JavaScript developers don't care that it takes forever to download 6,000 nodejs modules and don't care about what .NET is doing, because they've fully isolated all their dependencies, so they don't deal with any of this magic."

The other ideal approach is to reify dependencies as objects and use objects themselves as modules. This is the approach championed by Gilad Bracha, and the one adopted by EcmaScript 6 working group for its modules standard. I would have preferred this to all the magic contained in FrameworkReference and "runtime packs" and "design time reference assemblies". But I also get that Microsoft tries to ship extremely lean, resource efficient designs, and that to this date, the system we have for .NET core is fairly lean and resource efficient. It also is not very respectful of my time debugging applications that don't require lean, resource efficient dependency resolution.

@vitek-karas
Copy link
Member

I'm sorry if your experience is that we don't care - that's definitely not the case. I was just reacting to this specific issue - where you state that knowing the additional deps is valuable in some diagnostic scenarios. I'm not questioning that fact, what I was wondering is why it's "Easier" to have it accessible from managed code where you need to modify the app to get that information, as compared to the tracing where you just modify the environment how to run the app to get that information.

Side note on PowerShell - I would be interested to understand why host tracing is a problem here - I use powershell as my main shell and I use host tracing almost daily without any issues - $env:COREHOST_TRACE=1 dotnet app $env:COREHOST_TRACE=0 works well for me. Regardless if there's another way you would prefer to be able to trigger the tracing I'm definitely interested learning about it.

I do understand that diagnosing the entire system from the project file down to the actual assembly loading at runtime is difficult. I'd be happy to discuss what can we do to improve the experience.

@jzabroski
Copy link
Contributor Author

jzabroski commented Aug 11, 2020

I don't think you don't care. I genuinely believe most people want to help other people. I just got a bit jumpy when I heard that this was an implementation detail and everything is solved by core tracing. All I want is user stories written down. A few of them are captured, for examples, including yours:

  1. https://github.com/dotnet/core-setup/blob/master/Documentation/design-docs/framework-version-resolution.md
  2. https://github.com/dotnet/designs/blob/master/accepted/2019/runtime-binding.md
  3. https://github.com/dotnet/designs/blob/master/accepted/2020/targeting-packs/targeting-packs.md
  4. https://github.com/dotnet/designs/blob/master/accepted/2019/targeting-packs-and-runtime-packs.md

I had also outlined the following feature request:

dotnet/runtime#12506

I can also see, through looking through the references, I am not alone in this regard: dotnet/runtime#7693 (comment) But I see I put the problem rather pithily there, "Arbitrary assembly loading is a bug". And also mentioned troubleshooting use cases I had forgotten about: x64 -> AnyCPU -> x32

Side note on PowerShell - I would be interested to understand why host tracing is a problem here - I use powershell as my main shell and I use host tracing almost daily without any issues - $env:COREHOST_TRACE=1 dotnet app $env:COREHOST_TRACE=0 works well for me. Regardless if there's another way you would prefer to be able to trigger the tracing I'm definitely interested learning about it.

With Assembly Fusion Logging, I could leave the registry setting enabled and not have a global dump everywhere. Even with the addition of a trace file output, I get a global output. As your coworker @heaths commented on in dotnet/runtime#12506 , why doesn't this go through event pipe so that we can tailor what we see (e.g., not need to see PowerShell)? In the current design, I have to always remember to disable a global variable after exiting a program. If you don't find that annoying, I've got no more feedback for you. But you also can't run a new PowerShell window while you're tracing any .net core application. You would have to also do the inverse: use cmd.exe to update COREHOST_TRACE=0 then run pwsh.exe

@jzabroski
Copy link
Contributor Author

By the way, an example of what I mean by user stories is provided in the repo root of my open source project, FluentMigrator. See: https://github.com/fluentmigrator/fluentmigrator/blob/master/EXAMPLE_STORY.md - I believe this user-action-benefit triple provides a high watermark for figuring out what it means to deliver a complete feature. I frankly often fall short of it myself.

@jzabroski
Copy link
Contributor Author

Also, one last point:

what I was wondering is why it's "Easier" to have it accessible from managed code where you need to modify the app to get that information, as compared to the tracing where you just modify the environment how to run the app to get that information.

In this case, easier is about supporting customers who complain about exceptions that, to them, appear to originate in my code stack or somehow caused by how I bundle dependencies. As you are probably aware, a hard part of software development is reproducing customer problems. By creating managed code that can automatically dump the information I need to ask them intelligent questions, I save lots and lots of time, which is all I really care about. I do agree that it would be nicer if I could tell them to just flip a switch and send me a dump file, but also some customers are reluctant to share that much information about their application, as it may contain information i should not see. Especially for an open source project.

@vitek-karas
Copy link
Member

In .NET 5 we implemented the assembly loader tracing (also called binder tracing, or loader tracing, ...) - unfortunately we don't have a doc about it (yet - working on it), but in short there's a set of new tracing events from the runtime which describe when/how/why the runtime tried to load assemblies and what was the result of the load operation. This uses event pipe (just like all other runtime tracing). It's not exactly like fusion logs from .NET Framework but I hope it will serve similar purpose (and more, it's more detailed in some areas).

Host tracing is different because it runs before the runtime is started - so it's currently technically very challenging to use event pipe for this tracing (all the infrastructure for event pipe is in the runtime, but that has not been loaded yet). So certain parts of host tracing will probably have to remain as they are. I do hope that one day we will be able to move more of the host assembly resolution logic into the runtime (so basically perform the detailed assembly resolution after the runtime has started), in which case it would be much easier to support event pipe for tracing.

I absolutely agree that implementation details and technical decisions leak out in this case and it makes the experience... not optimal.

@jzabroski
Copy link
Contributor Author

In .NET 5 we implemented the assembly loader tracing (also called binder tracing, or loader tracing, ...) - unfortunately we don't have a doc about it (yet - working on it)

That's cool. I would encourage standardizing on one vocabulary phrase so that it's easier to Google. I would probably call it assemblyloadcontext tracing, but perhaps assembly loader tracing is more general and logs more than what happens inside AssemblyLoadContext class.

As for host tracing not having access to that stuff, that did not occur to me and it was certainly an Aha moment, so thank you for pointing that out.

@BillWagner
Copy link
Member

thanks for the discussion.

Proposal:

The correct fix is to add docs for the .NET 5 features @vitek-karas describes.

Question for @vitek-karas : Is there an issue for that? Is someone assigned to that task?

If not, let's create a new issue for that.

Once resolved, I recommend closing this PR pointing to that issue.

@jzabroski
Copy link
Contributor Author

Unless I'm mistaken (since the document to learn how this works is not available): The .NET 5 features do not provide traceability to deps.json, only improve host tracing.

What this PR addresses is basically the equivalent of the following log statements in the old assembly fusion log viewer:

LOG: No application configuration file found.
LOG: Using host configuration file: 
LOG: Using machine configuration file from C:\Windows\Microsoft.NET\Framework\v4.0.xxxxx\config\machine.config.

and

LOG: This bind starts in default load context.
LOG: Using application configuration file: d:\src\private\FusionLog\FusionLogRunner\bin\Debug\FusionLogRunner.exe.Config
LOG: Using host configuration file: 
LOG: Using machine configuration file from C:\Windows\Microsoft.NET\Framework\v4.0.30319\config\machine.config.
LOG: Redirect found in application configuration file: 3.2.1.0 redirected to 4.0.0.0.

The specific line that says "LOG: Using application configuration file: d:\src\private\FusionLog\FusionLogRunner\bin\Debug\FusionLogRunner.exe.Config" is one feature that is missing from .NET Core, and the only way to polyfill this behavior is with the behavior I describe in this PR. Some may say the polyfill is imperfect and argue technical things about it being an implementation detail, but de facto it works.

If the .NET 5 feature handles this, then it is still useful documentation for non .NET 5 customers.

@vitek-karas
Copy link
Member

The loader tracing we're adding in .NET 5 is not going to include information about where the resolution came from. Currently that information is "hidden" inside the host and runtime doesn't know about it. I filed dotnet/runtime#40947 to track this potential future improvement.

I actually tried to create a new docs issue for the .NET loader tracing now, but it doesn't let me - the create issue button is greyed out - I'm not sure what rules are setup there... GitHub doesn't tell me why it won't let me create a new issue. If you can create one - please do so.

Co-authored-by: Vitek Karas <vitek.karas@microsoft.com>
@jzabroski
Copy link
Contributor Author

Thank you, @vitek-karas and @BillWagner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants