Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading entry assembly in AppHost / SingleFileHost #519

Open
caesay opened this issue Jan 25, 2024 · 4 comments
Open

Reading entry assembly in AppHost / SingleFileHost #519

caesay opened this issue Jan 25, 2024 · 4 comments
Labels
dotnet Issues related to AsmResolver.DotNet enhancement

Comments

@caesay
Copy link

caesay commented Jan 25, 2024

Problem Description

Epic library!

When given a random EXE file, it's currently cumbersome to locate the dotnet entry assembly.

  • For Full Framework Exe's, you can use AssemblyDefinition.FromFile.
  • For dotnet SingleFileHost (PublishSingleFile), you need to use BundleManifest.FromFile, and then kind of guess what the entry dll is from the file name of the BundleFileType.RuntimeConfigJson entry
  • For reading dotnet AppHost, this seems to be entirely unsupported. We would need to read the placeholder to get the relative path to the entry dll. I can see there is support for writing the placeholder, but not reading one.

Proposal

Ideally, there would be an API to read the placeholder / relative entry dll path, so given an EXE we could locate the entry dotnet assembly.

Furthermore, if there was an API which could "detect" the entry assembly given an arbitrary EXE, that would also be neat - but some documentation describing how to do this yourself would also be good as an alternative.

Alternatives

No response

Additional Context

No response

@Washi1337
Copy link
Owner

I am not sure this is possible without applying some heuristics.

The biggest issue is that the offset the application binary path is stored in bundle files is not so well-defined. The reference implementation of the bundler by Microsoft also just finds the right place simply by searching for a known placeholder in a template file. Unfortunately, this placeholder is (as its name implies) replaced with the final entry assembly at compile-time, effectively destroying all information we can use to infer it automatically.

Generally speaking I am hesitant to adding heuristics to AsmResolver (especially when it involves disassembling and interpreting code or similar) unless it is very frequently used and reliable. However, I am open to suggestions.

@Washi1337 Washi1337 added the dotnet Issues related to AsmResolver.DotNet label Jan 25, 2024
@caesay
Copy link
Author

caesay commented Jan 25, 2024

I'm aware of the difficulty, but I suspected you knew something I didn't - because the WriteUsingTemplate function can replace the path in an already-written apphost where the placeholder text is no longer present, using an offset from the signature.

If not that, I suppose we could build a dictionary of apphost file hash and offsets. It would be trivial to scrape all of the app hosts from NuGet (eg. https://www.nuget.org/packages/Microsoft.NETCore.App.Host.win-x86) and record the placeholder offset for each unique file.

@Washi1337
Copy link
Owner

The WriteUsingTemplate method combined with BundlerParameters.FromExistingBundle uses the original main file path itself (padded with zeroes up to the original length of the placeholder) instead of the standard placeholder as heuristic, and strips all EOF/overlay data to replace it with the new bundle manifest (see BundleManifest.cs:400-409). This is fast and reliable for standard apphost/singlefilehost files, but definitely not perfect (hence the warning in the docs).

Maintaining a dictionary of well-known template file-offset pairs is also not super trivial, because what are the keys of those dictionaries going to be? Raw hashes of the files I don't think will work because the existing files will have their placeholders replaced, and for windows binaries with have their own win32 resources.

@caesay
Copy link
Author

caesay commented Jan 25, 2024

All good points. I don't have any other suggestions, other than: All the Win32 resources are written to the entry DLL, and then copied to the final exe. As long as that's how the apphost is built, the "OriginalFilename" resource will be the name of the entry DLL. It's certainly not foolproof but it's better than guessing based on the file name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dotnet Issues related to AsmResolver.DotNet enhancement
Projects
None yet
Development

No branches or pull requests

2 participants