Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asset Processor does not reprocess engine dependencies with wildcards if the assets were updated. #17876

Open
Twolewis opened this issue May 3, 2024 · 1 comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/content Categorizes an issue or PR as relevant to SIG Content.

Comments

@Twolewis
Copy link
Contributor

Twolewis commented May 3, 2024

Describe the bug
AssetProcessor won't reprocess dependency xml files with wildcards, if new files were introduced that the wildcard would catch.

Steps to reproduce

  1. Create an XML file as follows:
    <EngineDependencies versionnumber="1.0.0"> <Dependency path="some/path/*" optional="false" /> </EngineDependencies>
    and place in your project folder.
  2. Create some assets in some/path - textures, models, whatever.
  3. Run the Asset Processor.
    --> Note: the dependency list from the dependency xml is updated to reflect all assets in some/path/*
  4. Add new assets in some/path (models, textures, etc). Do not touch or alter the xml file.
  5. Run the Asset Processor.
    --> Note: The dependency list of the xml file is not updated to contain the additional assets.
  6. Using the AssetBundler, create a new seedlist and add the xml from (1) to it, and create a new bundle.
    --> Error: The bundle will have the assets from (3) but will not have the new ones from (4)

Expected behavior
Given a wildcard dependency, I expect the dependency list to be updated every time I run the Asset Processor, reflecting the changes to the wildcard capture, so that bundles can be created correctly.

Actual behavior
The dependencies are only updated if the xml is modified in some way (changing its signature), triggering the Asset Processor to process it and update its dependency list

@Twolewis Twolewis added kind/bug Categorizes issue or PR as related to a bug. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 3, 2024
@nick-l-o3de
Copy link
Contributor

nick-l-o3de commented May 6, 2024

Just some more information here since I have been doing some code diving.

There is a builder that is supposed to do this, the XmlBuilderWorker, but it is incapable of doing this correctly, which I'll explain below

The XmlBuilderWorker's job is to read XML files and convert them into XML files in the cache while also emitting any dependencies they have. It does this by reading associated .xmlschema files which can be opened and modified in the Asset Editor window int he Editor (Create new / edit existing/ etc).
An XML Schema file essentially consists of 2 pieces

  1. a file matcher part which says which kind of files they match - for example, *.xml or more likely *_dependencies.xml as a better example.
  2. What to look for in that file if it matches. For example, it might say 'Any XML Element you see with the name <Depedency.../>, take its attribute called path and that is a Product Dependency. Another example is for example in font xml files, it finds the "fontshader" element and uses the "name" attribute. Is file pattern matcher is "Fonts/*.xml". See the existing xmlschema files in the Asset Editor for examples of this - there are a few that show how it works. Match the file, then tell it what to do when certain elements are encountered.

However, for a file to reprocess a source file into its corresponding output data (including dependencies) one of three things has to happen:
A) The source file data needs to change
B) Its builder fingerprint or job fingerprint needs to change
C) One of its emitted source dependencies must change (recursion is automatic)

Note that there are 2 main kinds of dependencies. Source dependencies, product dependendencies.

Source Dependencies are "This source file, for example blah.tif depends on Globals/imagesettings.xml. So if that second file changes, you need to rebuild blah.tif." another way of thinking about a source dependency is "The final output, which consists of both the actual cached data and also metadata such as dependencies, will mutate if this other file changes`. Source dependencies can be wildcards

Product dependencies are "The output of this build job, ie, a crunched asset, will depend, at runtime, on another crunched asset being present on disk or in the pak file.". It has no bearing on build. A material might establish a Product Dependency on a texture file it needs, but it doesn't mean that the material file itself needs to be rebuilt when the artist changes a few pixels in the texture file.

The XmlBuilder does not emit any fingerprint information at all, so B is out.
When adding a new file into a folder, matching a wildcard, the original source file did not change. So A) does not happen.

You'd expect the dependencies listed in the XmlBuilderWorker's schema parser to function here and satisfy C), since you can actually tell the XML Schema to look for certain elements and emit them as Source Dependencies, not just Product Dependencies... but looking at the code, the XmlBuilderWorker does not inspect that file for source dependencies in CreateJobs, it only looks for product dependencies in ProcessJob. The only source dependency it emits is a single source dependency on any schema file htat matches it.

So for example, if you have a schema like Schema/enginedependencies.xmlschema which says "I apply to any xml file that matches *_Dependencies.xml, the current XmlBuilder will process a file like engine_dependencies.xml as follows:

CreateJobs - the function where source dependencies should be emitted

  • Currently determines which xmlschema files match this file being processed. Any which match are added to the Source Dependencies. But only the schema files themselves.
  • Currently it does not actually attempt to apply the schema or read the xml being processed here.
  • It emits 1 single job (per platform) to convert the XMl

ProcessJobs - its too late to emit source deps here. Its where you emit product deps, which do not trigger rebuilds

  • It actually applies the xmlschema to find out the dependencies.
  • Emits product dependencies only
  • Copies the XML file.

So the problem here is at no point does CreateJobs actually look inside the actual XML to say what files it depends on.

Note that in Asset Processor, it is in fact legal to emit a source dependency as a wildcard during CreateJobs. You can in fact say that this file depends on *.tif as a source dependency, and that will cause THIS file to be reprocessed anytime anything changes on disk that matches *.tif. But the Xml Builder isn't applying the XML schema itself during CreateJobs, at all. Its just figuring out WHICH schemas might apply to the file. This might be intentional - CreateJobs, where Source Dependencies must currently be emitted, happens single threaded and is time critical. Reading every XML file in a project every time you encounter one may increase your full analysis time extensively.

So a correct short term fix here would be something like

  1. Update CreateJobs to actually apply the XML schema to look for source dependencies in the file, similar to how it applies it during ProcessJobs to find product dependencies. This may involve a couple extra functions in there. Do this only for files that actually match a schema which ideally is a small number.
  2. Make sure that the emitted source dependencies emit correctly, that is, assume relative paths to the file being emitted. Avoid ending up with very broad wildcards. If Textures/my_dependencies.xml says it depends on ".jpg", this should be a wildcard like "Textures/.jpg", not just "*.jpg" which would match the entire project and potentially heavily slow down the AP.

A longer term fix might be to improve the asset system itself, so that you don't need to emit Source Dependencies during CreateJobs and can defer that to ProcessJob. This would vastly improve createjob time in general since a lot of file types do have to read their files during CreateJobs (analysis) in order to properly emit dependencies.

Its needless, as most build systems have realized that in general, you either have never processed a file before, in which case you MUST process it, or you have already processed a file before, in which case you have the data from when you processed it. This infers that its not necessary to declare source dependencies up front at all, since they have no bearing on whether or not to process a file if its never been processed before (it will always be processed) and by the time you need to consider whether to process it again, you have the data from the last time you processed it.

@michalpelka michalpelka added sig/content Categorizes an issue or PR as relevant to SIG Content. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/content Categorizes an issue or PR as relevant to SIG Content.
Projects
None yet
Development

No branches or pull requests

3 participants