Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First pass at implementing verify_trust #928

Open
wants to merge 5 commits into
base: dev_new_recipe_logic
Choose a base branch
from

Conversation

n8felton
Copy link
Member

No description provided.

@nmcspadden
Copy link
Contributor

Thanks for this, this is a great start! Comments left inline

Code/autopkglib/recipes/__init__.py Outdated Show resolved Hide resolved
# We need to determine if a recipe in the chain is an override and thus contains trust
# if it contains trust, we then go validate that the trust is correct
# if there are no overrides, this always returns True (but maybe we print out that we did nothing)
# for recipe in self.recipes:
for recipe in self.recipes:
if recipe.is_override and recipe.trust_info:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than wrap the entire function in this if-block, I'd flip it - if the recipe isn't an override, it won't have trust info.

So this raises a pretty interesting question we haven't really answered - which is that there are really three possible states:

  1. Trust info is present and valid
  2. Trust info is present but invalid
  3. Trust info is not present

Should this function return true if trust info is not present? Does that make sense, conceptually?

Or maybe instead of returning a bool, we return a different object that represents the trust state, and also has the opportunity of representing what did or didn't work in trust validation. It's more work, but I think it's much clearer that way because we don't to split the logic around too many different places.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you thinking something as simple as this?

@dataclass
class TrustValidation:
    present: bool = False
    valid: bool = False
    message: str | None = None

Maybe RecipeChain gets a new variable?

self.validation = Optional[TrustValidation] = None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this fresh, I think I'm changing my mind a little bit.

When we compile a recipe chain with recipe_chain.build(), we would know right at that moment if we have trust info (i.e. if there is an override in the chain). If the recipe chain has an override, it must have trust, and the trust must be valid. So maybe we just need to add two more variables to RecipeChain:

  • self.contains_override: bool = False - the recipe chain contains an override, defined by a recipe for which recipe.is_override is True;
  • self.trust_is_valid: bool = False - the recipe trust was validated at build time

If contains_override and trust_is_valid are not both True, raise a warning/Error depending on the FAIL_TRUST preference.

If contains_override is False, then we never verify trust because it isn't expected to contain any. Therefore we should never even validate trust because we don't expect anything to validate.

It might be worth putting in some kind of assert, for the RCs at least, that trust_is_valid can never be True while contains_override is False, because that means we have a logic error.

Thus, we actually don't verify recipe trust at the time we execute it (which is historically done by the execution phase of a recipe run), but rather at the time we consider it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've made some progress on this, but it's late and my code needs to be cleaned up before I'm ready for another PR. I'm hoping to make time to clean it up and submit a PR tomorrow.

Some notes (and reminder to myself when I try to read my code again):
Something like self.contains_override in a RecipeChain object is likely not needed and/or complicates things.

Currently, if we detect recipe.is_override, we actively have the Recipe object that also contains the trust information.

If we use recipe.is_override to set self.contains_override, we have to go searching for which recipe was the override again, or we have to do something like self.override_recipe = Recipe

After playing with a few iterations of how to potentially implement this, I think it's best that if during recipe_chain.build(), there is already a loop through all the recipes in the RecipeChain. When we hit if recipe.is_override:, we punt the Recipe over to verify_trust(), which can process the already existing trust information nested in the Recipe object.

Essentially, looks almost like the existing PR, but moves these two lines (259-260) to RecipeChain.build() in the for loop at

for recipe in self.recipes:
self.input.update(recipe.input)
self.process.extend(recipe.process)
# Set our minimum version to the highest we see
if version_equal_or_greater(self.minimum_version, recipe.minimum_version):
self.minimum_version = recipe.minimum_version

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, some new commits are pushed. I have no idea if it's much better than what was there before. Might be worth a 30 minute real-time chat to go over some of the code/logic together.

Comment on lines 261 to 262
# We need a way to map the Recipes in our RecipeChain to the parent
# recipes in the recipe.trust_info.parent_recipes dictionary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RecipeChain contains a list of identifiers already, and a list of all paths in the chain for all parents.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but this method allows me to access the Recipe objects that are already generated with their ParentRecipeTrustInfo objects included in trust_info.

Maybe I'm missing something though, happy to take suggestions on a better implementation for this.

This helps when some overrides have paths such as
~/Library/AutoPkg/RecipeRepos (with home as ~) versus
/Users/<user>/Library/AutoPkg/RecipeRepos
@nmcspadden
Copy link
Contributor

So I checked this out and did a test:

% ./autopkg verify-trust-info GoogleChrome.download
Looking for GoogleChrome.download...
Did not find GoogleChrome.download in recipe map
Rebuilding recipe map with current working directories...
Traceback (most recent call last):
  File "/Users/nmcspadden/Documents/GitHub/autopkg/Code/./autopkg", line 2624, in <module>
    sys.exit(main(sys.argv))
  File "/Users/nmcspadden/Documents/GitHub/autopkg/Code/./autopkg", line 2620, in main
    return subcommands[verb]["function"](argv)
  File "/Users/nmcspadden/Documents/GitHub/autopkg/Code/./autopkg", line 1631, in verify_trust_info
    recipe = old_load_recipe(
  File "/Users/nmcspadden/Documents/GitHub/autopkg/Code/./autopkg", line 308, in old_load_recipe
    recipe_file = old_locate_recipe(
  File "/Users/nmcspadden/Documents/GitHub/autopkg/Code/./autopkg", line 203, in old_locate_recipe
    calculate_recipe_map(skip_cwd=False)
  File "/Users/nmcspadden/Documents/GitHub/autopkg/Code/autopkglib/recipes/__init__.py", line 506, in calculate_recipe_map
    globalRecipeMap["overrides"].update(map_key_to_paths("overrides", override))
  File "/Users/nmcspadden/Documents/GitHub/autopkg/Code/autopkglib/recipes/__init__.py", line 526, in map_key_to_paths
    recipe = Recipe(match, for_map=False)
  File "/Users/nmcspadden/Documents/GitHub/autopkg/Code/autopkglib/recipes/__init__.py", line 332, in __init__
    self.from_file(filename, for_map)
  File "/Users/nmcspadden/Documents/GitHub/autopkg/Code/autopkglib/recipes/__init__.py", line 367, in from_file
    self._parse_trust_info(recipe_dict)
  File "/Users/nmcspadden/Documents/GitHub/autopkg/Code/autopkglib/recipes/__init__.py", line 385, in _parse_trust_info
    trust_blob = TrustBlob(**nc_proc)
TypeError: TrustBlob.__init__() missing 1 required positional argument: 'git_hash'

I'm going to start digging in to this more, but I think there's an order of operations problem.

Right now, when it loads up the Recipe object simply by looking at a file, it tries to calculate the relevant hashes. Unfortunately, as part of our attempt to speed up the recipe map generation time, we commented out the code that fulfills that data at load time, so it's missing from the objects in memory. When it goes to parse the trust info, that missing info causes trust blobs to fail because it doesn't have another way to generate it.

So I think what has to happen is:

  1. The recipe map needs to ignore hashes when building and loading recipes;
  2. The recipe map needs to load up a recipe object, collect the identifier, and then dispense of the object to avoid memory bloat (which it does not do today, so autopkg eats up a fuckton of ram when generating the recipe map)
  3. We only calculate the hashes when we are generating new trust info (i.e. autopkg update-trust-info or autopkg make-override)
  4. We read in the hashes if and only if they are present in the trust info already; we do not generate new ones during a read operation

That should address this circle of problems

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants