Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Porch: PackageVariant Redux #3973

Open
johnbelamaric opened this issue May 23, 2023 · 12 comments
Open

Porch: PackageVariant Redux #3973

johnbelamaric opened this issue May 23, 2023 · 12 comments
Assignees
Labels
area/porch enhancement New feature or request triaged Issue has been triaged by adding an `area/` label

Comments

@johnbelamaric
Copy link
Contributor

johnbelamaric commented May 23, 2023

We have been working with the PackageVariant resource for the last couple weeks in Nephio. Functionally, we're probably OK (apart from bugs which I will work on), but I think a few improvements are warranted. Some issues:

  1. Modifying package context seems to be a bad idea. I kind of suspected this but went ahead with it anyway. However, it seems kpt and Porch both reconstruct it from scratch when cloning the package. This means that you cannot set defaults in there and have PV override them; they are destroyed and there is a bit of a delay before PV reconciles and re-adds them. I think this is just trying to use it in a way that was never intended.
  2. Config injection using the injectors mechanism is awkward. It is designed with the idea that the common use case would be "match all GVKs in the cluster with this name, and associate them to injection points by GVK". But I don't think that's the right model; I think the right model is "I have these injection points, how should I satisfy them". In other words, "this package supports these configuration APIs, how should I set the values". Currently it also means that you can not inject multiple resources of the same GVK; only a single instance per GVK is allowed (well, others are allowed but they will all receive the same values).
  3. Package context and pipeline allow the package variant author to tweak the contents of the downstream with data directly in the PV resource, but config injection only supports tweaking it by inserting contents from the cluster. This means that for simple use cases, we need at least two resources (the PV and the injected resource). It would be nice to be able to directly inject from the PV itself (though doing this for things other than configmaps may not be great).
  4. When PackageVariantSet generates PackageVariants, it uses hash-based naming. This can cause issues because we use the PackageVariant name as part of the function name in the pipeline. (fixed in Change PV naming and fix pointer bug #3986)

I think 1, 2, and 3 can all be addressed with a simplified injection mechanism that works as follows.

  • The PV author would specify a struct for each injection point they want to fulfill, including the name of the injection point, and optionally the apiVersion and kind can also be specified (based on how we do condition types, package authors are required to make Kind and Name sufficient to identify an injection point).
  • The PV author would then be able to either 1) select an in-cluster object by referring to its name; or 2) specifying the resource directly in the PV as an unstructured resource.
  • The user would also be able to specify whether or not to create the injection point if it does not exist. This allows us to create an input config map for apply-setters, for example. This obviates the need for the packageContext construct we use now.

I think this satisfies nearly all, if not all, the original use cases of injection and package context manipulation, but in a more easily understandable way.

I would add this as a new field in PackageVariant, so we would have all the mechanisms available for a while, but the plan would be to mark packageContext and injectors as deprecated and replaced by this (maybe injection?).

For 4), I think we can solve this with a functionQualifier string in the PipelineTemplate. For PV, we would default this to the PV name, so this is totally backward compatible. For PVS, we would set this to the PVS name by default, but also allow the user to specify it like other fields in the PackageVariantTemplate. (this probably is not needed with the fix made in #3986)

cc @henderiw @natasha41575 @mortent @SimonTheLeg

@johnbelamaric johnbelamaric added the enhancement New feature or request label May 23, 2023
@henderiw
Copy link

Maybe PackageVariantContext iso injection (just an idea, as I am thinking of it as providing context to how we want to consume the package for this use case). It would also be nice to have this capability for offline use (meaning just with kpt). We have scenarios where we have a common package we want to be able to use before porch kick(s). E.g. creating repositories for porch where with some PackageVariantContext we can leverage the same package in different ways. E.g. have a regular repo a staging repo, etc.

@johnbelamaric
Copy link
Contributor Author

I'm trying to avoid package-specific APIs. The idea of injection is to allow the package to advertise how it can be configured. Another term @justinsb used for a similar concept is "binding" - so, the package has binding points you can "attach" a resource to. In that case he was dealing more with "binding" two packages together, rather than "injecting" from the cluster. These are similar concepts we may want to unite in some way, but I do think the distinction between "binding packages via an interface" and "injecting config from the cluster via an interface" are subtly different.

@henderiw
Copy link

Yes agree I was not implying package-specific APIs. The thought is to leverage some of this in a semi offline fashion. The way I see it this allows to provide specific context to the package. Whether it is through another package or via a CR/KRM resource are all options

@johnbelamaric
Copy link
Contributor Author

Extending the idea above, we can also allow it to inject resources from other packages, encompassing the "binding" concept. So, it would look something like:

  injection:
    # example of injecting from the cluster and attaching to a named, in-package binding point (aka "injection point")
  - bindingPoint: binding-point-1 # we can enforce unique binding point names across all GVKs
    clusterResource:
      name: in-cluster-resource-name # GVK is implied by bindingPoint
      # eventually when we work out permissions, we can allow namespace here
      # possibly one day, even values from a *different* cluster?
  - bindingPoint: binding-point-2
    directResource:
      apiVersion: foo.bar/v1alpha1    # we can get this from the binding point, but needs thought whether we should
      kind: FooBar
      spec:
        my: value
  - bindingPoint: binding-point-3
    packageResource:
      repo: foo
      package: bar
      revision: v4 # or maybe only allow it from the most recent published revision? needs discussion
      apiVersion: foo.bar/v1alpha1    # we can get this from the binding point, but needs thought whether we should
      kind: FooBar
      name: my-resource-name

Omitting the bindingPoint would mean to insert the resource with a generated name, and to own it completely (like we do for pipeline functions) - it would be added/removed/updated by the PV controller as the PV changes.

There are some details to work out. When we know the binding point, we know the GVK; maybe we should make it optional across the different types of injections. For the package one, we have namespaces for the repo/package and for the resource inside it, so we may want to create separate structures. In general a few more iterations on the API for usability would be useful, but I think this is better than what we have.

@johnbelamaric
Copy link
Contributor Author

Oh, I misunderstood your PackageVariantContext thing - you were suggesting the naming instead of injection? "Context" might be a good word, yes, let's think about names that may give that flavor.

@johnbelamaric
Copy link
Contributor Author

Also, in Nephio we found a need to inject multiple resources of the same type. For example, we use a sentinel CR to represent a demand for a certain set of resources; we then inject that resource from other packages. Something like what is described here (though the actual implementation is a little different): https://github.com/nephio-project/nephio/pull/263/files

I think we could extend the concept described here to manage this type of dependency as well. At least, it's worth discussing if that belongs here or as a separate post-fanout controller (aka "specializer" in Nephio terminology).

@johnbelamaric
Copy link
Contributor Author

johnbelamaric commented Jun 26, 2023

"Context" might be a good word, yes, let's think about names that may give that flavor.

Another point on this, "context" here makes sense in some cases, but I wonder if there are other cases that are a slightly different idea. Things like the cluster the package is going to, the region, dev/test/stage/prod all make sense as "context".

But there are other things, like "configure the capacity of this UPF". It's not really about adjusting the workload to the context it is running in, but more about what we want out of the workload itself. I am not sure this matters at all, but I am trying to tease apart the types of configuration APIs that a package may advertise, and whether there is a meaningful pattern as to how, when (in the overall provisioning process), and by whom they are used.

@johnbelamaric
Copy link
Contributor Author

johnbelamaric commented Sep 20, 2023

Another way to think about this, after a discussion with @liamfallon is to use an analogy to programming. You can think of each binding point name as a variable name. The package author is declaring a sort of function signature - you can set this variable (binding point) to a value of this type (GVK). The PV is like calling the function - you are deciding which values to assign to the variables. In this updated design, PV allows you to take the values from the cluster, from another package, or set them to a literal value. Similarly, each variable (binding point) can be required or optional. Optional ones take the default values if not set.

The key points are:

  1. Package authors get to define the typed (schema'd), versioned variable "inputs" to the package, and have a name they can use internal to access the values passed by the user.
  2. The user can pick ("bind") the input values from various places.
  3. The system can monitor the input values, and if they change, it can initiate a new Draft with the new values.

@johnbelamaric
Copy link
Contributor Author

Another thought based on @tliron's tko demo: should we allow (at least) the "direct" binding to be a patch as opposed to a complete resource? This allows more flexibility around defaulting.

In fact today "injection" is effectively a "patch" of the entire spec...so maybe there's something more general there.

See also #3121 (I need to read that one thoroughly too...I think there are some gotchas in here).

@johnbelamaric
Copy link
Contributor Author

It would also be nice to have this capability for offline use (meaning just with kpt). We have scenarios where we have a common package we want to be able to use before porch kick(s).

that's an interesting idea - could PV, or some kernel of PV, live in a package and be resolved client-side?

It's a bit of a tangent, but in general, I'd like to move towards APIs that can be resolved by different tools. So, for example, instead of embedding upstream information in the Kptfile, we would create an "Upstream" resource that lives in the package. It could be processed client-side by kpt, or server-side by Porch. In time, this would become a collection of package-oriented APIs that even other config tools could support. This also raises the idea of whether our current PackageRevisionResources is the right model for mapping package contents into K8s APIs. I would prefer something that makes those more "real" in the server, but of course without any actuation mechanism. This is a bigger discussion I am digressing into...I'll try to write up some thoughts in a design proposal soon.

@henderiw
Copy link

henderiw commented Jan 2, 2024

Have been looking a bit deeper into this and looking at the pipeline changes. What would be a good capability to have here is what the intention was with package-context but apply this to a new resource and use cel-functions to set the values of the parameters.

E.g.

  • bindingPoint: binding-point-1
    clusterResource:
    name: cel-function(x)

@henderiw
Copy link

henderiw commented Jan 2, 2024

Also on the clusterResourceBinding point would we not use ?

  • apiVersion
  • kind
  • name
    and maybe namespace in the future?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/porch enhancement New feature or request triaged Issue has been triaged by adding an `area/` label
Projects
None yet
Development

No branches or pull requests

4 participants