Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move init phase of KPR configuration outside of Hive Runtime #32216

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
8 changes: 7 additions & 1 deletion daemon/cmd/cells_test.go
Expand Up @@ -37,10 +37,16 @@ func TestAgentCell(t *testing.T) {

logging.SetLogLevelToDebug()

h := hive.New(Agent)

NewAgentCmd(h)
option.Config.Populate(h.Viper())

// Populate config with default values normally set by Viper flag defaults
option.Config.IPv4ServiceRange = AutoCIDR
option.Config.IPv6ServiceRange = AutoCIDR

err := hive.New(Agent).Populate(hivetest.Logger(t))
err := h.Populate(hivetest.Logger(t))

assert.NoError(t, err, "Populate()")
}
12 changes: 8 additions & 4 deletions daemon/cmd/daemon.go
Expand Up @@ -358,15 +358,19 @@ func newDaemon(ctx context.Context, cleaner *daemonCleanup, params *daemonParams
}
}

// Do the partial kube-proxy replacement initialization before creating BPF
// Do the partial kube-proxy replacement probing before creating BPF
// maps. Otherwise, some maps might not be created (e.g. session affinity).
//
// During the populate phase, initKubeProxyReplacementOptions is run to finalize
// some of the options prior to runtime.
//
// finishKubeProxyReplacementInit(), which is called later after the device
// detection, might disable BPF NodePort and friends. But this is fine, as
// the feature does not influence the decision which BPF maps should be
// created.
if err := initKubeProxyReplacementOptions(params.Sysctl, params.TunnelConfig); err != nil {
log.WithError(err).Error("unable to initialize kube-proxy replacement options")
return nil, nil, fmt.Errorf("unable to initialize kube-proxy replacement options: %w", err)
if err := probeKubeProxyReplacementOptions(params.Sysctl); err != nil {
log.WithError(err).Error("unable to probe kube-proxy replacement options")
return nil, nil, fmt.Errorf("unable to probe kube-proxy replacement options: %w", err)
}

ctmap.InitMapInfo(option.Config.EnableIPv4, option.Config.EnableIPv6, option.Config.EnableNodePort)
Expand Down
19 changes: 17 additions & 2 deletions daemon/cmd/daemon_main.go
Expand Up @@ -1574,6 +1574,7 @@ var daemonCell = cell.Module(
"Legacy Daemon",

cell.Provide(
initKPROptions,
newDaemonPromise,
newRestorerPromise,
func() k8s.CacheStatus { return make(k8s.CacheStatus) },
Expand Down Expand Up @@ -1647,7 +1648,21 @@ type daemonParams struct {
CompilationLock datapath.CompilationLock
}

func newDaemonPromise(params daemonParams) promise.Promise[*Daemon] {
func initKPROptions(tunnelCfg tunnel.Config) (*option.FinalDaemonConfig, error) {
// Prior to creating the daemonPromise, we do the first phase of kpr
// initialization.
// This looks at existing configuration to make any necessary option
// overrides related to kpr.
// Following this, probeKubeProxyReplacementOptions(...) and then
// finally finishKubeProxyReplacementInit(...) are run inside newDeamon
// which is done at Hive runtime.
if err := initKubeProxyReplacementOptions(tunnelCfg); err != nil {
return nil, err
}
return (*option.FinalDaemonConfig)(option.Config), nil
}

func newDaemonPromise(params daemonParams, _ *option.FinalDaemonConfig) (promise.Promise[*Daemon], error) {
daemonResolver, daemonPromise := promise.New[*Daemon]()

// daemonCtx is the daemon-wide context cancelled when stopping.
Expand Down Expand Up @@ -1690,7 +1705,7 @@ func newDaemonPromise(params daemonParams) promise.Promise[*Daemon] {
return nil
},
})
return daemonPromise
return daemonPromise, nil
}

// startDaemon starts the old unmodular part of the cilium-agent.
Expand Down
14 changes: 9 additions & 5 deletions daemon/cmd/kube_proxy_replacement.go
Expand Up @@ -31,16 +31,18 @@ import (
"github.com/cilium/cilium/pkg/safeio"
)

// initKubeProxyReplacementOptions will grok the global config and determine
// initKubeProxyReplacementOptions will grok the daemon config and determine
// if we strictly enforce a kube-proxy replacement.
// Note: Runtime probing is performed separately in probeKubeProxyReplacementOptions.
// This function should be run prior hive runtime.
//
// if we determine the config denotes a "strict" kube-proxy replacement, the
// returned boolean will be true, when we detect a "non-strict" configuration the
// return boolean is false.
//
// if this function cannot determine the strictness an error is returned and the boolean
// is false. If an error is returned the boolean is of no meaning.
func initKubeProxyReplacementOptions(sysctl sysctl.Sysctl, tunnelConfig tunnel.Config) error {
func initKubeProxyReplacementOptions(tunnelConfig tunnel.Config) error {
if option.Config.KubeProxyReplacement != option.KubeProxyReplacementTrue &&
option.Config.KubeProxyReplacement != option.KubeProxyReplacementFalse {
return fmt.Errorf("Invalid value for --%s: %s", option.KubeProxyReplacement, option.Config.KubeProxyReplacement)
Expand Down Expand Up @@ -184,9 +186,7 @@ func initKubeProxyReplacementOptions(sysctl sysctl.Sysctl, tunnelConfig tunnel.C
return fmt.Errorf("Failed to initialize maglev hash seeds: %w", err)
}
}
}

if option.Config.EnableNodePort {
if option.Config.TunnelingEnabled() && tunnelConfig.Protocol() == tunnel.VXLAN &&
option.Config.LoadBalancerUsesDSR() {
return fmt.Errorf("Node Port %q mode cannot be used with %s tunneling.", option.Config.NodePortMode, tunnel.VXLAN)
Expand Down Expand Up @@ -251,12 +251,16 @@ func initKubeProxyReplacementOptions(sysctl sysctl.Sysctl, tunnelConfig tunnel.C
return nil
}

return probeKubeProxyReplacementOptions(sysctl)
return nil
}

// probeKubeProxyReplacementOptions checks whether the requested KPR options can be enabled with
// the running kernel.
func probeKubeProxyReplacementOptions(sysctl sysctl.Sysctl) error {
if option.Config.DryMode {
return nil
}

if option.Config.EnableNodePort {
if probes.HaveProgramHelper(ebpf.SchedCLS, asm.FnFibLookup) != nil {
return fmt.Errorf("BPF NodePort services needs kernel 4.17.0 or newer")
Expand Down
4 changes: 1 addition & 3 deletions daemon/cmd/kube_proxy_replacement_test.go
Expand Up @@ -7,10 +7,8 @@ import (
"strings"

. "github.com/cilium/checkmate"
"github.com/spf13/afero"
"github.com/spf13/cobra"

"github.com/cilium/cilium/pkg/datapath/linux/sysctl"
"github.com/cilium/cilium/pkg/datapath/tunnel"
"github.com/cilium/cilium/pkg/hive"
"github.com/cilium/cilium/pkg/option"
Expand Down Expand Up @@ -65,7 +63,7 @@ func (cfg *kprConfig) set() {
}

func (cfg *kprConfig) verify(c *C, tc tunnel.Config) {
err := initKubeProxyReplacementOptions(sysctl.NewDirectSysctl(afero.NewOsFs(), "/proc"), tc)
err := initKubeProxyReplacementOptions(tc)
if err != nil || cfg.expectedErrorRegex != "" {
c.Assert(err, ErrorMatches, cfg.expectedErrorRegex)
if strings.Contains(cfg.expectedErrorRegex, "Invalid") {
Expand Down
3 changes: 2 additions & 1 deletion pkg/mtu/cell.go
Expand Up @@ -33,6 +33,7 @@ type mtuParams struct {
IPsec types.IPsecKeyCustodian
CNI cni.CNIConfigManager
TunnelConfig tunnel.Config
Config *option.FinalDaemonConfig
}

func newForCell(lc cell.Lifecycle, p mtuParams) MTU {
Expand All @@ -57,7 +58,7 @@ func newForCell(lc cell.Lifecycle, p mtuParams) MTU {
option.Config.EnableIPSec,
p.TunnelConfig.ShouldAdaptMTU(),
option.Config.EnableWireguard,
option.Config.EnableHighScaleIPcache && option.Config.EnableNodePort,
option.Config.EnableHighScaleIPcache && p.Config.NodePortEnabled(),
configuredMTU,
externalIP,
)
Expand Down
8 changes: 8 additions & 0 deletions pkg/option/config.go
Expand Up @@ -1356,6 +1356,14 @@ func LogRegisteredOptions(vp *viper.Viper, entry *logrus.Entry) {
}
}

// FinalDaemonConfig is a alias for DaemonConfig used to differentiate what stage of daemon option init
// the config is when it is provided.
type FinalDaemonConfig DaemonConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand the intent for this. It's a parallel type and not an alias and it wraps an exported field. Is there an interface at play here?

Copy link
Contributor

@joamaki joamaki May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems more complicated than it needs to be. Could we instead just use the DaemonConfig type and have the provider for it also do the KPR etc. init/validation? (daemon/cmd/cells.go:91).

I'm worried that things outside NewDaemon might also want to use this flag and they might not be ordered correctly if they don't use FinalDaemonConfig. So seems safer to just have one type whose provider correctly initializes it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@derailed The aliased type allows us to "re-provide" the same var (i.e. *option.DaemonConfig) but in a later state of initialization. That is, we Provide DaemonConfig to hive for convenience but the underlying data is a global variable in pkg/option, this var is modified at various points at both init/runtime for things like overriding required config for KPR.

This presents the problem that this breaks assumptions about Hive dependency ordering, as well as having configuration that cannot be finalized prior to runtime.

So by providing the FinalDaemonConfig after the kpr init stuff, we provide a synchronization point to Hive where we know that the KPR init has already occured.

i.e.:

// Arbitrary ordering, KPR init may not be complete prior to this: 
cell.Provide(func(_ *option.DaemonConfig) {})

// This depends on KPR init, so using fields mutated by that routine is "safe": 
cell.Provide(func(_ *option.FinalDaemonConfig) {})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joamaki this was the approach I tried initially, albeit with a "pre-init" DaemonConfig alias, but since this is global we might be able to just use the global var there and then provide DaemonConfig as a "post-kpr-init" state as the primary dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Final" part of this is actually misleading because this is only "final" for the KPR related fields so that might cause confusion - so I think maybe reworking this makes sense.

If these approaches prove unweidly, it may make sense to just step back and come up with a plan for how to fix this in its entirety (i.e. get rid of the global *option.DaemonConfig). In the meantime we could also solve this by having the KPR init provider just provide some token dependency (ex. postKPRInit{}) where anything depending on that would be forced to wait for that to complete.


func (c *FinalDaemonConfig) NodePortEnabled() bool {
return c.EnableNodePort
}

// DaemonConfig is the configuration used by Daemon.
type DaemonConfig struct {
CreationTime time.Time
Expand Down