Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in revision-controller handling sidecar images when using ko.local #1093

Closed
trisberg opened this issue Jun 7, 2018 · 10 comments
Closed
Assignees
Labels
area/API API objects and controllers kind/bug Categorizes issue or PR as related to a bug.

Comments

@trisberg
Copy link
Member

trisberg commented Jun 7, 2018

Expected Behavior

Deploying development build using minikube and ko.local should work for creating sample apps

Actual Behavior

The revision-controller throws error related to the images specified for the queueSidecarImage.

Using an image in the controller args like ko.local/github.com/knative/serving/cmd/queue:80201dca6c68711ee1cd8d2a2f4978530dc9e51d920fe53217ada2844102dabf fails while changing that to the GCR release image gcr.io/knative-releases/github.com/knative/serving/cmd/queue@sha256:1a5de85ab28dc940c5770516d045d5e9d91c3e5b794e57a09655efeadbe240bc works fine.

Steps to Reproduce the Problem

  1. build and deploy knative-serving using ko to a minikube with KO_DOCKER_REPO='ko.local'
  2. deploy the primer sample app
  3. the deployment never gets created and the controller log has an error message like:
knative-serving-system/controller-c568fc76-6d86z[controller]: {"level":"error","logger":"controller.revision-controller","caller":"revision/revision.go:773","msg":"Error resolving deployment{error 25 0  Get http://ko.local/v2/: dial tcp: lookup ko.local on 10.96.0.10:53: no such host}"

Additional Info

Log message:

knative-serving-system/controller-c568fc76-6d86z[controller]: {"level":"error","logger":"controller.revision-controller","caller":"revision/revision.go:773","msg":"Error resolving deployment{error 25 0  Get http://ko.local/v2/: dial tcp: lookup ko.local on 10.96.0.10:53: no such host}","knative.dev/controller":"revision-controller","knative.dev/namespace":"default","knative.dev/revision":"primertemplate-00001","stacktrace":"github.com/knative/serving/pkg/controller/revision.(*Controller).reconcileDeployment\n\t/Users/trisberg/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:773\ngithub.com/knative/serving/pkg/controller/revision.(*Controller).createK8SResources\n\t/Users/trisberg/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:658\ngithub.com/knative/serving/pkg/controller/revision.(*Controller).reconcileOnceBuilt\n\t/Users/trisberg/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:607\ngithub.com/knative/serving/pkg/controller/revision.(*Controller).reconcileWithImage\n\t/Users/trisberg/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:282\ngithub.com/knative/serving/pkg/controller/revision.(*Controller).syncHandler\n\t/Users/trisberg/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:277\ngithub.com/knative/serving/pkg/controller/revision.(*Controller).(github.com/knative/serving/pkg/controller/revision.syncHandler)-fm\n\t/Users/trisberg/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:210\ngithub.com/knative/serving/pkg/controller.(*Base).processNextWorkItem.func1\n\t/Users/trisberg/go/src/github.com/knative/serving/pkg/controller/controller.go:219\ngithub.com/knative/serving/pkg/controller.(*Base).processNextWorkItem\n\t/Users/trisberg/go/src/github.com/knative/serving/pkg/controller/controller.go:227\ngithub.com/knative/serving/pkg/controller.(*Base).RunController.func1\n\t/Users/trisberg/go/src/github.com/knative/serving/pkg/controller/controller.go:172\ngithub.com/knative/serving/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntknative-serving-system/controller-c568fc76-6d86z[controller]: {"level":"error","logger":"controller.revision-controller","caller":"revision/revision.go:659","msg":"Failed to create a deployment{error 25 0  Get http://ko.local/v2/: dial tcp: lookup ko.local on 10.96.0.10:53: no such host}","knative.dev/controller":"revision-controller","knative.dev/namespace":"default","knative.dev/revision":"primertemplate-00001","stacktrace":"github.com/knative/serving/pkg/controller/revision.(*Controller).createK8SResources\n\t/Users/trisberg/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:659\ngithub.com/knative/serving/pkg/controller/revision.(*Controller).reconcileOnceBuilt\n\t/Users/trisberg/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:607\ngithub.com/knative/serving/pkg/controller/revision.(*Controller).reconcileWithImage\n\t/Users/trisberg/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:282\ngithub.com/knative/serving/pkg/controller/revision.(*Controller).syncHandler\n\t/Users/trisberg/go/src/github.com/knativeknative-serving-system/controller-c568fc76-6d86z[controller]: E0607 17:01:00.880004       1 controller.go:230] error syncing "default/primertemplate-00001": Get http://ko.local/v2/: dial tcp: lookup ko.local on 10.96.0.10:53: no such host
@google-prow-robot google-prow-robot added area/API API objects and controllers kind/bug Categorizes issue or PR as related to a bug. labels Jun 7, 2018
@scothis
Copy link
Contributor

scothis commented Jun 13, 2018

/assign pivotal-sukhil-suresh

@sukhil-suresh
Copy link
Contributor

sukhil-suresh commented Jun 18, 2018

Followed repro steps listed by @trisberg and got the same error

  • Error Log:
{
  "level": "error",
  "logger": "controller.revision-controller",
  "caller": "revision/revision.go:653",
  "msg": "Error resolving deployment{error 25 0  Get http://ko.local/v2/: dial tcp: lookup ko.local on 10.96.0.10:53: no such host}",
  "knative.dev/controller": "revision-controller",
  "knative.dev/namespace": "default",
  "knative.dev/revision": "primertemplate-00001",
  "stacktrace": "github.com/knative/serving/pkg/controller/revision.(*Controller).reconcileDeployment\n\t/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:653\ngithub.com/knative/serving/pkg/controller/revision.(*Controller).createK8SResources\n\t/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:534\ngithub.com/knative/serving/pkg/controller/revision.(*Controller).reconcileOnceBuilt\n\t/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:488\ngithub.com/knative/serving/pkg/controller/revision.(*Controller).reconcileWithImage\n\t/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:322\ngithub.com/knative/serving/pkg/controller/revision.(*Controller).syncHandler\n\t/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:317\ngithub.com/knative/serving/pkg/controller/revision.(*Controller).(github.com/knative/serving/pkg/controller/revision.syncHandler)-fm\n\t/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:245\ngithub.com/knative/serving/pkg/controller.(*Base).processNextWorkItem.func1\n\t/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/controller.go:205\ngithub.com/knative/serving/pkg/controller.(*Base).processNextWorkItem\n\t/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/controller.go:213\ngithub.com/knative/serving/pkg/controller.(*Base).RunController.func1\n\t/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/controller.go:158\ngithub.com/knative/serving/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/Users/pivotal/go/src/github.com/knative/serving/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/knative/serving/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/Users/pivotal/go/src/github.com/knative/serving/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/knative/serving/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/Users/pivotal/go/src/github.com/knative/serving/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"
}
  • Stacktrace:
github.com/knative/serving/pkg/controller/revision.(*Controller).reconcileDeployment
	/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:653
github.com/knative/serving/pkg/controller/revision.(*Controller).createK8SResources
	/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:534
github.com/knative/serving/pkg/controller/revision.(*Controller).reconcileOnceBuilt
	/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:488
github.com/knative/serving/pkg/controller/revision.(*Controller).reconcileWithImage
	/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:322
github.com/knative/serving/pkg/controller/revision.(*Controller).syncHandler
	/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:317
github.com/knative/serving/pkg/controller/revision.(*Controller).(github.com/knative/serving/pkg/controller/revision.syncHandler)-fm
	/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/revision/revision.go:245
github.com/knative/serving/pkg/controller.(*Base).processNextWorkItem.func1
	/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/controller.go:205
github.com/knative/serving/pkg/controller.(*Base).processNextWorkItem
	/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/controller.go:213
github.com/knative/serving/pkg/controller.(*Base).RunController.func1
	/Users/pivotal/go/src/github.com/knative/serving/pkg/controller/controller.go:158
github.com/knative/serving/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
	/Users/pivotal/go/src/github.com/knative/serving/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
github.com/knative/serving/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/Users/pivotal/go/src/github.com/knative/serving/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
github.com/knative/serving/vendor/k8s.io/apimachinery/pkg/util/wait.Until
	/Users/pivotal/go/src/github.com/knative/serving/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88

PS: If required, have a handy script to teardown/startup minikube,.... deploy app on knative for logs.

@sukhil-suresh
Copy link
Contributor

sukhil-suresh commented Jun 18, 2018

Debug attempts so far:

  1. Got rough manual trace of the steps from knative to the ko library
github.com/knative/serving/pkg/controller/revision/revision.go - reconcileDeployment()
github.com/knative/serving/pkg/controller/revision/resolve.go - Resolve()
github.com/knative/serving/vendor/github.com/google/go-containerregistry/v1/remote/image.go - Image()
github.com/knative/serving/vendor/github.com/google/go-containerregistry/v1/remote/transport/transport.go - New()
github.com/knative/serving/vendor/github.com/google/go-containerregistry/v1/remote/transport/ping.go - ping()
  1. Tried out ko apply -L -f with warm-image using minikube and it worked.

  2. Altered the istio.sidecar.includeOutboundIPRanges value for config-network.yaml to use the Minikube version of 10.0.0.1/24 instead of * - no impact... still failed with same error.

  3. Added logs to capture exact parameters passed by knative revision controller to Ko library

remote.Image(tag, auth, r.transport)
func Image(ref name.Reference, auth authn.Authenticator, t http.RoundTripper) (v1.Image, error) {}

Parameters logged when failing with Get http://ko.local/v2/: dial tcp: lookup ko.local on 10.96.0.10:53: no such host

tag: ko.local/github.com/knative/serving/cmd/queue:a1bd90556a21b707a5cc18874c7d46be7139f6310f8939586274f04faa3540bc
auth: &{}
r.transport: &{{0 0} false map[] map[{ https gcr.io:443}:0xc4205a2d20 { http ko.local:80}:0xc4200a25a0] {<nil> map[]} {0 0} map[] {0 0} {map[https:{0xc4202a1220}]} 0x67b1f0 0x690ec0 <nil> <nil> 0xc420bce000 10s false false 100 0 1m30s 0s 1s map[h2:0x688000] map[] 0 {{0 0} 1} 0xc4202a1220}

@sukhil-suresh
Copy link
Contributor

At this point, convinced that the flaw lies in the vendored github.com/google/go-containerregistry/cmd/ko library
Ref: https://github.com/knative/serving/blob/master/Gopkg.toml#L30

Why Ko? As mentioned in the previous message, the revision controller when reconciling the deployment (on knative) tries to resolve a remote image using the vendored github.com/knative/serving/vendor/github.com/google/go-containerregistry/v1/remote/image.go. This is the point of failure.

Usure as to why the logged value for the image tag parameter is ko.local/github.com/knative/serving/cmd/queue:a1bd90556a21b707a5cc18874c7d46be7139f6310f8939586274f04faa3540bc

To convince myself that the fault lay with the Ko codebase, tried a sample app with the same input params and got the same lookup failure. Maybe special handling is required for ko.local prefixed image tags?

At this point, to proceed further, need to understand better how Ko handles ko.local prefixed URLs for operations besides remote image lookup. Eg. pushing images

Tried the sample app at the knative/serving vendor constraint of 8bdfdfc3bd4146f40b7709f6b1b7d979a3385da8 for github.com/google/go-containerregistry/cmd/ko and also at the latest HEAD b32ffadc0522f1db02422bf6c01e0ad62dc64c97 with same error message.

Sample app snippet for 8bdfdfc3bd4146f40b7709f6b1b7d979a3385da8 :

func main() {
	
	fmt.Println("test app for #1093")

	tag, err := name.NewTag("ko.local/github.com/knative/serving/cmd/queue:77177e7d6964bffca86536c07863620eb1bb8e4a009fc32f22f600341e2e693c", name.WeakValidation)
	if err != nil {
		panic("tag creation failed. " + err.Error())
	}

	img, err := remote.Image(tag, authn.Anonymous, http.DefaultTransport)
	if err != nil {
		panic("tag creation failed. " + err.Error())
	}

	fmt.Printf("remote image resolved: %v", img)
}

Note: there are minor variations in the imports and remote.Image call between 8bdfdfc3bd4146f40b7709f6b1b7d979a3385da8 and latest HEAD

@jchesterpivotal
Copy link
Contributor

During our further investigation today, we found that this issue also affects Docker for Desktop almost identically. We don't think it's exclusive to minikube.

@sukhil-suresh
Copy link
Contributor

The root error occurs when a ko.local address lookup is attempted against kube-dns by the go-containerregistry library. The solutions @jchesterpivotal and I considered are

  1. Patching go-containerregistry library to special handle ko.local address. i.e replace ko.local reference with the local docker registry address

  2. Stop using ko apply -L -f config/ for a local knative deploy. Instead pipe output of ko resolve -L -f config/ and replace references of ko.local with the local docker registry address and then pass the processed yaml files to kubectl apply -f -

We are convinced that solution 2 is the better approach since it does not introduce special handling for any address. The bigger challenge at the moment is to identify the local docker registry address which can be used from within the kubernetes cluster.

@jchesterpivotal
Copy link
Contributor

@pivotal-sukhil-suresh and I kept investigating yesterday.

Option 2 does not work, because the components we need to adjust (queue and autoscaler) are not part of the ko apply that option 2 affects.

We started on Option 1 and got it half-working. We can detect ko.local as a special case and prevent go-containerregistry from trying to perform a DNS lookup that will never succeed. But go-containerregistry still assumes that it will pull from a docker registry, meaning we would need to provide whatever magic value that would point it at the docker daemon internal to minikube.

@sukhil-suresh
Copy link
Contributor

sukhil-suresh commented Jun 22, 2018

@jchesterpivotal and I kept on yesterday.

The approach we explored yesterday was enabling the minikube addon registry.

The knative codebase (through go-containerregistry) can communicate with the addon registry and tries to pull images from there.

The challenge at the moment is getting images into this addon registry using ko from outside the minikube kubernetes cluster. This is because the registry is by default configured to be insecure (non
TLS) and the docker client code used by ko insists on using TLS by default.

The policy of being able to talk to an insecure registry is configured on the daemon and not possible at the docker CLI.

We are able to successfully talk to the addon registry (inside of kubernetes, inside of minikube) from the docker CLI (outside of minikube) by adding it to our host docker daemon (outside of minikube, in our case Docker for Desktop). But ko does not appear to rely on that docker daemon and does not pick up the insecure registry configuration. So it fails to communicate with the addon registry from outside of minikube. We're also unsure about whether this is true of ko when it is invoked inside of minikube by knative.

@sukhil-suresh
Copy link
Contributor

@jchesterpivotal and I paired on this:

Most of yesterday was spent getting ko (which insists on talking securely) to talk to the minikube addon registry which is insecure by default. However, today we pivoted and focused on enabling TLS for the minikube registry so that ko can talk to it. This work remains incomplete as of EOD, but documenting the effort/approach we took:

How-To on controlling ingress traffic rules for the addon registry using the Istio ingress-gateway.
Ref: https://istio.io/docs/tasks/traffic-management/ingress/#add-a-secure-port-https-to-our-gateway

How-To for generating the certificate for addon registry (signed with the kubernetes cluster CA)
Ref: https://kubernetes.io/docs/tasks/tls/managing-tls-in-a-cluster/#requesting-a-certificate

@mattmoor
Copy link
Member

I think the most expedient solution to this would be to simply restrict the tag-to-digest resolution to pod.Containers[i].Name == userContainerName.

Sorry I didn't see this previously.

mchmarny pushed a commit that referenced this issue Jun 29, 2018
By default the controller is configured to skip resolving
for the following registries:
- ko.local
- dev.local

This should address issue #1093

Co-authored-by: Sukhil Suresh <ssuresh@pivotal.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/API API objects and controllers kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

6 participants