Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud-proxy-server never has an external IP assigned #1867

Open
dcfranca opened this issue Apr 2, 2024 · 8 comments
Open

cloud-proxy-server never has an external IP assigned #1867

dcfranca opened this issue Apr 2, 2024 · 8 comments

Comments

@dcfranca
Copy link

dcfranca commented Apr 2, 2024

Describe the bug
I'm deploying Pixie locally to a Colima cluster for testing and PoC purposes
Running ./dev_dns_updater seems to get stuck, so I checked the LoadBalancer services and I have a weird situation

NAME                  TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                                       AGE
cloud-proxy-service   LoadBalancer   10.43.209.160   <pending>     443:30758/TCP,4444:30058/TCP,5555:32671/TCP   5m16s
❯ kubectl get service vzconn-service -n plc
vzconn-service   LoadBalancer   10.43.53.124   192.168.5.1   51600:31468/TCP   17d

As you can see vzconn-service worked fine and has an IP assigned, but for some reason cloud-proxy-service doesn't have it, which I think might be the root cause for the issue with dev_dns_updater

If both didn't have an IP I would assume that there is something wrong with the load balancer assignment, but if worked for one, why didn't work for the other?

I checked the pod events for the service and pod, but I don't see anything wrong there

Service
Events:
  Type    Reason                Age    From                Message
  ----    ------                ----   ----                -------
  Normal  EnsuringLoadBalancer  7m21s  service-controller  Ensuring load balancer
  Normal  AppliedDaemonSet      7m21s  service-controller  Applied LoadBalancer DaemonSet kube-system/svclb-cloud-proxy-service-80a58f80
Pod
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  16m   default-scheduler  Successfully assigned plc/cloud-proxy-7897b497cb-sx82r to colima
  Normal  Pulled     16m   kubelet            Container image "gcr.io/pixie-oss/pixie-prod/cloud/proxy_server_image:0.1.7" already present on machine
  Normal  Created    16m   kubelet            Created container cloud-proxy-server
  Normal  Started    16m   kubelet            Started container cloud-proxy-server
  Normal  Pulled     16m   kubelet            Container image "envoyproxy/envoy:v1.12.2@sha256:b36ee021fc4d285de7861dbaee01e7437ce1d63814ead6ae3e4dfcad4a951b2e" already present on machine
  Normal  Created    16m   kubelet            Created container envoy
  Normal  Started    16m   kubelet            Started container envoy

The only thing I see are some warnings on the cloud-proxy-server container, but I don't think they are an issue:

2024/04/02 16:13:23 [warn] 8#8: could not build optimal variables_hash, you should increase either variables_hash_max_size: 1024 or variables_hash_bucket_size: 64; ignoring variables_hash_bucket_size
nginx: [warn] could not build optimal variables_hash, you should increase either variables_hash_max_size: 1024 or variables_hash_bucket_size: 64; ignoring variables_hash_bucket_size
Stream closed EOF for plc/cloud-proxy-7897b497cb-sx82r (cloud-proxy-server)

Any idea what could be preventing the service of getting an external IP?

To Reproduce
Steps to reproduce the behavior:

  1. Install Pixie on Colima running locally
  2. See the cloud-proxy-server service never getting an external IP

Expected behavior
The External IP is assigned to the cloud-proxy-server

App information (please complete the following information):

  • Pixie version: 0.1.7
  • K8s cluster version: v1.27.1+k3s1
  • Node Kernel version
  • Browser version
@dcfranca
Copy link
Author

dcfranca commented Apr 4, 2024

I have removed the tcp-https on the service cloud-proxy-service, leaving only the tcp-grpc and tcp-http2 ones, and then I get an IP assigned to it (not sure if it is the right thing to do, but vzconn-service also doesn't have one

❯ kubectl get service cloud-proxy-service -n plc
NAME                  TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                         AGE
cloud-proxy-service   LoadBalancer   10.43.209.160   192.168.5.1   4444:30058/TCP,5555:32671/TCP   43h

❯ kubectl get service vzconn-service -n plc
NAME             TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)           AGE
vzconn-service   LoadBalancer   10.43.53.124   192.168.5.1   51600:31468/TCP   19d

But still, ./dev_dns_updater gets stuck, but now shows a bit more of logging

INFO[0000] DNS Entries                                   entries="dev.withpixie.dev, work.dev.withpixie.dev" service=cloud-proxy-service
INFO[0003] Update                                        addr=192.168.5.1 service=cloud-proxy-service

@dcfranca
Copy link
Author

dcfranca commented Apr 4, 2024

I manually added the host to the hosts file:

192.168.5.1      dev.withpixie.dev work.dev.withpixie.dev

Which at least resolve the address, but the connection to the server fails with a timeout

❯ curl -vv dev.withpixie.dev:5555
*   Trying 192.168.5.1:5555...
* connect to 192.168.5.1 port 5555 failed: Operation timed out
* Failed to connect to dev.withpixie.dev port 5555 after 75002 ms: Couldn't connect to server
* Closing connection
curl: (28) Failed to connect to dev.withpixie.dev port 5555 after 75002 ms: Couldn't connect to server

@dcfranca
Copy link
Author

dcfranca commented Apr 8, 2024

Anyone?

@dcfranca
Copy link
Author

Any suggestion on how I can solve this?

@dcfranca
Copy link
Author

Do you need more details?

@dcfranca
Copy link
Author

dcfranca commented May 2, 2024

anyone?

@dcfranca
Copy link
Author

dcfranca commented May 8, 2024

@JamesMBartlett

@JamesMBartlett
Copy link
Member

Hi @dcfranca.

It's hard for us to debug issues in environments we don't officially support.

I'm not too familiar with Colima. However, it seems like they have an option to enable exposing an external IP: https://github.com/abiosoft/colima/blob/main/docs/FAQ.md#the-virtual-machines-ip-is-not-reachable

Have you tried running colima with that flag?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants