Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Swarmpit on Docker Swarm behind Traefik ressource problems results in Gateway Timeouts 504 #662

Open
setcooki opened this issue Jun 14, 2023 · 1 comment

Comments

@setcooki
Copy link

setcooki commented Jun 14, 2023


BUG REPORT

** Description **

Swarmpit on Docker Swarm behind Traefik ressource problems results in Gateway Timeouts 504 > https://monosnap.com/direct/Hg8sSZiDBCSBF8PikFDmaBYoQWb8YS even though host server only has ram at 40% and cpu at 30% load. I does not load with SWARMPIT_DOCKER_API=1.4.1 it does partially with SWARMPIT_DOCKER_API=1.30. I have the same docker-compose and swarm/traefik setup running in at least 5 other projects. Never had any issues.

  • Restarted docker engine
  • Changed resource settings
  • Changed docker api value
  • Changed ulimits values

All do not produce a running swarmpit app

Docker Engine

 Engine:
  Version:          20.10.23
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.18.10
  Git commit:       6051f14
  Built:            Thu Jan 19 17:42:57 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.15
  GitCommit:        5b842e528e99d4d4c1686467debf2bd4b88ecd86
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Compose file:

version: "3.4"

services:
  app:
    image: swarmpit/swarmpit:latest
    environment:
      - "SWARMPIT_DB=http://db:5984"
      - "SWARMPIT_INFLUXDB=http://influxdb:8086"
      - "SWARMPIT_DOCKER_HTTP_TIMEOUT=10000"
      - "SWARMPIT_DOCKER_API=1.30"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    networks:
      - swarm
      - network
    deploy:
      labels:
        - "APP=swarmpit"
        - "traefik.enable=true"
        - "traefik.docker.network=traefik_traefiknet"
        - "traefik.http.routers.swarmpit.entrypoints=websecure"
        - "traefik.http.routers.swarmpit.tls=true"
        - "traefik.http.routers.swarmpit.tls.certresolver=lets-encrypt"
        - "traefik.http.routers.swarmpit.rule=Host(`swarmpit.onblocktrust.com`)"
        - "traefik.http.services.swarmpit.loadbalancer.server.port=8080"
      placement:
        constraints:
          - node.role == manager
      resources:
        limits:
          cpus: '0.75'
          memory: 2048M
        reservations:
          cpus: '0.50'
          memory: 1024M
    ulimits:
      nofile:
        soft: 20000
        hard: 40000
    healthcheck:
      test: [ "CMD", "curl", "-f", "http://localhost:8080" ]
      interval: 60s
      timeout: 10s
      retries: 3

  db:
    image: couchdb:2.3.0
    volumes:
      - db-data:/opt/couchdb/data
    networks:
      - network
    deploy:
      labels:
        - "traefik.enable=false"
        - "traefik.http.services.swarmpit-db.loadbalancer.server.port=80"
      placement:
        constraints:
          - node.role == manager
      resources:
        limits:
          cpus: '0.30'
          memory: 512M
        reservations:
          cpus: '0.15'
          memory: 256M

  influxdb:
    image: influxdb:1.8
    volumes:
      - influx-data:/var/lib/influxdb
    networks:
      - network
    deploy:
      labels:
        - "traefik.enable=false"
        - "traefik.http.services.swarmpit-influxdb.loadbalancer.server.port=80"
      placement:
        constraints:
          - node.role == manager
      resources:
        reservations:
          cpus: '0.3'
          memory: 128M
        limits:
          cpus: '0.6'
          memory: 512M

  agent:
    image: swarmpit/agent:latest
    environment:
      - "DOCKER_API_VERSION=1.30"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    networks:
      - network
    deploy:
      labels:
        - "traefik.enable=false"
        - "traefik.http.services.swarmpit-agent.loadbalancer.server.port=80"
      placement:
        constraints:
          - node.role == manager
      resources:
        limits:
          cpus: '0.10'
          memory: 64M
        reservations:
          cpus: '0.05'
          memory: 32M

networks:
  swarm:
    external:
      name: traefik_network
  network:
    driver: overlay
    attachable: true

volumes:
  db-data:
    driver: local
  influx-data:
    driver: local

Container logs:

swarmpit_app.1.80u1z39i7oca@master    | 23-06-14 15:09:27 63780e639727 INFO [swarmpit.server:91] - Swarmpit is starting...
swarmpit_app.1.80u1z39i7oca@master    | 23-06-14 15:09:27 63780e639727 INFO [swarmpit.database:30] - Waiting for CouchDB ...
swarmpit_app.1.80u1z39i7oca@master    | 23-06-14 15:09:27 63780e639727 INFO [swarmpit.database:33] - ... CouchDB connected in 0 sec
swarmpit_app.1.80u1z39i7oca@master    | 23-06-14 15:09:28 63780e639727 INFO [swarmpit.database:63] - Swarmpit DB already exist
swarmpit_app.1.80u1z39i7oca@master    | 23-06-14 15:09:28 63780e639727 INFO [swarmpit.database:30] - Waiting for InfluxDB ...
swarmpit_app.1.80u1z39i7oca@master    | 23-06-14 15:09:28 63780e639727 INFO [swarmpit.database:33] - ... InfluxDB connected in 0 sec
swarmpit_app.1.80u1z39i7oca@master    | 23-06-14 15:09:28 63780e639727 INFO [swarmpit.database:76] - InfluxDB RP: #{"autogen" "an_hour" "a_day"}
swarmpit_app.1.80u1z39i7oca@master    | 23-06-14 15:09:28 63780e639727 INFO [swarmpit.database:77] - InfluxDB CQ: #{"cq_tasks_1m" "cq_hosts_1m" "cq_max_usage_services_30m" "cq_services_1m"}
swarmpit_app.1.80u1z39i7oca@master    | 23-06-14 15:09:28 63780e639727 INFO [swarmpit.server:95] - Swarmpit running on port 8080
swarmpit_app.1.80u1z39i7oca@master    | 23-06-14 15:09:29 63780e639727 INFO [swarmpit.setup:14] - Docker API: 1.41
swarmpit_app.1.80u1z39i7oca@master    | 23-06-14 15:09:29 63780e639727 INFO [swarmpit.setup:15] - Docker ENGINE: 20.10.23
swarmpit_app.1.80u1z39i7oca@master    | 23-06-14 15:09:29 63780e639727 INFO [swarmpit.setup:16] - Docker SOCK: /var/run/docker.sock
swarmpit_app.1.80u1z39i7oca@master    | 23-06-14 15:09:29 63780e639727 INFO [swarmpit.setup:21] - Log level: info
@dazinator
Copy link

dazinator commented Jul 1, 2023

I was also seeing lots of gateway timeouts also running behind traefik. I then starting seeing java memory issues so it made me wonder if perhaps docker health check was failing for swarmpit and restarting the service? I need to look at it more on Monday

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants