Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] SwarmPit Deploy fails #684

Open
typoworx-de opened this issue Mar 13, 2024 · 3 comments
Open

[bug] SwarmPit Deploy fails #684

typoworx-de opened this issue Mar 13, 2024 · 3 comments

Comments

@typoworx-de
Copy link

Description
Error trying to deploy swarmpit on photon-os (

Steps to reproduce the issue:

  1. install photon-os minimal (https://vmware.github.io/photon/)
  2. tdnf install docker
  3. docker run -it --rm --name swarmpit-installer --volume /var/run/docker.sock:/var/run/docker.sock swarmpit/install:1.9

What happens:

Application deployment
Creating network swarm-pit_net
Creating service swarm-pit_agent
Creating service swarm-pit_app
Creating service swarm-pit_db
Creating service swarm-pit_influxdb
DONE.

Starting swarmpit.......................FAILED!
Swarmpit is not responding for a long time. Aborting installation...:(
Please check logs and cluster status for details.

docker service logs swarmpit_app | head -n 20

swarmpit_app.1.nafki909lywm@photon.typoworx.com    | # A fatal error has been detected by the Java Runtime Environment:
swarmpit_app.1.nafki909lywm@photon.typoworx.com    | #
swarmpit_app.1.nafki909lywm@photon.typoworx.com    | #  SIGSEGV (0xb) at pc=0x00007f2403e71529, pid=1, tid=0x00007f2404c41700
swarmpit_app.1.nafki909lywm@photon.typoworx.com    | #
swarmpit_app.1.nafki909lywm@photon.typoworx.com    | # JRE version: OpenJDK Runtime Environment (8.0_242-b08) (build 1.8.0_242-8u242-b08-1~deb9u1-b08)
swarmpit_app.1.nafki909lywm@photon.typoworx.com    | # Java VM: OpenJDK 64-Bit Server VM (25.242-b08 mixed mode linux-amd64 compressed oops)
swarmpit_app.1.nafki909lywm@photon.typoworx.com    | # Problematic frame:
swarmpit_app.1.nafki909lywm@photon.typoworx.com    | # C  [libc.so.6+0x34529]  abort+0x269
swarmpit_app.1.nafki909lywm@photon.typoworx.com    | #
swarmpit_app.1.nafki909lywm@photon.typoworx.com    | # Core dump written. Default location: /usr/src/app/core or core.1

Already tried rebooting the machine, leaving docker-swarm, docker prune -a and starting from scratch, but everytime results in the same.

What should happen:
Swarmpit should start

Additional information (e.g. docker version, cluster setup,...):

cat /etc/os-release

NAME="VMware Photon OS"
VERSION="4.0"
ID=photon
VERSION_ID=4.0
PRETTY_NAME="VMware Photon OS/Linux"
ANSI_COLOR="1;34"
HOME_URL="https://vmware.github.io/photon/"
BUG_REPORT_URL="https://github.com/vmware/photon/issues"

--

docker version
Client: Docker Engine - Community
Version: 24.0.5
API version: 1.43
Go version: go1.20.12
Git commit: ced0996
Built: Thu Dec 21 03:16:59 2023
OS/Arch: linux/amd64
Context: default

Server: Docker Engine - Community
Engine:
Version: 24.0.5
API version: 1.43 (minimum version 1.12)
Go version: go1.20.12
Git commit: a61e2b4
Built: Thu Dec 21 03:18:08 2023
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.8
GitCommit: 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
runc:
Version: 1.1.12
GitCommit:
docker-init:
Version: 0.19.0
GitCommit: de40ad0

@typoworx-de
Copy link
Author

cross checked by increasing RAM-memory (in vmware esxi vm) and still deployment is stuck. Additionally I noticed this error in vm-console while trying to deploy swarmpit using ssh-session:
Memory cgroup out of memory: Killed process xxxx (beam.smp)

I don't know if this may have to do with the error thrown inside docker swarm-pit container?!

@dcasota
Copy link

dcasota commented Mar 16, 2024

Hi,

Try the following custom setup. It works flawlessly so far. As first attempt, it does not include any cpu and ram limitations&reservations for the service subcomponents. Those can be added later. In addition it includes a few modifications to avoid culprits which are explained here. Hope this helps.

On Photon OS, clone the swarmpit github repository.

cd $HOME
tdnf install -y git
git clone https://github.com/swarmpit/swarmpit -b master

Replace docker-compose.yml and deploy.

cat > swarmpit/docker-compose.yml << "EOFdockercompose"
version: '3.9'

services:
  app:
    image: swarmpit/swarmpit:1.9
    depends_on:
      - db
    ulimits:
      nofile:
        soft: 65536
        hard: 65536
    environment:
      - SWARMPIT_DB=http://db:5984
      - SWARMPIT_INFLUXDB=http://influxdb:8086
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    ports:
      - 888:8080
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080"]
      interval: 60s
      timeout: 10s
      retries: 3
    networks:
      - net
    deploy:
      placement:
        constraints:
          - node.role == manager

  db:
    image: couchdb:2.3.1
    ulimits:
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - db-data:/opt/couchdb/data
    networks:
      - net

  influxdb:
    image: influxdb:1.7
    volumes:
      - influx-data:/var/lib/influxdb
    networks:
      - net

  agent:
    image: swarmpit/agent:2.2
    depends_on:
      - app
      - db
      - influxdb
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    networks:
      - net
    deploy:
      labels:
        swarmpit.agent: 'true'

networks:
  net:
    driver: overlay

volumes:
  db-data:
    driver: local
  influx-data:
    driver: local
EOFdockercompose

docker stack deploy -c swarmpit/docker-compose.yml swarmpit

@zbalogh
Copy link

zbalogh commented May 16, 2024

I had the same issue, but your example works.

thanks @dcasota dcasota!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants