Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add Metal3 provider #82

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

✨ Add Metal3 provider #82

wants to merge 20 commits into from

Conversation

chess-knight
Copy link
Member

@chess-knight chess-knight commented Apr 26, 2024

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #21

Special notes for your reviewer:
Test on virtualized environment:
See docs https://book.metal3.io/quick-start

  1. I created Ubuntu instance in gx-scs - flavor SCS-16V-64 and 200GiB disk
  2. Create libvirt network - https://book.metal3.io/quick-start#virtualized-configuration
    $ virsh net-info baremetal
    Name:           baremetal
    UUID:           ae14ef12-4ff1-4c54-90c8-38ebdec3542b
    Active:         yes
    Persistent:     yes
    Autostart:      no
    Bridge:         metal3
  3. Create VMs - https://book.metal3.io/quick-start#virtualized-configuration
    • e.g. create 1 for control-plane and 3 for workers:
    virt-install \
      --connect qemu:///system \
      --name bmh-vm-01 `# workers 02, 03, 04` \
      --description "Virtualized BareMetalHost" \
      --osinfo=ubuntu-lts-latest \
      --ram=12288 \
      --vcpus=2 `# e.g. 3 vcpus for workers` \
      --disk size=25 `# add second disk (--disk size=20) for workers if you want to install rook-ceph` \
      --graphics=none \
      --console pty \
      --serial pty \
      --pxe \
      --network network=baremetal,mac="00:60:2f:31:81:01" `# workers 02, 03, 04` \
      --noautoconsole
    $ virsh list
     Id   Name        State
    ---------------------------
     1    bmh-vm-01   running
     2    bmh-vm-02   running
     3    bmh-vm-03   running
     4    bmh-vm-04   running
  4. Install sushy-tools for Redfish communication - https://book.metal3.io/quick-start#sushy-tools---aka-the-bmc
    $ docker logs sushy-tools
     * Serving Flask app 'sushy_tools.emulator.main'
     * Debug mode: off
    WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
     * Running on http://192.168.222.1:8000
    Press CTRL+C to quit
  5. Create KinD management cluster - https://book.metal3.io/quick-start#management-cluster
  6. Install Dnsmasq - https://book.metal3.io/quick-start#dhcp-server
    • use this config:
    DHCP_HOSTS=00:60:2f:31:81:01,192.168.222.100;00:60:2f:31:81:02,192.168.222.101;00:60:2f:31:81:03,192.168.222.102;00:60:2f:31:81:04,192.168.222.103
    DHCP_IGNORE=tag:!known
    # IP of the host from VM perspective
    PROVISIONING_IP=192.168.222.1
    GATEWAY_IP=192.168.222.1
    DHCP_RANGE=192.168.222.100,192.168.222.149
    DNS_IP=provisioning
    
  7. Skip Image server (we use osism images) - https://book.metal3.io/quick-start#image-server
  8. Deploy Ironic - https://book.metal3.io/quick-start#deploy-ironic
  9. Deploy Bare Metal Operator - https://book.metal3.io/quick-start#deploy-bare-metal-operator
  10. Create BareMetalHosts - https://book.metal3.io/quick-start#create-baremetalhosts
    • 1 for control-plane and 3 for workers:
    apiVersion: v1
    kind: Secret
    metadata:
      name: bml-01 # workers 02, 03, 04
    type: Opaque
    stringData:
      username: replaceme
      password: replaceme
    ---
    apiVersion: metal3.io/v1alpha1
    kind: BareMetalHost
    metadata:
      name: bml-vm-01 # workers 02, 03, 04
      labels:
        type: control-plane # 'type: worker' for workers
    spec:
      online: true
      bootMACAddress: 00:60:2f:31:81:01 # workers 02, 03, 04
      bootMode: legacy
      hardwareProfile: libvirt
      bmc:
        address: redfish-virtualmedia+http://192.168.222.1:8000/redfish/v1/Systems/bmh-vm-01 # workers 02, 03, 04
        credentialsName: bml-01 # workers 02, 03, 04
    $ kubectl get bmh --show-labels
    NAME        STATE       CONSUMER   ONLINE   ERROR   AGE   LABELS
    bml-vm-01   available              true             11m   type=control-plane
    bml-vm-02   available              true             11m   type=worker
    bml-vm-03   available              true             11m   type=worker
    bml-vm-04   available              true             11m   type=worker
  11. Deploy CAPI/CAPM3/CSO
    export CLUSTER_TOPOLOGY=true
    clusterctl init --infrastructure metal3
    # apply Metal3ClusterTemplate CRD until new CAPM3 release (current v1.7.0)
    kubectl apply -f https://raw.githubusercontent.com/metal3-io/cluster-api-provider-metal3/main/config/crd/bases/infrastructure.cluster.x-k8s.io_metal3clustertemplates.yaml
    kubectl label crd metal3clustertemplates.infrastructure.cluster.x-k8s.io cluster.x-k8s.io/v1beta1=v1beta1
    # install CSO in your favourite way
  12. Create Cluster Stack
    apiVersion: clusterstack.x-k8s.io/v1alpha1
    kind: ClusterStack
    metadata:
      name: clusterstack
    spec:
      provider: metal3
      name: alpha
      kubernetesVersion: "1.28"
      channel: custom
      autoSubscribe: false
      noProvider: true
      versions:
      - v0-sha.b08777e
    $ kubectl get clusterstack
    NAME           PROVIDER   CLUSTERSTACK   K8S    CHANNEL   AUTOSUBSCRIBE   USABLE           LATEST                                       AGE   REASON   MESSAGE
    clusterstack   metal3     alpha          1.28   custom    false           v0-sha-b08777e   metal3-alpha-1-28-v0-sha-b08777e | v1.28.9   12m
  13. Create Cluster
    apiVersion: cluster.x-k8s.io/v1beta1
    kind: Cluster
    metadata:
      name: my-cluster
    spec:
      topology:
        class: metal3-alpha-1-28-v0-sha.b08777e
        version: v1.28.9
        controlPlane:
          replicas: 1
        workers:
          machineDeployments:
          - class: default-worker
            name: alpha
            replicas: 3
        variables:
    #   Required
        - name: controlPlaneEndpoint
          value:
            host: 192.168.222.150
    #        port: 6443
    #   Optional
        - name: workerHostSelector
          value:
            matchLabels:
              type: worker
        - name: controlPlaneHostSelector
          value:
            matchLabels:
              type: control-plane
    ##   Experiment with other optional variables, e.g. try rook-ceph
    #    - name: user
    #      value:
    #        name: user
    #        sshKey: ssh-ed25519 ABCD... user@example.com
    #    - name: image
    #      value:
    #        checksum: https://swift.services.a.regiocloud.tech/swift/v1/AUTH_b182637428444b9aa302bb8d5a5a418c/openstack-k8s-capi-images/ubuntu-2204-kube-v1.28/ubuntu-2204-kube-v1.28.9.qcow2.CHECKSUM
    #        checksumType: sha256
    #        format: qcow2
    #        url: https://swift.services.a.regiocloud.tech/swift/v1/AUTH_b182637428444b9aa302bb8d5a5a418c/openstack-k8s-capi-images/ubuntu-2204-kube-v1.28/ubuntu-2204-kube-v1.28.9.qcow2
    #    - name: rook_ceph_cluster_values
    #      value: |
    #        enabled: true
    #    - name: workerDataTemplate
    #      value: my-cluster-workers-template
    #    - name: controlPlaneDataTemplate
    #      value: my-cluster-controlplane-template
    #---
    #apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    #kind: Metal3DataTemplate
    #metadata:
    #  name: my-cluster-controlplane-template
    #spec:
    #  clusterName: my-cluster
    #---
    #apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    #kind: Metal3DataTemplate
    #metadata:
    #  name: my-cluster-workers-template
    #spec:
    #  clusterName: my-cluster
    $ kubectl get cluster,metal3cluster
    NAME                                  CLUSTERCLASS                       PHASE         AGE   VERSION
    cluster.cluster.x-k8s.io/my-cluster   metal3-alpha-1-28-v0-sha.b08777e   Provisioned   62m   v1.28.9
    
    NAME                                                             AGE   READY   ERROR   CLUSTER      ENDPOINT
    metal3cluster.infrastructure.cluster.x-k8s.io/my-cluster-srg2j   62m   true            my-cluster   {"host":"192.168.222.150","port":6443}
    $ clusterctl get kubeconfig my-cluster > kubeconfig.yaml
  14. Test kube-vip service loadbalancing
    $ kubectl --kubeconfig kubeconfig.yaml create deploy --image nginx --port 80 nginx
    # --load-balancer-ip needs to be specified because kube-vip-cloud-provider is missing
    $ kubectl --kubeconfig kubeconfig.yaml expose deployment nginx --port 80 --type LoadBalancer --load-balancer-ip 192.168.222.151
    $ curl 192.168.222.151
    <!DOCTYPE html>
    <html>
    <head>
    <title>Welcome to nginx!</title>
    <style>
    html { color-scheme: light dark; }
    body { width: 35em; margin: 0 auto;
    font-family: Tahoma, Verdana, Arial, sans-serif; }
    </style>
    </head>
    <body>
    <h1>Welcome to nginx!</h1>
    <p>If you see this page, the nginx web server is successfully installed and
    working. Further configuration is required.</p>
    
    <p>For online documentation and support please refer to
    <a href="http://nginx.org/">nginx.org</a>.<br/>
    Commercial support is available at
    <a href="http://nginx.com/">nginx.com</a>.</p>
    
    <p><em>Thank you for using nginx.</em></p>
    </body>
    </html>

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

  • squash commits
  • include documentation
  • add unit tests

Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Also move image values to values.yaml and rename to k8s v1-29

Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Also switch back to v1-28 and fix node names

Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Also update metrics-server addon

Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
@jschoone jschoone linked an issue May 3, 2024 that may be closed by this pull request
Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
We have only one variable 'image' which can be used for both types of nodes

Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Default values are based on test manifests

Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Should prevent too many restarts

Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
Related to #77

Signed-off-by: Roman Hros <roman.hros@dnation.cloud>
@chess-knight chess-knight marked this pull request as ready for review May 20, 2024 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cluster Stacks for Metal³
2 participants