[Other] Setup productive SCS cluster at Cloud&Heat with Yaook on bare metal as FITKO PoC #544

anjastrunk · 2024-04-02T14:41:32Z

Provide a productive SCS cluster as PoC for FITKO. The cluster MUST be set up with the open source Lifecycle Management Tool for OpenStack and K8S Yaook and must be SCS compliant.

In contrast to #414, productive SCS cluster is set up on bare metal.

martinmo · 2024-04-08T08:47:56Z

(Note: we initially used #414 as the sole issue to track both the provisioning of the virtual and the bare metal test clusters. For the sake of better documentation, we retroactively created this separate issue for the bare metal setup.)

Current state is that the installation is completed and we have a working bare metal Yaook cluster. Thanks to the contributions of @cah-hbaum this OpenStack cluster is already SCS compliant, i.e., it fulfills the stabilized standards of the IaaS track (details can be found in #415 and in https://github.com/SovereignCloudStack/standards/tree/do-not-merge/scs-compliant-yaook/Informational). Here's a recap of what happened until we got to this point:

Initial preparation for the Yaook bare metal installation started at the beginning of March. This involved a "rehearsal" of the installation procedure on an existing, simpler test cluster, because for me this was the first time conducting such an install and the hardware for the actual installation was not yet commisioned.

During the rehearsal we already had network setup issues we needed to work around:

The first issues were related to VLAN tagging and caused connection issues between the Yaook Management Cluster and the install node. These were (probably) caused by some incompatibilities or misconfiguration of the network interfaces of the virtual install node and/or KVM (Proxmox). The exact cause is not known, because we gave up debugging at some point and switched to a bare metal install node, which immediately worked.
The second set of issues were related to PXE booting for the actual nodes that are supposed to be provisioned by ironic. These turned out to be firmware issues (a subset of the servers "forgot" its PXE configuration after reboot) as well as misleading and very well hidden BIOS settings (especially: PXE boot timeouts).

The installation of the bare metal test cluster was conducted between March 11th and March 19th but we again bumped into a lot of technical difficulties. Debugging and fixing these was a bit more time consuming than usual because I am not yet 100% accustomed to the interactions of all the components.

Because this is going to be a productive cluster, the network setup is a lot more complex than the rehearsal install (more redundancy, stricter isolation). In addition to separate VLANs and subnets for different purposes, we also use several dedicated switches (e.g., for the Ceph nodes). This took several iterations and debugging sessions to get everything right, i.e., that all components supposed to be communicating could talk to each other.
Some trial and error was needed to get the netplan part of the cloud-init configuration right, also partly because I misunderstood the configuration. (This is very unfortunate and we will make this more robust and easier to verify in the future, e.g., by switching to interface selection by MAC address via the match keyword.)
During provisioning with ironic, a subset of the nodes repeatedly ended up in the clean failed state. It took some time to debug, but the Yaook bare metal logs contained the hint ("certificate not yet valid") and we finally figured out this was caused by an extremely out of sync hardware clock.
A similar, firmware setup/BIOS related error that cost us time was a still active hardware raid configuration on another subset of the nodes, which also lead to provisioning failures with ironic.
Finally, we had to troubleshoot some software issues that worked flawlessly before: e.g., during the automated install we ran into the problem that the K8s APT repositories were moved. Additionally, the CNI plugin (calico) installation failed initially, which we just fixed by switching to a different release.

The next step will involve moving the hardware from our facility to its final location.

anjastrunk · 2024-04-10T11:22:25Z

@cah-hbaum Please provide YAML output for SCS compatible IaaS v4, to proof cluster is SCS compliant.

martinmo · 2024-04-26T14:09:01Z

The next step will involve moving the hardware from our facility to its final location.

Status update:

Hardware was successfully moved to the final location (I am unsure if I am allowed to reveal the location yet – please be patient, more infos will come soon) and the cluster is stable.
The next step is a proper network link, we currently use a temporary link.

shmelkin · 2024-05-02T06:49:19Z

Status update for multiple working days:

Uplink configured
Providernet (1000) configured on computes and controllers
Network, Subnet, Router configured in OpenStack
Firewalls configured for routing
tested "Ubuntu 22.04" VM with public-routable IP
Configured and used SCS conformity-tests on cluster

shmelkin · 2024-05-02T17:06:10Z

Status update:

configured openStack API and made it publicly available
configured monitoring on site-wide cluster and connected it to c&h global monitoring
started installing yk8s for functionality-proof
benchmarked VM

------------------------------------------------------------------------
Benchmark Run: Thu May 02 2024 07:01:31 - 07:29:25
8 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       49374945.7 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     7331.2 MWIPS (8.8 s, 7 samples)
Execl Throughput                               4367.4 lps   (29.7 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks       1700297.0 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          465706.0 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       4516692.6 KBps  (30.0 s, 2 samples)
Pipe Throughput                             2643777.7 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 249035.2 lps   (10.0 s, 7 samples)
Process Creation                               4239.2 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   3404.9 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   7774.8 lpm   (60.0 s, 2 samples)
System Call Overhead                        2399837.6 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   49374945.7   4230.9
Double-Precision Whetstone                       55.0       7331.2   1333.0
Execl Throughput                                 43.0       4367.4   1015.7
File Copy 1024 bufsize 2000 maxblocks          3960.0    1700297.0   4293.7
File Copy 256 bufsize 500 maxblocks            1655.0     465706.0   2813.9
File Copy 4096 bufsize 8000 maxblocks          5800.0    4516692.6   7787.4
Pipe Throughput                               12440.0    2643777.7   2125.2
Pipe-based Context Switching                   4000.0     249035.2    622.6
Process Creation                                126.0       4239.2    336.4
Shell Scripts (1 concurrent)                     42.4       3404.9    803.0
Shell Scripts (8 concurrent)                      6.0       7774.8  12958.0
System Call Overhead                          15000.0    2399837.6   1599.9
                                                                   ========
System Benchmarks Index Score                                        1995.8

------------------------------------------------------------------------
Benchmark Run: Thu May 02 2024 07:29:25 - 07:57:36
8 CPUs in system; running 8 parallel copies of tests

Dhrystone 2 using register variables      396360992.9 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    58344.0 MWIPS (9.8 s, 7 samples)
Execl Throughput                              20889.7 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks      12927118.6 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks         3677514.2 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks      22497528.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                            21037325.6 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                1958050.1 lps   (10.0 s, 7 samples)
Process Creation                              44864.0 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                  65052.2 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   9420.2 lpm   (60.0 s, 2 samples)
System Call Overhead                       19695065.7 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0  396360992.9  33964.1
Double-Precision Whetstone                       55.0      58344.0  10608.0
Execl Throughput                                 43.0      20889.7   4858.1
File Copy 1024 bufsize 2000 maxblocks          3960.0   12927118.6  32644.2
File Copy 256 bufsize 500 maxblocks            1655.0    3677514.2  22220.6
File Copy 4096 bufsize 8000 maxblocks          5800.0   22497528.7  38788.8
Pipe Throughput                               12440.0   21037325.6  16911.0
Pipe-based Context Switching                   4000.0    1958050.1   4895.1
Process Creation                                126.0      44864.0   3560.6
Shell Scripts (1 concurrent)                     42.4      65052.2  15342.5
Shell Scripts (8 concurrent)                      6.0       9420.2  15700.3
System Call Overhead                          15000.0   19695065.7  13130.0
                                                                   ========
System Benchmarks Index Score                                       13756.0

berendt · 2024-05-02T18:13:17Z

@shmelkin Can you please share how you benchmarked the VM? I would like to add this to the docs as a sample for a benchmark. We only documented fio at the moment.

shmelkin · 2024-05-03T06:10:03Z

@berendt I generally use the open source tool UnixBench for this.

shmelkin · 2024-05-03T10:49:45Z

issued LetsEncrypt certificates for public openstack api endpoints
debugging network issue with OVN in the cluster, see post from @horazont for more info
fixed image-parameter minRam for all scs-compliant images
did tempest tests for the cluster (WIP)

horazont · 2024-05-03T11:43:39Z

Okay, this is interesting. Basically it's a "regression" from the OVS setups we are used to.

In OpenvSwitch/L3-Agent based setups, the NAT rules for ingress (and I suppose also egress) traffic for floating IPs were set up no matter whether the port to which the floating IP was bound was ACTIVE or DOWN.

In OVN, the NAT rules are only set up when the port is up.

That breaks a specific use case, which is the use of VRRP/keepalived in VMs to implement custom load balancers or other HA solutions.

(In particular, this breaks yaook/k8s which we tried to run as a "burn in" test.)

I'll bring this up in next week's IaaS call.

shmelkin · 2024-05-07T08:55:17Z

Installed healthmonitor on l1.cloudandheat.com https://health.l1.cloudandheat.sovereignit.cloud:3000/
created credentials/projects for SCS members so that they can prepare scs-summit providerexchange
further investigated OVN NAT topic (no solution yet...)
Finished rollout of yk8s as burn-in test

horazont · 2024-05-07T13:01:26Z

We looked more into the OVN issue and it seems the only viable workaround is using allowed-address on the non VRRP port. This is somewhat sad, we'll discuss it in the IaaS call tomorrow.

berendt · 2024-05-08T06:20:13Z

In osism/terraform-base (used by osism/testbed) we do it this way (allowed-address) as well (VRRP is only used inside the virtual network and the managed VIP is only accessed from inside the same virtual network):

security group rule to allow protocol 112: https://github.com/osism/terraform-base/blob/ac9eb008deffc74f08b2f7070b2f7526852c7fef/testbed-default/neutron.tf#L85-L91
allowed address pairs on the ports: https://github.com/osism/terraform-base/blob/main/testbed-default/nodes.tf#L15-L21

We do not reserve the VIPs by creating unassigned Neutron ports because we work with static IP addresses in osism/terraform-base. This is therefore not necessary.

It also looks as if this has always been the way independent of OVN. At least https://www.codecentric.de/wissens-hub/blog/highly-available-vips-openstack-vms-vrrp comes from a time when IMO there was no OVN in OpenStack (or OVN itself?).

Searched for some more references:

https://access.redhat.com/documentation/de-de/red_hat_openstack_platform/13/html/networking_guide/config-allowed-address-pairs_rhosp-network (Also quite old and IMO before OVN)
https://docs.openstack.org/neutron/latest/admin/archives/introduction.html#allowed-address-pairs (VRRP is mentioned there: "This enables the use of protocols such as VRRP, which floats an IP address between two instances to enable fast data plane failover."

artificial-intelligence · 2024-05-08T07:32:52Z

I think @berendt is right, if this would work without allowed_address_pairs you would have a security issue I think.

By default strict filters only allow the configured subnets and associated macs to pass.
Via allowed_address_pairs you are given an allowlist to extend this where needed, e.g. for VRRP.

If it would work the other way around arbitrary L2 or L3 traffic would be allowed to flow, which is of course insecure?

shmelkin · 2024-05-08T11:19:27Z

Summary of working day:

prepared renaming of api endpoints
tested using multiple fqdn's via ingress redirect (not successfull, yet)
tested using multiple ingresses to serve the same endpoint (not successfull, yet)
installed horizon dashboard (horizon.l1.cloudandheat.com, temporary name)
removing of error-ish and stuck volumes

shmelkin · 2024-05-13T12:04:32Z

Summary of multiple working days:

performed maintenance (deletion of stuck volumes) x2
renamed API for horizon
rolled out users for maintenance on firewalls, ymc, installnode and all controllers, computes, and storages (for C&H Rufbereitschaft) to make support for SCS Summit possible
looked into issue that cinder volumes get stuck constantly (ongoing...)

anjastrunk added question Further information is requested SCS-VP10 Related to tender lot SCS-VP10 and removed question Further information is requested labels Apr 2, 2024

anjastrunk assigned martinmo Apr 2, 2024

anjastrunk mentioned this issue Apr 2, 2024

[EPIC] Evaluate costs of making a cluster SCS compliant #426

Open

18 tasks

anjastrunk changed the title ~~[Other] Setup productive SCS cluster at Cloud&Heat with Yaook~~ [Other] Setup productive SCS cluster at Cloud&Heat with Yaook on bare metal Apr 2, 2024

anjastrunk assigned cah-hbaum Apr 2, 2024

martinmo mentioned this issue Apr 9, 2024

[Other] Investigate why nova cell1 database in bare metal FITKO SCS cluster is overflowing #557

Closed

anjastrunk changed the title ~~[Other] Setup productive SCS cluster at Cloud&Heat with Yaook on bare metal~~ [Other] Setup productive SCS cluster at Cloud&Heat with Yaook on bare metal as FitKom PoC Apr 10, 2024

anjastrunk changed the title ~~[Other] Setup productive SCS cluster at Cloud&Heat with Yaook on bare metal as FitKom PoC~~ [Other] Setup productive SCS cluster at Cloud&Heat with Yaook on bare metal as FitKo PoC Apr 10, 2024

anjastrunk changed the title ~~[Other] Setup productive SCS cluster at Cloud&Heat with Yaook on bare metal as FitKo PoC~~ [Other] Setup productive SCS cluster at Cloud&Heat with Yaook on bare metal as FITKO PoC Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Other] Setup productive SCS cluster at Cloud&Heat with Yaook on bare metal as FITKO PoC #544

[Other] Setup productive SCS cluster at Cloud&Heat with Yaook on bare metal as FITKO PoC #544

anjastrunk commented Apr 2, 2024 •

edited

martinmo commented Apr 8, 2024

anjastrunk commented Apr 10, 2024

martinmo commented Apr 26, 2024

shmelkin commented May 2, 2024

shmelkin commented May 2, 2024

berendt commented May 2, 2024

shmelkin commented May 3, 2024

shmelkin commented May 3, 2024 •

edited

horazont commented May 3, 2024

shmelkin commented May 7, 2024 •

edited

horazont commented May 7, 2024 •

edited

berendt commented May 8, 2024 •

edited

artificial-intelligence commented May 8, 2024

shmelkin commented May 8, 2024 •

edited

shmelkin commented May 13, 2024

[Other] Setup productive SCS cluster at Cloud&Heat with Yaook on bare metal as FITKO PoC #544

[Other] Setup productive SCS cluster at Cloud&Heat with Yaook on bare metal as FITKO PoC #544

Comments

anjastrunk commented Apr 2, 2024 • edited

martinmo commented Apr 8, 2024

anjastrunk commented Apr 10, 2024

martinmo commented Apr 26, 2024

shmelkin commented May 2, 2024

shmelkin commented May 2, 2024

berendt commented May 2, 2024

shmelkin commented May 3, 2024

shmelkin commented May 3, 2024 • edited

horazont commented May 3, 2024

shmelkin commented May 7, 2024 • edited

horazont commented May 7, 2024 • edited

berendt commented May 8, 2024 • edited

artificial-intelligence commented May 8, 2024

shmelkin commented May 8, 2024 • edited

shmelkin commented May 13, 2024

anjastrunk commented Apr 2, 2024 •

edited

shmelkin commented May 3, 2024 •

edited

shmelkin commented May 7, 2024 •

edited

horazont commented May 7, 2024 •

edited

berendt commented May 8, 2024 •

edited

shmelkin commented May 8, 2024 •

edited