You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note: This is a continuation of a discussion originally started by @aureq in #8593 (comment)_ which has since been moved to discussions, but I can't post a comment.
@aureq, I have managed to automate this with Ansible not using any kind of wrapped python. Here's my playbook:
- hosts: k8sgather_facts: falseserial: 1tasks:
- name: Update apt cache on {{ inventory_hostname_short }}ansible.builtin.apt:
update_cache: yes
- name: Check if there are updates for {{ inventory_hostname_short }}ansible.builtin.command:
cmd: apt list --upgradableregister: updates
- name: Cordon node {{ inventory_hostname_short }}delegate_to: localhostkubernetes.core.k8s_drain:
state: cordonname: "{{ inventory_hostname_short }}"when: updates.stdout_lines | reject('search','Listing...') | list | length > 0
- name: Evict Longhorn volumes from {{ inventory_hostname_short }}delegate_to: localhostkubernetes.core.k8s_json_patch:
kind: nodesnamespace: longhorn-systemapi_version: longhorn.io/v1beta2name: "{{ inventory_hostname_short }}"patch:
- op: replacepath: /spec/allowSchedulingvalue: false
- op: replacepath: /spec/evictionRequestedvalue: truewhen: updates.stdout_lines | reject('search','Listing...') | list | length > 0
- name: Wait for Longhorn volume eviction on {{ inventory_hostname_short }}delegate_to: localhostkubernetes.core.k8s_info:
kind: nodesnamespace: longhorn-systemapi_version: longhorn.io/v1beta2name: "{{ inventory_hostname_short }}"register: replica_listuntil: "replica_list.resources[0] | community.general.json_query('status.diskStatus.*.scheduledReplica') |unique == [{}]"retries: 60delay: 10when: updates.stdout_lines | reject('search','Listing...') | list | length > 0
- name: Drain node {{ inventory_hostname_short }}delegate_to: localhost# unfortunately the k8s_drain command from kubernetes.core really struggles with longhorn as it throws a 429 too many requests error very often, in spite of all the attempts to cleanly migrate volumes in longhornansible.builtin.shell: kubectl drain {{ inventory_hostname_short }} --ignore-daemonsets --delete-emptydir-datawhen: updates.stdout_lines | reject('search','Listing...') | list | length > 0
- name: Upgrade all packages on node {{ inventory_hostname_short }}ansible.builtin.apt:
update_cache: noupgrade: yesforce: yesdpkg_options: 'force-confdef,force-confold'when: updates.stdout_lines | reject('search','Listing...') | list | length > 0# Restart required?
- name: Check if reboot is needed for {{ inventory_hostname_short }}stat: path=/var/run/reboot-requiredregister: check_reboot when: updates.stdout_lines | reject('search','Listing...') | list | length > 0
- name: Reboot node {{ inventory_hostname_short }}ansible.builtin.reboot:
connect_timeout: 5reboot_timeout: 600pre_reboot_delay: 0post_reboot_delay: 30test_command: whoamimsg: "Reboot complete"when: check_reboot.stat.exists and updates.stdout_lines | reject('search','Listing...') | list | length > 0
- name: Uncordon node {{ inventory_hostname_short }}delegate_to: localhostkubernetes.core.k8s_drain:
state: uncordonname: "{{ inventory_hostname_short }}"tags:
- alwayswhen: updates.stdout_lines | reject('search','Listing...') | list | length > 0
- name: Re-enable Longhorn volumes on {{ inventory_hostname_short }}delegate_to: localhostkubernetes.core.k8s_json_patch:
kind: nodesnamespace: longhorn-systemapi_version: longhorn.io/v1beta2name: "{{ inventory_hostname_short }}"patch:
- op: replacepath: /spec/allowSchedulingvalue: true
- op: replacepath: /spec/evictionRequestedvalue: falsewhen: updates.stdout_lines | reject('search','Listing...') | list | length > 0
The magic is in the kubernetes.core.k8s_json_patch tasks, which patches the Longhorn nodes and evicts the volumes running on them. This causes Longhorn to rebalance them if additional nodes are available (in my case, they are) and things continue as you'd expect.
There's almost certainly some improvement available in this playbook, but this works for me as a way to automatically update my hosts.
The text was updated successfully, but these errors were encountered:
Note: This is a continuation of a discussion originally started by @aureq in #8593 (comment)_ which has since been moved to discussions, but I can't post a comment.
@aureq, I have managed to automate this with Ansible not using any kind of wrapped python. Here's my playbook:
The magic is in the
kubernetes.core.k8s_json_patch
tasks, which patches the Longhorn nodes and evicts the volumes running on them. This causes Longhorn to rebalance them if additional nodes are available (in my case, they are) and things continue as you'd expect.There's almost certainly some improvement available in this playbook, but this works for me as a way to automatically update my hosts.
The text was updated successfully, but these errors were encountered: