[TEST] Retry update setting #8582

yangchiu · 2024-05-16T05:53:03Z

What's the test to develop? Please describe

Somehow client.update(setting, value=setting_value) could fail:

longhorn.ApiError: (ApiError(...), '500 : Operation cannot be fulfilled on settings.longhorn.io "taint-toleration": the object has been modified; please apply your changes to the latest version and try again\n{\'code\': 500, \'detail\': \'\', \'message\': \'Operation cannot be fulfilled on settings.longhorn.io "taint-toleration": the object has been modified; please apply your changes to the latest version and try again\', \'status\': 500}')

https://ci.longhorn.io/job/public/job/master/job/sles/job/arm64/job/longhorn-tests-sles-arm64/851/testReport/tests/test_settings/test_setting_toleration/

Add retry mechanism for it to avoid it failing test cases.

Describe the tasks for the test

Additional context

https://ci.longhorn.io/job/public/job/master/job/sles/job/amd64/job/longhorn-tests-sles-amd64/921/testReport/junit/tests/test_settings/test_setting_priority_class/

core_api = <kubernetes.client.api.core_v1_api.CoreV1Api object at 0x7fd01662f150>
apps_api = <kubernetes.client.api.apps_v1_api.AppsV1Api object at 0x7fd01662ce10>
scheduling_api = <kubernetes.client.api.scheduling_v1_api.SchedulingV1Api object at 0x7fd018417e90>
priority_class = {'apiVersion': 'scheduling.k8s.io/v1', 'kind': 'PriorityClass', 'metadata': {'name': 'priority-class-1bht9e'}, 'value': 700232437}
volume_name = 'longhorn-testvol-gldc9m'

    def test_setting_priority_class(core_api, apps_api, scheduling_api, priority_class, volume_name):  # NOQA
        """
        Test that the Priority Class setting is validated and utilized correctly.
    
        1. Verify that the name of a non-existent Priority Class cannot be used
        for the Setting.
        2. Create a new Priority Class in Kubernetes.
        3. Create and attach a Volume.
        4. Verify that the Priority Class Setting can be updated with an attached
           volume.
        5. Generate and write `data1`.
        6. Detach the Volume.
        7. Update the Priority Class Setting to the new Priority Class.
        8. Wait for all the Longhorn system components to restart with the new
           Priority Class.
        9. Verify that UI, manager, and drive deployer don't have Priority Class
        10. Attach the Volume and verify `data1`.
        11. Generate and write `data2`.
        12. Unset the Priority Class Setting.
        13. Wait for all the Longhorn system components to restart with the new
            Priority Class.
        14. Verify that UI, manager, and drive deployer don't have Priority Class
        15. Attach the Volume and verify `data2`.
        16. Generate and write `data3`.
    
        Note: system components are workloads other than UI, manager, driver
         deployer
        """
        client = get_longhorn_api_client()  # NOQA
        count = len(client.list_node())
        name = priority_class['metadata']['name']
        setting = client.by_id_setting(SETTING_PRIORITY_CLASS)
    
        with pytest.raises(Exception) as e:
            client.update(setting, value=name)
        assert 'failed to get priority class ' in str(e.value)
    
        scheduling_api.create_priority_class(priority_class)
    
        volume = create_and_check_volume(client, volume_name)
        volume.attach(hostId=get_self_host_id())
        volume = wait_for_volume_healthy(client, volume_name)
    
>       setting = client.update(setting, value=name)

test_settings.py:527: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
longhorn.py:388: in update
    return self._put_and_retry(url, *args, **kw)
longhorn.py:401: in _put_and_retry
    raise e
longhorn.py:395: in _put_and_retry
    return self._put(url, data=self._to_dict(*args, **kw))
longhorn.py:74: in wrapped
    return fn(*args, **kw)
longhorn.py:312: in _put
    self._error(r.text, r.status_code)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <longhorn.Client object at 0x7fd0182ed150>
text = '{"actions":{},"code":"Internal Server Error","detail":"","links":{"self":"http://10.42.1.8:9500/v1/settings/priority-...bject has been modified; please apply your changes to the latest version and try again","status":500,"type":"error"}\n'
status_code = 500

    def _error(self, text, status_code):
>       raise ApiError(self._unmarshall(text), status_code)
E       longhorn.ApiError: (ApiError(...), '500 : Operation cannot be fulfilled on settings.longhorn.io "priority-class": the object has been modified; please apply your changes to the latest version and try again\n{\'code\': 500, \'detail\': \'\', \'message\': \'Operation cannot be fulfilled on settings.longhorn.io "priority-class": the object has been modified; please apply your changes to the latest version and try again\', \'status\': 500}')

longhorn.py:283: ApiError

https://ci.longhorn.io/job/public/job/master/job/sles/job/amd64/job/longhorn-tests-sles-amd64/921/testReport/junit/tests/test_settings/test_setting_toleration/

def test_setting_toleration():
        """
        Test toleration setting
    
        1.  Set `taint-toleration` to "key1=value1:NoSchedule; key2:InvalidEffect".
        2.  Verify the request fails.
        3.  Create a volume and attach it.
        4.  Set `taint-toleration` to "key1=value1:NoSchedule; key2:NoExecute".
        5.  Verify that can update toleration setting when any volume is attached.
        6.  Generate and write `data1` into the volume.
        7.  Detach the volume.
        8.  Set `taint-toleration` to "key1=value1:NoSchedule; key2:NoExecute".
        9.  Wait for all the Longhorn system components to restart with new
            toleration.
        10. Verify that UI, manager, and drive deployer don't restart and
            don't have new toleration.
        11. Attach the volume again and verify the volume `data1`.
        12. Generate and write `data2` to the volume.
        13. Detach the volume.
        14. Clean the `toleration` setting.
        15. Wait for all the Longhorn system components to restart with no
            toleration.
        16. Attach the volume and validate `data2`.
        17. Generate and write `data3` to the volume.
        """
        client = get_longhorn_api_client()  # NOQA
        apps_api = get_apps_api_client()  # NOQA
        core_api = get_core_api_client()  # NOQA
        count = len(client.list_node())
    
        setting = client.by_id_setting(SETTING_TAINT_TOLERATION)
    
        with pytest.raises(Exception) as e:
            client.update(setting,
                          value="key1=value1:NoSchedule; key2:InvalidEffect")
        assert 'invalid effect' in str(e.value)
    
        volume_name = "test-toleration-vol"  # NOQA
        volume = create_and_check_volume(client, volume_name)
        volume.attach(hostId=get_self_host_id())
        volume = wait_for_volume_healthy(client, volume_name)
    
        setting_value_str = "key1=value1:NoSchedule; key2:NoExecute"
        setting_value_dicts = [
            {
                "key": "key1",
                "value": "value1",
                "operator": "Equal",
                "effect": "NoSchedule"
            },
            {
                "key": "key2",
                "value": None,
                "operator": "Exists",
                "effect": "NoExecute"
            },
        ]
>       setting = client.update(setting, value=setting_value_str)

test_settings.py:158: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
longhorn.py:388: in update
    return self._put_and_retry(url, *args, **kw)
longhorn.py:401: in _put_and_retry
    raise e
longhorn.py:395: in _put_and_retry
    return self._put(url, data=self._to_dict(*args, **kw))
longhorn.py:74: in wrapped
    return fn(*args, **kw)
longhorn.py:312: in _put
    self._error(r.text, r.status_code)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <longhorn.Client object at 0x7fd018596550>
text = '{"actions":{},"code":"Internal Server Error","detail":"","links":{"self":"http://10.42.1.8:9500/v1/settings/taint-tol...bject has been modified; please apply your changes to the latest version and try again","status":500,"type":"error"}\n'
status_code = 500

    def _error(self, text, status_code):
>       raise ApiError(self._unmarshall(text), status_code)
E       longhorn.ApiError: (ApiError(...), '500 : Operation cannot be fulfilled on settings.longhorn.io "taint-toleration": the object has been modified; please apply your changes to the latest version and try again\n{\'code\': 500, \'detail\': \'\', \'message\': \'Operation cannot be fulfilled on settings.longhorn.io "taint-toleration": the object has been modified; please apply your changes to the latest version and try again\', \'status\': 500}')

longhorn.py:283: ApiError

The text was updated successfully, but these errors were encountered:

yangchiu added the kind/test Request for adding test label May 16, 2024

yangchiu self-assigned this May 16, 2024

yangchiu mentioned this issue May 21, 2024

test: add retry mechanism for update setting longhorn/longhorn-tests#1909

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TEST] Retry update setting #8582

[TEST] Retry update setting #8582

yangchiu commented May 16, 2024 •

edited

[TEST] Retry update setting #8582

[TEST] Retry update setting #8582

Comments

yangchiu commented May 16, 2024 • edited

What's the test to develop? Please describe

Describe the tasks for the test

Additional context

yangchiu commented May 16, 2024 •

edited