You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We got business request to provide projects with limited TTL. In our automation we can add checking, that requester not provide higher possible value, but requester can create project with small TTL and ask operation staff to extend the TTL. If operation staff made mistake, project can live too long.
My idea is add optional setting (e.g. --max-project-ttl) to the operator for checking max TTL acquired from the K8s object.
E.g. correct state:
let --max-project-ttl set to 120h
staff to the object set cs.sap.com/ttl: 48h
operator will not change the label (48h < 120h), current good state
E.g. wrong state:
let --max-project-ttl set to 720h
staff to the object set cs.sap.com/ttl: 768h
operator will change the label to 720h (768h > 720h), to be sync with value from the setting (768h -> 720h) and log this step as a warning
This setting prevent to do mistakes from human operational staff (e.g. typo with more zeroes on the end). It will be also a good practice to prevent burning resources in case, that somebody want to have very long living project.
If you have better idea to solve the problem, please, share it.
The text was updated successfully, but these errors were encountered:
How about having creation time as a metric and then calculate the max ttl based on need with something like having a recording rule expr: time() - app_creation_time / (60 * 60 * 24) and then have an alert based on however days we would like to have with something like expr: app_uptime_days > 30 .
We could either use the creationTime := pod.ObjectMeta.CreationTimestamp to use something that is already available or set one during creation time with a appCreationTime.Set(float64(time.Now().Unix())) when the application starts up.
OK, lets think about it, that we have this metric and alert. If situation occurred, operations team must act and fix it. I mean, place correct value. Here is another opportunity to make human failure.
Why making opportunities for failure?
If we have this feature AND operations team place here wrong value, operator fix it immediately and bad state disappeared for good.
We got business request to provide projects with limited TTL. In our automation we can add checking, that requester not provide higher possible value, but requester can create project with small TTL and ask operation staff to extend the TTL. If operation staff made mistake, project can live too long.
My idea is add optional setting (e.g.
--max-project-ttl
) to the operator for checking max TTL acquired from the K8s object.E.g. correct state:
--max-project-ttl
set to120h
cs.sap.com/ttl: 48h
E.g. wrong state:
--max-project-ttl
set to720h
cs.sap.com/ttl: 768h
720h
(768h > 720h), to be sync with value from the setting (768h -> 720h) and log this step as a warningThis setting prevent to do mistakes from human operational staff (e.g. typo with more zeroes on the end). It will be also a good practice to prevent burning resources in case, that somebody want to have very long living project.
If you have better idea to solve the problem, please, share it.
The text was updated successfully, but these errors were encountered: