Use specific client set for operations #105

machacekondra · 2020-10-19T13:14:49Z

This PR add usage of specific client set for specific operation.
We now use client set built from kubeconfig parameter only for creation
and watching of the experiment of the pod.
And we use client set built from litmuskubeconfig for watching/updating
chaosengine pod, which may run on different cluster.

Signed-off-by: Ondra Machacek omachace@redhat.com

What this PR does / why we need it:

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #

Special notes for your reviewer:

Checklist:

Fixes #
Labelled this PR & related issue with documentation tag
PR messages has document related information
Labelled this PR & related issue with breaking-changes tag
PR messages has breaking changes related information
Labelled this PR & related issue with requires-upgrade tag
PR messages has upgrade related information
Commit has unit tests
Commit has integration tests

This PR add usage of specific client set for specific operation. We now use client set built from kubeconfig parameter only for creation and watching of the experiment of the pod. And we use client set built from litmuskubeconfig for watching/updating chaosengine pod, which may run on different cluster. Signed-off-by: Ondra Machacek <omachace@redhat.com>

ksatchit · 2020-10-21T15:29:38Z

pkg/utils/types.go

 	flag.Parse()
+	// Use in-cluster config if kubeconfig path is specified
+	if *litmuskubeconfig == "" {
+		configLitmus, err := rest.InClusterConfig()


Should this not be configExperiment ?

Actually no, whatever is specified by litmuskubeconfig parameter is then defined in configLitmus variable. And whatever is specified in kubeconfig parameter is then defined in configExperiment variable.

Ok got it. Could we name this better. Can we say litmusControlPlaneKubeConfig as the one which deals w/ chaos-operator/runner/experimentCR/engineCR & targetKubeConfig as the one that deals w/ the target application & also the experiment job/helper pod & chaosresult resources (note that chaosresult CR is created by the experiment and thereby will reside in the target cluster) .. the logic to read chaosResults and patch to enginine may need to be looked at

ksatchit · 2020-10-21T15:30:45Z

pkg/utils/types.go

 	// Use in-cluster config if kubeconfig path is specified
 	if *kubeconfig == "" {
-		config, err := rest.InClusterConfig()
+		configExperiment, err := rest.InClusterConfig()


And this, the configLitmus?

ksatchit · 2020-10-21T15:33:47Z

pkg/utils/types.go

 	kubeconfig := flag.String("kubeconfig", "", "absolute path to the kubeconfig file")
+	litmuskubeconfig := flag.String("litmuskubeconfig", "", "absolute path to the kubeconfig file")


I suppose we can also distinguish the usage/help message to specify:

litmuskubeconfig - "absolute path to the kubeconfig file of target cluster where experiment job is launched"

ref: #105 (comment)

ispeakc0de · 2020-10-23T06:25:28Z

pkg/utils/builders.go

@@ -116,7 +116,7 @@ func BuildingAndLaunchJob(experiment *ExperimentDetails, clients ClientSets) err

 // launchJob spawn a kubernetes Job using the job Object received.
 func (experiment *ExperimentDetails) launchJob(job *batchv1.Job, clients ClientSets) error {
-	_, err := clients.KubeClient.BatchV1().Jobs(experiment.Namespace).Create(job)
+	_, err := clients.KubeClientExperiment.BatchV1().Jobs(experiment.Namespace).Create(job)


Just trying to understand which config is used for which purpose.

kubeconfig & KubeClientExperiment for ChaosEngine, ChaosExperiment, ChaosRunner, ChaosOperator

litmuskubeconfig & k8sClientSetLitmus for TargetApplication, ExperimentJob/Pod, ChaosResult
right? if this is the case, would you like to use the clients.KubeClientExperiment to get the chaosresult verdict here?
.

Well I was thinking having the chausresult used by kubeconfig. Do you think it should be used by litmuskubeconfig?

ksatchit · 2020-10-23T07:00:37Z

Also, just thinking out load - on the impacts/benefits (usecases) of this change. Today, the chaos resources (CRs) and the associated pod/job resources are tightly tied together.

The engine details is referenced inside the experiments and the chaos k8s events are generated against the chaosengine resource
The abort logic today is via patching the engine resource and has the operator itself removing the runner/experiment jobs/helper etc.,
The chaosresults are generated by the experiment and viewed by the runner in order to patch the engine (result being a more involved descriptor of test details over enginestatus, in which just a summary is maintained)

This would mean playing/passing around multiple configs across components. While feasible, was wondering if this is really needed?

@machacekondra , there is a feature in the (up and coming) litmus-portal which allows you to connect an "external cluster" and execute chaos against it via an agent (that is auto deployed when this 'cluster connect' happens). The target/external cluster also has the entire set of litmus components installed into it. The only difference is in case of the portal we run chaos workflows (Argo workflows that embed the engine spec) rather than the engine directly. Do you think this satisfies your usecase? Here is a ref to the userguide: https://docs.google.com/document/d/1fiN25BrZpvqg0UkBCuqQBE7Mx8BwDGC8ss2j2oXkZNA/edit#heading=h.solcj2aoqyws

ispeakc0de · 2020-10-23T09:05:51Z

Also, just thinking out load - on the impacts/benefits (usecases) of this change. Today, the chaos resources (CRs) and the associated pod/job resources are tightly tied together.

The engine details is referenced inside the experiments and the chaos k8s events are generated against the chaosengine resource

The abort logic today is via patching the engine resource and has the operator itself removing the runner/experiment jobs/helper etc.,

The chaosresults are generated by the experiment and viewed by the runner in order to patch the engine (result being a more involved descriptor of test details over enginestatus, in which just a summary is maintained)

This would mean playing/passing around multiple configs across components. While feasible, was wondering if this is really needed?

@machacekondra , there is a feature in the (up and coming) litmus-portal which allows you to connect an "external cluster" and execute chaos against it via an agent (that is auto deployed when this 'cluster connect' happens). The target/external cluster also has the entire set of litmus components installed into it. The only difference is in case of the portal we run chaos workflows (Argo workflows that embed the engine spec) rather than the engine directly. Do you think this satisfies your usecase? Here is a ref to the userguide: https://docs.google.com/document/d/1fiN25BrZpvqg0UkBCuqQBE7Mx8BwDGC8ss2j2oXkZNA/edit#heading=h.solcj2aoqyws

Adding to @ksatchit comments, Few more complexities can arise if we will use different kubeconfig for runner & experiment-pod. In the case of node attributes/operations i.e, node-selector, node-taint, etc. We are bounding those pod to the same node so that they will remain unaffected in case of node-level chaos. The separate cluster/kubeconfig may have different node-name/labels which can result to failure for those cases.

machacekondra · 2020-10-23T13:13:16Z

Just went quickly throught the targeted cluster doc and it should solve the use case. I will try it.

machacekondra · 2020-10-29T11:57:02Z

@ispeakc0de @ksatchit Is there any plan to have some API for the create external cluster, rather then doing it only via UI app?

ksatchit · 2020-10-29T13:56:34Z

@ispeakc0de @ksatchit Is there any plan to have some API for the create external cluster, rather then doing it only via UI app?

@machacekondra that's a really good point. IMO, we should definitely have well structured and documented API for this - & hope this is already being thought of actively. Tagging @gdsoumya and @rajdas98 to respond.

gdsoumya · 2020-10-29T14:10:33Z

@ispeakc0de @ksatchit Is there any plan to have some API for the create external cluster, rather then doing it only via UI app?

@machacekondra we have graphql apis exposed, though they are currently undocumented for public use. It should be possible to interact with the server directly using those. We can give a quick walkthrough if you want to try it immediately, we will document the APIs soon so that it can be used directly.

machacekondra · 2020-11-01T13:52:04Z

Well I find it quite not easy to use the multiple clusters from the portal, so I would welcome if it could be somehow possible via multiple kube config files. Other approach would be to have only special kubeconfig file for the helper pod, if that could somehow solve the gaps. What do you think?

ksatchit · 2020-11-03T04:29:00Z

Hey @machacekondra , sorry to hear you are not finding the portal convenient - hopefully this is addressed in the upcoming builds. Having said that launching helpers alone (and maybe any other specific actions on the target clusters - via a different kubeconfig) on a case-by-case basis does seem like a better option than changing it in chaos-runner. Makes things simpler/less-complicated by one level. I would like to hear @ispeakc0de 's views on this too

gdsoumya · 2020-11-03T04:46:27Z

Hi @machacekondra, can you share your concerns or issues that you might be facing while trying to use the portal? Once we have the api doc public it should be possible to automate the external cluster connects and also the chaos workflow deployments. If you want to try the api way of doing things we can give you a rough documentation for now.

machacekondra · 2020-11-04T14:13:33Z

@gdsoumya Well, I think then you are locked to the litmus-portal usage only with chaos workflows. Users can't then simply use ChaosEngine etc. am I correct?

machacekondra · 2020-11-05T15:24:25Z

If you want to try the api way of doing things we can give you a rough documentation for now.
What I would need is following:

Create external cluster.
Obtain the kubectl I should run on the external cluster to set it up.
Run the experiment on external cluster, while the managment cluster would be the self cluster.

gdsoumya · 2020-11-05T15:55:17Z

Hi @machacekondra yes you are right, you cannot directly use chaosengines but you can surely use them inside your chaos-workflow. If your goal is just to run an experiment you can run a workflow with only one step that is executing a chaosengine.

Yes the above flow should be possible with the apis, as I mentioned our apis are graphql apis so you can use any graphql client of your choice in your programming language or even craft http requests to perform the gql operations directly.

Once you setup your protal(login and setup the admin project) and have an exposed address for the backend server, you need to first get a auth token from the auth server using an api call and then use that token as an auth header in your gql queries. That's the only extra step you need to perform. After that there is basically only 2 operations or api calls that you need to make - 1st one is to get the manifest to connect the external cluster and second api call is to create the workflow that will be executed.

If you want we can write a sample script/program in go to show how you can do the above using the apis through http requests.

ksatchit reviewed Oct 21, 2020

View reviewed changes

ispeakc0de reviewed Oct 23, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use specific client set for operations #105

Use specific client set for operations #105

machacekondra commented Oct 19, 2020

ksatchit Oct 21, 2020

machacekondra Oct 22, 2020

ksatchit Oct 23, 2020 •

edited

machacekondra Oct 23, 2020

ksatchit Oct 21, 2020

ksatchit Oct 21, 2020 •

edited

machacekondra Oct 22, 2020

ksatchit Oct 23, 2020

ispeakc0de Oct 23, 2020

machacekondra Oct 23, 2020

ksatchit commented Oct 23, 2020 •

edited

ispeakc0de commented Oct 23, 2020

machacekondra commented Oct 23, 2020

machacekondra commented Oct 29, 2020

ksatchit commented Oct 29, 2020 •

edited

gdsoumya commented Oct 29, 2020

machacekondra commented Nov 1, 2020

ksatchit commented Nov 3, 2020 •

edited

gdsoumya commented Nov 3, 2020

machacekondra commented Nov 4, 2020 •

edited

machacekondra commented Nov 5, 2020

gdsoumya commented Nov 5, 2020

		kubeconfig := flag.String("kubeconfig", "", "absolute path to the kubeconfig file")
		litmuskubeconfig := flag.String("litmuskubeconfig", "", "absolute path to the kubeconfig file")

Use specific client set for operations #105

Are you sure you want to change the base?

Use specific client set for operations #105

Conversation

machacekondra commented Oct 19, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ksatchit Oct 23, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ksatchit Oct 21, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ksatchit commented Oct 23, 2020 • edited

ispeakc0de commented Oct 23, 2020

machacekondra commented Oct 23, 2020

machacekondra commented Oct 29, 2020

ksatchit commented Oct 29, 2020 • edited

gdsoumya commented Oct 29, 2020

machacekondra commented Nov 1, 2020

ksatchit commented Nov 3, 2020 • edited

gdsoumya commented Nov 3, 2020

machacekondra commented Nov 4, 2020 • edited

machacekondra commented Nov 5, 2020

gdsoumya commented Nov 5, 2020

ksatchit Oct 23, 2020 •

edited

ksatchit Oct 21, 2020 •

edited

ksatchit commented Oct 23, 2020 •

edited

ksatchit commented Oct 29, 2020 •

edited

ksatchit commented Nov 3, 2020 •

edited

machacekondra commented Nov 4, 2020 •

edited