Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: runtime error: invalid memory address or nil pointer dereference #676

Open
ibreakthecloud opened this issue Sep 6, 2018 · 2 comments

Comments

@ibreakthecloud
Copy link

kubicorn apply kubicorn-cluster

2018-09-06T12:23:36+05:30 [ℹ]  Selected [fs] state store
2018-09-06T12:23:36+05:30 [ℹ]  Loaded cluster: kubicorn-cluster
2018-09-06T12:23:36+05:30 [ℹ]  Init Cluster
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x140 pc=0x14f9068]

goroutine 1 [running]:
github.com/kubicorn/kubicorn/pkg/initapi.sshLoader(0xc0000e11e0, 0x15, 0x0, 0x0)
	/home/karn/go/src/github.com/kubicorn/kubicorn/pkg/initapi/ssh.go:29 +0x38
github.com/kubicorn/kubicorn/pkg/initapi.InitCluster(0xc0000e11e0, 0xc, 0x0, 0x0)
	/home/karn/go/src/github.com/kubicorn/kubicorn/pkg/initapi/init.go:48 +0x9c
github.com/kubicorn/kubicorn/cmd.runApply(0xc0000e99e0, 0x59a8e0, 0xc000613cc0)
	/home/karn/go/src/github.com/kubicorn/kubicorn/cmd/apply.go:118 +0x2dd
github.com/kubicorn/kubicorn/cmd.ApplyCmd.func1(0xc000026900, 0xc00022be00, 0x1, 0x1)
	/home/karn/go/src/github.com/kubicorn/kubicorn/cmd/apply.go:59 +0x68
github.com/kubicorn/kubicorn/vendor/github.com/spf13/cobra.(*Command).execute(0xc000026900, 0xc00022bdc0, 0x1, 0x1, 0xc000026900, 0xc00022bdc0)
	/home/karn/go/src/github.com/kubicorn/kubicorn/vendor/github.com/spf13/cobra/command.go:750 +0x2cc
github.com/kubicorn/kubicorn/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x2e67e20, 0xc000613f78, 0x15c71c9, 0x1b0a6cf)
	/home/karn/go/src/github.com/kubicorn/kubicorn/vendor/github.com/spf13/cobra/command.go:831 +0x2dc
github.com/kubicorn/kubicorn/vendor/github.com/spf13/cobra.(*Command).Execute(0x2e67e20, 0x407450, 0xc0000b6058)
	/home/karn/go/src/github.com/kubicorn/kubicorn/vendor/github.com/spf13/cobra/command.go:784 +0x2b
github.com/kubicorn/kubicorn/cmd.Execute()
	/home/karn/go/src/github.com/kubicorn/kubicorn/cmd/root.go:76 +0x2d
main.main()
	/home/karn/go/src/github.com/kubicorn/kubicorn/main.go:22 +0x20
@teq0
Copy link

teq0 commented Oct 22, 2018

I was having this problem, and it was caused by clusterAPI::spec::providerConfig being empty. The panic is caused by ProviderConfig().SSH being nil here - https://github.com/kubicorn/kubicorn/blob/master/pkg/initapi/ssh.go#L29. If you fix the code to check for nil you still don't get very far because everything else is missing.

I haven't figured out what yet, but something appears to be blanking out that section in cluster.yaml sometimes, possibly after certain errors occur. Solution was to regenerate the file with kubicorn create, or just back it up so you can restore it if something overwrites it.

Edit: This appears to be because the ProviderConfig is cleared once the Reconciler has been created, but if there's another error after that before actual provisioning the ProviderConfig is lost.

Edit2: It's here. At this point newCluster.ClusterAPI.Spec.ProviderConfig is empty. In the case of Azure the cluster object has been (re)constructed in https://github.com/kubicorn/kubicorn/blob/master/cloud/azure/public/resources/resourcegroup.go#L131, which starts with an empty ProviderConfig, and only copies a couple of fields into it from the resourceGroup, and these (in my case) are also empty.

@teq0
Copy link

teq0 commented Oct 25, 2018

I've now played with this a bit more and traced through it many times, and I must be missing something because I can't see how it ever worked. Has anyone successfully deployed an Azure cluster in the last few months?

The thing I can't understand is that, as per my previous comment, resourcegroup.immutableRender() returns a cluster object with the ProviderConfig created from a new struct i.e. empty, so most or all of the original ProviderConfig is lost at this point.

What puzzles me even more is that AtomicReconciler.Actual() is passed in a cluster config in the parameter known, but this is not used in the function, it instead starts with r.known but it loops over the resources and overwrites the actualCluster variable inside the loop, so it only ever returns a cluster object with the resource info of the last resource in the list.

At which point the original ProviderGroup details from cluster.yaml are lost, but they are used after that, like to get the SSH details, and are stored back to the original file, so they are lost forever. You can't even retry.

Sorry if I'm just sharing my ignorance and I'm missing something pretty basic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants