Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to generate yml files that will produce the existing cluster? #24873

Closed
dcowden opened this issue Apr 27, 2016 · 35 comments
Closed

Comments

@dcowden
Copy link

dcowden commented Apr 27, 2016

Given a kubernetes cluster that is running some number of pods, services, deployments, etc, I would like to generate one or more files ( yml format preferred) that would re-produce the current configuration, when applied to a new cluster.

My use case is a promotion system. I have my 'stack files' as yml files in a git repo, but I need to allow humans to approve changes before they are applied to the cluster.

One way to do this is to use an 'open loop' system. I can use tags or other mechanisms to determine which versions have been applied to the cluster, and then compare the latest version available with the latest deployed version.

The problem with the open-loop system is that it does not consider that changes could have been made outside the files, or that changes applied could have had problems, etc.

If I could extract the 'equivalent' files from a running cluster, I could compare them with the ones that are about to be applied. This is a much stronger, 'closed loop' system-- it is able to correctly understand what will happen when the changes are applied, even if we have lost track of the real target state.

if there were such a thing as kubectl apply -f --dry-run, which lists only the changes that will be made, rather than actually doing the changes, that would work as well. That is already being discussed over at issue #11488

Does anyone have thoughts on this? We are new to kubernetes, but we have created the the functionality I'm describing above for our RedHat/Satellite rpm-based deployments, so i want to re-create it in K8s. Of course, in k8s, we have the complexity that the infrastructure itself can change, not just installed package versions!

@mikedanese
Copy link
Member

kubectl get po,deployment,rc,rs,ds,no,job -o yaml?

@dcowden
Copy link
Author

dcowden commented Apr 29, 2016

Ah yes, of course! This works, but it is not what what I was looking for. It answers my question, but it doesnt give me files that match the ones I used.

I learned that the answer to this question is to read the 'last-applied-configuration' annotation that kubectl adds. this will give the files that were used to produce the config.

@dcowden dcowden closed this as completed Apr 29, 2016
@mikedanese
Copy link
Member

@dcowden also see kubectl get --export

@dcowden
Copy link
Author

dcowden commented Apr 29, 2016

ah thats even better! thanks!

@alahijani
Copy link

alahijani commented Oct 17, 2017

Combining other answers, this is what I came up with for bash:

for n in $(kubectl get -o=name pvc,configmap,serviceaccount,secret,ingress,service,deployment,statefulset,hpa,job,cronjob)
do
    mkdir -p $(dirname $n)
    kubectl get -o=yaml --export $n > $n.yaml
done

Edit: --export is no longer necessary (or even supported) in latest versions.

@rakshazi
Copy link

k8s 1.8

kubectl get all --export=true -o yaml

@LouisStAmour
Copy link

For folks coming here from Google, on my test instance, the last comment's all doesn't appear to include ingress, and also you have to say --all-namespaces to get it to dump other namespaces.

Related: #42885 and #42954 (comment) etc.

@acondrat
Copy link

acondrat commented Jul 18, 2018

A variation on top of the solution provided by @alahijani

for n in $(kubectl get -o=name pvc,configmap,ingress,service,secret,deployment,statefulset,hpa,job,cronjob | grep -v 'secret/default-token')
do
    kubectl get -o=yaml --export $n > $(dirname $n)_$(basename $n).yaml
done

This is to have all yaml files in a single dir for an easy kubectl apply -f. It also excludes the default service account secret which can not be exported.

@PidgeyBE
Copy link

PidgeyBE commented Aug 27, 2018

Another version: Exporting all yaml's from all namespaces. For each namespace a directory is made.

  • including persistent volumes!
i=$((0))
for n in $(kubectl get -o=custom-columns=NAMESPACE:.metadata.namespace,KIND:.kind,NAME:.metadata.name pv,pvc,configmap,ingress,service,secret,deployment,statefulset,hpa,job,cronjob --all-namespaces | grep -v 'secrets/default-token')
do
	if (( $i < 1 )); then
		namespace=$n
		i=$(($i+1))
		if [[ "$namespace" == "PersistentVolume" ]]; then
			kind=$n
			i=$(($i+1))
		fi
	elif (( $i < 2 )); then
		kind=$n
		i=$(($i+1))
	elif (( $i < 3 )); then
		name=$n
		i=$((0))
		echo "saving ${namespace} ${kind} ${name}"
		if [[ "$namespace" != "NAMESPACE" ]]; then
			mkdir -p $namespace
			kubectl get $kind -o=yaml --export $name -n $namespace > $namespace/$kind.$name.yaml
		fi
	fi
done

and for importing again:

path=$(pwd)
for n in $(ls -d */)
do
	echo "Creating namespace ${n:0:-1}"
	kubectl create namespace ${n:0:-1}

	for yaml in $(ls $path/$n)
	do
		echo -e "\t Importing $yaml"
		kubectl apply -f $path/$n$yaml -n ${n:0:-1}
	done

done

@mrwulf
Copy link

mrwulf commented Aug 28, 2018

Another little tweak to exclude service account tokens:

#!/bin/env bash

## https://github.com/kubernetes/kubernetes/issues/24873#issuecomment-416189335

i=$((0))
for n in $(kubectl get -o=custom-columns=NAMESPACE:.metadata.namespace,KIND:.kind,NAME:.metadata.name pv,pvc,configmap,ingress,service,secret,deployment,statefulset,hpa,job,cronjob --all-namespaces | grep -v 'secrets/default-token')
do
    if (( $i < 1 )); then
        namespace=$n
        i=$(($i+1))
        if [[ "$namespace" == "PersistentVolume" ]]; then
            kind=$n
            i=$(($i+1))
        fi
    elif (( $i < 2 )); then
        kind=$n
        i=$(($i+1))
    elif (( $i < 3 )); then
        name=$n
        i=$((0))
        if [[ "$namespace" != "NAMESPACE" ]]; then
            mkdir -p $namespace

            yaml=$((kubectl get $kind -o=yaml $name -n $namespace ) 2>/dev/null)
            if [[ $kind != 'Secret' || $yaml != *"type: kubernetes.io/service-account-token"* ]]; then
                echo "Saving ${namespace}/${kind}.${name}.yaml"
                kubectl get $kind -o=yaml --export $name -n $namespace > $namespace/$kind.$name.yaml
            fi
        fi
    fi
done

@oreasono
Copy link

For those who works in windows Powershell, here's an one-liner:
Foreach ($i in $(kubectl get -o=name pvc,configmap,ingress,service,secret,deployment,statefulset,hpa,job,cronjob)) {If($i -notmatch "default-token") {kubectl get -o=yaml --export $i | Out-File -filepath $($i.Replace("/", "-") + ".yaml")}}

@4m3ndy
Copy link

4m3ndy commented Nov 19, 2018

@mrwulf @acondrat
I think grep -v 'secrets/default-token' should be changed to grep -v 'secret/default-token'
secrets didn't work with me.

I'm using the following versions of kubectl and k8s cluster

Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-24T06:54:59Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.5", GitCommit:"32ac1c9073b132b8ba18aa830f46b77dcceb0723", GitTreeState:"clean", BuildDate:"2018-06-21T11:34:22Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

@acondrat
Copy link

@4m3ndy you are right! Thanks!

@4m3ndy
Copy link

4m3ndy commented Dec 3, 2018

Hey guys just made this docker image for exporting the required yaml files for each component per namespace. these backups are exported then encrypted with a password and uploaded to S3 Bucket.

If any one would like to commit any changes or share any comments, you're more than welcome 👍
ambient-innovation/k8s-backup

@xiaoping378
Copy link
Contributor

example: generate pv yaml.

kubectl get pv -o yaml --export | sed -e '/resourceVersion: "[0-9]\+"/d' -e '/uid: [a-z0-9-]\+/d' -e '/selfLink: [a-z0-9A-Z/]\+/d' -e '/status:/d' -e '/phase:/d' -e '/creationTimestamp:/d' > pvList.yaml

@4m3ndy
Copy link

4m3ndy commented Mar 25, 2019

@xiaoping378
https://github.com/ambient-innovation/k8s-backup/blob/01c1bfe750136648fd91e14dd691ba39bb05f282/k8s-backup.sh#L38

This script should generate all pvc for each namespace then export the yaml file for each pv, have a look

@ivalexm
Copy link

ivalexm commented Oct 1, 2019

Create a folder ${HOME}/clusterstate/, then run:
kubectl cluster-info dump --all-namespaces --output-directory=${HOME}/clusterstate/ -o yaml
All your entities will be in separate folders structure, corresponding to the namespaces.
The .json extesions, i.e. deployments.json, are misleading, as the -o yaml flag will create yaml exports.

@harbdog
Copy link

harbdog commented Jan 19, 2020

Create a folder ${HOME}/clusterstate/, then run:
kubectl cluster-info dump --all-namespaces --output-directory=${HOME}/clusterstate/ -o yaml
All your entities will be in separate folders structure, corresponding to the namespaces.
The .json extesions, i.e. deployments.json, are misleading, as the -o yaml flag will create yaml exports.

FYI, this appears to need a decent amount of RAM for large deployments, my 2GB RAM CLI jumpbox VM can't handle it (probably need 4 or 8 I'd imagine):

fatal error: runtime: out of memory

runtime stack:
runtime.throw(0x1ab7c29, 0x16)
        /usr/local/go/src/runtime/panic.go:774 +0x72
runtime.sysMap(0xc068000000, 0x10000000, 0x2da7238)
        /usr/local/go/src/runtime/mem_linux.go:169 +0xc5
runtime.(*mheap).sysAlloc(0x2d8e9a0, 0x10000000, 0x0, 0x0)
        /usr/local/go/src/runtime/malloc.go:701 +0x1cd
runtime.(*mheap).grow(0x2d8e9a0, 0x8000, 0xffffffff)
        /usr/local/go/src/runtime/mheap.go:1255 +0xa3
runtime.(*mheap).allocSpanLocked(0x2d8e9a0, 0x8000, 0x2da7248, 0x42c7bc)
        /usr/local/go/src/runtime/mheap.go:1170 +0x266
runtime.(*mheap).alloc_m(0x2d8e9a0, 0x8000, 0x101, 0xc000103f18)
        /usr/local/go/src/runtime/mheap.go:1022 +0xc2

I reran on my desktop and tracked kubectl process memory usage, it peaked at just around 4GB, so 8GB it is!

Apparently that about matches the total output size of the dump, which includes logs and some of our pods (90 of them) are putting out logs of over 100MB in size. This would indicate to me the dump command is storing everything in RAM even as it is writing out to disk, probably could be optimized to clear out RAM as logs are finished writing.

@apurvabhandari
Copy link

apurvabhandari commented Feb 6, 2020

Can anyone tell me the command or script which will take backup of cluster (includes ns, deployment, svc, secrets, pv pvc, cm yaml files only) with all information and restore it in new cluster.
I have tried with --export command but in service yaml file name of service is missing and if i take backup without --export then it include clusterIP IP , nodePort Port, loadBalancer IP which is not allowing me to deploy it in new cluster as it is immutable(clusterIP and loadBalancer).
kubernetes cluster version 1.14 onward 1.15/16/17 (trying to take backup/restore in GCP gke or AWS eks).

@vhosakot
Copy link

Thanks to kubectl api-resources! I've been able to get manifests (yaml files) of all resources in all namespaces in k8s using the following bash script:

#!/usr/bin/env bash

while read -r line
do
    output=$(kubectl get "$line" --all-namespaces -o yaml 2>/dev/null | grep '^items:')
    if ! grep -q "\[\]" <<< $output; then
        echo -e "\n======== "$line" manifests ========\n"
        kubectl get "$line" --all-namespaces -o yaml
    fi
done < <(kubectl api-resources | awk '{print $1}' | grep -v '^NAME')

Above bash script was tested with:

  • k8s v1.16.3
  • Ubuntu Bionic 18.04.3 OS
  • bash version version 4.4.20(1)-release (x86_64-pc-linux-gnu)

@jacobsvante
Copy link

@vhosakot Small simplification to your script.

You can replace: kubectl api-resources | awk '{print $1}' | grep -v '^NAME'
With: kubectl api-resources -o name

openshift-publish-robot pushed a commit to openshift/kubernetes that referenced this issue Apr 22, 2020
…volume-expansion-test

Bug 1810470: UPSTREAM: <drop> Increase timeout in volume expansion test

Origin-commit: 74769b89204e470f4c12391134a817d38febdc72
@scones
Copy link

scones commented Sep 17, 2020

#!/usr/bin/env bash

while read -r namespace
do
    echo "scanning namespace '${namespace}'"
    mkdir -p "${HOME}/cluster-backup/${namespace}"
    while read -r resource
    do
        echo "  scanning resource '${resource}'"
        mkdir -p "${HOME}/cluster-backup/${namespace}/${resource}"
        while read -r item
        do
            echo "    exporting item '${item}'"
            kubectl get "$resource" -n "$namespace" "$item" -o yaml > "${HOME}/cluster-backup/${namespace}/${resource}/$item.yaml"
        done < <(kubectl get "$resource" -n "$namespace" 2>&1 | tail -n +2 | awk '{print $1}')
    done < <(kubectl api-resources --namespaced=true 2>/dev/null | tail -n +2 | awk '{print $1}')
done < <(kubectl get namespaces | tail -n +2 | awk '{print $1}')

i extended the script above a little (and slowed it down). this loads all namespaces, loads all resources in all namespaces and then loads each config as a single file in each resource in each namespace. it is verbose and shows some errors, bute the end result (the dump) should be clean.

@nathan-c
Copy link

nathan-c commented Oct 22, 2020

#!/usr/bin/env bash
ROOT=${HOME}/clusterstate

while read -r resource
do
    echo "  scanning resource '${resource}'"
    while read -r namespace item x
    do
        mkdir -p "${ROOT}/${namespace}/${resource}"        
        echo "    exporting item '${namespace} ${item}'"
        kubectl get "$resource" -n "$namespace" "$item" -o yaml > "${ROOT}/${namespace}/${resource}/$item.yaml" &
    done < <(kubectl get "$resource" --all-namespaces 2>&1 | tail -n +2)
done < <(kubectl api-resources --namespaced=true 2>/dev/null | tail -n +2 | awk '{print $1}')

wait

Inspired by @scones but runs a little quicker because of process forking and reduced loop nesting which is useful if you have a lot of custom resource definitions!

@mohamedelhabib
Copy link

Same as @nathan-c
I removed events from resources list to fix errors

#!/usr/bin/env bash
ROOT=${HOME}/clusterstate
while read -r resource
do
    echo "  scanning resource '${resource}'"
    while read -r namespace item x
    do
        mkdir -p "${ROOT}/${namespace}/${resource}"        
        echo "    exporting item '${namespace} ${item}'"
        kubectl get "$resource" -n "$namespace" "$item" -o yaml > "${ROOT}/${namespace}/${resource}/$item.yaml" &
    done < <(kubectl get "$resource" --all-namespaces 2>&1  | tail -n +2)
done < <(kubectl api-resources --namespaced=true 2>/dev/null | grep -v "events" | tail -n +2 | awk '{print $1}')

wait

@dnzlde
Copy link

dnzlde commented Jan 28, 2021

In case someone still searches here for the answer, like me.
Currently there's an easier way to do it - actually get last applied yaml files, without added empty fields:
kubectl apply view-last-applied <resource> <name>
Like this:
kubectl apply view-last-applied configmap nginx-config

@manoj-devops
Copy link

The script looks good, but for cloud maintained cluster seems to have slight issue in connecting to api. It sometimes times out and then reconnect. May be need to slow down the connections to export from cluster. May be some tweaks will do.

Unable to connect to the server: dial tcp 20.143.169.19:443: i/o timeout
Unable to connect to the server: dial tcp 20.143.169.19:443: i/o timeout
Unable to connect to the server: dial tcp 20.143.169.19:443: i/o timeout

@dnzlde
Copy link

dnzlde commented Feb 5, 2021

The script looks good, but for cloud maintained cluster seems to have slight issue in connecting to api. It sometimes times out and then reconnect. May be need to slow down the connections to export from cluster. May be some tweaks will do.

Unable to connect to the server: dial tcp 20.143.169.19:443: i/o timeout
Unable to connect to the server: dial tcp 20.143.169.19:443: i/o timeout
Unable to connect to the server: dial tcp 20.143.169.19:443: i/o timeout

This is obviously just a connection issue you are having.

@ameyaagashe
Copy link

In case someone still searches here for the answer, like me.
Currently there's an easier way to do it - actually get last applied yaml files, without added empty fields:
kubectl apply view-last-applied <resource> <name>
Like this:
kubectl apply view-last-applied configmap nginx-config

But unfortunately, it does not work for all resources, please check deployments, replica sets and ingress

@o6uoq
Copy link

o6uoq commented Feb 17, 2021

Use GitOps

@scones
Copy link

scones commented Mar 2, 2021

@o6uoq investing even more work to solve a small problem does not solve the problem of investing less work.

@jmorcar
Copy link

jmorcar commented Mar 23, 2021

I found the best sophisticated project here :

https://github.com/pieterlange/kube-backup

So there are a lot of short commands to perform and reduce the command lines

For example if you get an inventory list to permit iterate after (like this command )

kubectl get \
namespace,replicaset,secret,nodes,job,daemonset,statefulset,ingress,configmap,pv,pvc,service,deployment,pod,elasticsearch.elasticsearch.k8s.elastic.co,kibana.kibana.k8s.elastic.co,serviceaccount \
--all-namespaces \
--ignore-not-found \
-o custom-columns=NAME:.metadata.name,KIND:.kind,NAMESPACE:.metadata.namespace

Then you can get any yaml from before list with a simple kubectl get <item name> -n <namespace> -o yaml

The project that I indicated before has a excellent implementation here more details

https://github.com/pieterlange/kube-backup/blob/master/entrypoint.sh

So I have pending to check this code to eliminate part of default yaml statuses (I prefer yaml export but the project used json export, so it could be a better solution:

for resource in $GLOBALRESOURCES; do
    [ -d "$GIT_REPO_PATH/$GIT_PREFIX_PATH" ] || mkdir -p "$GIT_REPO_PATH/$GIT_PREFIX_PATH"
    echo "Exporting resource: ${resource}" >/dev/stderr
    kubectl get -o=json "$resource" | jq --sort-keys \
        'del(
          .items[].metadata.annotations."kubectl.kubernetes.io/last-applied-configuration",
          .items[].metadata.annotations."control-plane.alpha.kubernetes.io/leader",
          .items[].metadata.uid,
          .items[].metadata.selfLink,
          .items[].metadata.resourceVersion,
          .items[].metadata.creationTimestamp,
          .items[].metadata.generation
      )' | python -c 'import sys, yaml, json; yaml.safe_dump(json.load(sys.stdin), sys.stdout, default_flow_style=False)' >"$GIT_REPO_PATH/$GIT_PREFIX_PATH/${resource}.yaml"
done

@xykong
Copy link

xykong commented Jun 9, 2021

Inspired by @mohamedelhabib , @nathan-c, @scones
I make a script with options help people easy to use.

https://gist.github.com/xykong/6efdb1ed57535d18cb63aa8e20da3f4b

Run script like this:

./k8sdump.sh -n jmeter -r deployments -o /data/workspace

@andrey-gava
Copy link

@mohamedelhabib @nathan-c @scones

Nice, but what about none namespaced resources? Like pv,crd,psp,clusterrolebindings,clusterroles,sc? Its great to have them too.
Is there way not to add one more while loop to the script?
(warning I'm use kubectl-neat binary for last get)

while read -r resource
do
    echo "  scanning resource '${resource}'"
    while read -r item x
    do
        mkdir -p "${ROOT}/non-namespaced/${resource}"
        echo "    exporting item '${item}'"
        kubectl-neat get -- "$resource" "$item" -o yaml > "${ROOT}/non-namespaced/${resource}/$item.yaml" &
    done < <(kubectl get  "$resource" --all-namespaces 2>&1  | tail -n +2)
done < <(kubectl api-resources --namespaced=false 2>/dev/null | grep -vE "componentstatuses|nodes|mutatingwebhookconfigurations|validatingwebhookconfigurations|apiservices|tokenreviews|.*subject.*|certificatesigningrequests|runtimeclasses|csi.*|volumeattachments" | tail -n +2 | awk '{print $1}' | sort | uniq)

Second problem with repeated resources, which produce duplicate iterate. See output with and without sort | uniq (v1.16.10):

kubectl api-resources --namespaced=true 2>/dev/null | grep -vE "events|controllerrevisions|endpoints|replicasets|pods" | tail -n +2 | awk '{print $1}' | wc -l
60
kubectl api-resources --namespaced=true 2>/dev/null | grep -vE "events|controllerrevisions|endpoints|replicasets|pods" | tail -n +2 | awk '{print $1}' | sort | uniq | wc -l
52

@fzyzcjy
Copy link

fzyzcjy commented Aug 1, 2022

What about the CRDs?

@B-Whitt
Copy link

B-Whitt commented Aug 16, 2022

kubectl cluster-info dump

Unfortunately this does not backup all the resources in the namespace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests