Skip to content
This repository has been archived by the owner on Jan 4, 2022. It is now read-only.

No space left on device /var/lib/machines (again) #281

Open
2 tasks
alban opened this issue Jun 30, 2018 · 7 comments
Open
2 tasks

No space left on device /var/lib/machines (again) #281

alban opened this issue Jun 30, 2018 · 7 comments

Comments

@alban
Copy link
Member

alban commented Jun 30, 2018

To Reproduce:

  • Install Fedora 28 from https://cloud.fedoraproject.org/ (GP2 image) on AWS:
    • m4.large
    • Disk: at least 50GiB
    • ssh: ssh -i ~/.ssh/$KEY fedora@$IP
  • Start a kube-spawn Kubernetes cluster on the AWS EC2 instance:
export KUBERNETES_VERSION=v1.9.9 # or other version
export KUBE_SPAWN_VERSION=master
sudo setenforce 0
sudo dnf install -y btrfs-progs git go iptables libselinux-utils polkit qemu-img systemd-container make docker
mkdir go
export GOPATH=$HOME/go
curl -fsSL -O https://github.com/containernetworking/plugins/releases/download/v0.6.0/cni-plugins-amd64-v0.6.0.tgz
sudo mkdir -p /opt/cni/bin
sudo tar -C /opt/cni/bin -xvf cni-plugins-amd64-v0.6.0.tgz
mkdir -p $GOPATH/src/github.com/kinvolk
cd $GOPATH/src/github.com/kinvolk
git clone https://github.com/kinvolk/kube-spawn.git
cd kube-spawn/
git checkout $KUBE_SPAWN_VERSION
make DOCKERIZED=n
sudo make install
sudo -E kube-spawn create --kubernetes-version $KUBERNETES_VERSION
sudo -E kube-spawn start --nodes=3

And I get the error:

Got 17% of https://alpha.release.flatcar-linux.net/amd64-usr/current/flatcar_developer_container.bin.bz2. 1min 31s left at 4.9M/s.
Failed to write file: Success
Failed to write file: Success
Failed to write file: No space left on device
Failed to retrieve image file. (Wrong URL?)
Exiting.
Failed to start cluster: error running machinectl pull-raw: exit status 1

/var/lib/machines is full.

Expected outcome

@alban
Copy link
Member Author

alban commented Jun 30, 2018

Manually running the workaround suggested in #70 seems to work:

sudo umount /var/lib/machines
sudo qemu-img resize -f raw /var/lib/machines.raw $((10*1024*1024*1024))
sudo mount -t btrfs -o loop /var/lib/machines.raw /var/lib/machines
sudo btrfs filesystem resize max /var/lib/machines
sudo btrfs quota disable /var/lib/machines

@alban
Copy link
Member Author

alban commented Jun 30, 2018

It seems I cannot run the workaround on a fresh install of Fedora because /var/lib/machines does not exist yet. I have to first run into the error, then apply the workaround, and then try kube-spawn again. I guess that's why #70 didn't work.

$ sudo umount /var/lib/machines
umount: /var/lib/machines: not mounted.
$ sudo qemu-img resize -f raw /var/lib/machines.raw $((10*1024*1024*1024))
qemu-img: Could not open '/var/lib/machines.raw': Could not open '/var/lib/machines.raw': No such file or directory
$ sudo mount -t btrfs -o loop /var/lib/machines.raw /var/lib/machines
mount: /var/lib/machines: failed to setup loop device for /var/lib/machines.raw.
$ sudo btrfs filesystem resize max /var/lib/machines
ERROR: not a btrfs filesystem: /var/lib/machines
$ sudo btrfs quota disable /var/lib/machines
ERROR: not a btrfs filesystem: /var/lib/machines

schu added a commit that referenced this issue Jul 9, 2018
While working on 924bb3b I did not
re-add the `EnlargeStoragePool` invocation and first time users
encountered disk spaces issues more often again ever since.

Also, add a note to the README and troubleshooting document to tell
users what to do.

Resolves #281
schu added a commit that referenced this issue Jul 9, 2018
While working on 924bb3b I did not
re-add the `EnlargeStoragePool` invocation and first time users
encountered disk space issues more often again ever since.

Also, add a note to the README and troubleshooting document to tell
users what to do.

Resolves #281
@schu
Copy link
Contributor

schu commented Jul 9, 2018

@alban can you say what steps you took on a fresh Fedora system, would like to add it to the docs.

@alban
Copy link
Member Author

alban commented Jul 10, 2018

@schu Do you mean the steps to work around the issue? See the steps in #282, search for First attempt to use kube-spawn and Workaround for "no space left on device"

@donbowman
Copy link
Contributor

You can run e.g. 'sudo machinectl set-limit 20G' before you launch the first machine, this will set the max limit prior to it creating the btrfs.

@dongsupark
Copy link
Member

@donbowman Yes, we can document that approach.

Anyway to fix this issue, we need to merge #283, which looks good to me. I'm thinking about merging it tomorrow, if there's no objection.
Documentation is still in progress, so I can make a follow-up PR to address the documentation issue.

@dongsupark
Copy link
Member

Hmm, I didn't mean to close it. Will reopen it, as there's a documentation issue left.

@dongsupark dongsupark reopened this Jul 24, 2018
dongsupark pushed a commit that referenced this issue Jul 24, 2018
The PR #283 added some
missing code for fixing free space issues. Though in some corner
cases, we would still need to repair the volume manually.

Partly addresses #281
dongsupark pushed a commit that referenced this issue Aug 30, 2018
The PR #283 added some
missing code for fixing free space issues. Though in some corner
cases, we would still need to repair the volume manually.

Partly addresses #281
dongsupark pushed a commit that referenced this issue Aug 30, 2018
The PR #283 added some
missing code for fixing free space issues. Though in some corner
cases, we would still need to repair the volume manually.

Partly addresses #281
dongsupark pushed a commit that referenced this issue Aug 30, 2018
The PR #283 added some
missing code for fixing free space issues. Though in some corner
cases, we would still need to repair the volume manually.

Partly addresses #281
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants