Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to destroy AWS node with volume: * aws_volume_attachment.jenkins_disk_attachment: Error waiting for Volume (vol-XXXX) to detach from Instance: i-XXXXX #1991

Closed
hashibot opened this issue Oct 20, 2017 · 5 comments
Labels
bug Addresses a defect in current functionality. service/ec2 Issues and PRs that pertain to the ec2 service. stale Old or inactive issues managed by automation, if no further action taken these will get closed.

Comments

@hashibot
Copy link

This issue was originally opened by @evgeniagusakova as hashicorp/terraform#16167. It was migrated here as a result of the provider split. The original body of the issue is below.


Hi Team!

I faced with issue when I'm not able to destroy instance with volume created by terraform

Terraform Version

bash-4.3$ uname  -s
Darwin
bash-4.3$ terraform version
Terraform v0.10.6

Issue Explanation

I created instance and attached pre-created volume with the following code:
(all code not related to issue like sec. groups, EIP etc are skipped)

data "aws_ebs_volume" "jenkins_disk" {
  most_recent = true

  filter {
    name   = "volume-id"
    values = ["vol-XXXXX"]
  }
}
resource "aws_volume_attachment" "jenkins_disk_attachment" {
  device_name = "/dev/sdx"
  volume_id   = "${data.aws_ebs_volume.jenkins_disk.id}"
  instance_id = "${aws_instance.jenkins_server.id}"
}

resource "aws_instance" "jenkins_server" {
  connection {
    type        = "ssh"
    user        = "${var.instance_user}"
    host        = "${aws_instance.jenkins_server.public_ip}"
    private_key = "${file("${var.jenkins_server_ssh_key_path}")}"
  }
...
}

And got an error when I tried to do terraform destroy

terraform  destroy   -var secret_key=<KEY> -var access_key=<KEY>
aws_security_group.jenkins_server_security_group: Refreshing state... (ID: sg-XXXXX)
<other resources here>
data.aws_ebs_volume.jenkins_disk: Refreshing state...
<other resources here>
aws_instance.jenkins_server: Refreshing state... (ID: i-XXXXX)
aws_volume_attachment.jenkins_disk_attachment: Refreshing state... (ID: vai-XXXXX)

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  - destroy

Terraform will perform the following actions:
<full and correct list of resources to delete here>
Plan: 0 to add, 0 to change, 12 to destroy.
Do you really want to destroy?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.
  Enter a value: yes

<deleting resource>
aws_volume_attachment.jenkins_disk_attachment: Destroying... (ID: vai-XXXXX)
<deleting resources>
aws_volume_attachment.jenkins_disk_attachment: Still destroying... (ID: vai-XXXXX, 10s elapsed)
aws_volume_attachment.jenkins_disk_attachment: Still destroying... (ID: vai-XXXXX, 20s elapsed)
aws_volume_attachment.jenkins_disk_attachment: Still destroying... (ID: vai-XXXXX, 30s elapsed)
aws_volume_attachment.jenkins_disk_attachment: Still destroying... (ID: vai-XXXXX, 40s elapsed)
Error applying plan:
1 error(s) occurred:
* aws_volume_attachment.jenkins_disk_attachment (destroy): 1 error(s) occurred:
* aws_volume_attachment.jenkins_disk_attachment: Error waiting for Volume (vol-XXXXX) to detach from Instance: i-XXXXX

Debug Output

Expected Behavior

Instance is deleted, volume is deattached but not deleted.

Actual Behavior

EIP is deattached
Sec. Groups are not deleted
Instance is not deleted
Volume is not deattaced

Workaround

on node:

  1. Stop all processes using disk mount point (lsof helps to find all processes)
  2. Sync disk and unmount it
  3. Deattach volume using amazon console
  4. Remove attachment from state terraform state rm aws_volume_attachment.jenkins_disk_attachment
  5. Destroy instance: terraform destroy -var ....

Additional details

I'm really sure that disk was not de-attached because it was mounted and in-use. From Amazon Console I was able to deattach it only after I did disk unmount.

But I expected Forced deattach from terraform after some (configurable?) timeout.

So if this is expected behaviour I guess we need to convert this bug report into feature request :)

References

Not sure but this issue looks close to my.
It was with old terraform version so I decided to create new issue.

@abinashmeher999
Copy link
Contributor

@evgeniagusakova You might want to try setting skip_destroy to true. https://www.terraform.io/docs/providers/aws/r/volume_attachment.html#skip_destroy

#1017 would be worth a look.

@radeksimko radeksimko added the service/ec2 Issues and PRs that pertain to the ec2 service. label Jan 28, 2018
@RC-2
Copy link

RC-2 commented Jul 17, 2018

I am facing the same problem and it does help to set skip_destroy

@romapi
Copy link

romapi commented Mar 12, 2019

I had similar issue with destroying aws_ebs_volume attached to instance.
In order for an application to be able to use the volume (disk space) on instance the following steps have to be done:

  • create aws volume
  • attach it to an instance
  • create a filesystem on that disk
  • create a mount point folder
  • add mountpoint to fstab
  • mount it
  • start using it (typically run an application)

So I tried to detach a mounted volume from instance manually and after a while it failed with a note:
.... /dev/xvdg (busy)
Based on that I came to a conclusion that terraform cannot forcibly remove mounted volume from an instance. Which is correct. So in order to detach the volume I have created a so-called procedure that unmounts the volume safely before detaching it:

  • stop using volume (stop application)
  • unmount it
  • remove mountpoint from fstab (optional, if it's permanent)
  • remove mount folder (optional, if it's permanent)
  • detach from instance
  • remove aws volume

Which in code looks like this:

# Commands in variable to create 
# a filesystem and add to fstab
variable "dataCreate" {
  default = [ "echo 'BEGIN: mkfs.ext4 /dev/xvdg'",
              "sudo mkfs.ext4 /dev/xvdg > /dev/null 2>&1 && echo 'END: mkfs.ext4 /dev/xvdg done'",
              "echo 'BEGIN: create /data folder'",
              "sudo mkdir -p /data > /dev/null 2>&1 && echo 'END: create /data folder done'",
              "echo 'BEGIN: Add /dev/xvdg to fstab /data'",
              "sudo sh -c \"echo '/dev/xvdg /data ext4 defaults 1 2' >> /etc/fstab\" && echo 'END: Added /dev/xvdg to fstab /data'",
              "echo 'BEGIN: Mount /dev/xvdg as /data'",
              "sudo mount /bitcoin && echo 'END: Mount /data done'" ]
}
## Variable holding commands to stop application, 
## unmount and remove from fstab
variable "dataUmount" {
  default = [ "echo 'BEGIN: Stop application using /data''",
              "sudo systemctl stop {{APP}} && echo 'END: Application using /data stopped'",
              "echo 'BEGIN: unmount /data'",
              "sudo umount /data && echo 'END: unmount /data done'",
              "echo 'BEGIN: Remove /data from fstab''",
              "sudo sed -i '/data/d' /etc/fstab && echo 'END: Remove /data from fstab'",
              "echo 'BEGIN: Remove /data folder",
              "sudo rm -rfv /data && echo 'END: Added /dev/xvdg to fstab /data'"
            ]
}
## Create a data volume
resource "aws_ebs_volume" "data" {
  availability_zone = "eu-central-1a"
  size = "10"
}
## Attach to instance
resource "aws_volume_attachment" "data_attach" {
  device_name = "/dev/xvdg"
  volume_id   = "${aws_ebs_volume.data.id}"
  instance_id = "${aws_instance.instance.id}"
}

## Configure volume
resource "null_resource" "data_config" {
  depends_on = ["aws_volume_attachment.data_attach", "aws_instance.instance"]
  
 ## commands to run when creating
  provisioner "remote-exec" {
    connection {
      type = "ssh"
      agent = false
      timeout = "60s"
      host = "${aws_instance.instance.public_ip}"
      user = "ubuntu"
      private_key = "${file(var.privateKey)}"
    }
    inline = ["${var.dataCreate}"]
    }

  ## run unmount commands when destroying the data volume
  provisioner "remote-exec" {
    when = "destroy"
    on_failure = "continue"
    connection {
      type = "ssh"
      agent = false
      timeout = "60s"
      host = "${aws_instance.instance.public_ip}"
      user = "ubuntu"
      private_key = "${file(var.privateKey)}"
    }
    inline = ["${var.dataUmount}"]
  }
}

## Create instance
resource "aws_instance" "instance" {
  depends_on = ["aws_ebs_volume.data"]
  ami = .....
......
}

So the idea is to free the volume before it gets detached.
Hope it helps someone.

@github-actions
Copy link

github-actions bot commented Mar 2, 2021

Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 30 days it will automatically be closed. Maintainers can also remove the stale label.

If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!

@github-actions github-actions bot added the stale Old or inactive issues managed by automation, if no further action taken these will get closed. label Mar 2, 2021
@github-actions github-actions bot closed this as completed Apr 1, 2021
@ghost
Copy link

ghost commented May 2, 2021

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked as resolved and limited conversation to collaborators May 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/ec2 Issues and PRs that pertain to the ec2 service. stale Old or inactive issues managed by automation, if no further action taken these will get closed.
Projects
None yet
Development

No branches or pull requests

5 participants