-
Notifications
You must be signed in to change notification settings - Fork 23.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reboot and Wait for #14413
Comments
add
as a workaround to avoid the connection shutting before Ansible can 'reap' the temp files and close the connection. |
Just as info, this is documented at https://support.ansible.com/hc/en-us/articles/201958037-Reboot-a-server-and-wait-for-it-to-come-back although not maintained in this repos docs, and does not show up at docs.ansible.com |
@bcoca Added as you said but still ran into same error. I have to use ignore_errors: true to skip that error.
Error: |
I had same problem with 2.0.0.2, this workaround helped me: - name: Wait for server come back
wait_for: >
host="{{ inventory_hostname }}"
port=22
delay=15
timeout=60
delegate_to: localhost |
You may want to try: shell: sleep 2 && /sbin/shutdown -r now |
@andyhky it worked!! :) thanks! Final solution in Ansible 2.1 that works is as follows - name: Restart server
become: yes
shell: sleep 2 && /sbin/shutdown -r now "Ansible system package upgraded"
- name: waiting 30 secs for server to come back
local_action: wait_for host={{ ansible_default_ipv4.address }} port=22 state=started delay=30 timeout=60
become: false |
@sayantandas does the solution still work for you? I am using ansible 2.1.1.0 and get the following: |
I found this answer that solve the problem for me : http://stackoverflow.com/a/39174307 - name: Restart server
become: yes
shell: sleep 2 && /sbin/shutdown -r now "Ansible system package upgraded"
async: 1
poll: 0 |
The |
I can confirm this broke completely on 2.1 in our install. We had it working on 1.9 in the "1.9" way, upgraded Ansible to 2.1, modified the task to the "2.1" way, and it breaks every time. |
This solution kinda works with my install. However, the |
After a bit of trial and error with various solutions posted for various versions, the following is working for me on 2.1.2 with an Ubuntu 16.04 guest VM and OS X host using Vagrant (1.8.6) and VirtualBox (5.1.8). - name: "Reboot if required"
shell: sleep 2 && shutdown -r now 'Reboot required' removes=/var/run/reboot-required
become: true
async: 1
poll: 0
ignore_errors: true
- name: "Wait for reboot"
local_action: wait_for host={{ ansible_default_ipv4.address }} port=22 delay=10 state=started
become: false @furiml: Not sure if this applies to what you're trying to do, but this second task will poll every 10 seconds (default) after a 10 second delay to see if port 22 on the guest machine is open before continuing i.e. it won't take the full allocated timeout value. |
An update of the docs and/or the support article to use the preferred full YAML format for tasks would also be nice. This works for me: - name: reboot nodes
shell: sleep 2 && shutdown -r now "Ansible reboot"
async: 1
poll: 0
ignore_errors: true
- name: wait for server to come back
local_action: wait_for
args:
host: "{{ inventory_hostname }}"
port: 22
state: started
delay: 30
timeout: 300 |
I wrote something else to test this. Instead of waiting for an host to be up, I want to wait for it to be down. - name: "Wait for the machine to be down"
local_action: wait_for
args:
host={{target}}
port=22
state=stopped
delay=1
timeout=3600
become: false If I understood well, this will poll the port 22 of my target every second and will only continue if it is closed. I shutdown the machine myself, but Ansible is stuck for 5 minutes now :( |
@martineg that works great! It's now included in the Galaxy role |
On ansible 2.2 this does not reboot my computer. It simply says that job is started, and then waits for 22 port. But node does not reboot! |
I have the same issue as @sashgorokhov on Ubuntu 16.04/ansible 2.2.1.0. Just says "OK" and doesn't reboot. |
@noaho maybe this tiny code snippet could help you: tasks:
- shell: shutdown -r now This simply reboots the node without waiting for it (in my case I really dont need to wait for it to reboot) |
@sashgorokhov unfortunately I need it to reboot, Worked around it with at command, but wastes 1 minute before taking any action, so I'd prefer to have this working. |
Trying to get this working on Centos 7.3 servers from F25 workstation with Ansible 2.2.1, but doesn't seems to be working. Any workaround? EDIT: |
Having the same problem. Any updates on this matter? Looks like many folks are facing this issue |
This is still a problem (Mac version 2.3.0.0), target is a Fedora Instance in AWS. None of the above workarounds worked for me (the wouldn't error, but also didn't reboot it) so I did the following (where delayed_reboot is just a shell script, sleep and reboot): - copy:
src: files/delayed_reboot
dest: /tmp/delayed_reboot
owner: root
group: root
mode: 0700
- name: Restart machine
shell: nohup /tmp/delayed_reboot &
async: 1
poll: 0
ignore_errors: true
become: true
become_method: sudo
when: new_kernel.changed or new_kernel_headers.changed
- name: Wait for machine to restart
local_action:
module: wait_for
host={{ inventory_hostname }}
port=22
delay=20
timeout=300
state=started
become: false
when: new_kernel.changed or new_kernel_headers.changed |
ISSUE TYPE
COMPONENT NAMEShared connection ANSIBLE VERSION
CONFIGURATION
OS / ENVIRONMENTAnsible host: macOS Sierra 10.12.4 SUMMARYSTEPS TO REPRODUCE- name: install python and deps for ansible modules
raw: dnf install -y python2 python2-dnf libselinux-python
- name: gather facts
setup:
- name: Install new Kernel
dnf:
name: https://kojipkgs.fedoraproject.org//packages/kernel/4.9.13/201.fc25/x86_64/kernel-core-4.9.13-201.fc25.x86_64.rpm
register: new_kernel
- name: Install new Kernel headers
dnf:
name: https://kojipkgs.fedoraproject.org//packages/kernel/4.9.13/201.fc25/x86_64/kernel-headers-4.9.13-201.fc25.x86_64.rpm
register: new_kernel_headers
- name: Restart machine
command: reboot
async: 1
poll: 0
ignore_errors: true
become: true
become_method: sudo
when: new_kernel.changed or new_kernel_headers.changed
- name: Wait for machine to restart
local_action:
module: wait_for
host={{ inventory_hostname }}
port=22
delay=20
timeout=300
state=started
become: false
when: new_kernel.changed or new_kernel_headers.changed EXPECTED RESULTSThe target should reboot properly and ansible continue the playbook. ACTUAL RESULTSSee output below with -vvv
|
@peterwillcn IMO You better use wait_for_connection instead of wait_for, see: http://docs.ansible.com/ansible/latest/wait_for_connection_module.html It's not just easier, it also works over jumphosts or proxies, using the exact same transport Ansible uses for the target node. |
@dagwieers how is the reboot action plugin coming along ? |
Found a couple roles out there to take care of this: |
@afeld that role looks great. |
So my issue with all the examples I have seen is that it all relies on some random wait time before starting the poll to see if ssh port is available. Now given different hosts require differing amounts of time to shutdown processes depending on what it’s doing - it either means you have to set a long delay to catch the worst offender or you risk false positives.
The new wait_for_connection just uses ping and another random delay factor (see above). So again huge risk of false positives (confirmed by redhat support).
The way I have made this slightly more robust is using 2 tasks - first one waits for ssh port to be absent - this starts immediately and has maximum wait of 15 mins, polls every second - this should be plenty of time for server processes to shutdown and means that you should only have to wait for regular os services to stop.
The second ssh not running It starts task 2 - wait for ssh port state - started after 1 min delay.
Note the wait_for port doesn’t rely on ssh it uses a python socket to determine if port is up
Andy
…Sent from my iPhone
On 12 Dec 2017, at 05:41, Shaun Smiley ***@***.***> wrote:
@afeld that role looks great.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@akcrisp And the problem with your implementation is that it fails for anything but the simple direct-connection use-case. The wait_for_connection module used to do this as well, but it fails for ssh_proxy, or other proxied transport connections, so we had to remove it. You can make the delay-time configurable per system/group or other characteristics, but that's not ideal. |
Agreed but not clear how you fix this without ? A none deterministic finger in the air random delay ?
…Sent from my iPhone
On 12 Dec 2017, at 16:20, Dag Wieers ***@***.***> wrote:
@akcrisp And the problem with your implementation is that it fails for anything but the simple direct-connection use-case. The wait_for_connection module used to do this as well, but it fails for ssh_proxy, or other proxied transport connections, so we had to remove it.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Isn't it possible to add ssh check to |
this worked well for me:
|
This worked for me on Ansible 2.4.2.0 and Ubuntu 16.04 LTS on Azure
|
I guess my use case was much more complex. Here's mine written as a handler: - name: Inform of reboot required
listen: reboot machine
debug:
msg: "System {{ inventory_hostname }} needs to be rebooted for changes to take effect"
- name: Update GRUB to pick up changes to default config, if any
command: update-grub2
listen: reboot machine
# Send the reboot command and let it run in the background
# so we can disconnect...
- name: Send reboot command
listen: reboot machine
shell: '(sleep 5; shutdown -r now) &'
- name: Clear host errors
listen: reboot machine
meta: clear_host_errors
failed_when: false
- name: Reset connection
listen: reboot machine
meta: reset_connection
failed_when: false
- name: Wait for SSH to be available
listen: reboot machine
local_action: wait_for
args:
host: "{{ ansible_host }}"
port: "{{ ansible_port | default('22') }}"
delay: 60
state: started
- name: Ansible ping
listen: reboot machine
local_action: ping
register: result
until: result.ping is defined and result.ping == 'pong'
retries: 30
delay: 10
- name: Run uptime
listen: reboot machine
command: uptime
# LACP and spanning-tree take a bit of time to start working
- name: Ping default gateway
listen: reboot machine
command: "ping -c 1 {{ ansible_default_ipv4.gateway }}"
register: result
until: result.rc == 0
retries: 30
delay: 10 |
Here's my solution (Ansible 2.4.2): - name: restart machine
shell: nohup sh -c '(sleep 5; shutdown -r now "Ansible restart") &' &>/dev/null
become: yes
- name: wait for machine to restart
wait_for_connection:
delay: 60
sleep: 5
timeout: 300 |
this worked for me:
|
All these workarounds are interesting, but the real fix will be
Right? (from #14413 (comment)) |
Confirmed. |
looking forward to it |
I am interested to know whether any reboot module will support various Unix flavours beyond Linux ? Ie aix / Solaris etc, I assume it works with windows ?
The point I made with my example and most seem to have missed it - is that by simply having a time out of waiting for port 22 - it’s entirely possible to get a false positive - if a host takes longer to shutdown processes i.e. think large database - than the delay factor then it may well not have actually rebooted - tested and proved this can happen - hence my test to ensure ssh is absent first.
Andy
…Sent from my iPhone
On 28 Feb 2018, at 15:04, Dag Wieers ***@***.***> wrote:
Confirmed.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
ansible-playbook test.yaml The error appears to have been in '/etc/ansible/test.yaml': line 4, column 10, but may The offending line appears to be:
help please ;-) |
Try this. |
Grief - I wish people would read what I have done. Everyone just waiting for timeout risks a false positive. It would only take an app along time to shutdown and it will think ssh is up after reboot. I have tested it.
You are far better off checking ssh is absent first - that doesn’t rely on shh - uses a python socket connection
…Sent from my iPhone
On 16 Apr 2018, at 09:21, Ben Abineri ***@***.***> wrote:
---
- hosts: all
- name: restart the system
shell: "sleep 5 & reboot"
async: 1
poll: 0
- name: wait for the system to reboot
wait_for_connection:
connect_timeout: 20
sleep: 5
delay: 5
timeout: 60
Try this.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
You're right - my comment wasn't an endorsement of the design, I just wanted to demonstrate the correct formatting. |
So we now have a reboot and win_reboot action plugin to reboot Unix and Windows servers. If you have any issues with the existing implementation, feel free to open a new issue with any specifics. |
ISSUE TYPE
COMPONENT NAME
wait_for
ANSIBLE VERSION
v2.2
SUMMARY
Hi,
I have the below as a part of my playbok to upgrade all system packages, reboot the machine and wait for it to come back. The ansible playbook exits when machine reboots and is not waiting for the host to come back online and run the remaining playbook. Can you please suggest?
Reboot works but unusable playbook lost it connection as shown with above error.
Let me know if any details required. Thanks.
Thanks,
Govind
The text was updated successfully, but these errors were encountered: