vMX becomes unreachable after stop and restart #220

mbound · 2020-02-04T14:12:54Z

If I stop a vMX with docker stop or docker kill and then start it again, the newly restarted router is unreachable, either via SSH or telnet.

I couldn't find anything obvious in the logs that would explain this silent failure, and I've not been able to access the vCP or vFPC VMs from inside the container either.

Anyone had the same issue?

The text was updated successfully, but these errors were encountered:

plajjan · 2020-02-05T21:47:12Z

In general, vrnetlab routers are not tested with doing stop / start. It is assumed they are created (docker run) and then live until they are stopped and destroyed. Restarts aren't necessarily supported. If it works it is more by happenstance than per design.

I think it could conceptually work to restart a container but there might be pieces missing. i can assist you, if you would like to look into the issue.

mbound · 2020-02-09T16:54:52Z

Thanks, looking at the trace after a docker stop/start I can see the following:

MPC NG 2E/3E JAM Plugin: load succeeded
 bcmsdk_5_9_x kld
 Loading BCMSDK module.....
Junosprocfs mounted on /junosproc.
@ 1581266781 [2020-02-09 16:46:21 UTC] mtx_init product vmx
@ 1581266781 [2020-02-09 16:46:21 UTC] mtx_init pvi_model
@ 1581266781 [2020-02-09 16:46:21 UTC] mtx_init tvp_mode 0
@ 1581266781 [2020-02-09 16:46:21 UTC] mgd start
Creating initial configuration:  ...

2020-02-09 16:46:31,145: launch     TRACE    OUTPUT VCP: mgd: error: commit-script
mgd: error:   unable to release privileges
mgd: error: 1 error reported by translation scripts
mgd: error: translation script failure
Warning: Failed to commit active configuration.
Warning: Trying to commit recovery mode configuration.

2020-02-09 16:46:35,151: launch     TRACE    OUTPUT VCP: mgd: error: Unable to find the rescue configuration file: /config/rescue.conf
Warning: Commit failed, activating partial configuration.
Warning: Edit the router configuration to fix these errors.
@ 1581266794 [2020-02-09 16:46:34 UTC] mgd done

 Lock Manager
RDM Embedded 7 [04-Aug-2006] http://www.birdstep.com
Copyright (c) 1992-2006 Birdstep Technology, Inc.  All Rights Reserved.

Unix Domain sockets Lock manager
Lock manager 'lockmgr' started successfully.

Database Initialization Utility
RDM Embedded 7 [04-Aug-2006] http://www.birdstep.com
Copyright (c) 1992-2006 Birdstep Technology, Inc.  All Rights Reserved.


2020-02-09 16:46:39,155: launch     TRACE    OUTPUT VCP: Profile database initialized
Set Enhanced BBE Default...
Enhanced BBE Default for vmx set to... 2
cp: /var/etc/login.conf: No such file or directory
invalid user: getpwuid failsSet Enhanced BBE to 2
lag enhanced disabled 0
No core dumps found.
chown: nobody: illegal user name
Prefetching /usr/sbin/rpd ...
Prefetching /usr/libexec64/rpd ...
Prefetching /usr/sbin/lacpd ...
Prefetching /usr/sbin/chassisd ...
Starting jlaunchhelperd.
Invoking jdid_diag_mode_setup.sh on junos
Starting cron.

Sun Feb  9 16:46:38 UTC 2020

But I don't think that mgd error is the main issue, because I should still be able to telnet into the vCP, unless it's causing JunOS to hang in a loop of sorts.
My understanding is it shouldn't even get to this stage if it gets a login prompt, which to me means, the issue might be something earlier...?

aaglenn mentioned this issue May 21, 2020

docker stop, then docker stop puts vMX container to Unhealthy state #226

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vMX becomes unreachable after stop and restart #220

vMX becomes unreachable after stop and restart #220

mbound commented Feb 4, 2020

plajjan commented Feb 5, 2020

mbound commented Feb 9, 2020 •

edited

vMX becomes unreachable after stop and restart #220

vMX becomes unreachable after stop and restart #220

Comments

mbound commented Feb 4, 2020

plajjan commented Feb 5, 2020

mbound commented Feb 9, 2020 • edited

mbound commented Feb 9, 2020 •

edited