Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vMX becomes unreachable after stop and restart #220

Open
mbound opened this issue Feb 4, 2020 · 2 comments
Open

vMX becomes unreachable after stop and restart #220

mbound opened this issue Feb 4, 2020 · 2 comments

Comments

@mbound
Copy link

mbound commented Feb 4, 2020

If I stop a vMX with docker stop or docker kill and then start it again, the newly restarted router is unreachable, either via SSH or telnet.

I couldn't find anything obvious in the logs that would explain this silent failure, and I've not been able to access the vCP or vFPC VMs from inside the container either.

Anyone had the same issue?

@plajjan
Copy link
Collaborator

plajjan commented Feb 5, 2020

In general, vrnetlab routers are not tested with doing stop / start. It is assumed they are created (docker run) and then live until they are stopped and destroyed. Restarts aren't necessarily supported. If it works it is more by happenstance than per design.

I think it could conceptually work to restart a container but there might be pieces missing. i can assist you, if you would like to look into the issue.

@mbound
Copy link
Author

mbound commented Feb 9, 2020

Thanks, looking at the trace after a docker stop/start I can see the following:

MPC NG 2E/3E JAM Plugin: load succeeded
 bcmsdk_5_9_x kld
 Loading BCMSDK module.....
Junosprocfs mounted on /junosproc.
@ 1581266781 [2020-02-09 16:46:21 UTC] mtx_init product vmx
@ 1581266781 [2020-02-09 16:46:21 UTC] mtx_init pvi_model
@ 1581266781 [2020-02-09 16:46:21 UTC] mtx_init tvp_mode 0
@ 1581266781 [2020-02-09 16:46:21 UTC] mgd start
Creating initial configuration:  ...

2020-02-09 16:46:31,145: launch     TRACE    OUTPUT VCP: mgd: error: commit-script
mgd: error:   unable to release privileges
mgd: error: 1 error reported by translation scripts
mgd: error: translation script failure
Warning: Failed to commit active configuration.
Warning: Trying to commit recovery mode configuration.

2020-02-09 16:46:35,151: launch     TRACE    OUTPUT VCP: mgd: error: Unable to find the rescue configuration file: /config/rescue.conf
Warning: Commit failed, activating partial configuration.
Warning: Edit the router configuration to fix these errors.
@ 1581266794 [2020-02-09 16:46:34 UTC] mgd done

 Lock Manager
RDM Embedded 7 [04-Aug-2006] http://www.birdstep.com
Copyright (c) 1992-2006 Birdstep Technology, Inc.  All Rights Reserved.

Unix Domain sockets Lock manager
Lock manager 'lockmgr' started successfully.

Database Initialization Utility
RDM Embedded 7 [04-Aug-2006] http://www.birdstep.com
Copyright (c) 1992-2006 Birdstep Technology, Inc.  All Rights Reserved.


2020-02-09 16:46:39,155: launch     TRACE    OUTPUT VCP: Profile database initialized
Set Enhanced BBE Default...
Enhanced BBE Default for vmx set to... 2
cp: /var/etc/login.conf: No such file or directory
invalid user: getpwuid failsSet Enhanced BBE to 2
lag enhanced disabled 0
No core dumps found.
chown: nobody: illegal user name
Prefetching /usr/sbin/rpd ...
Prefetching /usr/libexec64/rpd ...
Prefetching /usr/sbin/lacpd ...
Prefetching /usr/sbin/chassisd ...
Starting jlaunchhelperd.
Invoking jdid_diag_mode_setup.sh on junos
Starting cron.

Sun Feb  9 16:46:38 UTC 2020

But I don't think that mgd error is the main issue, because I should still be able to telnet into the vCP, unless it's causing JunOS to hang in a loop of sorts.
My understanding is it shouldn't even get to this stage if it gets a login prompt, which to me means, the issue might be something earlier...?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants