Troubleshooting
- Understanding Nodes and Devices
- Understanding Netdisco Jobs
- My Device details look all wrong!
- Devices are not being discovered
- Devices have the wrong names
- arpnip does not work
- After OS update or upgrade, Netdisco fails
- Run a
netdisco-do
Task with Debugging - Dump an SNMP object for a Device
- Interactive SQL terminal on the Netdisco Database
- See how Netdisco has parsed the Configuration File
- Change the SNMP community string for a Device
- Database Schema Redeployment
- Installation on SLES 11 SP4
- perl-TermReadLine-Gnu operating system packages
The two basic components in Netdisco’s world are Nodes and Devices.
These are your network hardware, such as routers, switches, and firewalls.
Devices respond to SNMP, and therefore can report useful information about themselves such as interfaces, operating system, IP addresses, as well as knowledge of other systems via MAC address and ARP tables. Devices are actively contacted by Netdisco during a discover (and other polling jobs such as macsuck, arpnip).
Netdisco discovers Devices using "neighbor protocols" such as CDP and LLDP, and peers configured in routing protocols BGP, OSPF, EIGRP and IS-IS. We assume your Devices are running these layer 2 or layer 3 protocols and learning about their connections to each other. If they aren’t, you’ll need to configure manual topology within the web interface (or simply have standalone Devices).
These are the end-stations connected to devices, such as workstations, servers, printers, and telephones.
Nodes, on the other hand, are passive as far as Netdisco is concerned. The only job that contacts a node is nbtstat, which makes NetBIOS queries. Nodes are learned about via the MAC and ARP tables on upstream Devices.
Because Netdisco only learns about devices through certain protocols, it’s possible to run an SNMP agent on a Node (end-station). Only if the Node is also advertising itself via a neighbor or routing protocol will Netdisco treat it as a Device. This can account for undesired behaviour, such as treating a server (Node) as a Device, or vice versa only recognising a switch (Device) as a Node.
To prevent discovery of devices, use the devices_no
configuration setting.
If you don’t see links between Devices in Netdisco, it might be because
they’re not running a neighbor protocol, or for some reason not reporting the
relationships to Netdisco. Use the show
command to troubleshoot this:
~/bin/netdisco-do show -d 192.0.2.1 -e c_id
Please read the section above, if you’ve not yet done so.
Netdisco has four principal job types:
- discover
-
Gather information about a Device, including interfaces, vlans, PoE status, and chassis components (modules). Also learns about potential new Devices via neighbor protocols and adds jobs for their discovery to the queue.
- macsuck
-
Gather MAC to port mappings from known Devices reporting Layer 2 capability. Wireless client information is also gathered from Devices supporting the 802.11 MIBs.
- arpnip
-
Gather MAC to IP mappings from known Devices reporting layer 3 capability.
- nbtstat
-
Poll a Node to obtain its NetBIOS name.
The actions as named above will operate on one device only. Complementary job
types discoverall
, macwalk
, arpwalk
, and nbtwalk
will enqueue one
corresponding single-device job for each known device. The Netdisco backend
daemon will then process the queue (in a random order).
See the tips at Vendor Tips, or else contact the mail list.
First of all, is the backend daemon running? Run the following command at the CLI to check:
~/bin/netdisco-backend status
Besides reading the whole of this manual page for general tips, take a look at the "SNMP Connect Failures" report under the Admin menu. Any devices listed have had multiple SNMP connect failures, indicating a possible configuration error on the device or in Netdisco’s configuration.
Netdisco uses neighbor protocols to discover devices and will use as the
default identity for a device the interface IP advertised over those neighbor
protocols. You can use the device_identity
configuration setting to steer
Netdisco towards using a different interface for the canonical device name.
certain devices don’t expose their arp tables via snmp or snmp::info does not support their proprietary arp mapping. for this we offer an alternative via ssh on certain platforms, check metacpan for supported platforms (those in the App::Netdisco::SSHCollector::Platform namespace).
If you upgrade the operating system then your system libraries will change and Netdisco needs to be rebuilt (specifically, C library bindings which affects perl xs modules & other items).
The safest way to do this is set up a new user and follow the same install instructions, connecting to the same database. Stop the web and backend daemon for the old user, and start them for the new user. Then delete the old user account.
Alternatively, if you do not mind the downtime: stop the web and backend
daemons then delete the ~/perl5
directory and reinstall from
scratch. The configuration file, database, and MIBs can all be reused
in-place.
The netdisco-do
command has several debug flags which will show what’s
going on internally. Usually you always add -D
for general Netdisco
debugging, then -I
for SNMP::Info
logging and -Q
for SQL tracing. For example:
~/bin/netdisco-do discover -d 192.0.2.1 -DIQ
You will see that SNMP community strings and users are hidden by default, to
make the output safe for sending to Netdisco developers. To show the community
string and SNMPv3 protocols, set the SHOW_COMMUNITY
environment variable:
SHOW_COMMUNITY=1 ~/bin/netdisco-do discover -d 192.0.2.1 -DIQ
This is useful when trying to work out why some information isn’t displaying correctly (or at all) in Netdisco. It may be that the SNMP response isn’t understood. Netdisco can dump any leaf or table, by name:
~/bin/netdisco-do show -d 192.0.2.1 -e interfaces ~/bin/netdisco-do show -d 192.0.2.1 -e ::interfaces ~/bin/netdisco-do show -d 192.0.2.1 -e Layer2::HP::interfaces
You can combine this with SNMP::Info debugging, shown above (-I
).
Start an interactive terminal with the Netdisco PostgreSQL database. If you pass an SQL statement in the "-e" option then it will be executed.
~/bin/netdisco-do psql ~/bin/netdisco-do psql -e 'SELECT ip, dns FROM device' ~/bin/netdisco-do psql -e 'COPY (SELECT ip, dns FROM device) TO STDOUT WITH CSV HEADER'
The last example above is useful for sending data to Netdisco developers, as it’s more compact and readable than the standard tabular output (second example).
At the command line you can dump the complete parsed configuration or one
configuration setting (replace settingname
with the setting key):
~/bin/netdisco-do dumpconfig ~/bin/netdisco-do dumpconfig -e settingname
Note that the device_auth
and snmp_auth
settings are not dumped by default.
They can be retrieved for a specific device:
~/bin/netdisco-do dumpconfig -d 192.0.2.1 -e device_auth
If you change the SNMP community string in use on a Device, and update Netdisco’s configuration to match, then everything will continue to work fine.
However, if the Device happens to support two community strings then Netdisco
can become "stuck" on the wrong one, as it caches the last-known-good
community string to improve performance. To work around this, delete the
device (either in the web GUI or using netdisco-do
at the command line),
and then re-discover it.
The database schema can be redeployed (even over an existing installation), in a safe way, using the following command:
~/bin/netdisco-db-deploy --redeploy-all
To drop all tables and data from the database, and redeploy (including setting up an initial web user), run:
❗
|
This is a destructive task! Backup your database first. |
~/bin/netdisco-do psql -e 'DROP OWNED BY netdisco' ~/bin/netdisco-deploy
Try running the following command for installation:
curl -L http://cpanmin.us/ | CFLAGS="-DPERL_ARGS_ASSERT_CROAK_XS_USAGE" perl - --notest --local-lib ~/perl5 App::Netdisco
certain versions of this package can make netdisco-deploy error out with:
Warning: unable to close filehandle properly: Bad file descriptor during global destruction.
upgrading to a newer version (1.35) should fix the problem. if this is not an option setting the environment variable PERL_RL=Perl can also work around the problem.
- Home
- Installation ⇗
- Configuration ⚙
- API
- Hooks
- Cookbook
- Troubleshooting
- Install Tips
- Vendor Tips
- Database Tips
- Custom Reports
- Release Notes
- Docker Images ⇗
- Commercial Support