Process to troubleshoot a problem in the NSP

Purpose

Perform the following high-level sequence of actions with respect to the problem-solving model described in The troubleshooting process .

Stages
 

Establish an operational baseline for your network. See the NSP System Administrator Guide for more information.


When a problem occurs, identify the type of problem. The table below lists some general NSP problem types.

Table 1-1: General NSP problem types

Type

Example problems

Managed network

  • alarms raised against network objects

  • service degradation with no associated alarms

  • problem indications on topology maps

Service and network health

  • network health issues

  • error or warning messages related to configuration

  • problem encountered during diagnose

NSP platform

  • pod failure

  • errored cluster member

  • disk capacity or performance issues

  • MDM server issues


Identify the root cause of the problem using NSP or NFM-P procedures in the document

  1. Use Table 1-2, NSP functions and dashboards problems or tasks to identify the appropriate NSP function troubleshooting procedure for the problem.

  2. Use Table 1-3, NSP platform problems or tasks to identify the appropriate NSP platform troubleshooting procedure for the problem.

  3. Use Table 1-4, NFM-P managed NE network problems or tasks to identify the appropriate NFM-P managed NE network troubleshooting procedure for the problem.

  4. Use Table 1-5, NFM-P network management domain problems or tasks to identify the appropriate NFM-P network management domain troubleshooting procedure for the problem.

  5. Use Table 1-6, NFM-P platform problems or tasks to identify the appropriate NFM-P platform troubleshooting procedure for the problem.

Table 1-2: NSP functions and dashboards problems or tasks

Problem or task

Troubleshooting using alarms

Troubleshooting services and connectivity

Onboarding a service into NSP

Troubleshooting NSP Analytics

Troubleshooting data collection

Troubleshooting data storage

Troubleshooting Analytics reporting

Table 1-3: NSP platform problems or tasks

Problem or task

Troubleshooting NFM-P platform problems

To collect NSP log files

To retrieve a list of pods

To retrieve pod information

To recover pods

To recover executor pods

To retrieve a list of cluster members

To retrieve cluster member information

To retrieve detailed information about MDM servers

To rebalance NE load on MDM servers

To verify disk performance for etcd

To verify disk performance for NSP

Problem: NSP data synchronization is not 100%

Problem: Alarms not appearing for rapidly reoccurring faults

Table 1-4: NFM-P managed NE network problems or tasks

Problem or tasks

Troubleshooting services and connectivity

To identify whether a VPLS is part of an H-VPLS

To verify the operational and administrative states of service components

To verify the FIB configuration

To verify connectivity for all egress points in a service using MAC Ping and MAC Trace

To verify connectivity for all egress points in a service using MEF MAC Ping

To measure frame transmission size on a service using MTU Ping

To verify the end-to-end connectivity of a service using Service Site Ping

To verify the end-to-end connectivity of a service tunnel using Tunnel Ping

To verify end-to-end connectivity of an MPLS LSP using LSP Ping

To review the route for an MPLS LSP using LSP Trace

To review ACL filter properties

To view anti-spoof filters

To retrieve MIB information from a GNE using the snmpDump utility

Table 1-5: NFM-P network management domain problems or tasks

Problem or task

Troubleshooting network management LAN issues

Problem: All network management domain stations experience performance degradation

Problem: Lost connectivity to one or more network management domain stations

Problem: Another station can be pinged, but some functions are unavailable

Problem: Packet size and fragmentation issues

Troubleshooting using NFM-P client GUI warning messages

To respond to a GUI warning message

Troubleshooting with Problem Encountered forms

To view additional problem information

To collect problem information for technical support

Troubleshooting with the client activity log

To identify the user activity for a network object

To identify the user activity for an NFM-P object

To navigate to the object of a user action

To view the user activity records of an object

Table 1-6: NFM-P platform problems or tasks

Problem or task

Troubleshooting NFM-P platform problems

To collect NFM-P log files

Problem: Poor performance on a RHEL station

Problem: Device discovery fails because of exceeded ARP cache

Troubleshooting with the NFM-P LogViewer

To display logs using the LogViewer GUI

To configure the LogViewer using the GUI

To show or hide buttons from the LogViewer main tool bar

To set highlight colors and fonts for LogViewer components and levels

To automatically show or hide log messages

To manage filters using the GUI Filter Manager

To specify a plug-in using the LogViewer GUI

To display logs using the LogViewer CLI

To configure the LogViewer CLI

To specify plug-ins using the CLI

Troubleshooting the NFM-P database

Problem: NFM-P database corruption or failure

Problem: The database is running out of disk space

Problem: Frequent database backups create performance issues

Problem: An NFM-P database restore fails and generates a No backup sets error

Problem: NFM-P database redundancy failure

Problem: Primary or standby NFM-P database is down

Problem: Need to verify that Oracle database and listener services are started

Problem: Need to determine status or version of NFM-P database or Oracle proxy

Troubleshooting NFM-P server issues

Problem: Cannot start an NFM-P server, or unsure of NFM-P server status

Problem: NFM-P server and database not communicating

Problem: An NFM-P server starts up, and then quickly shuts down

Problem: Client not receiving server heartbeat messages

Problem: Main server unreachable from RHEL client station

Problem: Excessive NFM-P server-to-client response time

Problem: Unable to receive alarms on the NFM-P, or alarm performance is degraded

Problem: All SNMP traps from managed devices are arriving at one NFM-P server, or no SNMP traps are arriving

Cannot manage new devices

Problem: Cannot discover more than one device, or device resynchronization fails

Problem: Slow or failed resynchronization with network devices

Problem: Statistics are rolling over too quickly

Troubleshooting NFM-P GUI and OSS clients

Problem: Cannot start NFM-P client, or error message during client startup

Problem: NFM-P client unable to communicate with NFM-P server

Problem: Delayed server response to client activity

Problem: Cannot place newly discovered device in managed state

Problem: User performs action, such as saving a configuration, but cannot see any results

Problem: Device configuration backup not occurring

Problem: NFM-P client GUI shuts down regularly

Problem: Configuration change not displayed on NFM-P client GUI

Problem: List or search function takes too long to complete

Problem: Cannot select some menu options or save some configurations

Problem: The NFM-P client GUI does not display NE user accounts created, modified, or deleted using the CLI


Plan corrective action using information in the NSP User Guide and NSP System Administrator Guide.


Verify the solution.