Fault management

Overview

The fault management system that the CPAM provides includes:

RCA audit policies

The CPAM allows you to perform on-demand verifications of the IP/MPLS configuration of IS-IS and OSPF routing protocols on NFM-P-managed NEs. RCA audit policies identify problems or errors in protocol-related and control plane configurations.

RCA audit polices can be created for IGP administrative domains to verify the configuration of network objects on NFM-P-managed routers. For each audit policy created, one or more policy entries can be specified to define the scope of configuration that is verified by the audit and the severity of the related problem.

See “RCA audit policies” in the NSP NFM-P Control Plane Assurance Manager User Guide for more information about RCA audit policies.

RCA audit policies workflow

The following workflow provides how to create and use RCA audit policies.

 

Create an RCA audit policy. Configure the attributes that are verified for each RCA audit policy entry.


Run the RCA audit policy on a specific object.


Identify the problems and correct the configuration.

RCA audit policies creation

The rca package is used for policy-driven IP/MPLS configuration audits for OSPF and IS-IS routing protocols.

When creating an RCA audit policy without specifying any audit policy entry, by default, all entries are created and enabled for an RCA audit. If you want to only audit specific attributes, you would need to create audit policy entries for those attributes.

The following example shows the minimum XML parameter tags required to develop an XML request to create an OSPF RCA audit policy with an audit policy entry to audit only the MTU attribute of OSPF interfaces.

The generic.GenericObject.configureChildInstance method is used to create an RCA audit policy. The request response returns the <objectFullName> attribute to identify the new RCA audit policy.

RCA audit policy creation parameters

Figure 21-21: OSPF RCA audit policy creation request example
<generic.GenericObject.configureChildInstance xmlns="xmlapi_1.0">
   <deployer>immediate</deployer>
   <distinguishedName>rca-mgr</distinguishedName>
   <childConfigInfo>
      <rca.AuditPolicy>
         <actionMask>
            <bit>create</bit>
         </actionMask>
         <id>100</id>
         <policyName>ospf</policyName>
         <children-Set>
            <rca.AuditPolicyEntry>
               <actionMask>
                  <bit>modify</bit>
               </actionMask>
               <objectFullName>rca-mgr:ID-100:enId-1</objectFullName>
               <mismatchAttributes>
                  <rca.MismatchAttribute>
                     <attributeName>mtu</attributeName>
                     <severity>critical</severity>
                     <enabled>true</enabled>
                  </rca.MismatchAttribute>
               </mismatchAttributes> 
            </rca.AuditPolicyEntry>
         </children-Set>           
      </rca.AuditPolicy>
   </childConfigInfo>
</generic.GenericObject.configureChildInstance>
RCA audit policies configuration check method

RCA audits can be manually triggered by using the rca.RcaManager.checkConfig method. The method validates the configuration of OSPF or IS-IS routing protocols set by the RCA audit policy.

Check config method

Figure 21-22: RCA audit execution request example
<rca.RcaManager.checkConfig xmlns="xmlapi_1.0">
   <deployer>immediate</deployer>
   <fdn>tpgy-mgr:name-AdminDomain1-AS-1:protocol-ospf-area-0.0.0.0</fdn>
   <fdnPolicy>rca-mgr:ID-100</fdnPolicy>
   <aInContext>
      <rca.Context/>
   </aInContext>
   <resultFilter/>
</rca.RcaManager.checkConfig>

If there are problems with the configuration, the response is similar to the example in the following figure.

Figure 21-23: RCA audit problem response example
<rca.RcaManager.checkConfigResponse xmlns="xmlapi_1.0">
   <problemList>
      <rca.Problem>
         <lastTimeChanged>1344305122231</lastTimeChanged>
         <id>notFound</id>
         <causedByObjectFullNames>
            <pointer>network:198.51.100.32:router-1:ospf-v2:areaSite-0.0.0.0:interface-5</pointer>
         </causedByObjectFullNames>
         <isFixable>false</isFixable>
         <severity>indeterminate</severity>
         <probableCause>unknown</probableCause>
         <description>Far-End Not found</description>
         <ignoreFix>false</ignoreFix>
         <applicationEnum>unspecified</applicationEnum>
         <solution/>
         <policyFdn>rca-mgr:ID-100</policyFdn>
         <deploymentState>0</deploymentState>
         <objectFullName>network:198.51.100.32:router-1:ospf-v2:areaSite-0.0.0.0:interface-5:cause-unknown-notFound</objectFullName>
         <name>cause-unknown-notFound</name>
         <selfAlarmed>false</selfAlarmed>
         <children-Set/>
      </rca.Problem>
      
      <rca.Problem>
         <lastTimeChanged>1344305122232</lastTimeChanged>
         <id>aggregated</id>
         <causedByObjectFullNames>
            <pointer>ospf:area-0.0.0.0-version-2</pointer>
         </causedByObjectFullNames>
         <isFixable>false</isFixable>
         <severity>indeterminate</severity>
         <probableCause>aggregatedCause</probableCause>
         <description>Aggregated Cause</description>
         <ignoreFix>false</ignoreFix>
         <applicationEnum>unspecified</applicationEnum>
         <solution/>
         <policyFdn>rca-mgr:ID-100</policyFdn>
         <deploymentState>0</deploymentState>
         <objectFullName>ospf:area-0.0.0.0-version-2:cause-aggregatedCause-aggregated</objectFullName>
         <name>cause-aggregatedCause-aggregated</name>
         <selfAlarmed>false</selfAlarmed>
         <children-Set/>
      </rca.Problem>
   </problemList>
</rca.RcaManager.checkConfigResponse>
CPAM alarms

In addition to routing alarms generated by routers, there are threshold reaching alarms generated by the 7701 CPAA. The alarm generation process uses the routing data collected by the 7701 CPAA. The generated alarms are sent to the CPAM route controller. The alarms are then available to an OSS application in the same manner that alarms generated in the NFM-P are sent to the XML API clients.

See Chapter 11, Fault management for more information about retrieving alarms using the OSS interface.

All CPAM-related JMS event messages are part of the CPAM event category, for example, ALA_category = CPAM. When retrieving CPAM threshold alarms via JMS, set ALA_category=CPAM as part of the filter.

The following are not included in the CPAM category:

See “Threshold reaching alarms” in the NSP NFM-P Control Plane Assurance Manager User Guide for more information about the CPAM threshold reaching alarms for BGP, IS-IS, and OSPF.

The following table shows some of the CPAM alarms threshold package and class for setting alarm threshold attributes for the specific alarm. This list is not complete. See the XML API Reference under the topology package for a complete list.

Table 21-7: CPAM alarm threshold objects

Routing alarm type

Package and class

ISIS LSP Alarm Threshold

topology.IsisLspAlarmThreshold

ISIS Reachability Threshold

topology.IsisReachabilityAlarmThreshold

OSPF LSA Rate Alarm Threshold

topology.OspfLsaRatePerRouterAlarmThreshold

OSPF LSA Alarm Threshold

topology.OspfLsaPerRouterAlarmThreshold

BGP Route Rate Threshold Per RT Threshold

topology.BgpRouteRateThresholdPerRTargetAlarmThreshold

BGP AS Path Length Per Monitored Prefix Threshold

topology.BgpAsPathLenPerMonPrefAlarmThreshold

BGP Packet Rate Threshold

topology.BgpPktRateAlarmThreshold

BGP Route Count Threshold

topology.BgpRouteCountAlarmThreshold

BGP Route Flap Rate Threshold

topology.BgpRouteFlapRateAlarmThreshold

BGP Route Rate Threshold

topology.BgpRouteRateAlarmThreshold

BGP Monitored Prefix Flap Rate Threshold Reached

topology.BgpMonitorPrefixFlapRateThresholdReached

BGP monitored Prefix Unreachable

topology.BgpMonPrefixUnreachable

BGP Prefix Monitor Redundancy Loss

topology.BgpMonPrefRedundancyLossThresholdReached

The following are the mandatory XML parameter tags required to develop an XML request to create and set a BGP alarm threshold.

The generic.GenericObject.configureChildInstance method is used to create a BGP route count alarm threshold. The request response returns the <objectFullName> attribute to identify an alarm threshold.

Alarm threshold creation parameters

Figure 21-24: BGP route count alarm threshold creation request example
<generic.GenericObject.configureChildInstance xmlns="xmlapi_1.0">
   <deployer>immediate</deployer>
   <distinguishedName>tpgy-rtr-alarm-mgr</distinguishedName>
   <childConfigInfo>
      <topology.BgpRouteCountAlarmThreshold>
         <actionMask>
            <bit>create</bit>
         </actionMask>
         <cpaaPointer>network:198.51.100.37:cpaa</cpaaPointer>
         <alarmName>topology.BgpRouteCountThresholdReached</alarmName>
         <threshold>1000</threshold>
         <administrativeState>up</administrativeState>
      </topology.BgpRouteCountAlarmThreshold>
   </childConfigInfo>
</generic.GenericObject.configureChildInstance>

© 2023 Nokia. Nokia Confidential Information

Use subject to agreed restrictions on disclosure and use.