OAM fault and performance tools and protocols

OAM overview

Delivery of services requires that a number of operations occur properly and at different levels in the service delivery model. For example, operations such as the association of packets to a service, VC labels to a service, and each service to a service tunnel must be performed properly in the forwarding plane for the service to function properly. To verify that a service is operational, a set of in-band, packet-based Operation, Administration, and Maintenance (OAM) tools is supported, with the ability to test each of the individual packet operations.

For in-band testing, the OAM packets closely resemble customer packets to effectively test the customer's forwarding path, but they are distinguishable from customer packets so they are kept within the service provider's network and not forwarded to the customer.

The suite of OAM diagnostics supplement the basic IP ping and traceroute operations with diagnostics specialized for the different levels in the service delivery model. There are diagnostics for MPLS LSPs, SR policies, SDPs, services, and VPLS MACs within a service.

LSP diagnostics for LDP, RSVP, and BGP labeled routes: LSP ping and LSP trace

The router LSP diagnostics include implementations of LSP ping and LSP trace based on RFC 8029, Detecting Multi-Protocol Label Switched (MPLS) Data Plane Failures. LSP ping provides a mechanism to detect data plane failures in MPLS LSPs. LSP ping and LSP trace are modeled after the ICMP echo request or reply used by ping and trace to detect and localize faults in IP networks.

For a specific LDP FEC, RSVP P2P LSP, or BGP IPv4 or IPv6 labeled route, LSP ping verifies whether the packet reaches the egress label edge router (LER), while for LSP trace, the packet is sent to the control plane of each transit Label Switching Router (LSR) that performs various checks to see if it is intended to be a transit LSR for the path.

The downstream mapping TLV is used in LSP ping and LSP trace to provide a mechanism for the sender and responder nodes to exchange and validate interface and label stack information for each downstream hop in the path of an LDP FEC or an RSVP LSP.

Two downstream mapping TLVs are supported. The original Downstream Mapping (DSMAP) TLV defined in RFC 4379, Detecting Multi-Protocol Label Switched (MPLS) Data Plane Failures, (obsoleted by RFC 8029) and the new Downstream Detailed Mapping (DDMAP) TLV defined in RFC 6424, Mechanism for Performing Label Switched Path Ping (LSP Ping) over MPLS Tunnels, and RFC 8029.

When the responder node has multiple equal cost next-hops for an LDP FEC prefix, the downstream mapping TLV can also be used to exercise a specific path of the ECMP set using the path-destination option. The behavior in this case is described in the ECMP sub-section that follows.

LSP ping and LSP trace for an LSP using a BGP IPv4 or IPv6 labeled route

This feature uses the Target FEC Stack TLV of type BGP Labeled IPv4 /32 Prefix as defined in RFC 8029.

The TLV is structured as shown in Target FEC stack TLV for a BGP labeled IPv4 and IPv6 prefixes.

Figure 1. Target FEC stack TLV for a BGP labeled IPv4 and IPv6 prefixes

The user issues an LSP ping using the following CLI command and specifies a bgp-label type of prefix:

oam lsp-ping bgp-label prefix ip-prefix/mask [src-ip-address ip-address] [fc fc-name [profile {in | out}]] [size octets] [ttl label-ttl] [send-count send-count] [timeout timeout] [interval interval] [path-destination ip-address [interface if-name | next-hop ip-address]] [detail]

This feature supports BGP label IPv4 prefixes with a prefix length of 32 bits only and supports IPv6 prefixes with a prefix length of 128 bits only.

The path-destination option is used to exercise specific ECMP paths in the network when the LSR performs hashing on the MPLS packet.

Similarly, the user issues a LSP trace using the following command:

oam lsp-trace bgp-label prefix ip-prefix/mask [src-ip-address ip-address] [fc fc-name [profile {in | out}]] [max-fail no-response-count] [probe-count probes-per-hop] [size octets] [min-ttl min-label-ttl] [max-ttl max-label-ttl] [timeout timeout] [interval interval] [path-destination ip-address [interface if-name | next-hop ip-address]] [detail]

The following is the process to send and respond to an LSP ping or LSP trace packet when the downstream mapping is set to the DSMAP TLV. The detailed procedures with the DDMAP TLV are presented in Using DDMAP TLV in LSP stitching and LSP hierarchy.

  • The next-hop of a BGP labeled route for a IPv4 /32 or an IPv6 /128 can be resolved to either an IPv4 transport tunnel or to an IPv6 transport tunnel. Thus, the sender node encapsulates the packet of the echo request message with a label stack which consists of the transport label stack as the outer labels and the BGP label as the inner label.

    If the packet expires on a node that acts as an LSR for the outer transport LSP, and does not have context for the BGP label prefix, the outer label in the stack is validated. If the validation is successful it replies as in the case when it receives an echo request message for an LDP FEC, which is stitched to a BGP IPv4 labeled route. In other words, it replies with return code 8 Label switched at stack-depth <RSC>.

  • An LSR node that is the next-hop for the BGP label prefix and the LER node which originated the BGP label prefix have full context for the BGP IPv4 or IPv6 target FEC stack and can, therefore, perform full validation of it.

  • If a BGP IPv4 labeled route is stitched to an LDP FEC, the egress LER for the resulting LDP FEC does not have context for the BGP IPv4 target FEC stack in the echo request message and replies with return code 4 Replying router has no mapping for the FEC at stack- depth <RSC>. This is the same behavior as an LDP FEC that is stitched to a BGP IPv4 labeled route when the echo request message reaches the egress LER for the BGP prefix.

Note:

Only BGP label IPv4 /32 prefixes and BGP IPv6 /128 prefixes are supported because only these are usable as tunnels on the Nokia router platforms. The BGP IPv4 or IPv6 label prefix is also supported with the prefix SID attribute if BGP segment routing is enabled on the routers participating in the path of the tunnel.

The responder node must have an IPv4 address to use as the source address of the IPv4 echo reply packet. SR OS uses the system interface IPv4 address. When an IPv4 BGP labeled route resolves to an IPv6 next-hop and uses an IPv6 transport tunnel, any LSR or LER node that responds to an LSP ping or LSP trace message must have an IPv4 address assigned to the system interface or the reply is not sent. In the latter case, the LSP ping or LSP trace probe times out at the sender node.

Similarly, the responder node must have an IPv6 address assigned to the system interface so that it is used in the IPv6 echo reply packet in the case of a BGP-LU IPv6 labeled route when resolved to an IPv4 or an IPv4-mapped IPv6 next-hop which itself is resolved to an IPv4 transport tunnel.

LSP ping and LSP trace over unnumbered IP interface

LSP ping for Point-to-Point (P2P) and Point-to-Multipoint (P2MP) LSPs can operate over a network using unnumbered links without any changes. LSP trace, P2MP LSP trace, and LDP tree trace are modified such that the unnumbered interface is properly encoded in the downstream mapping (DSMAP/DDMAP) TLV.

In an RSVP P2P or P2MP LSP, the upstream LSR encodes the downstream router ID in the ‟Downstream IP Address” field and the local unnumbered interface index value in the ‟Downstream Interface Address” field of the DSMAP/DDMAP TLV as defined in RFC 8029. Both values are taken from the TE database.

In an LDP unicast FEC or mLDP P2MP FEC, the interface index assigned by the peer LSR is not readily available to the LDP control plane. In this case, the alternative method as defined in RFC 8029 is used. The upstream LSR sets the Address Type to IPv4 Unnumbered, the Downstream IP Address to a value of 127.0.0.1, and the interface index is set to 0. If an LSR receives an echo-request packet with this encoding in the DSMAP/DDMAP TLV, it bypasses interface verification but continues with label validation.

ECMP considerations for LSP ping and LSP trace

When the responder node has multiple equal cost next-hops for an LDP FEC or a BGP label prefix, it replies in the DSMAP TLV with the downstream information of the outgoing interface which is part of the ECMP next-hop set for the prefix.

When BGP labeled route is resolved to an LDP FEC (of the BGP next-hop of the BGP labeled route), ECMP can exist at both the BGP and LDP levels. The following selection of next hop is performed in this case:

  • For each BGP ECMP next-hop of the labeled route, a single LDP next-hop is selected even if multiple LDP ECMP next-hops exist. Thus, the number of ECMP next-hops for the BGP labeled route is equal to the number of BGP next-hops.

  • ECMP for a BGP labeled route is only supported at PE router (BGP label push operation) and not at ABR/ASBR (BGP label swap operation). Thus at an LSR, a BGP labeled route is resolved to a single BGP next-hop which itself is resolved to a single LDP next-hop.

  • LSP trace returns one downstream mapping TLV for each next-hop of the BGP labeled route. Furthermore, it returns exactly the LDP next-hop the datapath programmed for each BGP next-hop.

The following description of the behavior of LSP ping and LSP trace makes a reference to a FEC in a generic way and which can represent an LDP FEC or a BGP labeled route. In addition, the reference to a downstream mapping TLV means either the DSMAP TLV or the DDMAP TLV.

  • If the user initiates an LSP trace of the FEC without the path-destination option specified, the sender node does not include multipath information in the DSMAP TLV in the echo request message (multipath type=0). In this case, the responder node replies with a DSMAP TLV for each outgoing interface, which is part of the ECMP next-hop set for the FEC.

    Note:

    The sender node selects the first DSMAP TLV only for the subsequent echo request message with incrementing TTL.

  • If the user initiates an LSP ping of the FEC with the path-destination option specified, the sender node does not include the DSMAP TLV. However, the user can use the interface option, part of the same path-destination option, to direct the echo request message at the sender node to be sent out a specific outgoing interface, which is part of an ECMP path set for the FEC.

  • If the user initiates an LSP trace of the FEC with the path-destination option specified but configured not to include a downstream mapping TLV in the MPLS echo request message using the CLI command downstream-map-tlv {none}, the sender node does not include the DSMAP TLV. However, the user can use the interface option, part of the same path-destination option, to direct the echo request message at the sender node to be sent out a specific outgoing interface which is part of an ECMP path set for the FEC.

  • If the user initiates an LSP trace of the FEC with the path-destination option specified, the sender node includes the multipath information in the Downstream Mapping TLV in the echo request message (multipath type=8). The path-destination option allows the user to exercise a specific path of a FEC in the presence of ECMP. This is performed by having the user enter a specific address from the 127/8 range, which is then inserted in the multipath type 8 information field of the DSMAP TLV. The CPM code at each LSR in the path of the target FEC runs the same hash routine as the datapath and replies in the Downstream Mapping TLV with the specific outgoing interface the packet would have been forwarded to if it did not expire at this node and if DEST IP field in the packet’s header was set to the 127/8 address value inserted in the multipath type 8 information. This hash is based on:

    • the {incoming port, system interface address, label-stack} when the lsr-load-balancing option of the incoming interface is configured to lbl-only. In this case, the 127/8 prefix address entered in the path-destination option is not used to select the outgoing interface. All packets received with the same label stack maps to a single and same outgoing interface.

    • the {incoming port, system interface address, label-stack, SRC/DEST IP fields of the packet} when the lsr-load-balancing option of the incoming interface is configured to lbl-ip. The SRC IP field corresponds to the value entered by the user in the src-ip-address option (default system IP interface address). The DEST IP field corresponds to the 127/8 prefix address entered in the path-destination option. In this case, the CPM code maps the packet, as well as any packet in a sub-range of the entire 127/8 range, to one of the possible outgoing interface of the FEC.

    • the {SRC/DEST IP fields of the packet} when the lsr-load-balancing option of the incoming interface is configured to ip-only. The SRC IP field corresponds to the value entered by the user in the src-ip-address option (default system IP interface address). The DEST IP field corresponds to the 127/8 prefix address entered in the path-destination option. In this case, the CPM code maps the packet, as well as any packet in a sub-range of the entire 127/8 range, to one of the possible outgoing interface of the FEC.

    In all preceding cases, the user can use the interface option, part of the same path-destination option, to direct the echo request message at the sender node to be sent out a specific outgoing interface which is part of an ECMP path set for the FEC.

    Note:

    If the user enabled the system-ip-load-balancing hash option (config>system>system-ip-load-balancing), the LSR hashing is modified by applying the system IP interface, with differing bit-manipulation, to the hash of packets of all three options (lbl-only, lbl-ip, ip-only). This system level option enhances the LSR packet distribution such that the probability of the same flow selecting the same ECMP interface index or LAG link index at two consecutive LSR nodes is minimized.

  • The ldp-treetrace tool always uses the multipath type=8 and inserts a range of 127/8 addresses instead of a single address in order multiple ECMP paths of an LDP FEC. As such, it behaves the same way as the lsp-trace with the path-destination option enabled described in the preceding sections.

  • The path-destination option can also be used to exercise a specific ECMP path of an LDP FEC, which is tunneled over a RSVP LSP or of an LDP FEC stitched to a BGP FEC in the presence of BGP ECMP paths. The user must, however, enable the use of the new DDMAP TLV either globally (config>test-oam>mpls-echo-request-downstream-map ddmap) or within the specific ldp-treetrace or lsp-trace test (downstream-map-tlv ddmap option).

LSP ping for RSVP P2MP LSP (P2MP)

The P2MP LSP ping complies to RFC 6425, Detecting Data Plane Failures in Point-to-Multipoint Multiprotocol Label Switching (MPLS) - Extensions to LSP Ping.

An LSP ping can be generated by entering the following OAM command.

oam p2mp-lsp-ping lsp-name [p2mp-instance instance-name [s2l-dest-addr ip-address [...up to 5 max]]] [fc fc-name [profile {in | out}]] [size octets] [ttl label-ttl] [timeout timeout] [detail]

The echo request message is sent on the active P2MP instance and is replicated in the datapath over all branches of the P2MP LSP instance. By default, all egress LER nodes that are leaves of the P2MP LSP instance reply to the echo request message.

The user can reduce the scope of the echo reply messages by explicitly entering a list of addresses for the egress LER nodes that are required to reply. A maximum of five addresses can be specified in a single execution of the p2mp-lsp-ping command. If all five egress LER nodes are router nodes, they can parse the list of egress LER addresses and reply. RFC 6425 specifies that only the top address in the P2MP egress identifier TLV must be inspected by an egress LER. When interoperating with other implementations, the router egress LER responds if its address is anywhere in the list. Furthermore, if another vendor implementation is the egress LER, only the egress LER matching the top address in the TLV may respond.

If the user enters the same egress LER address more than once in a single p2mp-lsp-ping command, the head-end node displays a response to a single one and displays a single error warning message for the duplicate ones. When queried over SNMP, the head-end node issues a single response trap and issues no trap for the duplicates.

The timeout parameter should be set to the time it would take to get a response from all probed leaves under no failure conditions. For that purpose, its range extends to 120 seconds for a p2mp-lsp-ping from a 10 second lsp-ping for P2P LSP. The default value is 10 seconds.

The router head-end node displays a Send_Fail error when a specific S2L path is down only if the user explicitly listed the address of the egress LER for this S2L in the ping command.

Similarly, the router head-end node displays the timeout error when no response is received for an S2L after the expiry of the timeout timer only if the user explicitly listed the address of the egress LER for this S2L in the ping command.

The user can configure a specific value of the ttl parameter to force the echo request message to expire on a router branch node or a bud LSR node. The latter replies with a downstream mapping TLV for each branch of the P2MP LSP in the echo reply message.

Note:

A maximum of 16 downstream mapping TLVs can be included in a single echo reply message. It also sets the multipath type to zero in each downstream mapping TLV and does not include any egress address information for the reachable egress LER nodes for this P2MP LSP.

If the router ingress LER node receives the new multipath type field with the list of egress LER addresses in an echo reply message from another vendor implementation, it ignores but does not cause an error in processing the downstream mapping TLV.

If the ping expires at an LSR node, which is performing a remerge or cross-over operation in the datapath between two or more ILMs of the same P2MP LSP, there is an echo reply message for each copy of the echo request message received by this node.

The output of the command without the detail parameter specified provides a high-level summary of error codes or success codes received.

The output of the command with the detail parameter specified shows a line for each replying node as in the output of the LSP ping for a P2P LSP.

The display is delayed until all responses are received or the timer configured in the timeout parameter expired. No other CLI commands can be entered while waiting for the display. A control-C (^C) command aborts the ping operation.

For more information about P2MP, see the 7450 ESS, 7750 SR, 7950 XRS, and VSR MPLS Guide.

LSP trace for RSVP P2MP LSP

The P2MP LSP trace is in accordance with RFC 6425. Generate an LSP trace using the following OAM command.

oam p2mp-lsp-trace lsp-name p2mp-instance instance-name s2l-dest-address ip-address [fc fc-name [profile {in | out}]] [size octets] [max-fail no-response-count] [probe-count probes-per-hop] [min-ttl min-label-ttl] [max-ttl max-label-ttl] [timeout timeout] [interval interval] [detail]

The LSP trace capability allows the user to trace a single S2L path of a P2MP LSP. Its operation is similar to that of the p2mp-lsp-ping command but the sender of the echo reply request message includes the downstream mapping TLV to request the downstream branch information from a branch LSR or bud LSR. The branch LSR or bud LSR then also includes the downstream mapping TLV to report the information about the downstream branches of the P2MP LSP. An egress LER does not include this TLV in the echo response message.

The probe-count parameter operates in the same way as in LSP trace on a P2P LSP. It represents the maximum number of probes sent per TTL value before giving up on receiving the echo reply message. If a response is received from the traced node before reaching the maximum number of probes, no more probes are sent for the same TTL. The sender of the echo request then increments the TTL and uses the information it received in the downstream mapping TLV to start sending probes to the node downstream of the last node that replied. This process continues until the egress LER for the traced S2L path replied.

Because the command traces a single S2L path, the timeout and interval parameters keep the same value range as in LSP trace for a P2P LSP.

The P2MP LSP Trace makes use of the Downstream Detailed Mapping (DDMAP) TLV. The following excerpt from RFC 6424 details the format of the new DDMAP TLV entered in the path-destination belongs to one of the possible outgoing interfaces of the FEC.

       0                                1                               2           
                   3
        0 1  2  3 4 5  6 7  8 9  0 1  2 3  4 5  6  7 8 9  0 1  2 3  4  5 6  7 8  9 0
 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |              MTU                            | Address Type    |    DS Flags 
      |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |               Downstream Address (4 or 16 octets)                           
      |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |         Downstream Interface Address (4 or 16 octets)                       
      |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  Return Code  | Return Subcode    |        Subtlv Length                    
      |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      .                                                                             
                           .
      .                      List of SubTLVs                                        
              
      .                                                                             
                           .
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The Downstream Detailed Mapping TLV format is derived from the Downstream Mapping (DSMAP) TLV format. The key change is that variable length and optional fields have been converted into sub-TLVs. The fields have the same use and meaning as in RFC 8029.

Similar to P2MP LSP ping, an LSP trace probe results on all egress LER nodes eventually receiving the echo request message but only the traced egress LER node replies to the last probe.

As well, any branch LSR node or bud LSR node in the P2MP LSP tree may receive a copy of the echo request message with the TTL in the outer label expiring at this node. However, only a branch LSR or bud LSR that has a downstream branch over which the traced egress LER is reachable must respond.

When a branch LSR or BUD LSR node responds to the sender of the echo request message, it sets the global return code in the echo response message to RC=14 - "See DDMAP TLV for Return Code and Return Sub-Code" and the return code in the DDMAP TLV corresponding to the outgoing interface of the branch used by the traced S2L path to RC=8 - "Label switched at stack-depth <RSC>".

Because a single egress LER address, for example an S2L path, can be traced, the branch LSR or bud LSR node sets the multipath type of zero in the downstream mapping TLV in the echo response message as no egress LER address need to be included.

LSP trace behavior when S2L path traverses a remerge node

When a 7450 ESS, 7750 SR, or 7950 XRS LSR performs a remerge of one or more ILMs of the P2MP LSP to which the traced S2L sub-LSP belongs, it may block the ILM over which the traced S2L resides. This causes the trace to either fail or to succeed with a missing hop.

The following is an example of this behavior.

S2L1 and S2L2 use ILMs that remerge at node B. Depending on which ILM is blocked at B, the TTL=2 probe either yields two responses or times out.

S2L1 = ACBDF (to leaf F)
S2L2 = ABDE (to leaf E)
 
   A
 /  \
B -- C
|  
D
| \
F  E
  • tracing S2L1 when ILM on interface C-B blocked at node B

    For TTL=1, A receives a response from C only as B does not have S2L1 on the ILM on interface A-B.

    For TTL=2, assume A receives first the response from B, which indicates a success. It then builds the next probe with TTL=3. B only passes the copy of the message arriving on interface A-B and drops the one arriving on interface C-B (treats it like a data packet because it does not expire at node B). This copy expires at F. However, F returns a DSMappingMismatched error message because the DDMAP TLV was the one provided by node B in TTL=2 step. The trace aborts at this point in time. However, A knows it received a second response from Node D for TTL=2 with a DSMappingMismatched error message.

    If A receives the response from D first with the error code, it waits to see if it gets a response from B or it times out. In either case, it logs this status as multiple replies received per probe in the last probe history and aborts the trace.

  • tracing S2L2 when ILM on interface A-B blocked at node B

    For TTL=1, B responds with a success. C does not respond as it does not have an ILM for S2L2.

    For TTL=2, B drops the copy coming on interface A-B. It receives a copy coming on interface B-C but drops it as the ILM does not contain S2L2. Node A times out. Next, node A generates a probe with TTL=3 without a DDMAP TLV. This time node D responds with a success and includes its downstream DDMAP TLV to node E. The rest of the path is discovered correctly. The traced path for S2L2 looks like: A-B-(*)-D-E.

The router ingress LER detects a remerge condition when it receives two or more replies to the same probe, such as the same TTL value. It displays the following message to the user regardless if the trace operation successfully reached the egress LER or was aborted earlier: Probe returned multiple responses. Result may be inconsistent.

This warning message indicates the potential of a remerge scenario and that a p2mp-lsp-ping command for this S2L should be used to verify that the S2L path is not defective.

The router ingress LER behavior is to always proceed to the next TTL probe when it receives an OK response to a probe or when it times out on a probe. If, however, it receives replies with an error return code, it must wait until it receives an OK response or it times out. If it times out without receiving an OK reply, the LSP trace must be aborted.

Possible echo reply messages and corresponding ingress LER behaviors are described in Echo reply messages and ingress LER behavior.

Table 1. Echo reply messages and ingress LER behavior
Echo reply message Ingress LER behavior

One or more error return codes + OK

Display OK return code. Proceed to next TTL probe. Display warning message at end of trace.

OK + one or more error return codes

Display OK return code. Proceed to next TTL probe right after receiving the OK reply but keep state that more replies received. Display warning message at end of trace.

OK + OK

Should not happen for remerge but would continue trace on first OK reply. This is the case when one of the branches of the P2MP LSP is activating the P2P bypass LSP. In this case, the head-end node receives a reply from both a regular P2MP LSR that has the ILM for the traced S2L and from an LSR switching the P2P bypass for other S2Ls. The latter does not have context for the P2MP LSP being tunneled but responds after doing a label stack validation.

One error return code + timeout

Abort LSP trace and display error code. Ingress LER cannot tell the error occurred of a remerge condition.

More than one error return code + timeout

Abort LSP trace and display first error code. Display warning message at end of trace.

Timeout on probe without any reply

Display ‟*” and proceed to next TTL probe.

Downstream Detailed Mapping (DDMAP) TLV

The Downstream Detailed Mapping (DDMAP) TLV provides the same features as the DSMAP TLV, with the enhancement to trace the details of LSP stitching and LSP hierarchy. The latter is achieved using a sub-TLV of the DDMAP TLV called the FEC stack change sub-TLV. DDMAP TLV shows the structures of these two objects as defined in RFC 6424.

Figure 2. DDMAP TLV

The DDMAP TLV format is derived from the DSMAP TLV format. The key change is that variable length and optional fields have been converted into sub-TLVs. The fields have the same use and meaning as in RFC 8029 as shown in FEC stack change sub-TLV.

Figure 3. FEC stack change sub-TLV

The operation type specifies the action associated with the FEC stack change. The following operation types are defined.

Type #     Operation
------     ---------
1          Push      
2          Pop

More details on the processing of the fields of the FEC stack change sub-TLV are provided later in this section.

The user can configure which downstream mapping TLV to use globally on a system by using the following command: configure test-oam mpls-echo-request-downstream-map {dsmap | ddmap}

This command specifies which format of the downstream mapping TLV to use in all LSP trace packets and LDP tree trace packets originated on this node. The Downstream Mapping (DSMAP) TLV is the original format in RFC 4379 (obsoleted by RFC 8029) and is the default value. The Downstream Detailed Mapping (DDMAP) TLV is the enhanced format specified in RFC 6424 and RFC 8029.

This command applies to LSP trace of an RSVP P2P LSP, a MPLS-TP LSP, a BGP labeled route, or LDP unicast FEC, and to LDP tree trace of a unicast LDP FEC. It does not apply to LSP trace of an RSVP P2MP LSP, which always uses the DDMAP TLV.

The global DSMAP TLV setting impacts the behavior of both OAM LSP trace packets and SAA test packets of type lsp-trace and is used by the sender node when one of the following events occurs:

  • An SAA test of type lsp-trace is created (not modified) and no value is specified for the per-test downstream-map-tlv {dsmap | ddmap | none} option. In this case the SAA test downstream-map-tlv value defaults to the global mpls-echo-request-downstream-map value.

  • An OAM test of type lsp-trace test is executed and no value is specified for the per-test downstream-map-tlv {dsmap | ddmap | none} option. In this case, the OAM test downstream-map-tlv value defaults to the global mpls-echo-request-downstream-map value.

A consequence of the preceding rules is that a change to the value of the mpls-echo-request-downstream-map option does not affect the value inserted in the downstream mapping TLV of existing tests.

The following are the details of the processing of the DDMAP TLV:

  • When either the DSMAP TLV or the DDMAP TLV is received in an echo request message, the responder node includes the same type of TLV in the echo reply message with the correct downstream interface information and label stack information.

  • If an echo request message without a Downstream Mapping TLV (DSMAP or DDMAP) expires at a node that is not the egress for the target FEC stack, the responder node always includes the DSMAP TLV in the echo reply message. This can occur in the following cases:

    • The user issues an LSP trace from a sender node with a min-ttl value higher than 1 and a max-ttl value lower than the number of hops to reach the egress of the target FEC stack. This is the sender node behavior when the global configuration or the per-test setting of the Downstream Mapping TLV is set to DSMAP.

    • The user issues a LSP ping from a sender node with a ttl value lower than the number of hops to reach the egress of the target FEC stack. This is the sender node behavior when the global configuration of the Downstream Mapping TLV is set to DSMAP.

    • The behavior in (a) is changed when the global configuration or the per-test setting of the Downstream Mapping TLV is set to DDMAP. The sender node includes in this case the DDMAP TLV with the Downstream IP address field set to the all-routers multicast address as per Section 3.4 of RFC 8029. The responder node then bypasses the interface and label stack validation and replies with a DDMAP TLV with the correct downstream information for the target FEC stack.

  • A sender node never includes the DSMAP or DDMAP TLV in an LSP ping message.

Using DDMAP TLV in LSP stitching and LSP hierarchy

In addition to performing the same features as the DSMAP TLV, the DDMAP TLV addresses the following scenarios:

  • Full validation of an LDP IPv4 FEC stitched to a BGP IPv4 labeled route. In this case, the LSP trace message is inserted from the LDP LSP segment or from the stitching point.

  • Full validation of a BGP IPv4 labeled route stitched to an LDP IPv4 FEC. The LSP trace message is inserted from the BGP LSP segment or from the stitching point.

  • Full validation of an LDP IPv4 FEC, which is stitched to a BGP IPv4 labeled route and stitched back into an LDP IPv4 FEC. In this case, the LSP trace message is inserted from the LDP segments or from the stitching points.

  • Full validation of a LDP IPv4 FEC stitched to a SR-ISIS or SR-OSPF IPv4 tunnel.

  • Full validation of an SR-ISIS or SR-OSPF IPv4 tunnel stitched to an LDP IPv4 FEC.

  • Full validation of an LDP FEC tunneled over an RSVP LSP or an SR-TE LSP using LSP trace.

  • Full validation of a BGP IPv4 labeled route or of a BGP IPv6 labeled route (with an IPv4 or an IPv4-mapped IPv6 next-hop) tunneled over an RSVP LSP, an LDP IPv4 FEC, an SR-ISIS IPv4 tunnel, a SR-OSPF IPv4 tunnel, an SR-TE IPV4 LSP, or an IPv4 SR policy.

  • Full validation of a BGP IPv4 labeled route (with an IPv6 next-hop) or a BGP IPv6 labeled route tunneled over an LDP IPv6 FEC, an SR-ISIS IPv6 tunnel, an SR-OSPF3 IPv6 tunnel, an SR-TE IPv6 LSP, or an IPv6 SR policy.

  • Full validation of a BGP IPv6 labeled route (with an IPv4 or an IPv4-mapped IPv6 next-hop) recursively resolved to a BGP IPv4 labeled route which itself is tunneled over an LDP IPv4 FEC, an SR-ISIS IPv4 tunnel, an SR-OSPF IPv4 tunnel, an RSVP-TE LSP, an SR-TE IPv4 LSP, or an IPv4 SR policy.

To properly check a target FEC which is stitched to another FEC (stitching FEC) of the same or a different type, or which is tunneled over another FEC (tunneling FEC), it is necessary for the responding nodes to provide details about the FEC manipulation back to the sender node. This is achieved via the use of the new FEC stack change sub-TLV in the Downstream Detailed Mapping TLV (DDMAP) defined in RFC 6424.

When the user configures the use of the DDMAP TLV on a trace for an LSP that does not undergo the stitching or tunneling operation in the network, the procedures at the sender and responder nodes are the same as in the case of the existing DSMAP TLV.

This feature, however, introduces changes to the target FEC stack validation procedures at the sender and responder nodes in the case of LSP stitching and LSP hierarchy. These changes pertain to the processing of the new FEC stack change sub-TLV in the new DDMAP TLV and the new return code 15 Label switched with FEC change. The following is a description of the main changes which are a superset of the rules described in Section 4 of RFC 6424 to allow greater scope of interoperability with other vendor implementations.

Responder node procedures

This section describes responder-node behaviors.

  • As a responder node, the router always inserts a global return code of either:

    • 3 Replying router is an egress for the FEC at stack-depth <RSC>
    • 14 See DDMAP TLV for Return Code and Return Subcode.
  • When the responder node inserts a global return code of 3, it does not include a DDMAP TLV.

  • When the responder node includes the DDMAP TLV, it inserts a global return code, 14 See DDMAP TLV for Return Code and Return Subcode and:

    • On a success response, includes a return code of 15 in the DDMAP TLV for each downstream that has a FEC stack change TLV.

    • On a success response, includes a return code 8 Label switched at stack-depth <RSC> in the DDMAP TLV for each downstream if no FEC stack change sub-TLV is present.

    • On a failure response, includes an appropriate error return code in the DDMAP TLV for each downstream.

  • A tunneling node indicates that it is pushing a FEC (the tunneling FEC) on top of the Target FEC Stack TLV by including a FEC stack change sub-TLV in the DDMAP TLV with a FEC operation type value of PUSH. It also includes a return code 15 Label switched with FEC change. The downstream interface address and downstream IP address fields of the DDMAP TLV are populated for the pushed FEC. The remote peer address field in the FEC stack change sub-TLV is populated with the address of the control plane peer for the pushed FEC. The Label stack sub-TLV provides the full label stack over the downstream interface.

  • A node that is stitching a FEC indicates that it is performing a POP operation for the stitched FEC followed by a PUSH operation for the stitching FEC and potentially one PUSH operation for the transport tunnel FEC. It includes two or more FEC stack change sub-TLVs in the DDMAP TLV in the echo reply message. It also includes a return code 15 Label switched with FEC change. The downstream interface address and downstream address fields of the DDMAP TLV are populated for the stitching FEC. The remote peer address field in the FEC stack change sub-TLV of type POP is populated with a null value (0.0.0.0). The remote peer address field in the FEC stack change sub-TLV of type PUSH is populated with the address of the control plane peer for the tunneling FEC. The Label stack sub-TLV provides the full label stack over the downstream interface.

  • If the responder node is the egress for one or more FECs in the target FEC Stack, it must reply with no DDMAP TLV and with a return code 3 Replying router is an egress for the FEC at stack-depth <RSC>. RSC must be set to the depth of the topmost FEC. This operation is iterative in a sense that at the receipt of the echo reply message the sender node pops the topmost FEC from the target stack FEC TLV and resend the echo request message with the same TTL value as described in (5) below. The responder node performs exactly the same operation as described in this step until all FECs are popped or until the topmost FEC in the Target FEC Stack TLV matches the tunneled or stitched FEC. In the latter case, processing of the Target FEC Stack TLV follows again steps (1) or (2).

Sender node procedures

This section describes sender-node behaviors.

  • If the echo reply message contains the return code 14 See DDMAP TLV for Return Code and Return Subcode and the DDMAP TLV has a return code 15 Label switched with FEC change, the sender node adjusts the target FEC Stack TLV in the echo request message for the next value of the TTL to reflect the operation on the current target FEC stack as indicated in the FEC stack change sub-TLV received in the DDMAP TLV of the last echo reply message. In other words, one FEC is popped at most and one or more FECs are pushed as indicated.

  • If the echo reply message contains the return code 3 Replying router is an egress for the FEC at stack-depth <RSC>, then:

    • If the value for the label stack depth specified in the Return Sub-Code (RSC) field is the same as the depth of current target FEC Stack TLV, the sender node considers the trace operation complete and terminates it. A responder node causes this case to occur as per step (6) of the responder node procedures.

    • If the value for the label stack depth specified in the Return Sub-Code (RSC) field is different from the depth of the current target FEC Stack TLV, the sender node must continue the LSP trace with the same TTL value after adjusting the Target FEC Stack TLV by removing the top FEC.

      Note:

      This step continues iteratively until the value for the label stack depth specified in the Return Sub-Code (RSC) field is the same as the depth of current target FEC Stack TLV and in which case step (a) is performed. A responder node causes this case to occur as per step (6) of the responder node procedures.

    • If a DDMAP TLV with or without a FEC stack change sub-TLV is included, the sender node must ignore it and processing is performed as the preceding steps. A responder node does not cause this case to occur but a third party implementation may.

  • As a sender node, the router can accept an echo-reply message with the global return code of either 14 (with DDMAP TLV return code of 15 or 8), or 15 and process properly the FEC stack change TLV as per step (1) of the sender node procedures.

  • If an LSP ping is performed directly to the egress LER of the stitched FEC, there is no DDMAP TLV included in the echo request message and the responder node, which is the egress node, replies with return code 4 Replying router has no mapping for the FEC at stack- depth <RSC>. This case cannot be resolved with this feature.

OAM support in Segment Routing with MPLS data plane

MPLS OAM supports Segment Routing extensions to lsp-ping and lsp-trace as defined in draft-ietf-mpls-spring-lsp-ping.

Segment Routing (SR) performs both shortest path and source-based routing. When the data plane uses MPLS encapsulation, MPLS OAM tools such as lsp-ping and lsp-trace can be used to check connectivity and trace the path to any midpoint or endpoint of an SR-ISIS, a SR-OSPF shortest path tunnel, or an SR-TE LSP.

The CLI options for lsp-ping and lsp-trace are under OAM and SAA for the following types of Segment Routing tunnels:

  • SR-ISIS and SR-OSPF node SID tunnels

  • SR-TE LSP

OAM support in IPv4 or IPv6 SR policies with MPLS data plane

This feature extends the support of LSP ping, LSP trace, and ICMP tunneling probes to IPv4 and IPv6 SR policies.

This feature describes the CLI options for the lsp-ping and lsp-trace commands under the OAM and SAA contexts for the following type of Segment Routing tunnel: sr-policy.

  • oam lsp-ping sr-policy {color integer <0..4294967295> endpoint ip-address<ipv4/ipv6>} [segment-list id<1..32>] [src-ip-address ip-address] [fc fc-name [profile {in|out}]] [size octets] [ttl label-ttl] [send-count send-count] [timeout timeout] [interval interval] [path-destination ip-address [interface if-name | next-hop ip-address]] [detail]

  • oam lsp-trace sr-policy {color integer <0..4294967295> endpoint ip-address<ipv4/ipv6>} [segment-list id<1..32>] [src-ip-address ip-address] [fc fc-name [profile {in|out}]] [max-fail no-response-count] [probe-count probes-per-hop] [size octets] [min-ttl min-label-ttl] [max-ttl max-label-ttl] [timeout timeout] [interval interval] [path-destination ip-address [interface if-name | next-hop ip-address]] [downstream-map-tlv {dsmap | ddmap | none}] [detail]

The CLI does not require entry of the SR policy head-end parameter that corresponds to the IPv4 address of the router where the static SR policy is configured or where the BGP NRLRI of the SR policy is sent to by a controller or another BGP speaker. SR OS expects its IPv4 system address in the head-end parameter of both the IPv4 and IPv6 SR policy NLRIs, otherwise, SR OS does not import the NRLI.

The source IPv4 or IPv6 address can be specified to encode in the Echo Request message of the LSP ping or LSP trace packet.

The endpoint command specifies the endpoint of the policy and which can consist of an IPv4 address, and therefore, matching to a SR policy in the IPv4 tunnel-table, or an IPv6 address, and therefore, matching to a SR policy in the IPv6 tunnel-table.

The color command must correspond to the SR policy color attribute that is configured locally in the case of a static policy instance or signaled in the NLRI of the BGP signaled SR policy instance.

The endpoint and color commands test the active path (or instance) of the identified SR policy only.

The lsp-ping and lsp-trace commands can test one segment list at a time by specifying one segment list of the active instance of the policy or active candidate path. In this case, the segment-list id command is configured or segment list 1 is tested by default. The segment-list ID corresponds to the same index that was used to save the SR policy instance in the SR policy database. In the case of a static SR policy, the segment-list ID matches the segment-list index entered in the configuration. In both the static and the BGP SR policies, the segment-list ID matches the index displayed for the segment list in the output of the show command of the policies.

The exercised segment list corresponds to a single SR-TE path with its own NHLFE or super NHLFE in the datapath.

The ICMP tunneling feature support with SR policy is described in ICMP-tunneling operation and does not require additional CLI commands.

LSP ping and LSP trace operation

The following operations are supported with both LSP ping and LSP trace.

  • The lsp-ping and lsp-trace features model the tested segment list as a NIL FEC target FEC stack.

  • Both an IPv4 SR policy (endpoint is an IPv4 address) and IPv6 SR policy (endpoint is an IPv6 address) can potentially contain a mix of IPv4 and IPv6 (node, adjacency, or adjacency set) SIDs in the same segment list or across segment lists of the same policy. While this is not a typical use of the SR policy, it is nonetheless allowed in the IETF standard and supported in SR OS. As a result, the downstream interface and node address information returned in the DSMAP or DDMAP TLV can have a different IP family across the path of the SR policy.

    Also, the IPv4 or IPv6 endpoint address can be null (0.0.0.0 or 0::0). This has no impact on the OAM capability.

  • Unlike a SR-TE LSP path, the type of each segment (node, adjacency, or adjacency set) in the SID list may not be known to the sender node, except for the top SID that is validated by the SR policy database and which uses this segment type to resolve the outgoing interface or interfaces and outgoing label or labels to forward the packet out.

  • The NIL FEC type is used to represent each SID in the segment list, including the top SID. The NIL FEC is defined RFC 8029 and has three main applications:

    • Allow the sender node to insert a FEC stack sub-TLV into the target FEC TLV when the FEC type is not known to the sender node (for SIDs of the SR policy except the top SID) or if there is no explicit FEC associated with the label (for a label of a static LSP or a MPLS forwarding policy). This is the application applicable to the SR policy.

      Although the sender node knows the FEC type for the top SID in the segment list of a SR policy, the NIL FEC is used for consistency. However, the sender node does all the processing required to look up the top SID as per the procedures of any other explicit FEC type.

    • Allow the sender node to insert a FEC stack sub-TLV into the target FEC stack sub-TLV if a special purpose label (for example, Router Alert) is inserted in the packet's label stack to maintain the correct 1-to-1 mapping of the packet's stacked labels to the hierarchy of FEC elements in the target FEC stack TLV processing at the responder node.

      SR OS does not support this application in a sender node role but can process the NIL FEC if received by a third-party implementation.

    • Allow the responder node to hide from the sender node a FEC element that it is pushing or stitching to by adding a NIL FEC TLV with a PUSH or a POP and PUSH (equivalent to a SWAP) operation into the FEC stack change sub-TLV.

      SR OS does not support this application in a sender node role but can process the NIL FEC if received by a third-party implementation.

  • For lsp-ping, the sender node builds a target FEC Stack TLV which contains a single NIL FEC element corresponding to the last segment of the tested segment list of the SR policy.

  • For lsp-trace, the sender node builds a target FEC Stack TLV which contains a NIL FEC element for each SID in the segment list.

  • To support the processing of the NIL FEC in the context of the SR policy and the applications in RFC 8029, SR OS in a receiver node role performs the following operations:

    1. Looks up the label of the NIL FEC in the SR database to match against the SID of a resolved node, a resolved adjacency, a resolved adjacency SET or a binding SID.

    2. If a match exists, continues processing of the NIL FEC.

    3. Otherwise, looks up the label of the NIL FEC in the Label Manager.

    4. If a match exists, processes the FEC as per the POP or SWAP operation provided by the lookup and following the NIL FEC procedures in RFC 8029.

    5. Otherwise, fails the validation and send a return code of 3 < Replying router has no mapping for the FEC at stack-depth <RSC>> in the MPLS echo reply message. The sender node fails the probe at this point.

  • A SID label associated with a NIL FEC and which is popped at an LSR, acting in a receiver node role, is first looked up. If the label is valid, the processing results in a return code of 3 <Replying router is an egress for the FEC at stack-depth <RSC>>.

    A label is valid if the LSR validates it in its Segment Routing (SR) database. Because the LSR does not know the actual FEC type and FEC value, it successfully validates it if the SR database indicates a programmed POP operation with that label for a node SID exists.

  • A SID label associated with a NIL FEC and which is swapped at an LSR, acting in a receiver node role, is first looked up. If the label is valid, the processing results in the return code of 8 Label switched at stack-depth <RSC> as per RFC 8029.

    A label is valid if the LSR validates it in its Segment Routing (SR) database. Because the LSR does not know the actual FEC type and FEC value, it successfully validates it if the SR database indicates a programmed SWAP operation with that label for either a node SID, an adjacency SID, an adjacency SET SID, or a binding SID exists.

    The swap operation corresponds to swapping the incoming label to an implicit-null label toward the downstream router in the case of an adjacency and toward a set of downstream routers in the case of an adjacency set.

    The swap operation corresponds to swapping the incoming label to one or more labels toward a set of downstream routers in the case of a node SID and a binding SID.

  • The lsp-trace command is supported with the inclusion of the DSMAP TLV, the DDMAP TLV, or none of them by the sender node in the Echo Request message. The responder node returns in the DSMAP or DDMAP TLV the downstream interface information along with the egress label and protocol ID that corresponds to the looked up node SID, adjacency SID, adjacency SET SID, or binding SID.

  • When the Target FEC Stack TLV contains more than one NIL FEC element, the responder node that is the termination of a FEC element indicates the FEC POP operation implicitly by replying with a return code of 3 <Replying router is an egress for the FEC at stack-depth <RSC>>. When the sender node gets this reply, the sender node adjusts the Target FEC Stack TLV by stripping the top FEC before sending the next probe for the same TTL value. When the responder node receives the next Echo Request message with the same TTL value from the sender node, the responder node processes the next FEC element in the stack.

  • The responder node performs validation of the top FEC in the target FEC stack TLV provided that the depth of the incoming label stack in the packet's header is strictly higher than the depth of the target FEC stack TLV.

  • The ttl value in lsp-ping context can be set to a value lower than 255 and the responder node replies if the NIL FEC element in the Target FEC Stack TLV corresponds to a node SID resolved at that node. The responder node, however, fails the validation if the NIL FEC element in Target FEC Stack TLV corresponds to adjacency of a remote node. The return code in the echo reply message can be one of: rc=4(NoFECMapping), and rc=10(DSRtrUnmatchLabel).

  • The min-ttl and max-ttl commands in lsp-trace context can be set to values other than default. The min-ttl can, however, properly trace the partial path of a SR policy only if there is not segment termination before the node that corresponds to the min-ttl value. Otherwise, the validation fails and returns an error as the responder node receives a Target FEC Stack depth that is higher than incoming label stack size. The return code in the echo reply message can be one of: rc=4(NoFECMapping), rc=5(DSMappingMismatched), and rc=10(DSRtrUnmatchLabel).

    This is true when the downstream-map-tlv option is set to any of ddmap, dsmap, or none values.

ICMP-tunneling operation

The ICMP tunneling feature operates in the same way as in a SR-TE LSP. When the label TTL of a traceroute packet of a core IPv4 or IPv6 route or a VPN IPv4 or VPN IPv6 route expires at an LSR, the latter generates an ICMP reply packet of type=11- (time exceeded) and injects it in the forward direction of the SR policy. When the packet is received by the egress LER or a BGP border router, SR OS performs a regular user packet route lookup in the datapath in the GRT context or in a VPRN context and forwards the packet to the destination. The destination of the packet is the sender of the original packet which TTL expired at the LSR.

SR extensions for LSP ping and LSP trace

This section describes how MPLS OAM models the SR tunnel types.

An SR shortest path tunnel, SR-ISIS, or SR-OSPF tunnel, uses a single FEC element in the Target FEC Stack TLV. The FEC corresponds to the prefix of the node SID in a specific IGP instance.

IPv4 IGP-prefix segment ID illustrates the format for the IPv4 IGP-prefix segment ID:

Figure 4. IPv4 IGP-prefix segment ID

In this format, the fields are as follows:

  • IPv4 prefix

    This field carries the IPv4 prefix to which the segment ID is assigned. For anycast segment ID, this field carries the IPv4 anycast address. If the prefix is shorter than 32 bits, trailing bits must be set to zero.

  • Prefix length

    The Prefix Length field is one octet. It gives the length of the prefix in bits (values can be 1 to 32).

  • Protocol

    This field is set to 1 if the IGP protocol is OSPF and is set to 2 if the IGP protocol is IS-IS.

IPv6 IGP prefix segment ID illustrates the format for the IPv6 IGP prefix segment ID.

Figure 5. IPv6 IGP prefix segment ID

In this format, the fields are as follows:

  • IPv6 prefix

    This field carries the IPv6 prefix to which the segment ID is assigned. For anycast segment ID, this field carries the IPv4 anycast address. If the prefix is shorter than 128 bits, trailing bits must be set to zero.

  • Prefix length

    The Prefix Length field is one octet, it gives the length of the prefix in bits (values can be 1 to 128).

  • Protocol

    This field is set to 1 if the IGP protocol is OSPF and is set to 2 if the IGP protocol is IS-IS.

An SR-TE LSP, as a hierarchical LSP, uses the Target FEC Stack TLV, which contains a FEC element for each node SID and for each adjacency SID in the path of the SR-TE LSP. Because the SR-TE LSP does not instantiate state in the LSR other than the ingress LSR, MPLS OAM is just testing a hierarchy of node SID and adjacency SID segments toward the destination of the SR-TE LSP. The format of the node-SID is as illustrated in the preceding diagram. IGP-adjacency segment ID illustrates the format for the IGP-Adjacency segment ID is as follows.

Figure 6. IGP-adjacency segment ID

In this format, the fields are as follows:

  • Adj. type (adjacency type)

    This field is set to 1 when the adjacency segment is parallel adjacency as defined in section 3.5.1 of I-D.ietf-spring-segment-routing. This field is set to 4 when the adjacency segment is IPv4-based and is not a parallel adjacency. This field is set to 6 when the adjacency segment is IPv6-based and is not a parallel adjacency.

  • Protocol

    This field is set to 1 if the IGP protocol is OSPF and is set to 2 if the IGP protocol is IS-IS.

  • Local interface ID

    This field is an identifier that is assigned by local LSR for a link on which the adjacency segment ID is bound. This field is set to local link address (IPv4 or IPv6). If unnumbered, the 32-bit link identifier defined in RFC 4203 and RFC 5307 is used. If the adjacency segment ID represents parallel adjacencies, as described in section 3.5.1 of I-D.ietf-spring-segment-routing, this field must be set to zero.

  • Remote interface ID

    This field is an identifier that is assigned by remote LSR for a link on which adjacency segment ID is bound. This field is set to the remote (downstream neighbor) link address (IPv4 or IPv6). If unnumbered, the 32-bit link identifier defined in RFC 4203 and RFC 5307 is used. If the adjacency segment ID represents parallel adjacencies, as described in section 3.5.1 of I-D.ietf-spring-segment-routing. This field must be set to zero.

  • Advertising node identifier

    This field specifies the advertising node identifier. When the Protocol field is set to 1, then the 32 rightmost bits represent the OSPF router ID. If the Protocol field is set to 2, this field carries the 48-bit IS-IS system ID.

  • Receiving node identifier

    This field specifies the downstream node identifier. When the Protocol field is set to 1, then the 32 rightmost bits represent OSPF router ID. If the Protocol field is set to 2, this field carries the 48-bit IS-IS system ID.

Both lsp-ping and lsp-trace apply to the following contexts:

  • SR-ISIS or SR-OSPF shortest path IPv4 tunnel

  • SR-ISIS or SR-OSPF3 (OSPFv3 instance ID 0-31) shortest path IPv6 tunnel

  • IS-IS SR-TE IPv4 LSP and OSPF SR-TE IPv4 LSP

  • IS-IS SR-TE IPv6 LSP

  • SR-ISIS IPv4 tunnel stitched to an LDP IPv4 FEC

  • BGP IPv4 LSP or BGP IPv6 LSP (with an IPv4 or an IPv4-mapped-IPv6 next-hop) resolved over an SR‑ISIS IPv4 tunnel, an SR-OSPF IPv4 tunnel, or an SR-TE IPv4 LSP. This includes support for BGP LSP across AS boundaries and for ECMP next-hops at the transport tunnel level.

  • BGP IPv4 LSP (with an IPv6 next-hop) or a BGP IPv6 LSP resolved over an SR-ISIS IPv6 tunnel, an SR-OSPF3 IPv6 tunnel, or an SR-TE IPv6 LSP; including support for BGP LSP across AS boundaries and for ECMP next-hops at the transport tunnel level.

  • SR-ISIS or SR-OSPF IPv4 tunnel resolved over IGP IPv4 shortcuts using RSVP-TE LSPs

  • SR-ISIS IPv6 tunnel resolved over IGP IPv4 shortcuts using RSVP-TE LSPs

  • LDP IPv4 FEC resolved over IGP IPv4 shortcuts using SR-TE LSPs

Operation on SR-ISIS or SR-OSPF tunnels

The following operations apply to lsp-ping and lsp-trace:

  • The sender node builds the Target FEC Stack TLV with a single FEC element corresponding to the node SID of the destination of the SR-ISIS or SR-OSPF tunnel.

  • A node SID label that is swapped at an LSR results in the return code of 8, ‟Label switched at stack-depth <RSC>” as defined in RFC 8029.

  • A node SID label that is popped at an LSR results in a return code of 3, ‟Replying router is an egress for the FEC at stack-depth <RSC>”.

  • The lsp-trace command is supported with the inclusion of the DSMAP TLV, the DDMAP TLV, or none (when none is configured, no Map TLV is sent). The downstream interface information is returned along with the egress label for the node SID tunnel and the protocol that resolved the node SID at the responder node.

The following diagram shows an example topology for an lsp-ping and lsp-trace for SR-ISIS node SID tunnel.

Figure 7. Testing MPLS OAM with SR tunnels
LSP ping on DUT-A for target node SID of DUT-F
*A:Dut-A# oam lsp-ping sr-isis prefix 10.20.1.6/32 igp-instance 0 detail
LSP-PING 10.20.1.6/32: 80 bytes MPLS payload
Seq=1, send from intf int_to_B, reply from 10.20.1.6
       udp-data-len=32 ttl=255 rtt=1220324ms rc=3 (EgressRtr)
---- LSP 10.20.1.6/32 PING Statistics ----
1 packets sent, 1 packets received, 0.00% packet loss
round-trip min = 1220324ms, avg = 1220324ms, max = 1220324ms, stddev = 0.000ms
LSP trace on DUT-A for target node SID of DUT-F (DSMAP TLV)
*A:Dut-A# oam lsp-trace sr-isis prefix 10.20.1.6/32 igp-instance 0 detail
lsp-trace to 10.20.1.6/32: 0 hops min, 0 hops max, 108 byte packets
1  10.20.1.2  rtt=1220323ms rc=8(DSRtrMatchLabel) rsc=1
     DS 1: ipaddr=10.10.4.4 ifaddr=10.10.4.4 iftype=ipv4Numbered MRU=1496
           label[1]=26406 protocol=6(ISIS)
2  10.20.1.4  rtt=1220323ms rc=8(DSRtrMatchLabel) rsc=1
     DS 1: ipaddr=10.10.9.6 ifaddr=10.10.9.6 iftype=ipv4Numbered MRU=1496
           label[1]=26606 protocol=6(ISIS)
3  10.20.1.6  rtt=1220324ms rc=3(EgressRtr) rsc=1
LSP trace on DUT-A for target node SID of DUT-F (DDMAP TLV)
*A:Dut-A# oam lsp-trace sr-isis prefix 10.20.1.6/32 igp-instance 0 downstream-map-tlv ddmap detail
lsp-trace to 10.20.1.6/32: 0 hops min, 0 hops max, 108 byte packets
1  10.20.1.2  rtt=1220323ms rc=8(DSRtrMatchLabel) rsc=1
     DS 1: ipaddr=10.10.4.4 ifaddr=10.10.4.4 iftype=ipv4Numbered MRU=1496
           label[1]=26406 protocol=6(ISIS)
2  10.20.1.4  rtt=1220324ms rc=8(DSRtrMatchLabel) rsc=1
     DS 1: ipaddr=10.10.9.6 ifaddr=10.10.9.6 iftype=ipv4Numbered MRU=1496
           label[1]=26606 protocol=6(ISIS)
3  10.20.1.6  rtt=1220324ms rc=3(EgressRtr) rsc=1

Operation on SR-TE LSP

The following operations apply to lsp-ping and lsp-trace:

  • The sender node builds a target FEC Stack TLV that contains FEC elements.

    For lsp-ping, the Target FEC Stack TLV contains a single FEC element that corresponds to the last segment; that is, a node SID or an adjacency SID of the destination of the SR-TE LSP.

    For lsp-trace, the Target FEC Stack TLV contains a FEC element for each node SID and for each adjacency SID in the path of the SR-TE LSP, including that of the destination of the SR-TE LSP.

  • A node SID label popped at an LSR results in a return code of 3 ‟Replying router is an egress for the FEC at stack-depth <RSC>”.

    An adjacency SID label popped at an LSR results in a return code of 3, ‟Replying router is an egress for the FEC at stack-depth <RSC>”.

  • A node SID label that is swapped at an LSR results in the return code of 8, "Label switched at stack-depth <RSC>" as defined in RFC 8029, Detecting Multiprotocol Label Switched (MPLS) Data-Plane Failures.

    An adjacency SID label that is swapped at an LSR results in the return code of 8, "Label switched at stack-depth <RSC>" as defined in RFC 8029; for example, in SR OS, ‟rc=8(DSRtrMatchLabel) rsc=1”.

  • The lsp-trace command is supported with the inclusion of the DSMAP TLV, the DDMAP TLV, or none (when none is configured, no Map TLV is sent). The downstream interface information is returned along with the egress label for the node SID tunnel or the adjacency SID tunnel of the current segment as well as the protocol which resolved the tunnel at the responder node.

  • When the Target FEC Stack TLV contains more than one FEC element, the responder node that is the termination of one node or adjacency SID segment SID pops its own SID in the first operation. When the sender node receives this reply, it adjusts the Target FEC Stack TLV by stripping the top FEC before sending the probe for the next TTL value. When the responder node receives the next echo request message with the same TTL value from the sender node for the next node SID or adjacency SID segment in the stack, it performs a swap operation to that next segment.

  • When the path of the SR-TE LSP is computed by the sender node, the hop-to-label translation tool returns the IGP instance that was used to determine the labels for each hop of the path. When the path of an SR-TE LSP is computed by a PCE, the protocol ID is not returned in the SR-ERO by PCEP. In this case, the sender node performs a lookup in the SR module for the IGP instance that resolved the first segment of the path. In both cases, the determined IGP is used to encode the Protocol ID field of the node SID or adjacency SID in each of the FEC elements of a Target FEC Stack TLV.

  • The responder node performs validation of the top FEC in the Target FEC Stack TLV, provided that the depth of the incoming label stack in the packet’s header is higher than the depth of the Target FEC Stack TLV.

  • TTL values can be changed.

    The ttl value in lsp-ping can be set to a value lower than 255 and the responder node replies if the FEC element in the Target FEC Stack TLV corresponds to a node SID resolved at that node. The responder node, however, fails the validation if the FEC element in the Target FEC Stack TLV is the adjacency of a remote node. The return code in the echo reply message can be one of: ‟rc=4(NoFECMapping)” or ‟rc=10(DSRtrUnmatchLabel)”.

    The min-ttl and max-ttl values in lsp-trace can be set to values other than the default. The minimum TTL value can, however, properly trace the partial path of an SR-TE LSP only if there is no segment termination before the node that corresponds to the minimum TTL value. Otherwise, it fails validation and returns an error as the responder node would receive a target FEC stack depth that is higher than the incoming label stack size. The return code in the echo reply message can be one of: ‟rc=4(NoFECMapping)”, ‟rc=5(DSMappingMismatched)”, or ‟rc=10(DSRtrUnmatchLabel)”.

    This is true when the downstream-map-tlv option is set to any of the ddmap, dsmap, or none values.

The following sections show example outputs for lsp-ping and lsp-trace for some SR-TE LSPs. The first example uses a path with strict hops, each corresponding to an adjacency SID, while the second example uses a path with loose hops, each corresponding to a node SID. Assume the topology shown in Testing MPLS OAM with SR-TE LSP.

Figure 8. Testing MPLS OAM with SR-TE LSP

The following is an output example for LSP-PING and LSP-TRACE on DUT-A for strict-hop adjacency SID SR-TE LSP, where:

  • source = DUT-A

  • destination = DUT-F

  • path = A-B, B-C, C-E, E-D, D-F

*A:Dut-A# oam lsp-ping sr-te "srteABCEDF" detail
LSP-PING srteABCEDF: 96 bytes MPLS payload
Seq=1, send from intf int_to_B, reply from 10.20.1.6
       udp-data-len=32 ttl=255 rtt=1220325ms rc=3 (EgressRtr)
---- LSP srteABCEDF PING Statistics ----
1 packets sent, 1 packets received, 0.00% packet loss
round-trip min = 1220325ms, avg = 1220325ms, max = 1220325ms, stddev = 0.000ms
*A:Dut-A# oam lsp-trace sr-te "srteABCEDF" downstream-map-tlv ddmap detail
lsp-trace to srteABCEDF: 0 hops min, 0 hops max, 252 byte packets
1  10.20.1.2  rtt=1220323ms rc=3(EgressRtr) rsc=5
1  10.20.1.2  rtt=1220322ms rc=8(DSRtrMatchLabel) rsc=4
     DS 1: ipaddr=10.10.33.3 ifaddr=10.10.33.3 iftype=ipv4Numbered MRU=1520
           label[1]=3 protocol=6(ISIS)
           label[2]=262135 protocol=6(ISIS)
           label[3]=262134 protocol=6(ISIS)
           label[4]=262137 protocol=6(ISIS)
2  10.20.1.3  rtt=1220323ms rc=3(EgressRtr) rsc=4
2  10.20.1.3  rtt=1220323ms rc=8(DSRtrMatchLabel) rsc=3
     DS 1: ipaddr=10.10.5.5 ifaddr=10.10.5.5 iftype=ipv4Numbered MRU=1496
           label[1]=3 protocol=6(ISIS)
           label[2]=262134 protocol=6(ISIS)
           label[3]=262137 protocol=6(ISIS)
3  10.20.1.5  rtt=1220325ms rc=3(EgressRtr) rsc=3
3  10.20.1.5  rtt=1220325ms rc=8(DSRtrMatchLabel) rsc=2
     DS 1: ipaddr=10.10.11.4 ifaddr=10.10.11.4 iftype=ipv4Numbered MRU=1496
           label[1]=3 protocol=6(ISIS)
           label[2]=262137 protocol=6(ISIS)
4  10.20.1.4  rtt=1220324ms rc=3(EgressRtr) rsc=2
4  10.20.1.4  rtt=1220325ms rc=8(DSRtrMatchLabel) rsc=1
     DS 1: ipaddr=10.10.9.6 ifaddr=10.10.9.6 iftype=ipv4Numbered MRU=1496
           label[1]=3 protocol=6(ISIS)
5  10.20.1.6  rtt=1220325ms rc=3(EgressRtr) rsc=1

The following is an output example for LSP-PING and LSP-TRACE on DUT-A for loose-hop Node SID SR-TE LSP, where:

  • source = DUT-A

  • destination = DUT-F

  • path = A, B, C, E

*A:Dut-A# oam lsp-ping sr-te "srteABCE_loose" detail
LSP-PING srteABCE_loose: 80 bytes MPLS payload
Seq=1, send from intf int_to_B, reply from 10.20.1.5
       udp-data-len=32 ttl=255 rtt=1220324ms rc=3 (EgressRtr)
---- LSP srteABCE_loose PING Statistics ----
1 packets sent, 1 packets received, 0.00% packet loss
round-trip min = 1220324ms, avg = 1220324ms, max = 1220324ms, stddev = 0.000ms
*A:Dut-A# oam lsp-trace sr-te "srteABCE_loose" downstream-map-tlv ddmap detail
lsp-trace to srteABCE_loose: 0 hops min, 0 hops max, 140 byte packets
1  10.20.1.2  rtt=1220323ms rc=3(EgressRtr) rsc=3
1  10.20.1.2  rtt=1220322ms rc=8(DSRtrMatchLabel) rsc=2
     DS 1: ipaddr=10.10.3.3 ifaddr=10.10.3.3 iftype=ipv4Numbered MRU=1496
           label[1]=26303 protocol=6(ISIS)
           label[2]=26305 protocol=6(ISIS)
     DS 2: ipaddr=10.10.12.3 ifaddr=10.10.12.3 iftype=ipv4Numbered MRU=1496
           label[1]=26303 protocol=6(ISIS)
           label[2]=26305 protocol=6(ISIS)
     DS 3: ipaddr=10.10.33.3 ifaddr=10.10.33.3 iftype=ipv4Numbered MRU=1496
           label[1]=26303 protocol=6(ISIS)
           label[2]=26305 protocol=6(ISIS)
2  10.20.1.3  rtt=1220323ms rc=3(EgressRtr) rsc=2
2  10.20.1.3  rtt=1220323ms rc=8(DSRtrMatchLabel) rsc=1
     DS 1: ipaddr=10.10.5.5 ifaddr=10.10.5.5 iftype=ipv4Numbered MRU=1496
           label[1]=26505 protocol=6(ISIS)
     DS 2: ipaddr=10.10.11.5 ifaddr=10.10.11.5 iftype=ipv4Numbered MRU=1496
           label[1]=26505 protocol=6(ISIS)
3  10.20.1.5  rtt=1220324ms rc=3(EgressRtr) rsc=1

Operation on an SR-ISIS tunnel stitched to an LDP FEC

The following operations apply to lsp-ping and lsp-trace:

  • The lsp-ping tool only works when the responder node is in the same domain (SR or LDP) as the sender node.

  • The lsp-trace tool works throughout the LDP and SR domains. When used with the DDMAP TLV, lsp-trace provides the details of the SR-LDP stitching operation at the boundary node. The boundary node as a responder node replies with the FEC stack change TLV, which contains two operations:

    • a PUSH operation of the SR (LDP) FEC in the LDP-to-SR (SR-to-LDP) direction

    • a POP operation of the LDP (SR) FEC in the LDP-to-SR (SR-to-LDP) direction

  • The ICMP tunneling feature is supported for SR-ISIS tunnel stitched to a LDP FEC.

The following is an output example of the lsp-trace command with the DDMAP TLV for LDP-to-SR direction (symmetric topology LDP-SR-LDP):

*A:Dut-E# oam lsp-trace prefix 10.20.1.2/32 detail downstream-map-tlv ddmap 
lsp-trace to 10.20.1.2/32: 0 hops min, 0 hops max, 108 byte packets
1  10.20.1.3  rtt=3.25ms rc=15(LabelSwitchedWithFecChange) rsc=1 
     DS 1: ipaddr=10.10.3.2 ifaddr=10.10.3.2 iftype=ipv4Numbered MRU=1496 
           label[1]=26202 protocol=6(ISIS)
           fecchange[1]=POP  fectype=LDP IPv4 prefix=10.20.1.2 remotepeer=0.0.0.0 (Unknown)
           fecchange[2]=PUSH fectype=SR Ipv4 Prefix prefix=10.20.1.2 remotepeer=10.10.3.2 
2  10.20.1.2  rtt=4.32ms rc=3(EgressRtr) rsc=1 

The following is an output example of the lsp-trace command with the DDMAP TLV for SR-to-LDP direction (symmetric topology LDP-SR-LDP):

*A:Dut-B# oam lsp-trace prefix 10.20.1.5/32 detail downstream-map-tlv ddmap sr-isis 
lsp-trace to 10.20.1.5/32: 0 hops min, 0 hops max, 108 byte packets
1  10.20.1.3  rtt=2.72ms rc=15(LabelSwitchedWithFecChange) rsc=1 
     DS 1: ipaddr=10.11.5.5 ifaddr=10.11.5.5 iftype=ipv4Numbered MRU=1496 
           label[1]=262143 protocol=3(LDP)
           fecchange[1]=POP  fectype=SR Ipv4 Prefix prefix=10.20.1.5 remotepeer=0.0.0.0 (Unknown)
           fecchange[2]=PUSH fectype=LDP IPv4 prefix=10.20.1.5 remotepeer=10.11.5.5 
2  10.20.1.5  rtt=4.43ms rc=3(EgressRtr) rsc=1

Operation on a BGP IPv4 LSP resolved over an SR-ISIS IPv4 tunnel, SR-OSPF IPv4 tunnel, or SR-TE IPv4 LSP

SR OS enhances lsp-ping and lsp-trace of a BGP IPv4 LSP resolved over an SR-ISIS IPv4 tunnel, an SR-OSPF IPv4 tunnel, or an SR-TE IPv4 LSP. The SR OS enhancement reports the full set of ECMP next-hops for the transport tunnel at both ingress PE and at the ABR or ASBR. The list of downstream next-hops is reported in the DSMAP or DDMAP TLV.

When the user initiates an lsp-trace of the BGP IPv4 LSP with the path-destination option specified, the CPM hash code, at the responder node, selects the outgoing interface to be returned in DSMAP or DDMAP. This decision is based on the modulo operation of the hash value on the label stack or the IP headers (where the DST IP is replaced by the specific 127/8 prefix address in the multipath type 8 field of the DSMAP or DDMAP) of the echo request message and the number of outgoing interfaces in the ECMP set.

Example topology for BGP over SR-OSPF, SR-TE (OSPF), SR-ISIS, and SR-TE (ISIS) depicts an example topology used in the subsequent BGP over SR-OSPF, BGP over SR-TE (OSPF), BGP over SR-ISIS, and BGP over SR-TE (ISIS) examples.

Figure 9. Example topology for BGP over SR-OSPF, SR-TE (OSPF), SR-ISIS, and SR-TE (ISIS)

The following are example outputs of the lsp-trace command for a hierarchical tunnel consisting of a BGP IPv4 LSP resolved over an SR-ISIS IPv4 tunnel, an SR-OSPF IPv4 tunnel, or an SR-TE IPv4 LSP.

BGP over SR-OSPF
*A:Dut-A# oam lsp-trace bgp-label prefix 11.21.1.6/32 detail downstream-map-tlv ddmap path-destination 127.1.1. 
lsp-trace to 11.21.1.6/32: 0 hops min, 0 hops max, 168 byte packets
1  10.20.1.3  rtt=2.31ms rc=8(DSRtrMatchLabel) rsc=2 
     DS 1: ipaddr=10.10.5.5 ifaddr=10.10.5.5 iftype=ipv4Numbered MRU=1496 
           label[1]=27506 protocol=5(OSPF)
           label[2]=262137 protocol=2(BGP)
     DS 2: ipaddr=10.10.11.4 ifaddr=10.10.11.4 iftype=ipv4Numbered MRU=1496 
           label[1]=27406 protocol=5(OSPF)
           label[2]=262137 protocol=2(BGP)
     DS 3: ipaddr=10.10.11.5 ifaddr=10.10.11.5 iftype=ipv4Numbered MRU=1496 
           label[1]=27506 protocol=5(OSPF)
           label[2]=262137 protocol=2(BGP)
2   10.20.1.4  rtt=4.91ms rc=8(DSRtrMatchLabel) rsc=2 
     DS 1: ipaddr=10.10.9.6 ifaddr=10.10.9.6 iftype=ipv4Numbered MRU=1492 
           label[1]=27606 protocol=5(OSPF)
           label[2]=262137 protocol=2(BGP)
3  10.20.1.6  rtt=4.73ms rc=3(EgressRtr) rsc=2 
3  10.20.1.6  rtt=5.44ms rc=3(EgressRtr) rsc=1 
BGP over SR-TE (OSPF)
*A:Dut-A# oam lsp-trace bgp-label prefix 11.21.1.6/32 detail downstream-map-tlv ddmap path-destination 127.1.1.1 
lsp-trace to 11.21.1.6/32: 0 hops min, 0 hops max, 236 byte packets
1  10.20.1.2  rtt=2.13ms rc=3(EgressRtr) rsc=4 
1  10.20.1.2  rtt=1.79ms rc=8(DSRtrMatchLabel) rsc=3 
     DS 1: ipaddr=10.10.4.4 ifaddr=10.10.4.4 iftype=ipv4Numbered MRU=1492 
           label[1]=3 protocol=5(OSPF)
           label[2]=262104 protocol=5(OSPF)
           label[3]=262139 protocol=2(BGP)
2  10.20.1.4  rtt=3.24ms rc=3(EgressRtr) rsc=3 
2  10.20.1.4  rtt=4.46ms rc=8(DSRtrMatchLabel) rsc=2 
     DS 1: ipaddr=10.10.9.6 ifaddr=10.10.9.6 iftype=ipv4Numbered MRU=1492 
           label[1]=3 protocol=5(OSPF)
           label[2]=262139 protocol=2(BGP)
3  10.20.1.6  rtt=6.24ms rc=3(EgressRtr) rsc=2 
3  10.20.1.6  rtt=6.18ms rc=3(EgressRtr) rsc=1  
BGP over SR-ISIS
A:Dut-A# oam lsp-trace bgp-label prefix 11.21.1.6/32 detail downstream-map-tlv ddmap path-destination 127.1.1.1 
lsp-trace to 11.21.1.6/32: 0 hops min, 0 hops max, 168 byte packets
1  10.20.1.3  rtt=3.33ms rc=8(DSRtrMatchLabel) rsc=2 
    DS 1:  ipaddr=10.10.5.5 ifaddr=10.10.5.5 iftype=ipv4Numbered MRU=1496 
           label[1]=28506 protocol=6(ISIS)
           label[2]=262139 protocol=2(BGP)
     DS 2: ipaddr=10.10.11.4 ifaddr=10.10.11.4 iftype=ipv4Numbered MRU=1496 
           label[1]=28406 protocol=6(ISIS)
           label[2]=262139 protocol=2(BGP)
     DS 3: ipaddr=10.10.11.5 ifaddr=10.10.11.5 iftype=ipv4Numbered MRU=1496 
           label[1]=28506 protocol=6(ISIS)
           label[2]=262139 protocol=2(BGP)
2  10.20.1.4  rtt=5.12ms rc=8(DSRtrMatchLabel) rsc=2 
     DS 1: ipaddr=10.10.9.6 ifaddr=10.10.9.6 iftype=ipv4Numbered MRU=1492 
           label[1]=28606 protocol=6(ISIS)
           label[2]=262139 protocol=2(BGP)
3  10.20.1.6  rtt=8.41ms rc=3(EgressRtr) rsc=2 
3  10.20.1.6  rtt=6.93ms rc=3(EgressRtr) rsc=1 
BGP over SR-TE (IS-IS)
*A:Dut-A# oam lsp-trace bgp-label prefix 11.21.1.6/32 detail downstream-map-tlv ddmap path-destination 127.1.1.1 
lsp-trace to 11.21.1.6/32: 0 hops min, 0 hops max, 248 byte packets
1  10.20.1.2  rtt=2.60ms rc=3(EgressRtr) rsc=4 
1  10.20.1.2  rtt=2.29ms rc=8(DSRtrMatchLabel) rsc=3 
     DS 1: ipaddr=10.10.4.4 ifaddr=10.10.4.4 iftype=ipv4Numbered MRU=1492 
           label[1]=3 protocol=6(ISIS)
           label[2]=262094 protocol=6(ISIS)
           label[3]=262139 protocol=2(BGP)
2  10.20.1.4  rtt=4.04ms rc=3(EgressRtr) rsc=3 
2  10.20.1.4  rtt=4.38ms rc=8(DSRtrMatchLabel) rsc=2 
     DS 1: ipaddr=10.10.9.6 ifaddr=10.10.9.6 iftype=ipv4Numbered MRU=1492 
           label[1]=3 protocol=6(ISIS)
           label[2]=262139 protocol=2(BGP)
3  10.20.1.6  rtt=6.64ms rc=3(EgressRtr) rsc=2 
3  10.20.1.6  rtt=5.94ms rc=3(EgressRtr) rsc=1

Assuming the topology in Example topology for BGP over SR-ISIS in inter-AS option C and BGP over SR-TE (ISIS) in inter-AS option C has the addition of an External Border Gateway Protocol (eBGP) peering between nodes B and C, the BGP IPv4 LSP spans the AS boundary and resolves to an SR-ISIS tunnel or an SR-TE LSP within each AS.

Figure 10. Example topology for BGP over SR-ISIS in inter-AS option C and BGP over SR-TE (ISIS) in inter-AS option C

BGP over SR-ISIS in inter-AS option C
*A:Dut-A# oam lsp-trace bgp-label prefix 11.20.1.6/32 src-ip-address 11.20.1.1 detail downstream-map-tlv ddmap path-destination 127.1.1.1 
lsp-trace to 11.20.1.6/32: 0 hops min, 0 hops max, 168 byte packets
1  10.20.1.2  rtt=2.69ms rc=3(EgressRtr) rsc=2 
1  10.20.1.2  rtt=3.15ms rc=8(DSRtrMatchLabel) rsc=1 
     DS 1: ipaddr=10.10.3.3 ifaddr=10.10.3.3 iftype=ipv4Numbered MRU=0 
           label[1]=262127 protocol=2(BGP)
2  10.20.1.3  rtt=5.26ms rc=15(LabelSwitchedWithFecChange) rsc=1 
     DS 1: ipaddr=10.10.5.5 ifaddr=10.10.5.5 iftype=ipv4Numbered MRU=1496 
           label[1]=26506 protocol=6(ISIS)
           label[2]=262139 protocol=2(BGP)
           fecchange[1]=PUSH fectype=SR Ipv4 Prefix prefix=10.20.1.6 remotepeer=10.10.5.5 
3  10.20.1.5  rtt=7.08ms rc=8(DSRtrMatchLabel) rsc=2 
     DS 1: ipaddr=10.10.10.6 ifaddr=10.10.10.6 iftype=ipv4Numbered MRU=1496 
           label[1]=26606 protocol=6(ISIS)
       label[2]=262139 protocol=2(BGP)
4  10.20.1.6  rtt=9.41ms rc=3(EgressRtr) rsc=2 
4  10.20.1.6  rtt=9.53ms rc=3(EgressRtr) rsc=1
BGP over SR-TE (IS-IS in inter-AS option C
*A:Dut-A# oam lsp-trace bgp-label prefix 11.20.1.6/32 src-ip-address 11.20.1.1 detail downstream-map-tlv ddmap path-destination 127.1.1.1 
lsp-trace to 11.20.1.6/32: 0 hops min, 0 hops max, 168 byte packets
1  10.20.1.2  rtt=2.77ms rc=3(EgressRtr) rsc=2 
1  10.20.1.2  rtt=2.92ms rc=8(DSRtrMatchLabel) rsc=1 
     DS 1: ipaddr=10.10.3.3 ifaddr=10.10.3.3 iftype=ipv4Numbered MRU=0 
           label[1]=262127 protocol=2(BGP)
2  10.20.1.3  rtt=4.82ms rc=15(LabelSwitchedWithFecChange) rsc=1 
     DS 1: ipaddr=10.10.5.5 ifaddr=10.10.5.5 iftype=ipv4Numbered MRU=1496 
           label[1]=26505 protocol=6(ISIS)
           label[2]=26506 protocol=6(ISIS)
           label[3]=262139 protocol=2(BGP)
           fecchange[1]=PUSH fectype=SR Ipv4 Prefix prefix=10.20.1.6            remotepeer=0.0.0.0 (Unknown)
           fecchange[2]=PUSH fectype=SR Ipv4 Prefix prefix=10.20.1.5            remotepeer=10.10.5.5 
3  10.20.1.5  rtt=7.10ms rc=3(EgressRtr) rsc=3 
3  10.20.1.5  rtt=7.45ms rc=8(DSRtrMatchLabel) rsc=2 
     DS 1: ipaddr=10.10.10.6 ifaddr=10.10.10.6 iftype=ipv4Numbered MRU=1496 
           label[1]=26606 protocol=6(ISIS)
           label[2]=262139 protocol=2(BGP)
4  10.20.1.6  rtt=9.23ms  c=3(EgressRtr) rsc=2 
4  10.20.1.6  rtt=9.46ms rc=3(EgressRtr) rsc=1 

Operation on an SR-ISIS IPv4 tunnel, IPv6 tunnel, or SR-OSPF IPv4 tunnel resolved over IGP IPv4 shortcuts using RSVP-TE LSPs

When IGP shortcut is enabled in an IS-IS or an OSPF instance and the family SRv4 or SRv6 is set to resolve over RSVP-TE LSPs, a hierarchical tunnel is created whereby an SR-ISIS IPv4 tunnel, an SR-ISIS IPv6 tunnel, or an SR-OSPF tunnel resolves over the IGP IPv4 shortcuts using RSVP-TE LSPs.

The following example outputs are of the lsp-trace command for a hierarchical tunnel consisting of an SR‑ISIS IPv4 tunnel and an SR-OSPF IPv4 tunnel, resolving over an IGP IPv4 shortcut using a RSVP-TE LSP.

The topology, as shown in Example topology for SR-ISIS over RSVP-TE and SR-OSPF over RSVP-TE, is used for the following SR-ISIS over RSVP-TE and SR-OSPF over RSVP-TE example outputs.

Figure 11. Example topology for SR-ISIS over RSVP-TE and SR-OSPF over RSVP-TE
SR-ISIS over RSVP-TE
*A:Dut-F# oam lsp-trace sr-isis prefix 10.20.1.1/32 detail path-destination 127.1.1.1 igp-instance 1 
lsp-trace to 10.20.1.1/32: 0 hops min, 0 hops max, 180 byte packets
1  10.20.1.4  rtt=5.05ms rc=8(DSRtrMatchLabel) rsc=2 
     DS 1: ipaddr=10.10.4.2 ifaddr=10.10.4.2 iftype=ipv4Numbered MRU=1500 
           label[1]=262121 protocol=4(RSVP-TE)
           label[2]=28101 protocol=6(ISIS)
2  10.20.1.2  rtt=5.56ms rc=8(DSRtrMatchLabel) rsc=2 
     DS 1: ipaddr=10.10.1.1 ifaddr=10.10.1.1 iftype=ipv4Numbered MRU=1500 
           label[1]=262124 protocol=4(RSVP-TE)
           label[2]=28101 protocol=6(ISIS)
3  10.20.1.1  rtt=7.30ms rc=3(EgressRtr) rsc=2 
3  10.20.1.1  rtt=5.40ms rc=3(EgressRtr) rsc=1 
*A:Dut-F#
SR-OSPF over RSVP-TE
*A:Dut-F# oam lsp-trace sr-ospf prefix 10.20.1.1/32 detail path-destination 127.1.1.1 igp-instance 2 
lsp-trace to 10.20.1.1/32: 0 hops min, 0 hops max, 180 byte packets
1  10.20.1.4  rtt=3.24ms rc=8(DSRtrMatchLabel) rsc=2 
     DS 1: ipaddr=10.10.4.2 ifaddr=10.10.4.2 iftype=ipv4Numbered MRU=1500 
           label[1]=262125 protocol=4(RSVP-TE)
           label[2]=27101 protocol=5(OSPF)
2  10.20.1.2  rtt=5.77ms rc=8(DSRtrMatchLabel) rsc=2
     DS 1: ipaddr=10.10.1.1 ifaddr=10.10.1.1 iftype=ipv4Numbered MRU=1500 
           label[1]=262124 protocol=4(RSVP-TE)
           label[2]=27101 protocol=5(OSPF)
3  10.20.1.1  rtt=7.19ms rc=3(EgressRtr) rsc=2 
3  10.20.1.1  rtt=8.41ms rc=3(EgressRtr) rsc=1 

Operation on an LDP IPv4 FEC resolved over IGP IPv4 shortcuts using SR-TE LSPs

When IGP shortcut is enabled in an IS-IS or an OSPF instance and the family IPv4 is set to resolve over SR-TE LSPs, a hierarchical tunnel is created whereby an LDP IPv4 FEC resolves over the IGP IPv4 shortcuts using SR-TE LSPs.

The following example outputs show the lsp-trace command for a hierarchical tunnel consisting of a LDP IPv4 FEC resolving over a IGP IPv4 shortcut using a SR-TE LSP.

The topology, as shown in Example topology for LDP over SR-TE (ISIS) and LDP over SR-TE (OSPF), is used for the following LDP over SR-TE (ISIS) and LDP over SR-TE (OSPF) example outputs.

Figure 12. Example topology for LDP over SR-TE (ISIS) and LDP over SR-TE (OSPF)
LDP over SR-TE (ISIS)
*A:Dut-F# oam lsp-trace prefix 10.20.1.1/32 detail path-destination 127.1.1.1  
lsp-trace to 10.20.1.1/32: 0 hops min, 0 hops max, 184 byte packets
1  10.20.1.4  rtt=2.33ms rc=8(DSRtrMatchLabel) rsc=3 
     DS 1: ipaddr=10.10.4.2 ifaddr=10.10.4.2 iftype=ipv4Numbered MRU=1492 
           label[1]=28202 protocol=6(ISIS)
           label[2]=28201 protocol=6(ISIS)
           label[3]=262138 protocol=3(LDP)
2  10.20.1.2  rtt=6.39m rc=3(EgressRtr) rsc=3
2  10.20.1.2  rtt=7.29ms rc=8(DSRtrMatchLabel) rsc=2 
     DS 1: ipaddr=10.10.1.1 ifaddr=10.10.1.1 iftype=ipv4Numbered MRU=1492
           label[1]=28101 protocol=6(ISIS)
           label[2]=262138 protocol=3(LDP)
3  10.20.1.1  rtt=8.34m rc=3(EgressRtr) rsc=2 
3  10.20.1.1  rtt=9.37ms rc=3(EgressRtr) rsc=1

*A:Dut-F# oam lsp-ping prefix 10.20.1.1/32 detail 
LSP-PING 10.20.1.1/32: 80 bytes MPLS payload
Seq=1, send from intf int_to_D, reply from 10.20.1.1
     udp-data-len=32 ttl=255 rtt=8.21ms rc=3 (EgressRtr)
---- LSP 10.20.1.1/32 PING Statistics ----
1 packets sent, 1 packets received, 0.00% packet loss
round-trip mi = 8.21ms, avg = 8.21ms, max = 8.21ms, stddev = 0.000ms
===============================================================================
LDP Bindings (IPv4 LSR ID 10.20.1.6)
             (IPv6 LSR ID fc00::a14:106)
===============================================================================
Label Status:
        U - Label In Use, N - Label Not In Use, W - Label Withdrawn
        WP - Label Withdraw Pending, BU - Alternate For Fast Re-Route
        e - Label ELC
FEC Flags:
        LF - Lower FEC, UF - Upper FEC, M - Community Mismatch, BA - ASBR Backup FEC
===============================================================================
LDP IPv4 Prefix Bindings
===============================================================================
Prefix                                      IngLbl                    EgrLbl
Peer                                        EgrIntf/LspId             
EgrNextHop                                                            
-------------------------------------------------------------------------------
10.20.1.1/32                                  --                      262138
10.20.1.1:0                                 LspId 655467              
10.20.1.1                                                             
                                                                       
10.20.1.1/32                                262070U                   262040
10.20.1.3:0                                   --                      
  --                                                                  
                                                                       
10.20.1.1/32                                262070U                   --
10.20.1.4:0                                   --                      
  --                                                                  
                                                                       
10.20.1.1/32                                262070U                   262091
10.20.1.5:0                                   --                      
  --                                                                  
                                                                       
10.20.1.1/32                                  --                      262138
fc00::a14:101[0]                              --                      
  --                                                                  
                                                                       
10.20.1.1/32                                262070U                   262040
fc00::a14:103[0]                              --                      
  --                                                                  
                                                                       
10.20.1.1/32                                262070U                   262091
fc00::a14:105[0]                              --                      
  --                                                                  
                                                                       
-------------------------------------------------------------------------------
No. of IPv4 Prefix Bindings: 7
===============================================================================
LDP over SR-TE (OSPF)
*A:Dut-F# oam lsp-trace prefix 10.20.1.1/32 detail path-destination 127.1.1.1 
lsp-trace to 10.20.1.1/32: 0 hops min, 0 hops max, 184 byte packets
1  10.20.1.4  rtt=2.73ms rc=8(DSRtrMatchLabel) rsc=3 
     DS 1: ipaddr=10.10.4.2 ifaddr=10.10.4.2 iftype=ipv4Numbered MRU=1492 
           label[1]=27202 protocol=5(OSPF)
           label[2]=27201 protocol=5(OSPF)
           label[3]=262143 protocol=3(LDP)
2  10.20.1.2  rtt=6.77ms rc=3(EgressRtr) rsc=3 
2  10.20.1.2  rtt=6.75ms rc=8(DSRtrMatchLabel) rsc=2 
     DS 1: ipaddr=10.10.1.1 ifaddr=10.10.1.1 iftype=ipv4Numbered MRU=1492 
           label[1]=27101 protocol=5(OSPF)
           label[2]=262143 protocol=3(LDP)
3  10.20.1.1  rtt=7.10ms rc=3(EgressRtr) rsc=2 
3  10.20.1.1  rtt=7.53ms rc=3(EgressRtr) rsc=1 
 
*A:Dut-F# oam lsp-ping prefix 10.20.1.1/32 detail 
LSP-PING 10.20.1.1/32: 80 bytes MPLS payload
Seq=1, send from intf int_to_D, reply from 10.20.1.1
       udp-data-len=32 ttl=255 rtt=8.09ms rc=3 (EgressRtr)
 
---- LSP 10.20.1.1/32 PING Statistics ----
1 packets sent, 1 packets received, 0.00% packet loss
round-trip min = 8.09ms, avg = 8.09ms, max = 8.09ms, stddev = 0.000ms
 
===============================================================================
LDP Bindings (IPv4 LSR ID 10.20.1.6)
             (IPv6 LSR ID fc00::a14:106)
===============================================================================
Label Status:
        U - Label In Use, N - Label Not In Use, W - Label Withdrawn
        WP - Label Withdraw Pending, BU - Alternate For Fast Re-Route
        e - Label ELC
FEC Flags:
        LF - Lower FEC, UF - Upper FEC, M - Community Mismatch, BA - ASBR Backup FEC
===============================================================================
LDP IPv4 Prefix Bindings
===============================================================================
Prefix                                      IngLbl                    EgrLbl
Peer                                        EgrIntf/LspId             
EgrNextHop                                                            
-------------------------------------------------------------------------------
10.20.1.1/32                                  --                      262143
10.20.1.1:0                                 LspId 655467              
10.20.1.1                                                             
                                                                       
10.20.1.1/32                                262089U                   262135
10.20.1.3:0                                   --                      
  --                                                                  
                                                                       
10.20.1.1/32                                262089U                     --
10.20.1.4:0                                   --                      
  --                                                                  
                                                                       
10.20.1.1/32                                262089U                   262129
10.20.1.5:0                                   --                      
  --                                                                  
                                                                       
10.20.1.1/32                                  --                      262143
fc00::a14:101[0]                              --                      
  --                                                                  
                                                                       
10.20.1.1/32                                262089U                   262135
fc00::a14:103[0]                              --                      
  --                                                                  
                                                                       
10.20.1.1/32                                262089U                   262129
fc00::a14:105[0]                              --                      
  --                                                                  
                                                                       
-------------------------------------------------------------------------------
No. of IPv4 Prefix Bindings: 7
===============================================================================

OAM support in Segment Routing IPv6 (SRv6)

The router supports a comprehensive set of OAM tools for SRv6.

The network setup illustrated in SRv6 OAM network setup shows an example configuration of segment routing using SRv6 that is discussed in this section. See the 7750 SR and 7950 XRS Segment Routing and PCE User Guide for more information about the SRv6 feature.

Figure 13. SRv6 OAM network setup

As shown in SRv6 OAM network setup, the network administrator originates a ping or a traceroute probe on node R1 to test the path of an SRv6 locator of node D, an SRv6 segment identifier (SID) owned by node D, or an IP prefix resolved to an SRv6 tunnel toward node D. R1 is referred to as the sender node. Node D is referred to as the target node because it owns the target locator or SID that is being tested. A target node can be any router in the SRv6 network domain which either owns the target locator or SID, or a router in which the OAM probe was extracted because a local route matches or because of the value of the hop-limit field setting in the packet.

The primary path to D is through R2 and R4. The link-protect TI-LFA backup path is through R3 as a PQ node and then R2 and R4.

The classic ping and traceroute OAM CLI commands are used to test an IPv4 or IPv6 prefix in a virtual routing and forwarding (VRF) table or in the base router table when resolved to an SRv6 tunnel, for example:

ping address [detail] [source ip-address]

traceroute address [detail] source ip-address] [protocol {udp | tcp}] decode original-datagram

The same CLI commands are used to test the address of an SRv6 locator or a SID. In this case, the user enters the IPv6 address of the target locator prefix or the target SID.

The source address encoded in the outer IPv6 header of the ping or traceroute packet is derived from the following steps, in ascending order:

  1. the user-entered source IPv6 address in the ping or traceroute command, which is checked for validity against a local interface address or a local locator prefix or SID

  2. the globally-configured IPv6 source address from the source-address application6 ping or source-address application6 traceroute commands

  3. the preferred primary IPv6 address of the system interface

  4. the IPv6 address of outgoing interface

Ping or traceroute of SRv6 remote locator or remote SID (End, End.X, End.DT4, End.DT6, End.DT46, End.DX2, End.DT2M and End.DT2U)

The features in this section are in accordance with draft-ietf-6man-spring-srv6-oam, Operations, Administration, and Maintenance (OAM) in Segment Routing Networks with IPv6 Data plane (SRv6).

Ingress PE router (sender node) behavior

The packet is encoded with a destination address set to the remote locator prefix or the specific remote SID, and the next-header field is ICMPv6 (for the ping’s Echo Request message), UDP or TCP (for traceroute). The packet is encapsulated as shown in Packet encapsulation for ping or traceroute of a remote locator/SID. When the Topology-Independent Loop-Free Alternate (TI-LFA) or Remote LFA repair tunnel is activated, the LFA segment routing header (LFA SRH) is also pushed on the encapsulation of the SRv6 tunnel to the node D.

Figure 14. Packet encapsulation for ping or traceroute of a remote locator/SID

The outer IPv6 header hop-limit field is set according to the operation of the probe. For ping, the hop limit uses the default 254 or a user-entered value.

For traceroute, the hop limit is incrementally increased using one of the following:

  • from 1 until the packet reaches the egress PE

  • from the configured minimum value to the maximum value or until the packet reaches the egress PE

The ingress PE looks up the prefix of the locator or SID in the routing table and if a route exists, it forwards the packet to the next hop. The ingress PE does not check if the target SID or locator has been received in IS-IS or BGP.

Transit P router behavior

Ping and traceroute operate similarly to any data or OAM IPv6 packet when expiring (the value in the hop-limit field is equal to or less than 1) at a transit SRv6 node, whether this node is a SID termination or not.

The datapath at the ingress network interface where the packet is received extracts the packet to the CPM.

The CPM originates a TTL expiry ICMP reply message Type: "Time Exceeded", Code: "Time to Live exceeded in Transit".

The CPM sends the reply to the SRv6 router whose address is encoded in the SA field of the outer IPv6 header in the received packet. The source address is set to the system IPv6 address, if configured, or the address of the interface used to forward the packet to the next hop.

When the transport protocol of the traceroute packet is UDP or TCP (as indicated in the traceroute command), the intermediate node copies a portion of the original datagram, up to 1232 bytes, into the reply message payload. This may include the outer IPv6 header and SRH headers. To copy the maximum information from the original datagram, the node must include the configure test-oam icmp ipv6 maximum-original-datagram option.

When the hop-limit field value is higher than 1, the packet is processed in the datapath similar to the process of any SRv6 user data packet in the transit router. See the 7750 SR and 7950 XRS Segment Routing and PCE User Guide for more information.

Egress PE router (target node) behavior

The datapath in the ingress network port in the destination router that owns the target SID extracts the packet to CPM.

A traceroute packet is extracted based on the hop-limit field value of 1 before the route lookup.

A ping packet is extracted after the route lookup matches a FIB entry of a local locator, End, or End.X SID.

The CPM checks that the target locator or SID address matches a local entry. This means that the locator or SID has either been configured manually by the user or it has been auto-allocated by the locator module for use by IS-IS or BGP.

A match on the locator requires an exact match on the locator field and that both the function and argument fields be zero.

A match on a SID requires both the locator and function fields to match. The argument field is not checked.

When a match on a local locator or SID exists, the CPM replies with the following:

  • in the case of ICMPv6 ping

    CPM replies with an ICMPv6 echo reply message.

    The source address of the packet is set to the address in the DA field of the packet of the received echo request message.

  • in the case of UDP or TCP traceroute

    CPM replies with an ICMPv6 message (Type: ‟Destination unreachable”, Code: ‟Port Unreachable”).

    The source address is set to the system IPv6 address if it is configured or the address of the interface used to forward the packet to the next hop.

    The target node copies a portion of the original datagram, up to 1232 bytes, of the received packet into the reply message payload. This may include the outer IPv6 header and any SRH headers. To copy the maximum information from the original datagram, the node must configure the configure test-oam icmp ipv6 maximum-original-datagram option.

When the traceroute is carried over a TCP transport and the destination is not an interface address, there is no indication whether the TCP port is open or closed.

End-to-end packet encapsulation for ping or traceroute of a remote locator or SID (1) and End-to-end packet encapsulation for ping or traceroute of a remote locator or SID (2) show the packet encapsulation from ingress PE to egress PE for both the primary path and the backup path. For the backup path, both the PSP and USP types of the LFA SRH are shown.

Figure 15. End-to-end packet encapsulation for ping or traceroute of a remote locator or SID (1)
Figure 16. End-to-end packet encapsulation for ping or traceroute of a remote locator or SID (2)
Ping or traceroute of an IPv4/IPv6 VRF prefix resolved to an SRv6 shortest path or SRv6 policy tunnel

This feature implements the existing behavior of a ping or a traceroute packet, originated at the ingress PE node, for a prefix resolved to an SRv6 tunnel. If the OAM ping or traceroute packet received from the CE router expires (with a hop-limit field value equal to or less than 1), the ingress PE node responds as per current behavior. If the packet does not expire (with a hop-limit field value greater than 1), it is forwarded over the SRv6 tunnel as a datapath packet.

Ingress PE router (sender node) behavior

The CPM-originated ping or traceroute packet is encoded with:

  • DA set to End.DT4, End.DT6, or End.DT46 SID
  • the next-header field set to IPv6 (IPv6 VRF prefix)
  • the outer IPv6 header hop-limit field is set to the default value 254
  • the next-header field of the inner IPv6 header is set to ICMPv6 (for ping), to UDP or TCP (for traceroute)

The packet is encapsulated as shown in Packet encapsulation for ping or traceroute of a VRF IPv6 prefix over an SRv6 tunnel. The LFA SRH is also shown in the packet encapsulation of the SRv6 tunnel to node D when the TI-LFA or remote LFA repair tunnel is activated.

A similar encoding of the ping and traceroute packet is performed when testing a VRF IPv4 prefix that is resolved to an SRv6 tunnel. The difference is the inner packet is in a IPv4 packet format.

Figure 17. Packet encapsulation for ping or traceroute of a VRF IPv6 prefix over an SRv6 tunnel
Transit P router behavior

The packet is processed in the datapath, like any SRv6 user-data packet, by the transit router. See the 7750 SR and 7950 XRS Segment Routing and PCE User Guide for more information.

Egress PE router behavior

At the target node, the packet is processed as follows:

  • If the DA field of the outer IPv6 header matches a service SID and the payload type is IPv4 or IPv6, the datapath removes the SRv6 headers and extracts the inner IPv4 or IPv6 packet to the CPM.
  • If the DA field of the packet matches a local locator prefix entry in the FIB and the payload type is either IPv4 or IPv6, the packet is handed to the SRv6 Forwarding Path Extension (FPE). The egress datapath of the SRv6 FPE removes the SRv6 headers and passes the inner IPv4 or IPv6 packet to the ingress datapath which performs the regular exception handling for a ping or a traceroute packet.

Ping or traceroute of an SRv6 policy

Use the ping tool to verify the connectivity of a specific static or BGP SRv6 policy and the traceroute tool to verify the path of the SRv6 policy. The objective of these tools is similar to LSP ping and LSP trace for an SR policy with an MPLS data plane; they test that the SIDs specified in the SRv6 policy segment lists are programmed and reachable through the SRv6 policy and that the endpoint of the SRv6 policy is reachable through that policy. Both UDP and TCP transport are supported for traceroute.

From an encapsulation perspective, the SRv6 encapsulation is pushed onto the ping or traceroute packet and the next-header field in the Segment Routing Header (SRH) is set to ICMPv6, UDP or TCP (as applicable). Similar to seamless BFD (S-BFD), an additional address (the endpoint address of the SRv6 policy) is pushed as the final entry in the SRH to prevent SRH expiry in case the last SID in a segment list is a binding SID.

Use the following command to launch a ping for an SRv6 policy:
Note: You must configure at least the color and endpoint command options to launch a ping.
  • MD-CLI

    ping srv6-policy color color endpoint ipv6-address segment-list segment-list candidate-path protocol-owner bgp | static preference preference distinguisher distinguisher
  • classic CLI

    ping srv6-policy color color endpoint ipv6-address [segment-list segment-list] [candidate-path protocol-owner static | bgp [preference preference] [distinguisher distinguisher]]

Use the following command to launch a traceroute for an SRv6 policy:

Note: You must configure at least the color and endpoint command options to launch a traceroute.
  • MD-CLI

    traceroute srv6-policy color color endpoint ipv6-address segment-list segment-list candidate-path protocol-owner bgp | static preference preference distinguisher distinguisher
  • classic CLI

    traceroute srv6-policy color color endpoint ipv6-address [segment-list segment-list] [candidate-path protocol-owner static | bgp [preference preference] [distinguisher distinguisher]]

The router uses the tuple of color, endpoint, and optionally the segment-list and candidate-path, to match the SRv6 policy candidate path to send the ping or traceroute packet on. If the candidate-path is not specified, the matching active candidate path across all static and BGP SRv6 policies is selected. If the segment list ID is not specified, the router sends the ping or traceroute packet probe on the first available segment list that can be used for forwarding. For example, if a segment list is not specified, the router sends on the lowest segment list that is up, which means the top SID resolves or S-BFD is up on it. The test fails if the router does not find a matching programmed candidate path or a combination of a candidate path and segment list.

LDP tree trace: end-to-end testing of paths in an LDP ECMP network

Network resilience using LDP ECMP shows an IP/MPLS network which uses LDP ECMP for network resilience. Faults that are detected through IGP or LDP are corrected as soon as IGP and LDP reconverge. The impacted traffic is forwarded on the next available ECMP path as determined by the hash routine at the node that had a link failure.

Figure 18. Network resilience using LDP ECMP

However, there are faults which the IGP/LDP control planes may not detect. These faults may be because of a corruption of the control plane state or of the data plane state in a node. Although these faults are very rare and mostly caused by misconfiguration, the LDP tree trace OAM feature is intended to detect these ‟silent” data plane and control plane faults. For example, it is possible that the forwarding plane of a node has a corrupt Next Hop Label Forwarding Entry (NHLFE) and keeps forwarding packets over an ECMP path only to have the downstream node discard them. This data plane fault can only be detected by an OAM tool that can test all possible end-to-end paths between the ingress LER and the egress LER. A corruption of the NHLFE entry can also result from a corruption in the control plane at that node.

LDP ECMP tree building

When the LDP tree trace feature is enabled, the ingress LER builds the ECMP tree for a specific FEC (egress LER) by sending LSP trace messages and including the LDP IPv4 Prefix FEC TLV as well as the downstream mapping TLV. To build the ECMP tree, the router LER inserts an IP address range drawn from the 127/8 space. When received by the downstream LSR, it uses this range to determine which ECMP path is exercised by any IP address or a sub-range of addresses within that range based on its internal hash routine. When the MPLS echo reply is received by the router LER, it records this information and proceeds with the next Echo Request message targeted for a node downstream of the first LSR node along one of the ECMP paths. The sub-range of IP addresses indicated in the initial reply are used because the objective is to have the LSR downstream of the router LER pass this message to its downstream node along the first ECMP path.

The following figure illustrates the behavior through the following example adapted from RFC 8029.

    PE1 ---- A ----- B ----- C ------ G ----- H ---- PE2
            \       \---- D ------/       /
                 \      \--- E------/       / 
                  -- F --------------------/ 

LSR A has two downstream LSRs, B and F, for PE2 FEC. PE1 receives an echo reply from A with the Multipath Type set to 4, with low/high IP addresses of 127.1.1.1->127.1.1.255 for downstream LSR B and 127.2.1.1->127.2.1.255 for downstream LSR F. PE1 reflects this information to LSR B. B, which has three downstream LSRs, C, D, and E, computes that 127.1.1.1->127.1.1.127 would go to C and 127.1.1.128-> 127.1.1.255 would go to D. B would then respond with 3 Downstream Mappings: to C, with Multipath Type 4 (127.1.1.1->127.1.1.127); to D, with Multipath Type 4 (127.1.1.127->127.1.1.255); and to E, with Multipath Type 0.

The router supports multipath type 0 and 8, and up to a maximum of 36 bytes for the multipath length and supports the LER part of the LDP ECMP tree building feature.

A user configurable parameter sets the frequency of running the tree trace capability. The minimum and default value is 60 minutes and the increment is 1 hour.

The router LER gets the list of FECs from the LDP FEC database. New FECs are added to the discovery list at the next tree trace and not when they are learned and added into the FEC database. The maximum number of FECs to be discovered with the tree building feature is limited to 500. The user can configure FECs to exclude the use of a policy profile.

Periodic path exercising

The periodic path exercising capability of the LDP tree trace feature runs in the background to test the LDP ECMP paths discovered by the tree building capability. The probe used is an LSP ping message with an IP address drawn from the sub-range of 127/8 addresses indicated by the output of the tree trace for this FEC.

The periodic LSP ping messages continuously probe an ECMP path at a user configurable rate of at least 1 message per minute. This is the minimum and default value. The increment is 1 minute. If an interface is down on a router LER, then LSP ping probes that normally go out of this interface are not sent.

The LSP ping routine updates the content of the MPLS echo request message, specifically the IP address, as soon as the LDP ECMP tree trace has output the results of a new computation for the path in question.

Tunneling of ICMP reply packets over MPLS LSP

This feature enables the tunneling of ICMP reply packets over MPLS LSP at an LSR node as defined in RFC 3032. At an LSR node, including an ABR, ASBR, or datapath Router Reflector (RR) node, the user enables the ICMP tunneling feature globally on the system using the config>router>icmp-tunneling command.

This feature supports tunneling ICMP replies to a UDP traceroute message. It does not support tunneling replies to an icmp ping message. The LSR part of this feature consists of crafting the reply ICMP packet of type=11- 'time exceeded', with a source address set to a local address of the LSR node, and appending the IP header and leading payload octets of the original datagram. The system skips the lookup of the source address of the sender of the label TTL expiry packet, which becomes the destination address of the ICMP reply packet. Instead, CPM injects the ICMP reply packet in the forward direction of the MPLS LSP the label TTL expiry packet was received from. The TTL of pushed labels should be set to 255.

The source address of the ICMP reply packet is determined as follows:

  • The LSR uses the address of the outgoing interface for the MPLS LSP.

    Note:

    With LDP LSP or BGP LSP, multiple ECMP next-hops can exist in which case the first outgoing interface is selected.

  • If the interface does not have an address of the same family (IPv4 or IPv6) as the ICMP packet, the system address of the same family is selected. If one is not configured, the packet is dropped.

When the packet is received by the egress LER, it performs a regular user packet lookup in the datapath in the GRT context for BGP shortcut, 6PE, and BGP labeled route prefixes, or in VPRN context for VPRN and 6VPE prefixes. It then forwards it to the destination, which is the sender of the original packet which TTL expired at the LSR.

If the egress LER does not have a route to the destination of the ICMP packet, it drops the packets.

The rate of the tunneled ICMP replies at the LSR can be directly or indirectly controlled by the existing IOM level and CPM levels mechanisms. Specifically, the rate of the incoming UDP traceroute packets received with a label stack can be controlled at ingress IOM using the distributed CPU protection feature. The rate of the ICMP replies by CPM can also be directly controlled by configuring a system wide rate limit for packets ICMP replies to MPLS expired packets which are successfully forwarded to CPM using the command configure system security vprn-network-exceptions.
Note:

While this command's name refers to VPRN service, this feature rate limits ICMP replies for packets received with any label stack, including VPRN and shortcuts.

The 7450 ESS, 7750 SR and 7950 XRS router implementation supports appending to the ICMP reply of type Time Exceeded the MPLS label stack object defined in RFC 4950. It does not include it in the ICMP reply type of Destination unreachable.

The MPLS Label Stack object allows an LSR to include label stack information including label value, EXP, and TTL field values, from the encapsulation header of the packet that expired at the LSR node. The ICMP message continues to include the IP header and leading payload octets of the original datagram.

To include the MPLS Label Stack object, the SR OS implementation adds support of RFC 4884, Extended ICMP to Support Multi-Part Messages, which defines extensions for a multi-part ICMPv4/v6 message of type Time Exceeded. Section 5 of RFC 4884 defines backward compatibility of the new ICMP message with extension header with prior standard and proprietary extension headers.

To guarantee interoperability with third-party implementations deployed in customer networks, the router implementation is able to parse in the receive side all possible encapsulations formats as defined in Section 5 of RFC 4884. Specifically:

The MPLS Label Stack object allows an LSR to include label stack information including label value, EXP, and TTL field values, from the encapsulation header of the packet that expired at the LSR node. The ICMP message continues to include the IP header and leading payload octets of the original datagram.

  • If the length attribute is zero, it is treated as a compliant message and the router implementation processes the original datagram field of size equal to 128 bytes and with no extension header.

  • If the length attribute is not included, it is treated as a non-compliant message and the router implementation processes the original datagram field of size equal to 128 bytes and also look for a valid extension header following the 128 byte original datagram field. If the extension is valid, it is processed accordingly, if not it is assumed the remainder of the packet is still part of the original datagram field and process it accordingly.

    Note:

    The router implementation only validates the ICMP extension version number and not the checksum field in the extension header. The checksum of the main time exceeded message is also not validated as per prior implementation.

  • An ICMP reply message is dropped if it includes more than one MPLS label object. In general, when a packet is dropped because of an error in the packet header or structure, the traceroute times out and an error message is not displayed.

  • When processing the received ICMP reply packet, an unsupported extension header is skipped.

In the transmit side, when the MPLS Label Stack object is added as an extension to the ICMP reply message, it is appended to the message immediately following the "original datagram" field taken from the payload of the received traceroute packet. The size of the appended "original datagram" field contains exactly 128 octets. If the original datagram did not contain 128 octets, the "original datagram" field is zero padded to 128 octets.

For example output of the traceroute OAM tool when the ICMP tunneling feature is enabled see Traceroute with ICMP tunneling in common applications.

QoS handling of tunneled ICMP reply packets

When the ICMP reply packet is generated in CPM, its FC is set by default to NC1 with the corresponding default ToS byte value of 0xC0. The DSCP value can be changed by configuring a different value for an ICMP application under the config>router>sgt-qos icmp context.

When the packet is forwarded to the outgoing interface, the packet is queued in the egress network queue corresponding to its CPM assigned FC and profile parameter values. The marking of the packet's EXP is dictated by the {FC, profile}-to-EXP mapping in the network QoS policy configured on the outgoing network interface. The ToS byte, and DSCP value for that matter, assigned by CPM are not modified by the IOM.

Summary of UDP traceroute behavior with and without ICMP tunneling

At a high level, the major difference in the behavior of the UDP traceroute when ICMP tunneling is enabled at an LSR node is that the LSR node tunnels the ICMP reply packet toward the egress of the LSP without looking up the traceroute sender's address. When ICMP tunneling is disabled, the LSR looks it up and replies if the sender is reachable. However there are additional differences in the two behaviors and they are summarized in the following information:

  • icmp-tunneling disabled/IPv4 LSP/IPv4 traceroute

    Ingress LER, egress LER, and LSR attempt to reply to the UDP traceroute of both IPv4 and VPN-IPv4 routes.

    For VPN-IPv4 routes, the LSR attempts to reply but it may not find a route and in such a case the sender node times out. In addition, the ingress and egress ASBR nodes in VPRN inter-AS option B do not respond as in current implementation and the sender times out.

  • icmp-tunneling disabled/IPv4 LSP/IPv6 traceroute

    Ingress LER and egress LER reply to traceroute of both IPv6 and VPN-IPv6 routes. LSR does not reply.

  • icmp-tunneling enabled/IPv4 LSP/IPv4 traceroute

    Ingress LER and egress LER reply directly to the UDP traceroute of both IPv4 and VPN-IPv4 routes. LSR tunnels the reply to the endpoint of the LSP to be forwarded from there to the source of the traceroute.

    For VPN-IPv4 routes, the ingress and egress ASBR nodes in VPRN inter-AS option B also tunnel the reply to the endpoint of the LSP; and therefore, there is no timeout at the sender node like in the case when icmp-tunneling is disabled.

  • icmp-tunneling enabled/IPv4 LSP/IPv6 traceroute

    Ingress LER and egress LER reply directly to the UDP traceroute of both IPv6 and VPN-IPv6 routes. LSR tunnels the reply to the endpoint of the LSP to be forwarded from there to the source of the traceroute.

    For VPN-IPv6 routes, the ingress and egress ASBR nodes in VPRN inter-AS option B also tunnel the reply to the endpoint of the LSP like in the case when icmp-tunneling is disabled.

In the presence of ECMP, CPM generated UDP traceroute packets are not sprayed over multiple ECMP next-hops. The first outgoing interface is selected. In addition, a LSR ICMP reply to a UDP traceroute is also forwarded over the first outgoing interface regardless if ICMP tunneling is enabled or not. When ICMP tunneling is enabled, it means the packet is tunneled over the first downstream interface for the LSP when multiple next-hops exist (LDP FEC or BGP labeled route). In all cases, the ICMP reply packet uses the outgoing interface address as the source address of the reply packet.

SDP diagnostics

The router SDP diagnostics are SDP ping and SDP MTU path discovery.

SDP ping

SDP ping performs in-band unidirectional or round-trip connectivity tests on SDPs. The SDP ping OAM packets are sent in-band, in the tunnel encapsulation, so they follow the same path as traffic within the service. The SDP ping response can be received out-of-band in the control plane, or in-band using the data plane for a round-trip test.

For a unidirectional test, SDP ping tests:

  • egress SDP ID encapsulation

  • ability to reach the far-end IP address of the SDP ID within the SDP encapsulation

  • path MTU to the far-end IP address over the SDP ID

  • forwarding class mapping between the near-end SDP ID encapsulation and the far-end tunnel termination

For a round-trip test, SDP ping uses a local egress SDP ID and an expected remote SDP ID. Because SDPs are unidirectional tunnels, the remote SDP ID must be specified and must exist as a configured SDP ID on the far-end router SDP round trip testing is an extension of SDP connectivity testing with the additional ability to test:

  • remote SDP ID encapsulation

  • potential service round trip time

  • round trip path MTU

  • round trip forwarding class mapping

SDP MTU path discovery

In a large network, network devices can support a variety of packet sizes that are transmitted across its interfaces. This capability is referred to as the Maximum Transmission Unit (MTU) of network interfaces. It is important to understand the MTU of the entire path end-to-end when provisioning services, especially for virtual leased line (VLL) services where the service must support the ability to transmit the largest customer packet.

The Path MTU discovery tool provides a powerful tool that enables service provider to get the exact MTU supported by the network's physical links between the service ingress and service termination points (accurate to one byte).

Service diagnostics

The service diagnostics include the following:
  • service ping
  • IGMP snooping
  • various VPLS MAC diagnostic tools
  • various VLL diagnostic tools

Service ping

Nokia’s Service ping feature provides end-to-end connectivity testing for an individual service. Service ping operates at a higher level than the SDP diagnostics by verifying an individual service and not the collection of services carried within an SDP.

Service ping is initiated from a router to verify round-trip connectivity and delay to the far-end of the service. Nokia’s implementation functions for both GRE and MPLS tunnels and tests the following from edge-to-edge:

  • tunnel connectivity

  • VC label mapping verification

  • service existence

  • service provisioned parameter verification

  • round trip path verification

  • service dynamic configuration verification

IGMP snooping diagnostics

MFIB ping

The multicast forwarding information base (MFIB) ping OAM tool allows the user to easily verify inside a VPLS which SAPs would normally egress a specific multicast stream. The multicast stream is identified by a source unicast and destination multicast IP address, which are mandatory when issuing an MFIB ping command.

An MFIB ping packet is sent through the data plane and goes out with the data plane format containing a configurable VC label TTL. This packet traverses each hop using forwarding plane information for next hop, VC label, and so on. The VC label is swapped at each service-aware hop, and the VC TTL is decremented. If the VC TTL is decremented to 0, the packet is passed up to the management plane for processing. If the packet reaches an egress node, and would be forwarded out a customer facing port (SAP), it is identified by the OAM label below the VC label and passed to the management plane.

VPLS MAC diagnostics

While the LSP ping, SDP ping and service ping tools enable transport tunnel testing and verify whether the correct transport tunnel is used, they do not provide the means to test the learning and forwarding functions on a per-VPLS-service basis.

It is conceivable, that while tunnels are operational and correctly bound to a service, an incorrect Forwarding Database (FDB) table for a service could cause connectivity issues in the service and not be detected by the ping tools. Nokia has developed VPLS OAM functionality to specifically test all the critical functions on a per-service basis. These tools are based primarily on the IETF document draft-stokes-vkompella-ppvpn-hvpls-oam-xx.txt, Testing Hierarchical Virtual Private LAN Services.

The VPLS OAM tools are:

  • MAC ping provides an end-to-end test to identify the egress customer-facing port where a customer MAC was learned. MAC ping can also be used with a broadcast MAC address to identify all egress points of a service for the specified broadcast MAC.

  • MAC trace provides the ability to trace a specified MAC address hop-by-hop until the last node in the service domain. An SAA test with MAC trace is considered successful when there is a reply from a far-end node indicating that they have the destination MAC address on an egress SAP or the CPM.

  • CPE ping provides the ability to check network connectivity to the specified client device within the VPLS. CPE ping returns the MAC address of the client, as well as the SAP and PE at which it was learned.

  • MAC populate allows specified MAC addresses to be injected in the VPLS service domain. This triggers learning of the injected MAC address by all participating nodes in the service. This tool is generally followed by MAC ping or MAC trace to verify if correct learning occurred.

  • MAC purge allows MAC addresses to be flushed from all nodes in a service domain.

MAC ping

For a MAC ping test, the destination MAC address (unicast or multicast) to be tested must be specified. A MAC ping packet is sent through the data plane. The ping packet goes out with the data plane format.

In the data plane, a MAC ping is sent with a VC label TTL of 255. This packet traverses each hop using forwarding plane information for next hop, VC label, and so on. The VC label is swapped at each service-aware hop, and the VC TTL is decremented. If the VC TTL is decremented to 0, the packet is passed up to the management plane for processing. If the packet reaches an egress node, and would be forwarded out a customer facing port, it is identified by the OAM label below the VC label and passed to the management plane.

MAC pings are flooded when they are unknown at an intermediate node. They are responded to only by the egress nodes that have mappings for that MAC address.

VXLAN ping supporting EVPN for VXLAN

EVPN is an IETF technology as defined in RFC 7432 that uses a new BGP address family and allows VPLS services to be operated as IP-VPNs, where the MAC addresses and the information to setup the flooding trees are distributed by BGP. The EVPN VXLAN connections, VXLAN Tunnel Endpoint (VTEP), uses a connection specific OAM Protocol for on demand connectivity verification. This connection specific OAM tool, VXLAN ping, is described in the 7450 ESS, 7750 SR, 7950 XRS, and VSR Layer 2 Services and EVPN Guide, within the VXLAN section.

MAC trace

A MAC trace functions like an LSP trace with some variations. Operations in a MAC trace are triggered when the VC TTL is decremented to 0.

Like a MAC ping, a MAC trace is sent using the data plane.

When a traceroute request is sent via the data plane, the data plane format is used. The reply can be via the data plane or the control plane.

A data plane MAC traceroute request includes the tunnel encapsulation, the VC label, and the OAM, followed by an Ethernet DLC, a UDP, and IP header. If the mapping for the MAC address is known at the sender, the data plane request is sent down the known SDP with the appropriate tunnel encapsulation and VC label. If the mapping is not known, it is sent down every SDP (with the appropriate tunnel encapsulation per SDP and appropriate egress VC label per SDP binding).

The tunnel encapsulation TTL is set to 255. The VC label TTL is initially set to the min-ttl (default is 1). The OAM label TTL is set to 2. The destination IP address is the all-routers multicast address. The source IP address is the system IP of the sender.

The destination UDP port is the LSP ping port. The source UDP port is whatever the system provides (this source UDP port is the demultiplexer that identifies the particular instance that sent the request, when correlating the reply).

The Reply Mode is either 3 (that is, reply using the control plane) or 4 (that is, reply through the data plane), depending on the reply-control option. By default, the data plane request is sent with Reply Mode 3 (control plane reply).

The Ethernet DLC header source MAC address is set to either the system MAC address (if no source MAC is specified) or to the specified source MAC. The destination MAC address is set to the specified destination MAC. The EtherType is set to IP.

CPE ping

The Nokia-specific CPE ping function provides a common approach to determine if a destination IPv4 address can be resolved to a MAC address beyond the Layer 2 PE, in the direction of the CPE. The function is supported for both VPLS and Epipe services and on a number of different connection types. The service type determines the packet format for network connection transmissions. The transmission of the packet from a PE egressing an access connection is a standard ARP packet. This allows for next-hop resolution for even unmanaged service elements. In many cases, responses to ICMP echo requests are restricted to trusted network segments only; however, ARP packets are typically processed.

If the ARP response is processed on a local SAP connection on the same node from which the command was executed, the detailed SAP information is returned as part of the display function. If the response is not local, the format of the display depends on the service type.

The VPLS service construct is multipoint by nature, and simply returning a positive response to a reachability request would not supply enough information. For this reason, VPLS service CPE ping requests use the Nokia-specific MAC ping packet format. Execution of the CPE ping command generates a MAC ping packet using a broadcast Layer 2 address on all non-access ports. This packet allows for more information about the location of the target. A positive result displays the IP address of the Layer 2 PE and SAP information for the target location.

Each PE, including the local PE, that receives a MAC ping proxies an ARP request on behalf of the original source, as part of the CPE ping function. If a response is received for the ARP request, the Layer 2 PE processes the request, translates the ARP response, and responds back to the initial source with the appropriate MAC ping response and fields.

The MAC ping OAM tool makes it possible to detect whether a particular IPv4 address and MAC address have been learned in a VPLS, and on which SAP the target was found.

The Epipe service construction is that of cross-connection, and returning a positive response to a reachability request is an acceptable approach. For this reason, Epipe service CPE ping requests use standard ARP requests and proxy ARP processing. A positive result displays remote-SAP for any non-local responses. Because Epipe services are point-to-point, the path toward the remote SAP for the service should already be understood.

Nokia recommends that a source IP address of all zeros (0.0.0.0) is used, which prevents the exposure of the provider IP address to the CPE.

The CPE ping function requires symmetrical datapaths for correct functionality. Issues may arise when the request egresses a PE and the response arrives on a related but different PE. When dealing with asymmetrical paths, the return-control option may be used to bypass some of the asymmetrical path issues. Asymmetrical paths can be common in all active multihoming solutions.

For all applications except basic VPLS services (SAP and SDP bindings without a PBB context), CPE ping functionality requires minimum FP2-based hardware for all connections that may be involved in the transmission or processing of the proxy function.

This approach should only be considered for unmanaged solutions where standard Ethernet CFM (ETH-CFM) functions cannot be deployed. ETH-CFM has a robust set of fault and performance functions that are purpose-built for Ethernet services and transport.

Connection types used to support VPLS and Epipes include:

  • SAPs
  • SDP bindings
  • B-VPLS
  • BGP-AD
  • BGP-VPWS
  • BGP-VPLS
  • MPLS-EVPN
CPE ping for PBB Epipe

The CPE ping command can also be used for local, distributed, and PBB Epipe services provisioned over a PBB VPLS. CPE ping for Epipe implements an alternative behavior to CPE ping for VPLS that enables fate sharing of the CPE ping request with the Epipe service. Any PE within the Epipe service (the source PE) can launch the CPE ping. The source PE builds an ARP request and encapsulates it to be sent in the Epipe as if it came from a customer device by using its chassis MAC as the source MAC address. The ARP request then egresses the remote PE device as any other packets on the Epipe. The remote CPE device responds to the ARP and the reply is transparently sent on the Epipe toward the source PE. The source PE then looks for a match on its chassis MAC in the inner customer DA. If a match is found, the source PE device intercepts this response packet.

This method is supported regardless of whether the network uses SDPs or SAPs. It is configured using the existing oam>cpe-ping CLI command.

Note: This feature does not support IPv6 CPEs.
MAC populate

MAC populate is used to send a message through the flooding domain to learn a MAC address as if a customer packet with that source MAC address had flooded the domain from that ingress point in the service. This allows the provider to craft a learning history and engineer packets in a particular way to test forwarding plane correctness.

The MAC populate request is sent with a VC TTL of 1, which means that it is received at the forwarding plane at the first hop and passed directly up to the management plane. The packet is then responded to by populating the MAC address in the forwarding plane, similar to a conventional learn, although the MAC is an OAM-type MAC in the FDB to distinguish it from customer MAC addresses.

This packet is then taken by the control plane and flooded out the flooding domain (squelching, appropriately, the sender and other paths that would be squelched in a typical flood).

This controlled population of the FDB is very important to manage the expected results of an OAM test. The same functions are available by sending the OAM packet as a UDP/IP OAM packet. It is then forwarded to each hop and the management plane has to do the flooding.

Options for MAC populate are to force the MAC in the table to type OAM (in case it already existed as dynamic or static or an OAM-induced learning with some other binding). This prevents new dynamic learning from overwriting the existing OAM MAC entry, to allow customer packets with this MAC to either ingress or egress the network, while still using the OAM MAC entry.

Finally, an option to flood the MAC populate request causes each upstream node to learn the MAC, populate the local FDB with an OAM MAC entry, and to flood the request along the data plane using the flooding domain.

An age can be provided to age a particular OAM MAC after a different interval than other MACs in a FDB.

MAC purge

MAC purge is used to clear the FDBs of any learned information for a particular MAC address. This allows one to do a controlled OAM test without learning induced by customer packets. In addition to clearing the FDB of a particular MAC address, the purge can also indicate to the control plane not to allow further learning from customer packets. This allows the FDB to be clean, and be populated only via a MAC Populate.

MAC purge follows the same flooding mechanism as the MAC populate.

VLL diagnostics

The VLL diagnostics include the following:

  • VCCV ping
  • VCC trace
VCCV ping

VCCV ping is used to check connectivity of a VLL in-band. It checks that the destination (target) PE is the egress for the Layer 2 FEC. It provides a cross-check between the data plane and the control plane. It is in-band, meaning that the VCCV ping message is sent using the same encapsulation and along the same path as user packets in that VLL. This is equivalent to the LSP ping for a VLL service. VCCV ping reuses an LSP ping message format and can be used to test a VLL configured over an MPLS and GRE SDP.

VCCV-ping application

VCCV effectively creates an IP control channel within the pseudowire between PE1 and PE2. PE2 should be able to distinguish on the receive side VCCV control messages from user packets on that VLL. There are three possible methods of encapsulating a VCCV message in a VLL which translates into three types of control channels:

  • Use of a Router Alert Label immediately above the VC label. This method has the drawback that if ECMP is applied to the outer LSP label (for example, transport label), the VCCV message does not follow the same path as the user packets. This effectively means it does not troubleshoot the appropriate path. This method is supported by the 7450 ESS, 7750 SR, and 7950 XRS routers.

  • Use of the OAM control word as shown as follows.

                                                               
          0                   1                   2                   3
          0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |0 0 0 1| FmtID |   Reserved    |         Channel Type          |
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    

    The first nibble is set to 0x1. The Format ID and the reserved fields are set to 0 and the channel type is the code point associated with the VCCV IP control channel as specified in the PWE3 IANA registry (RFC 4446). The channel type value of 0x21 indicates that the Associated Channel carries an IPv4 packet.

    The use of the OAM control word assumes that the draft-martini control word is also used on the user packets. This means that if the control word is optional for a VLL and is not configured, the PE node only advertises the router alert label as the CC capability in the Label Mapping message. This method is supported by the 7450 ESS, 7750 SR, and 7950 XRS routers.

  • Set the TTL in the VC label to 1 to force PE2 control plane to process the VCCV message. This method is not guaranteed to work under all circumstances. For instance, the draft mentions some implementations of penultimate hop popping overwrite the TTL field. This method is not supported by the 7450 ESS, 7750 SR, and 7950 XRS routers.

When sending the label mapping message for the VLL, PE1 and PE2 must indicate which of the preceding OAM packet encapsulation methods (for example, which control channel type) they support. This is accomplished by including an optional VCCV TLV in the pseudowire FEC Interface Parameter field. The format of the VCCV TLV is shown as follows.

0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      0x0c     |       0x04    |   CC Types    |   CV Types    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Note:

The absence of the optional VCCV TLV in the interface parameters field of the pseudowire FEC indicates the PE has no VCCV capability.

The Control Channel (CC) Type field is a bitmask used to indicate if the PE supports none, one, or many control channel types.

  • 0x00 None of the following VCCV control channel types are supported

  • 0x01 PWE3 OAM control word

  • 0x02 MPLS Router Alert Label

  • 0x04 MPLS inner label TTL = 1

If both PE nodes support more than one of the CC types, the router PE uses of the one with the lowest type value. For instance, OAM control word is used in preference to the MPLS router alert label.

The Connectivity Verification (CV) bitmask field is used to indicate the specific type of VCCV packets to be sent over the VCCV control channel. The valid values are:

0x00 None of the following VCCV packet types are supported.

0x01 icmp ping. Not applicable to a VLL over a MPLS or GRE SDP and is therefore not supported by the 7450 ESS, 7750 SR, and 7950 XRS routers.

0x02 LSP ping. This is used in VCCV ping application and applies to a VLL over an MPLS or a GRE SDP. This is supported by the 7450 ESS, 7750 SR, and 7950 XRS routers.

A VCCV ping is an LSP Echo Request message as defined in RFC 8029. It contains an L2 FEC stack TLV, which must include within the sub-TLV type 10 ‟FEC 128 Pseudowire”. It also contains a field which indicates to the destination PE which reply mode to use. There are four reply modes defined in RFC 8029:

Reply mode, meaning:

  • Do not reply

    This mode is supported by the routers.

  • Reply via an IPv4/IPv6 UDP packet

    This mode is supported by the routers.

  • Reply with an IPv4/IPv6 UDP packet with a router alert

    This mode sets the router alert bit in the IP header and do not confuse this with the CC type which makes use of the router alert label. This mode is not supported by the routers.

  • Reply via application level control channel

    This mode sends the reply message inband over the pseudowire from PE2 to PE1. PE2 encapsulates the Echo Reply message using the CC type negotiated with PE1. This mode is supported by the routers.

The reply is an LSP Echo Reply message as defined in RFC 8029. The message is sent as per the reply mode requested by PE1. The return codes supported are the same as those supported in the router LSP ping capability.

The VCCV ping feature is in addition to the service ping OAM feature, which can be used to test a service between router nodes. The VCCV ping feature can test connectivity of a VLL with any third-party node, which complies with RFC 5085.

Figure 19. VCCV-ping application
VCCV ping in a multi-segment pseudowire

Pseudowire switching is a method for scaling a large network of VLL or VPLS services by removing the need for a full mesh of T-LDP sessions between the PE nodes as the number of these nodes grow over time. Pseudowire switching is also used whenever there is a need to deploy a VLL service across two separate routing domains.

VCCV ping over a multi-segment pseudowire shows an example of an application of VCCV ping over a multi-segment pseudowire.

In the network, a Termination PE (T-PE) is where the pseudowire originates and terminates. The Switching PE (S-PE) is the node which performs pseudowire switching by cross-connecting two spoke SDPs.

VCCV ping is able to ping to a destination PE. A VLL FEC ping is a message sent by T-PE1 to test the FEC at T-PE2. The operation at T-PE1 and T-PE2 is the same as in the case of a single-segment pseudowire. The pseudowire switching node, S-PE1, pops the outer label, swaps the inner (VC) label, decrements the TTL of the VC label, and pushes a new outer label. The PE1 node does not process the VCCV OAM control word unless the VC label TTL expires. In that case, the message is sent to the CPM for further validation and processing. This is the method defined in draft-hart-pwe3-segmented-pw-vccv.

Note:

The originator of the VCCV ping message does not need to be a T-PE node; it can be an S-PE node. The destination of the VCCV ping message can also be an S-PE node.

Use VCCV trace to trace the entire path of a pseudowire with a single command issued at the T-PE. This is equivalent to LSP trace and is an iterative process by which T-PE1 sends successive VCCV ping messages while incrementing the TTL value, starting from TTL=1. The procedure for each iteration is the same as the preceding information and each node in which the VC label TTL expires checks the FEC and replies with the FEC to the downstream S-PE or T-PE node. The process is terminated when the reply is from T-PE2 or when a timeout occurs.

Figure 20. VCCV ping over a multi-segment pseudowire
Automated VCCV-trace capability for multi-segment pseudowire

Although tracing of the MS-PW path is possible using the methods described in preceding sections, these require multiple manual iterations and require that the FEC of the last pseudowire segment to the target T-PE/S-PE is known as priori at the node originating the echo request message for each iteration. This mode of operation is referred to as a ping mode.

The automated VCCV-trace can trace the entire path of a pseudowire with a single command issued at the T-PE or at an S-PE. This is equivalent to LSP trace and is an iterative process by which the ingress T-PE or T-PE sends successive VCCV-ping messages with incrementing the TTL value, starting from TTL=1.

The method is described in draft-hart-pwe3-segmented-pw-vccv, VCCV Extensions for Segmented Pseudo-Wire, and is pending acceptance by the PWE3 working group. In each iteration, the source T‑PE or S-PE builds the MPLS echo request message in a way similar to VCCV ping. The first message with TTL=1 has the next-hop S-PE T-LDP session source address in the Remote PE Address field in the pseudowire FEC TLV. Each S-PE that terminates and processes the message includes in the MPLS echo reply message the FEC 128 TLV corresponding the pseudowire segment to its downstream node. The inclusion of the FEC TLV in the echo reply message is allowed in RFC 8029. The source T-PE or S-PE can then build the next echo reply message with TTL=2 to test the next-next hop for the MS-pseudowire. It copies the FEC TLV it received in the echo reply message into the new echo request message. The process is terminated when the reply is from the egress T-PE or when a timeout occurs. If specified, the max-ttl parameter in the vccv-trace command stops on S-PE before reaching T-PE.

The results of VCCV trace can be displayed for a fewer number of pseudowire segments of the end-to-end MS-pseudowire path. In this case, the min-ttl and max-ttl parameters are configured accordingly. However, the T-PE/S-PE node still probes all hops up to min-ttl to correctly build the FEC of the needed subset of segments.

Note:

This method does not require the use of the downstream mapping TLV in the echo request and echo reply messages.

VCCV for static pseudowire segments

MS pseudowire is supported with a mix of static and signaled pseudowire segments. However, VCCV ping and VCCV trace are allowed until at least one segment of the MS pseudowire is static. Users cannot test a static segment but also cannot test contiguous signaled segments of the MS-pseudowire. VCCV ping and VCCV trace are not supported in static-to-dynamic configurations.

Detailed VCCV-trace operation

VCCV ping over a multi-segment pseudowire shows how a trace can be performed on the MS-pseudowire originating from T-PE1 by a single operational command. The following process occurs:

  1. T-PE1 sends a VCCV echo request with TTL set to 1 and a FEC 128 containing the pseudowire information of the first segment (pseudowire1 between T-PE1 and S-PE) to S-PE for validation.

  2. S-PE validates the echo request with the FEC 128. Because it is a switching point between the first and second segment, it builds an echo reply with a return code of 8 and includes the FEC 128 of the second segment (pseudowire2 between S-PE and T-PE2) and sends the echo reply back to T-PE1.

  3. T-PE1 builds a second VCCV echo request based on the FEC 128 in the echo reply from the S-PE. It increments the TTL and sends the next echo request out to T-PE2.

    Note:

    The VCCV echo request packet is switched at the S-PE datapath and forwarded to the next downstream segment without any involvement from the control plane.

  4. T-PE2 receives and validates the echo request with the FEC 128 of the pseudowire2 from T-PE1. Because T-PE2 is the destination node or the egress node of the MS-pseudowire it replies to T-PE1 with an echo reply with a return code of 3, (egress router) and no FEC 128 is included.

  5. T-PE1 receives the echo reply from T-PE2. T-PE1 is recognizes that T-PE2 is the destination of the MS pseudowire because the echo reply does not contain the FEC 128 and because its return code is 3. The trace process is completed.

Control plane processing of a VCCV echo message in a MS-pseudowire
Sending a VCCV echo request

When in the ping mode of operation, the sender of the echo request message requires the FEC of the last segment to the target S-PE/T-PE node. This information can either be configured manually or be obtained by inspecting the corresponding sub-TLV's of the pseudowire switching point TLV. However, the pseudowire switching point TLV is optional and there is no guarantee that all S-PE nodes populate it with their system address and the pseudowire ID of the last pseudowire segment traversed by the label mapping message. Thus, the router implementation always makes use of the user configuration for these parameters.

When in the trace mode operation, the T-PE automatically learns the target FEC by probing one by one the hops of the MS-pseudowire path. Each S-PE node includes the FEC to the downstream node in the echo reply message in a similar way that LSP trace causes the probed node to return the downstream interface and label stack in the echo reply message.

Receiving a VCCV echo request

Upon receiving a VCCV echo request the control plane on S-PEs (or the target node of each segment of the MS pseudowire) validates the request and responds to the request with an echo reply consisting of the FEC 128 of the next downstream segment and a return code of 8 (label switched at stack-depth) indicating that it is an S-PE and not the egress router for the MS-pseudowire.

If the node is the T-PE or the egress node of the MS-pseudowire, it responds to the echo request with an echo reply with a return code of 3 (egress router) and no FEC 128 is included.

Receiving a VCCV echo reply

The operation to be taken by the node that receives the echo reply in response to its echo request depends on its current mode of operation such as ping or trace.

In ping mode, the node may choose to ignore the target FEC 128 in the echo reply and report only the return code to the user.

However, in trace mode, the node builds and sends the subsequent VCCV echo request with a incrementing TTL and the information (such as the downstream FEC 128) it received in the echo request to the next downstream pseudowire segment.

MPLS-TP on-demand OAM

Ping and trace tools for PWs and LSPs are supported with both IP encapsulation and the MPLS-TP on demand CV channel for non-IP encapsulation (0x025).

MPLS-TP LSPs: LSP ping/LSP trace

For lsp-ping and lsp-trace commands:

  • sub-type static must be specified. This indicates to the system that the rest of the command contains parameters specific to a LSP identified by a static LSP FEC.

  • The 7450 ESS, 7750 SR, and 7950 XRS routers support the use of the G-ACh with non-IP encapsulation, IPv4 encapsulation, or labeled encapsulation with IP de-multiplexing for both the echo request and echo reply for LSP ping and LSP trace on LSPs with a static LSP FEC (such as MPLS-TP LSPs).

  • A user can specify the target MPLS-TP MEP/MIP identifier information for LSP ping. If the target global ID and node ID are not included in the lsp-ping command, these parameters for the target MEP ID are taken from the context of the LSP. The tunnel-number tunnel-num and lsp-num lsp-num for the far-end MEP are always taken from the context of the path under test.

lsp-ping static <lsp-name> 
[force]
[path-type [active|working|protect]]
[fc <fc-name> [profile {in | out}]] 
[size <octets>] 
[ttl <label-ttl>] 
[send-count <send-count>] 
[timeout <timeout>] 
[interval <interval>]
[src-ip-address <ip-address>] 
[dest-global-id <dest-global-id> dest-node-id dest-node-id]
[assoc-channel none | non-ip | ipv4][detail]
lsp-trace static  <lsp-name> 
[force]
[path-type [active|working|protect]
[fc <fc-name> [profile {in|out}]] 
[max-fail <no-response-count>] 
[probe-count <probes-per-hop>] 
[size <octets>] 
[min-ttl <min-label-ttl>] 
[max-ttl <max-label-ttl>] 
[timeout <timeout>] 
[interval <interval>]
[src-ip-address <ip-address>]
 [assoc-channel none | non-ip | ipv4]
[downstream-map-tlv <dsmap|ddmap>] 
[detail] 

The following commands are only valid if the sub-type static option is configured, implying that the LSP name refers to an MPLS-TP tunnel LSP:

path-type - Values: active, working, protect. Default: active.

dest-global-id global-id dest-node-id node-id - Default: to global-id:node-id from the LSP ID.

assoc-channel: If this is set to none, IP encapsulation over an LSP is used with a destination address in the 127/8 range. If this is set to ipv4, IPv4 encapsulation in a G-ACh over an LSP is used with a destination address in the 127/8 range. The source address is set to the system IP address, unless the user specifies a source address using the src-ip-address option. If this is set to non-ip, then non-IP encapsulation over a G-ACh with channel type 0x00025 is used. This is the default for sub-type static.

Note:

The encapsulation used for the echo reply is the same as the encapsulation used for the echo request.

downstream-map-tlv: LSP trace commands with this option can only be executed if the control channel is set to none. The DSMAP/DDMAP TLV is only included in the echo request message if the egress interface is either a numbered IP interface, or an unnumbered IP interface. The TLV is not included if the egress interface is of type unnumbered-mpls-tp.

For lsp-ping, the dest-node-id may be entered as a 4-octet IP address in the format a.b.c.d, or as a 32-bit integer in the range of 1 to 4294967295. For lsp-trace, the destination node ID and global ID are taken from the spoke-sdp context.

The send mode and reply mode are always taken to be an application level control channel for MPLS-TP.

The force parameter causes an LSP ping echo request to be sent on an LSP that has been brought operationally down by Bidirectional Forwarding Detection (BFD) (LSP ping echo requests would normally be dropped on operationally down LSPs). This parameter is not applicable to SAA.

The LSP ID used in the LSP ping packet is derived from a context lookup based on the LSP name and the path type (active/working/protect).

The dest-global-id and dest-node-id keywords refer to the target global/node ID. They do not need to be entered for end-to-end ping and trace, and the system uses the destination global ID and node ID from the LSP ID.

The same command syntax is applicable for SAA tests configured under config>saa>test.

MPLS-TP pseudowires: VCCV ping/VCCV trace

The 7450 ESS, 7750 SR, and 7950 XRS routers support VCCV ping and VCCV trace on single segment PWs and multi-segment PWs where every segment has static labels and a configured MPLS-TP PW Path ID. VCCV ping and trace on MS-PWs are supported, where a static MPLS-TP PW segment is switched to a dynamic T-LDP signaled segment.

Static MS-PW PWs are referred to with the sub-type static in the vccv-ping and vccv-trace commands. This indicates to the system that the rest of the command contains parameters that are applied to a static PW with a static PW FEC.

Two ACH channel types are supported: the IPv4 ACH channel type and the non-IP ACH channel type (0x0025). This is known as the non-IP associated channel. This is the default for type static. The Generic ACH Label (GAL) is not supported for PWs.

If the IPv4 associated channel is specified, the IPv4 channel type is used (0x0021). In this case, a destination IP address in the 127/8 range is used, while the source address in the UDP/IP packet is set to the system IP address, or may be explicitly configured by the user with the src-ip-address option. This option is only valid if the IPv4 control-channel is specified.

The reply mode is always assumed to be the same application level control channel type for type static.

As with other PW types, the downstream mapping and detailed downstream mapping TLVs (DSMAP/DDMAP TLVs) are not supported on static MPLS-TP PWs.

The follow CLI command description shows the options that are only allowed if the type static option is configured. All other options are blocked.

vccv-ping static sdp-id:vc-id [target-fec-type pw-id-fec sender-src-address ip-addr remote-dst-address ip-address pw-id pw-id pw-type pw-type] [dest-global-id global-id dest-node-id node-id] [assoc-channel ipv4 | non-ip] [fc fc-name [profile {in | out}]] [size octets] [count send-count] [timeout timeout] [interval interval] [ttl vc-label-ttl] [src-ip-address ip-addr]

vccv-trace static sdp-id:vc-id [assoc-channel ipv4 | non-ip] [src-ip-address ipv4-address] [target-fec-type pw-id sender-src-address ip-address remote-dst-address ip-address pw-id pw-id pw-type pw-type] [detail] [fc fc-name [profile in | out]] [interval interval-value] [max-fail no-response-count] [max-ttl max-vc-label-ttl] [min-ttl min-vc-label-ttl] [probe-count probe-count] [size octets] [timeout timeout-value]

If the spoke SDP referred to by the sdp-id:vc-id has an MPLS-TP PW-Path-ID defined, those parameters are used to populate the static PW TLV in the target FEC stack of the vccv-ping or vccv-trace packet. If a global ID and node ID are specified in the command, these values are used to populate the destination node TLV in the vccv-ping or vccv-trace packet.

The global ID and node ID are only used as the target node identifiers if the vccv-ping is not end-to-end (for example, a TTL is specified in the vccv-ping or trace command and it is < 255); otherwise, the value in the PW Path ID is used. For vccv-ping, the dest-node-id may be entered as a 4-octet IP address a.b.c.d or 32-bit integer 1 to 4294967295. For vccv-trace, the destination node ID and global ID are taken from the spoke SDP context.

The same command syntax is applicable for SAA tests configured under configure saa test a type.

VCCV ping and VCCV trace between static MPLS-TP and dynamic PW segments

The 7450 ESS, 7750 SR, and 7950 XRS routers support end-to-end VCCV ping and VCCV trace between a segment with a static MPLS-TP PW and a dynamic T-LDP segment by allowing the user to specify a target FEC type for the VCCV echo request message that is different from the local segment FEC type. That is, it is possible to send a VCCV ping or trace echo request containing a static PW FEC in the target stack TLV at a T-PE where the local egress PW segment is signaled, or a VCCV ping or trace echo request containing a PW ID FEC (FEC128) in the target stack TLV at a T-PE where the egress PW segment is a static MPLS-TP PW.

Note:

All signaled T-LDP segments and the static MPLS-TP segments along the path of the MS-PW must use a common associated channel type. Because only the IPv4 associated channel is supported in common between the two segments, this must be used. If a user selects a non-IP associated channel on the static MPLS-TP spoke SDP, vccv-ping and vccv-trace packets are dropped by the S-PE.

The target-fec-type option of the vccv-ping and vccv-trace command is used to indicate that the remote FEC type is different from the local FEC type. For a vccv-ping initiated from a T-PE with a static PW segment with MPLS-TP parameters, attempting to ping a downstream FEC128 segment, then a target-fec-type of pw-id is configured with a static PW type. In this case, an assoc-channel type of non-ip is blocked, and the other way around. Likewise, the reply-mode must be set to control-channel. For a vccv-ping initiated from a T-PE with a FEC 128 PW segment, attempting to ping a downstream static PW FEC segment, a target-fec-type of static is configured with a pw-id PW type, then a control-channel type of non-ip is blocked, and the other way around. Likewise, the reply-mode must also be set to control-channel.

When using VCCV trace, where the first node to be probed is not the first-hop S-PE, the initial TTL must be set to >1. In this case, the target-fec-type refers to the FEC at the first S-PE that is probed.

The same rules apply to the control-channel type and reply-mode as for the vccv-ping case.

MPLS Performance Monitoring (MPLS PM)

RFC 6374, Packet Loss and Delay Measurement for MPLS Networks, provides a standard packet format and process for measuring delay of a unidirectional or MPLS-TP using the General Associated Channel (G-ACh), channel type 0x000C. Unidirectional LSPs, such as RSVP-TE, require an additional TLV to return a response to the querier (the launch point). RFC 7876, UDP Return Path for Packet Loss and Delay Measurement for MPLS Networks, defines the source IP information to include in the UDP Path Return TLV so the responding node can reach the querier using an IP network. The MPLS DM PDU does not natively include any IP header information. With MPLS TP there is no requirement for the TLV defined in RFC 7876.

The function of MPLS delay measurement is similar regardless of LSP type. The querier sends the MPLS DM query message toward the responder, transported in an MPLS LSP. The responder extracts the required PDU information to respond appropriately.

Launching MPLS DM tests are configured in the config>oam-pm>session session-name test-family mpls context. Basic architectural OAM PM components are required to be completed along with the MPLS specific configuration. The test PDU includes the following PDU settings:

  • Channel Type: 0x000C (MPLS DM)

  • Flags: Query

  • Control Code: out-of-band (unidirectional LSP) and in-band (bidirectional LSP)

  • Querier Timestamp Format: IEEE 1588-2008 (1588v2) Precision Time Protocol truncated timestamp format

  • Session Identifier: The configured oam-pm>session>mpls>dm test-id test-id

  • DSCP: The configured oam-pm>session>mpls>dscp dscp-name (this value is not used to convey or influence the CoS setting in the MPLS TC field, on the querier or reflector. The profile {in | out} and fc fc-name must be used to influence CoS markings at launch, and the MPLS TC influences CoS handling and marking upon reception.

  • Timestamp 1: Set to the local transmit time in PTP format

  • Timestamp 2 and 3: set to 0

TVLs can also be included, based on the configuration.

  • Padding (Type 0)

    Copy in Response: Padding is returned in the response when the oam-pm>session>mpls>dm>reflect-pad is configured.

  • Padding (Type 128)

    Do not copy in Response: Padding is not returned in the response when the oam-pm>session>mpls>dm>pad-tlv-size is configured without the reflect-pad command. This is the typical configuration with unidirectional LSPs.

  • UDP Return (Type 131)

    UDP Return object: The IP information used by the reflector to reach the querier for an out-of-band response, when the oam-pm>session>mpls>lsp is rsvp or rsvp-auto and the udp-return-object information is configured.

The maximum pad size of 257 is a result of the structure of the defined TLV. The length field is one byte, limiting the overall value filed to 255 bytes.

The reflector processes the inbound MPLS DM PDU and respond back to the querier based on the received information, using the response flag setting. Specific to the timestamp, the responder responds using the Query Timestamp Format, filling in the Timestamp 2 and Timestamp 3 values.

When the response arrives back at the querier, the delay metrics are computed. The common OAM-PM computation model and naming are used to simplify and rationalize the different technologies that leverage the OAM-PM infrastructure. The common methodology reports unidirectional and round trip delay metrics for Frame Delay (FD), InterFrame Delay Variation (IFDV), and Frame Delay Range (FDR). The term, "frame" is not indicative of the underlying technology being measured. It represents a normal cross technology naming with applicability to the appropriate naming for the measured technology. The common normal naming requires a mapping to the supported delay measurements included in RFC 6374.

Table 2. Normalized naming mapping
Description RFC 6374 OAM-PM

A to B Delay

Forward

Forward

B to A Delay

Reverse

Backward

Two Way Delay (regardless of processing delays within the remote endpoint B)

Channel

Round-Trip

Two Way Delay (includes processing delay at the remote endpoint B)

Round-Trip

Not supported

Because OAM-PM uses a common reporting model, unidirectional (forward and backward, and round trip) are always reported. With unidirectional LSPs, the T4 and T3 timestamps are zeroed but the backward and round trip directions are still reported. In the case of unidirectional LSPs, the backward and round trip values are of no significance for the measured MPLS network.

An MPLS DM test may measure the endpoints of the LSP when the TTL is set equal to or higher than the termination point distance. Midpoints along the path that support MPLS DM response functions can be queried by a test setting a TTL to expire along the path. The MPLS DM launch and reflection, including mid-path transit nodes, capability is disabled by default. To launch and reflect MPLS DM test packets config>test-oam>mpls-dm must be enabled.

The SR OS implementation supports MPLS DM Channel Type 0x000C from RFC 6374 function for the following:

  • Label Switched Path types

    • RSVP-TE and RSVP-TE Auto LSPs set out-of-band response request and require the configuration of the UDP-Return-Object.

    • MPLS-TP sets in-band response request.

  • Querier and Responder

  • Traffic class indicator set to on (1)

  • Querier Timestamp format PTP

  • Mandatory TLV 0 – Copy padding

  • Optional TLV 128 – Do not copy padding

  • Optional TLV 131 – UDP-Return-Object (RFC 7876)

The following functions are not supported:

  • Packet Loss measurement

  • Throughput management

  • Dyadic measurements

  • Loopback

  • Mandatory TLVs

    • TLV 1 – Return Address

    • TLV 2 – Session Query Interval

    • TLV 3 – Loopback Request

  • Optional TLVs

    • TLV 129 – Destination Address

    • TLV 130 – Source Address

Configuring MPLS PM

The following configuration provides an example that comprises the different MPLS OAM PM elements for the various LSPs. This example only includes configuration on the querier, excluding the basic MPLS and IP configurations. The equivalent MPLS configuration must be completed on all responders. Enabling MPLS DM is required on all queriers and responders.

Accounting policy configuration

The following describes the accounting policy configuration:

config>log# info
----------------------------------------------
file-id 1
    description "OAM PM XML file Parameters"
    location cf2:
    rollover 10 retention 2
exit
accounting-policy 1
   description "Default OAM PM Collection Policy for 5-min Bins"
   record complete-pm
   collection-interval 5
   to file 1
  no shutdown
exit
log-id 1
exit
Enabling MPLS DM

The following configuration enables EMPLE DM:

config>test-oam> #info
----------------------------------------------
        mpls-dm
            no shutdown
        exit
RSVP LSP configuration

The following shows the RSVP LSP configuration:

config>router> #info
--------------------------------------------
      mpls
            path "path-1"
                no shutdown
            exit
        lsp "LSP-PE-2-PE-1-via29"   //lsp-name for oam-pm//
             to 192.0.2.1
             cspf
             include "via-P-3"
             primary "path-1"
             exit
             no shutdown
         exit
RSVP-auto LSP configuration components

The following shows the RSVP-auto LSP configuration:

config>router> #info
-----------------------------------------------------
        mpls
            path "path-1"
                no shutdown
            exit
           
            lsp-template "auto-system-lsps" mesh-p2p //template name for oam-pm//
                from 1.1.1.31                        //from address for oam-pm//
                default-path "path-1"
                cspf
                no shutdown
            exit
         
            auto-lsp lsp-template "auto-system-lsps" policy "auto-lsp"
            no shutdown
        exit
#--------------------------------------------------
echo "Policy Configuration"
#--------------------------------------------------
        policy-options
            begin
            prefix-list "mesh-p2p"                 //to addressing for oam-pm//
                prefix 1.1.1.28/32 exact
                prefix 1.1.1.29/32 exact
            exit
            policy-statement "auto-lsp"
                entry 10
                    from
                        prefix-list "mesh-p2p"
                    exit
                    action accept
                    exit
                exit
            exit
            commit
        exit 
MPLS-TP LSP configuration

The following shows the MPLS-TP LSP configuration:

config>router> #info
--------------------------------------------
      mpls
          mpls-tp
               global-id 135
               node-id 0.0.0.2
               tp-tunnel-id-range 100 200
               protection-template "ptcTemplate"
               exit
               oam-template "bfd-template"
               bfd-template "bfdTemplate"
               exit
               no shutdown
      exit
      lsp "LSP-PE-2-PE-1-static" mpls-tp 100
               to node-id 0.0.0.1
                dest-global-id 135
               dest-tunnel-number 100
               working-tp-path
                   in-label 129
                   out-label 131 out-link "int-PE-2-P-3" next-hop 192.168.23.1
                   mep
                       oam-template "bfd-template"
                       bfd-enable cc
                       no shutdown
                   exit
                   no shutdown
               exit
               no shutdown
      exit
MPLS OAM-PM configuration

The following shows the MPLS OAM-PM configuration:

config>oam-pm# info
----------------------------------------------
        bin-group 2 fd-bin-count 10 fdr-bin-count 10 ifdv-bin-count 10 create
            bin-type fd
                bin 1
                    lower-bound 1000
                exit
                bin 2
                    lower-bound 2000
                exit
                bin 3
                    lower-bound 3000
                exit
                bin 4
                    lower-bound 4000
                exit
                bin 5
                    lower-bound 5000
                exit
                bin 6
                    lower-bound 6000
                exit
                bin 7
                    lower-bound 7000
                exit
                bin 8
                    lower-bound 8000
                exit
                bin 9
                    lower-bound 9000
                exit
            exit
            bin-type fdr
                bin 1
                    lower-bound 1000
                exit
                bin 2
                    lower-bound 1500
                exit
                bin 3
                    lower-bound 2000
                exit
                bin 4
                    lower-bound 2500
                exit
                bin 5
                    lower-bound 3000
                exit
                bin 6
                    lower-bound 3500
                exit
                bin 7
                    lower-bound 4000
                exit
                bin 8
                    lower-bound 4500
                exit
                bin 9
                    lower-bound 5000
                exit
            exit
            bin-type ifdv
                bin 1
                    lower-bound 500
                exit
                bin 2
                    lower-bound 750
                exit
                bin 3
                    lower-bound 1000
                exit
                bin 4
                    lower-bound 1250
                exit
                bin 5
                    lower-bound 1500
                exit
                bin 6
                    lower-bound 1750
                exit
                bin 7
                    lower-bound 2000
                exit
                bin 8
                    lower-bound 2250
                exit
                bin 9
                    lower-bound 2500
                exit
            exit
            no shutdown
        exit
 
       session "mpls-dm-rsvp-PE-2-PE-1" test-family mpls session-type proactive create
            bin-group 2
            description "mpls dm testing rsvp"
            meas-interval 5-mins create
                accounting-policy 9
                event-mon
                    delay-events
                    no shutdown
                exit
            exit
            mpls
                dscp "af11"
                fc "af"
                lsp
                    rsvp
                        lsp "LSP-PE-2-PE-1-via29"
                        udp-return-object 192.0.2.2
                    exit
                exit
                profile in
                dm test-id 5 create
                    interval 2000
                    pad-tlv-size 257
                    no shutdown
                exit
            exit
        exit
        session "mpls-dm-static-PE-2-PE-1" test-family mpls session-type proactive create
            bin-group 2
            description "mpls dm testing static-mpls-tp"
            meas-interval 5-mins create
            exit
            mpls
                dscp "af11"
                fc "af"
                lsp
                    mpls-tp-static
                        lsp "LSP-PE-2-PE-1-static"
                    exit
                exit
                profile in
                ttl 5
                dm test-id 100 create
                    interval 2000
                    no shutdown
                exit
            exit
        exit
        session "mpls-dm-rsvp-auto-PE-2-PE-1" test-family mpls session-type proactive create
            bin-group 2
            description "mpls dm testing rsvp-auto-lsp"
            meas-interval 5-mins create
            exit
            mpls
                dscp "af11"
                fc "af"
                lsp
                    rsvp-auto
                        from 192.0.2.2
                        lsp-template "auto-system-lsps"
                        to 192.0.2.1
                        udp-return-object 192.0.2.2
                    exit
                exit
                profile in
                ttl 5
                dm test-id 200 create
                    interval 2000
                    no shutdown
                exit
            exit
        exit

BIER OAM

BIER supports the bier-ping and bier-trace OAM tools. FP4 and FP5 hardware is required and only IPv4 is supported. The tools are not supported:

  • on VSR

  • under the SAA tool or OAM-PM architecture

  • MCI

A bier-ping packet is sent in a specific subdomain. The user can specify the subdomain in which the BIER OAM packet must be generated. In addition, the BIER OAM packet has to be destined for a BFER or a set of BFERs. Multiple BFERs can be specified for bier-ping. A single BFER can be specified for bier-trace. The BFER can be specified in one of the following ways:

  • through a BIER prefix

    The BIER prefix is flooded to the IGP domain through the IGP BIER TLV. The TLV also contains the BFR-ID, so the BIER prefix can be used to find the BFER BFR-ID and its corresponding SI and bit position in the BIER header. Up to 16 BIER prefixes can be specified for bier-ping.

  • through BFR-ID

    SR OS supports 16 SI and 255 bits for the BIER header, which means that 4K of BFR-ID are supported. A BFR-ID or a range of BFR-ID can be specified in the OAM command to build the SI and bit positions in the BIER header. Up to 16 contiguous BFR-IDs can be specified for bier-ping.

ECMP and BIER OAM

ECMP is not supported for BIER in 7x50. For BIER OAM, Multipath Entropy data sub-TLV of Downstream Detailed Mapping TLV is used for ECMP discovery. SR OS does not support the Multipath Entropy data sub-TLV type. If SR OS receives Multipath Entropy data sub-TLV in the BIER OAM packet, it responds with the return code "One or more of the TLVs was not understood".

Outbound time

BIER Ping and BIER Trace only support outbound time. Round-trip time is not supported, because multicast is unidirectional and BIER ping is in-band downstream, but out-of-band for echo reply. The outbound time is calculated from the network processor (NP) of the root to the NP of the leaf nodes, where the packet is timestamped.

Negative outbound time

If negative outbound times display for BIER OAM, the cause is usually that the root and the leaf nodes are not synchronized. In this case, the user must ensure that the root and the leaf nodes are synchronized.

ICMP ping check connectivity checking using ICMP echo request and response

In specific network configurations, it is not always possible to deploy preferred standards-based purpose-built robust connectivity verification tools such as Bidirectional Forwarding Detection (BFD), or Ethernet Connectivity Fault Management Continuity Check Message (ETH-CCM), or other more suited tools. When circumstances prevent the preferred connectivity validation methods, ICMP echo request and response ping connectivity check (icmp ping check) using ping templates can be used as an alternate connectivity checking method. Before deploying this approach an understanding of the treatment of these types of packets on the involved network elements is required. The ping check affects the operational state of the VRPN or IES service IPv4 interface (the service IP interface) being verified.

Deployment of this feature requires the following:

  • configuring the ping template

  • the assignment of the ping template to a service IP interface

  • optionally, configuring the distributed CPU protection for the icmp-ping-check protocol

The ping template defines timers and thresholds that determine the basis for connectivity verification and influence the service IP interface operational state. The configuration of the ping template is located in the config>test-oam>icmp context. The configuration options are separated to allow different failure detection and recovery behaviors.

The transmission frequency (interval), loss detection (timeout) and threshold (failure-threshold) are used to check connectivity when the service IP interface is steady and operationally up, or steady and operationally down. When these values are monitoring connectivity and the service IP interface is operationally up, consecutive failures that reach the failure threshold transitions the interface to operationally down. When these values are monitoring connectivity and the service IP interface is operationally down, a first success triggers the recovery values to complete the validation. For example, if the interval 10 (seconds), timeout 5 (seconds) and failure-threshold 3 (count) the failure detection takes 30 seconds.

When a service IP interface transitions from operationally up to operationally down because of icmp ping check the log event ‟UTC WARNING: SNMP #2004 vprn1000999 int-PE-CE-999. Interface int-PE-CE-999 is not operational” is generated.

When a service IP interface has transitioned from operationally up to the operationally down state because of icmp ping check, the transmission continues at the specified interval until there is a successful ICMP echo response related to the ICMP echo request. When the first success is received, there is a possible transition from operationally down to operationally up and the function moves to the recovering phase. The icmp ping check packets for the affected service IP interface starts to transmit at frequency (reactivation-interval), invoking loss detection (reactivation-timeout) and consecutive success count (reactivation-threshold). If the reactivation threshold is reached, the service IP interface transitions from operationally down to operationally up. The transmission frequency (interval), loss detection (timeout) and threshold (failure-threshold) are used to monitor the service IP interface.

When a service IP interface transitions from operationally down to operationally up because of icmp ping check the log event ‟UTC WARNING: SNMP #2005 vprn1000999 int-PE-CE-999. Interface int-PE-CE-999 is operational” is generated.

If a failure occurs in the recovering phase, the reactivation-failure-threshold is consulted to determine the number of retries that should be attempted in this phase. This option allows a service IP interface a specified number of retries in this phase before returning to transmitting at interval and those associated values. The reactivation-failure-threshold parameter is bypassed if there was a previous success for the service IP interface in the recovery phase for the latest transition. This parameter determines the number of consecutive failures, without a previous success, before declaring the recovering is not proceeding and returns to the interval values. In larger scale environments this value may need to be increased.

Only packets related to the icmp ping check, ICMP echo request and ARP packets specifically associated with the assigned local ping template, can be sent when the interface is operationally down because of an icmp ping check failure. Only packets related to the icmp ping check, ICMP echo response and ARP packets specifically associated with the assigned local ping template, can be received when the interface is operationally down because of the ping check failure.

A ping check function should never be configured on both peers. This leads to deadlock conditions that can only be resolved by manually disabling the ping template under the interface. As previously stated, only packets associated with the local ping template can be transmitted and received on a service IP interface when the interface is operationally down because of icmp check.

The configured ping template values can be updated without having to change the administrative state or existing references. However, the service IP interfaces that reference a specific ping template configuration imports the values when the ping-template is administratively enabled under the service IP interface. There is no automatic updating of modified ping template values on service IP interfaces referencing a ping-template. To push the changes to the referencing service IP interface the command tools>perform>test-oam>icmp>ping-template-sync template-name is available. This command updates all interfaces that reference the specified ping-template. Executing this command updates all the referencing service IP interfaces in the background after the command is accepted. If there is an HA event and the tools command has not completed updating, all the interfaces that had were not updated at the time of the HA event do not receive the new values. If an HA event occurs and there is a concern that all interfaces may not have received the update the command should be executed again on the newly active. The command does not survive an HA event.

For a service IP interface to import and start using the icmp ping check, the ping-template template-name must be enabled and the destination-address ip-address, must be configured. When the ping-template is added to the service IP interface the values associated with that ping-template are imported. When the ping-template’s administrative state under the service IP interface is enabled, the values are checked again to ensure the latest values associated with the ping-template are being used. The source ip address of the packet is the primary IPv4 address of the service IP interface. This is not a configurable parameter.

When the ping-template command is administratively enabled under a service IP interface that is operationally up, the interface is assumed to have connectivity until proven otherwise. This means the interface state is not affected unless the ping template determines that there are connectivity issues based on the interval, timeout, and failure-threshold commands. If the wanted behavior is for the ping-template to validate service IP interface connectivity before allowing the service IP interface to become operational, the service IP interface can be administratively disabled, the ping-template enabled under that interface, and then the interface administratively enabled. This is considered to be operationally down because of underlying conditions.

When the ping-template command is administratively enabled under a service IP interface that is operationally down because of an underlying condition unrelated to icmp ping check, when the underlying condition is cleared, the icmp ping check prevents the interface from entering the operationally up state until it can verify the connectivity. When the underlying condition is cleared the icmp ping check function enters the recovering phase using the reactivation-interval, reactivation-timeout, reactivation-threshold, and the reactivation-failure-threshold values.

When a node is rebooted, service IP interfaces, with administratively enabled ping templates, must verify the interface connectivity before allowing it to progress to an operationally up state. This ensures that the interface does not bounce from operationally up to operationally down after a reboot and the service IP interface state is properly reflected when the reboot is complete. Service IP interfaces that have an administratively enabled ping-template enter the recovering phase using the reactivation-interval, reactivation-timeout, reactivation-threshold and the reactivation-failure-threshold values following a reboot.

When a soft reset condition is raised icmp ping check state for the service IP interface is held in the same state it entered the process until the soft reset is complete. The interfaces exit the soft reset in the same phase they entered but all counters are cleared. The service IP interfaces that have an administratively enabled ping template enter this held state if they are in any way related to any hardware that is undergoing a soft reset. Two examples to demonstrate the expected behavior are shown below. When a service IP interface is related to a LAG, if a single port member in that LAG is affected by the soft reset, the interface enters this held state. Similarly, if the service IP interface is connected using an R-VPLS configuration it enters the held state.

The protocol used to determine the icmp ping check function has been added to the distributed CPU protection list of protocols, icmp-ping-check. The distributed CPU protection function can be used to limit the amount of icmp ping check packets received on a service IP interface with an enabled ping template. This is an optional configuration that would prevent crossover impact on unrelated service IP interfaces using icmp ping check because of a rogue interface.

The show>service>id>interface ip-int-name detail command has been updated with the ping-template values and operational information. The most effective way to view the output is to use a match criterion for ‟Ping Template Values in Use”. The ‟Ping Template Values in Use” section of the output reports the current values that were imported from the referenced config>test-oam>icmp>ping-template. The ‟Operational Data” section of the output includes the administrative state (Up or Down) and destination address being tested (IP address or notConfigured). It also includes the current interval in use (interval or reactivation-interval) and the current state being reported, (operational, notRunning, failed). There are also pass and fail counters reporting, while in the current state, the number of consecutive passes or fails that have occurred. This provides a stability indicator. If these values are low, it may indicate that even though no operational state transitions have occurred there are intermittent but frequent failures. If neither of these counter are incrementing it is likely an underlying condition has been detected and the icmp ping check is not attempting to send and cannot receive connectivity packets. These counters are cleared when moving between different intervals, and for a soft reset.

show service id <service-id> interface <ip-int-name> detail | match "Ping Template Values in Use" post-lines 29
Ping Template Values in Use
Name             : customer-access-basic
Description      : basic service detection and recovery
Dscp             : nc1
Dot1p            : 7
Interval         : 10
Timeout          : 1
Failure Threshold: 3
React Fail Thresh: 9
React Interval   : 1
React Timeout    : 1
React Threshold  : 3
Size             : 56
TTL              : 1
Ping Template Operational Data
Admin State      : Up
Destination      : 192.9.99.2
Current Interval : Interval
Current State    : Operational
Ping Template Counters
Fail Counter     : 0
Pass Counter     : 107 

The show>test-oam>icmp>ping-template and show>test-oam>icmp>ping-template-using have been added to display the various config>test-oam>icmp>ping-template configurations and services referencing the ping templates.

Using icmp ping check enabled on service IP interfaces incur longer recovery delays on failure and reboot because of the additional validations required to validate those interfaces.

The icmp ping check function supports IPv4 interfaces created on SAPs in VRPN and IES services and R-VPLS services, as well as Ethernet satellite (esat) connections. When the service IP interface is making use of an R-VPLS configuration, the interface between the VRPN or IES service and the VPLS service is a virtual connection. In order for the icmp ping check to function properly in R-VPLS environments, the connection being used to validate the peer must be reachable over a SAP.

The icmp ping check should only be used when other purpose-built connectivity checking is not a deployable solution. Interaction with contending protocols may be unexpended.

The interaction between icmp ping check and service IP interface hold-time, in general, the hold-time up option delays the deactivation of the associated IP interface by the specified number of seconds. The hold-time down option delays the activation of the associated IP interface by the specified number of seconds.

With the hold-time up option, if a service IP interface is about to transition from operational up to down because the port transitioned from operational up to down, loss of signal, administrative down, and so on, then hold-time up timer is started. The interface remains operationally up until the timer expires. The icmp ping check runs in parallel because the underlying operational state has been delayed. If it lasts longer than the detection for the icmp ping check it could fail while the interval is counting down. If the hold-time up counter expires the interface transitions to operationally down and the icmp ping check now recognizes the underlying issue and stops trying to transmit. Normal underlying condition recovery noted earlier in this section follow.

If however, the hold-time up is short circuited because the port returns to an operationally up state before the expiration of the hold-time up, the following interactions are noted:

  • If the icmp ping check has not failed before the port returns to operational up, the service IP interface stays operational and the icmp ping check continues at interval without ever have affecting the operational state of the interface.

  • If the icmp ping check has registered a failure during this time the service IP interface transitions to operational down because of the icmp ping check and the icmp ping check must recover the interface using the reactivation-interval.

With the hold-time down option, if a service IP interface is about to transition from operationally down to up because the port transitioned from operationally down to up, the interface remains down until the expiration of the down timer. When the timer expires, the icmp ping check follows the normal underlying condition recovery noted earlier in this section follows.

These validations do not support or impact IPv6 interfaces.

There is no support for config>system>enable-icmp-vse Nokia-specific ICMP packets on interfaces that are using ping templates.

ICMP ping check connectivity is only supported on FP3-based and above platforms and should not be configured on any service IP interfaces that are configured over hardware that does not meet this requirement.

IP Performance Monitoring (IP PM)

SR OS supports Two-Way Active Measurement Protocol (TWAMP) and Two-Way Active Measurement Protocol Light (TWAMP Light) and Simple Two-Way Active Measurement Protocol (STAMP).

TWAMP

TWAMP provides a standards-based method for measuring the IP performance (packet loss, delay, and jitter) between two devices. TWAMP leverages the methodology and architecture of One-Way Active Measurement Protocol (OWAMP) to define a way to measure two-way or round-trip metrics.

There are four logical entities in TWAMP: the Control-Client, the Session-Sender, the server, and the Session-Reflector. The Control-Client and Session-Sender are typically implemented in one physical device (the ‟client”) and the server and Session-Reflector in a second physical device (the ‟server”). The router acts as the ‟server”.

The Control-Client and server establish a TCP connection and exchange TWAMP-Control messages over this connection. When a server accepts the TCP control session from the Control-Client, it responds with a server greeting message. This greeting includes the various modes supported by the server. The modes are in the form of a bit mask. Each bit in the mask represents a functionality supported on the server. When the Control-Client wants to start testing, the client communicates the test parameters to the server, requesting any of the modes that the server supports. If the server agrees to conduct the described tests, the test begins as soon as the Control-Client sends a Start-Sessions or Start-N-Session message. As part of a test, the Session-Sender sends a stream of UDP-based TWAMP test packets to the Session-Reflector, and the Session-Reflector responds to each received packet with a UDP-response TWAMP test packet. When the Session-Sender receives the response packets from the Session-Reflector, the information is used to calculate two-way delay, packet loss, and packet delay variation between the two devices. The exchange of TWAMP test PDUs is referred to as a TWAMP-Test.

The TWAMP test PDU does not achieve symmetrical packet size in both directions unless the frame is padded with a minimum of 27 bytes. The Session-Sender is responsible for applying the required padding. After the frame is appropriately padded, the Session-Reflector reduces the padding by the number of bytes needed to provide symmetry.

Server mode support includes:

  • individual session control (Mode Bit 4: Value 16)

  • reflected octets (Mode Bit 5: Value 32)

  • symmetrical size test packet (Mode Bit 6: Value 64)

TWAMP Light and STAMP

Overview

Note: For consistency within the SR OS, the twamp-light command sends the IP PM packet without a direct correlation to the actual test PDU format, TWAMP Light, or STAMP. Similarly, the documentation aligns with that nomenclature.

TWAMP Light was introduced as part of RFC 5357, A Two-Way Active Measurement Protocol (TWAMP), Appendix I (Informational). The RFC appendix defined a single-ended test without the requirement to use the TCP control channel over which the Control-Client and server negotiate test parameters. Using this approach, configuration on both entities, the Session-Sender and the Session-Reflector, replaces the control channel and provides the application-specific handling information required to launch and reflect test packets. In other words, TWAMP Light uses the TWAMP test packet for gathering IP performance information, but eliminates the need for the TWAMP TCP control channel. However, not all negotiated control parameters are replaced with local configuration. For example, QoS parameters communicated over the TWAMP control channel are replaced with a reply-in-kind approach for TWAMP Light. The reply-in-kind model reflects back the received QoS parameters, which can be influenced by the QoS policies of the Session-Reflector.

This informational work formed the baseline for standardization of TWAMP Light by RFC 8762, Simple Two-Way Active Measurement Protocol (STAMP). The STAMP standard defined by RFC 8762 is backward compatible, as described in the RFC 5357 appendix. As the STAMP work in the IETF continues to evolve, the backward compatibility has remained largely unchanged. However, RFC 8972, Simple Two-Way Active Measurement Protocol Optional Extension, introduces advanced capabilities to the base version of the protocol. This creates areas to consider.

The handling of some functions that were open to interpretation in the original TWAMP Light appendix is formalized in RFC 8972. To properly handle these mismatches, the SR OS uses a function that allows the user to define the wanted behavior for the Session-Sender (under the configure oam-pm context) and the Session-Reflector. Validation of configuration options are performed, and inconsistent configuration based on protocol handling is blocked.

Use the following command to configure the test packet type for the Session-Sender.

configure oam-pm session ip twamp-light session-sender-type

Use the following command to configure the test packet processing behavior for the Session-Reflector.

configure router twamp-light reflector type

The Session-Sender under the OAM-PM context uses the twamp-light and stamp options to determine the packet structure to transmit. For backward compatibility, the default packet format transmitted is twamp-light. When the twamp-light option is selected, any TLVs specific to STAMP test packets cannot be configured, including the pad-tlv-size. A Session-Sender should use the default pattern of 0 when communicating with a Session-Reflector acting in STAMP mode, to avoid conflicts when identifying the type of test packet arriving on the reflector. When the stamp option is selected, configuration options specific to TWAMP Light test packets cannot be configured, including the pad-size and a non-zero pattern.

The Session-Reflector uses the twamp-light and stamp options to determine its processing behavior for the test packet received from the Session-Sender. For backward compatibility, the default processing behavior is twamp-light. This type performs no TLV processing, treating any non-base packet octets as padding. If the required behavior for the Session-Reflector is to parse and process STAMP TLVs, the stamp command option must be used. Using this configuration, the Session-Reflector can accommodate both TWAMP Light and STAMP Session-Senders, processing the packet based on the TLV rules as defined in RFC 8972. Any Session-Sender that is transmitting TWAMP-Light-formatted test packets with additional padding must use an all-zero pattern to avoid ambiguity on the Session-Reflector.

The following describes PDU padding and the identification of STAMP TLVs.

  • TWAMP Light

    The TWAMP Light test packet request and response sizes are asymmetrical by default. The Session-Sender packet and the Session-Reflector packets are different sizes. To allow for symmetrical packets on the wire and packet size manipulation, the Session-Sender can configure the pad-size octets command to increase the size of the packet. These octets are added directly to the base packet. The default pattern of the padding is all zeros which can be changed using the pattern command.

  • STAMP

    The STAMP packet request and response sizes are symmetrical by default. RFC 8762 defines a structured packet that ensures this behavior. To allow for general packet size manipulation, the STAMP Optional Extensions RFC 8972 defines a PAD TLV. This TLV is added after the base packet. STAMP padding uses the pad-tlv-size octets command to increase the size of the packet. An all-zero PAD pattern must be used for the PAD TLV.

The Session-Reflector uses configuration option type twamp-light | stamp to determine its processing behavior for the test packet received from the Session-Sender. For backward compatibility, the default processing behavior is twamp-light. This type performs no TLV processing, treating any non-base packet octets as padding. If the required behavior for the Session-Reflector is to parse and process STAMP TLVs, type stamp must be used. Using this configuration, the Session-Reflector can accommodate both TWAMP Light and STAMP Session-Senders, processing the packet based on the TLV rules defined by the STAMP Optional Extensions RFC 8972. Any Session-Sender that is transmitting TWAMP Light-formatted test packets with additional padding must use an all-zero pattern to avoid ambiguity on the Session-Reflector.

SR OS was an early adopter of IP Performance Measurement (IP PM) under the OAM Performance Monitoring (OAM-PM) architecture implementing TWAMP Light RFC 5357 Appendix I behavior. However, subsequent features introduced to the SR OS have adopted the standardized version STAMP and STAMP Optional Extensions.

In Link Measurement, the Session-Sender uses STAMP formatted packets.

In OAM-PM, the Session-Sender allows for a choice of TWAMP Light- or STAMP-formatted packets.

TWAMP Light Session-Reflector

The Session-Reflector receives and processes TWAMP Light test packets.

Use the following context to configure the Session-Reflector functions for base router reflection.

configure router twamp-light

Use the following context to configure Session-Reflector functions for per VPRN reflection.

configure service vprn twamp-light 

The TWAMP Light Session-Reflector function is configured per context and must be activated before reflection can occur; the function is not enabled by default for any context. The Session-Reflector requires the user to define the TWAMP Light UDP listening port that identifies the TWAMP Light protocol. All the prefixes that the reflector accepts as valid sources for a TWAMP Light request must also be configured. If the source IP address in the TWAMP Light test packet arriving on the server does not match the configured prefixes, the packet is dropped. Multiple prefix entries may be configured per context on the server. Configured prefixes can be modified without shutting down the reflector function.

Note:

The TWAMP Light Session-Reflector udp-port udp-port-number range configured as part of the config>service>twamp-light and router>twamp-light create commands implement a restricted, reserved UDP port range that must adhere to a range of 862, 64364 to 64373 before an upgrade or reboot operation. Configurations outside this range result in a failure of the TWAMP Light Session-Reflector or prevent the upgrade operation. If an In-Service Software Upgrade (ISSU) function is invoked when the udp-port udp-port-number is outside the allowable range and the TWAMP Light Session-Reflector is in a no shutdown state, the ISSU operation cannot proceed. The user must, at a minimum, disable the TWAMP Light Session-Reflector to allow the ISSU to proceed; however, the TWAMP Light Session-Reflector is not allowed to enabled until the allowable range is met. A non-ISSU upgrade can proceed regardless of the state (enabled or disabled) of the TWAMP Light Session-Reflector. The configuration can load, however, the TWAMP Light Session-Reflector remains inactive following the reload when the allowable range is not met. When the udp-port udp-port-number for a TWAMP Light Session-Reflector is modified, all tests using the services of that reflector must update the dest-udp-port udp-port-number configuration parameter to match the new reflector listening port.

The TWAMP-Light Session-Reflector is stateful and supports unidirectional synthetic loss detection. An inactivity timeout under the config>oam-test>twamp>twamp-light command hierarchy defines the amount of time the TWAMP-Light Session-Reflector maintains individual test session in the absence of the arrival of test packets.

The TWAMP-Light Session-Reflector responds using the timestamp format that is indicated in the test packet from the Session-Sender. The Error Estimate Field is a two-byte field that includes an optional Z bit to indicate the format of timestamp. The TWAMP-Light Session-Reflector checks this field and replies using the same format for timestamp two (T2) and timestamp three (T3). The TWAMP-Light Session-Reflector does not interrogate or change the other bits in the Error Estimate field. Except for the processing of the Z bit, the received Error Estimate is reflected back to the Session-Sender.

Configurations that require an IPv6 UDP checksum of zero are increasing. In some cases, hardware timestamping functions that occur in the UDP header occur after the computation of the UDP checksum. Typically, packets that arrive with an IPv6 UDP checksum of zero are discarded. However, an optional configuration command allow-ipv6-udp-checksum-zero allows those packets to be accepted and processed for the configured UDP port of the TWAMP Light Session-Reflector.

Multiple tests sessions between peers are allowed. These test sessions are unique entities and may have different properties. Each test generates TWAMP test packets specific to their configuration. The TWAMP Light Session-Reflector includes the SSID defined by RFC 8972 as a fifth element augmenting the source IP, destination IP, source UDP port, and destination UDP port when maintaining the test state.

As TWAMP Light evolved, the TWAMP Light Session-Reflector required a method to determine processing of the arriving Session-Sender packets. The default processing behavior is type twamp-light. This treats all additional bytes beyond the base TWAMP Light packet as padding. The type stamp attempts to locate STAMP TLVs defined by RFC 8972 for processing.

See Link measurement and OAM performance monitoring for more information about the integration of TWAMP Light in those applications.

MPLS PM

TWAMP Light delay and loss for MPLS tunnels

The SR OS supports using TWAMP Light to measure the base router MPLS tunnel types in the following context.
configure oam-pm session ip tunnel mpls

See TWAMP Light and STAMP for more information about TWAMP Light.

When a TWAMP Light test configuration points to an MPLS tunnel, the complete test PDU is encapsulated in the MPLS tunnel and carried to the termination point of the tunnel based on MPLS forwarding rules. The session command must be configured to use the ip test family to allow for this mapping. The test family describes the underlying protocol used for testing, not the transport.

The following basic TWAMP Light IP configuration rules apply to the Session-Sender:
  • The source IP address must be part of the base router route table on the Session-Sender.

  • The destination must be an IP address in the base router on the terminating node of the MPLS tunnel where the Session-Reflector is configured.

  • The destination UDP port must be the listening UDP port of the Session-Reflector terminating the MPLS tunnel.

  • A Session-Reflector is required on the terminating node of the MPLS tunnel configured in the base router.

Use the commands in the following context to direct test packets to an MPLS transport.

configure oam-pm session ip tunnel

Users can specify the MPLS tunnel type to be used and the specific tunnel to carry the test packets. Entering a specific MPLS tunnel type sets it as the active configuration and deletes any configurations of different types under this context.

In the forward direction, test packets are encapsulated in the MPLS tunnel that matches the configuration. Because MPLS paths are unidirectional, the Session-Reflector performs an IP lookup in the base router route table and returns the packet using IP routing. This means IP reachability is required between the Session-Reflector and Session-Sender. Because the measurement is unidirectional delay, clock synchronization is required using PTP or an equivalently accurate time-distribution method. NTP does not have the accuracy to reliably produce unidirectional measurements.

IP MPLS tunnel configurations and IP forwarding configurations are mutually exclusive under the following context:

configure oam-pm session ip forwarding

Either type of test PDU (twamp-light or stamp) can be used for MPLS tunnel PM testing. The OAM-PM infrastructure, binning, delay streaming, threshold alarms, delay, and loss statistics are available for this testing.

Test statistics on the Session-Sender should be disregarded when an MPLS tunnel carrying the test packets recomputes on the Session-Sender node. Statistics during the MPLS tunnel change are not representative of the time to converge. Test convergence adds additional seconds on top of the tunnel recovery. This testing methodology is used to measure steady-state MPLS tunnel performance, not convergence, at the head of the tunnel.

RFC 6374 delay for MPLS tunnels

RFC 6374, Packet Loss and Delay Measurement for MPLS Networks, provides a standard packet format and process for measuring delay of a unidirectional or bidirectional label switched path (LSP) using the General Associated Channel (G-ACh), channel type 0x000C. Unidirectional LSPs, such as RSVP-TE, require an additional TLV to return a response to the querier (the launch point). RFC 7876, UDP Return Path for Packet Loss and Delay Measurement for MPLS Networks, defines the source IP information to include in the UDP Path Return TLV so the responding node can reach the querier using an IP network. The MPLS DM PDU does not natively include any IP source information. With MPLS TP there is no requirement for the TLV defined in RFC 7876.

The function of MPLS delay measurement is similar regardless of LSP type. The querier sends the MPLS DM query message toward the responder, transported in an MPLS LSP. he responder extracts the required PDU information to response appropriately.

Launching MPLS DM tests is configured in the config>oam-pm>session session-name test-family mpls context. Basic architectural OAM PM components are required to be completed along with the MPLS specific configuration. The test PDU includes the following PDU settings;

For the base PDU:

  • Channel Type: 0x000C (MPLS DM)

  • Flags: Query

  • Control Code: out-of-band (unidirectional LSP) and in-band (bidirectional LSP)

  • Querier Timestamp Format: IEEE 1588-2008 (1588v2) Precision Time Protocol truncated timestamp format

  • Session Identifier: The configured oam-pm>session>mpls>dm test-id test-id

  • DSCP: The configured oam-pm>session>mpls>dscp dscp-name (this value is not used to convey or influence the CoS setting in the MPLS TC field. The profile {in | out} and fc fc-name must be used to influence CoS markings.

  • Timestamp 1: Set to the local transmit time in PTP format

  • Timestamp 2 and 3: set to 0

TVLs can also be included, based on the configuration.

  • Padding (Type 0)

    Copy in Response: Padding is returned in the response when the oam-pm>session>mpls>dm>reflect-pad is configured.

  • Padding (Type 128)

    Do not copy in Response: Padding is not returned in the response (the typical configuration with unidirectional LSPs) when the oam-pm>session>mpls>dm>pad-tlv-size is configured without the reflect-pad command.

  • UDP Return (Type 131)

    UDP Return object: The IP information used by the reflector to reach the querier for an out-of-band response, when the oam-pm>session>mpls>lsp is rsvp or rsvp-auto and the udp-return-object information is configured.

The maximum pad size of 257 is a result of the structure of the defined TLV. The length field is one byte, limiting the overall value to 255 bytes.

The reflector processes the inbound MPLS DM PDU and respond back to the querier based on the received information, using the response flag setting. Specific to the timestamp, the responder responds to the Query Timestamp Format, filling in the Timestamp 2 and Timestamp 3 values.

When the response arrives back at the querier, the delay metrics are computed. The common OAM-PM computation model and naming is used to simplify and rationalize the different technologies that leverage the OAM-PM infrastructure. The common methodology reports unidirectional and round trip delay metrics for Frame Delay (FD), InterFrame Delay Variation (IFDV), and Frame Delay Range (FDR). The term, "frame" is not indicative of the underlying technology being measured. It represents a normal cross technology naming with applicability to the appropriate naming for the measured technology. The common normal naming requires a mapping to the supported delay measurements included in RFC 6374.

Table 3. Normalized naming mapping
Description RFC 6374 OAM-PM

A to B Delay

Forward

Forward

B to A Delay

Reverse

Backward

Two Way Delay (regardless of processing delays within the remote endpoint B)

Channel

Round-Trip

Two Way Delay (includes processing delay at the remote endpoint B)

Round-Trip

Because OAM-PM uses a common reporting model, unidirectional (forward and backward), round-trip is always reported. With unidirectional measurements, the T4 and T3 timestamps are zeroed but the round-trip and backward direction are still reported. With unidirectional measurements, the backward and round trip values are not of any significance.

An MPLS DM test may measure the endpoints of the LSP when the TTL is set to or higher than the termination point. Midpoints along the path that support MPLS DM response functions can be targeted by a test by setting a TTL to expire along the path. The MPLS DM launch and reflection, including mid-path transit nodes, capability is disabled by default. To launch and reflect MPLS DM test packets config>test-oam>mpls-dm must be enabled.

The SR OS implementation supports the following MPLS DM Channel Type 0x000C from RFC 6374 function:

  • Label Switched Path types

    • RSVP-TE and RSVP-TE Auto LSPs: sets out-of-band response request and requires the configuration of the UDP-Return-Object

    • MPLS-TP: sets in-band response request

  • Querier and Responder

  • Traffic class indicator set to on (1)

  • Querier Timestamp format PTP

  • Mandatory TLV 0 – Copy padding

  • Optional TLV 128 – Do not copy padding

  • Optional TLV 131 – UDP-Return-Object (RFC 7876)

The following functions are not supported:

  • Packet Loss measurement

  • Throughput management

  • Dyadic measurements

  • Loopback

  • Mandatory TLVs

    • TLV 1 – Return Address

    • TLV 2 – Session Query Interval

    • TLV 3 – Loopback Request

  • Optional TLVs

    • TLV 129 – Destination Address

    • TLV 130 – Source Address

ETH-CFM

The IEEE and the ITU-T have cooperated to define the protocols, procedures and managed objects to support service based fault management. Both IEEE 802.1ag standard and the ITU-T Y.1731 recommendation support a common set of tools that allow operators to deploy the necessary administrative constructs, management entities and functionality, Ethernet Connectivity Fault Management (ETH-CFM). The ITU-T has also implemented a set of advanced ETH-CFM and performance management functions and features that build on the proactive and on demand troubleshooting tools.

CFM uses Ethernet frames and is distinguishable by ether-type 0x8902. In specific cases, the different functions use a reserved multicast Layer 2 MAC address that could also be used to identify specific functions at the MAC layer. The multicast MAC addressing is not used for every function or in every case. The Operational Code (OpCode) in the common CFM header is used to identify the PDU type carried in the CFM packet. CFM frames are only processed by IEEE MAC bridges.

IEEE 802.1ag and ITU-T Y.1731 functions that are implemented are available on the SR and ESS platforms.

This section of the guide provides configuration example for each of the functions. It also provides the various OAM command line options and show commands to operate the network. The individual service guides provides the complete CLI configuration and description of the commands to build the necessary constructs and management points.

ETH-CFM acronym expansions lists and expands the acronyms used in this section.

Table 4. ETH-CFM acronym expansions
Acronym Expansion Supported platform

1DM

One way Delay Measurement (Y.1731)

All

AIS

Alarm Indication Signal

All

BNM

Bandwidth Notification Message (Y.1731 sub OpCode of GNM)

All

CCM

Continuity check message

All

CFM

Connectivity fault management

All

CSF

Client Signal Fail (Receive)

All

DMM

Delay Measurement Message (Y.1731)

All

DMR

Delay Measurement Reply (Y.1731)

All

ED

Ethernet Defect (Y.1731 sub OpCode of MCC)

All

GNM

Generic Notification Message

All

LBM

Loopback message

All

LBR

Loopback reply

All

LMM

(Frame) Loss Measurement Message

Platform specific

LMR

(Frame) Loss Measurement Response

Platform specific

LTM

Linktrace message

All

LTR

Linktrace reply

All

MCC

Maintenance Communication Channel (Y.1731)

All

ME

Maintenance entity

All

MA

Maintenance association

All

MD

Maintenance domain

All

MEP

Maintenance association endpoint

All

MEP-ID

Maintenance association endpoint identifier

All

MHF

MIP half function

All

MIP

Maintenance domain intermediate point

All

OpCode

Operational Code

All

RDI

Remote Defect Indication

All

TST

Ethernet Test (Y.1731)

All

SLM

Synthetic Loss Message

All

SLR

Synthetic Loss Reply (Y.1731)

All

VSM

Vendor Specific Message (Y.1731)

All

VSR

Vendor Specific Reply (Y.1731)

All

ETH-CFM building blocks

The IEEE and the ITU-T use their own nomenclature when describing administrative contexts and functions. This introduces a level of complexity to configuration, discussion and different vendors naming conventions. The SR OS CLI has chosen to standardize on the IEEE 802.1ag naming where overlap exists. ITU-T naming is used when no equivalent is available in the IEEE standard. In the following definitions, both the IEEE name and ITU-T names are provided for completeness, using the format IEEE Name/ITU-T Name.

Maintenance Domain (MD)/Maintenance Entity (ME) is the administrative container that defines the scope, reach and boundary for testing and faults. It is typically the area of ownership and management responsibility. The IEEE allows for various formats to name the domain, allowing up to 45 characters, depending on the format selected. ITU-T supports only a format of ‟none” and does not accept the IEEE naming conventions.

  • 0 is undefined and reserved by the IEEE.

  • 1 indicates no domain name.

  • 2,3, and 4 provide the ability to input various different textual formats, up to 45 characters. The string format (2) is the default and therefore the keyword is not shown when looking at the configuration.

Maintenance Association (MA)/Maintenance Entity Group (MEG) is the construct where the different management entities are contained. Each MA is uniquely identified by its MA-ID. The MA-ID comprises the MD level and MA name and associated format. This is another administrative context where the linkage is made between the domain and the service using the bridging-identifier configuration option. The IEEE and the ITU-T use their own specific formats. The MA short name formats (0 to 255) have been divided between the IEEE (0 to 31, 64 to 255) and the ITU-T (32 to 63), with five currently defined (1 to 4, 32). Even though the different standards bodies do not have specific support for the others formats a Y.1731 context can be configured using the IEEE format options.

The following formats are supported:

1 (Primary VID)
values 0 to 4094
2 (String)
raw ASCII, excluding 0-31 decimal/0-1F hex (which are control characters) form the ASCII table
3 (2-octet integer)
values 0 to 65535
4 (VPN ID)
Hex value as described in RFC 2685, Virtual Private Networks Identifier
32 (icc-format)
exactly 13 characters from the ITU-T recommendation T.50
Note: When a VID is used as the short MA name, 802.1ag does not support VLAN translation because the MA-ID must match all the MEPs. The default format for a short MA name is an integer. Integer value 0 means the MA is not attached to a VID. This is useful for VPLS services on SR OS platforms because the VID is locally significant.
Note: The double quote character (‟) included as part of the ITU-T recommendation T.50 is not a supported character on the SR OS.

Maintenance Domain Level (MD Level)/Maintenance Entity Group Level (MEG Level) is the numerical value (0-7) representing the width of the domain. The wider the domain (higher the numerical value) the farther the ETH-CFM packets can travel. It is important to understand that the level establishes the processing boundary for the packets. Strict rules control the flow of ETH-CFM packets and are used to ensure correct handling, forwarding, processing and dropping of these packets. ETH-CFM packets with higher numerical level values flows through MEPs on MIPs on endpoints configured with lower level values. This allows the operator to implement different areas of responsibility and nest domains within each other. Maintenance association (MA) includes a set of MEPs, each configured with the same MA-ID and MD level used to verify the integrity of a single service instance.

Note: Domain format and requirements that match that format, as well as association format and those associated requirements, and the level must match on peer MEPs.

Maintenance Endpoints/MEG Endpoints (MEP) are the workhorses of ETH-CFM. A MEP is the unique identification within the association (1-8191). Each MEP is uniquely identified by the MA-ID, MEP-ID tuple. This management entity is responsible for initiating, processing and terminating ETH-CFM functions, following the nesting rules. MEPs form the boundaries which prevent the ETH-CFM packets from flowing beyond the specific scope of responsibility. A MEP has direction, up or down. Each indicates the directions packets are generated; up toward the switch fabric, down toward the SAP away from the fabric. Each MEP has an active and passive side. Packets that enter the active point of the MEP are compared to the existing level and processed accordingly. Packets that enter the passive side of the MEP are passed transparently through the MEP. Each MEP contained within the same maintenance association and with the same level (MA-ID) represents points within a single service. MEP creation on a SAP is allowed only for Ethernet ports with NULL, q-tags, q-in-q encapsulations. MEPs may also be created on SDP bindings. A vMEP is a service level MEP configuration that installs ingress (down MEP-like) extraction on the supported ETH-CFM termination points within a VPLS configuration.

Maintenance Intermediate Points/MEG Intermediate Points (MIPs) are management entities between the terminating MEPs along the service path. MIPs provide insight into the service path connecting the MEPs. MIPs only respond to Loopback Messages (LBM) and Linktrace Messages (LTM). All other CFM functions are transparent to these entities.

MIP creation is the result of the mhf-creation mode and interaction with related MEPs, and with the direction of the MEP. Two different authorities can be used to determine the MIPs that should be considered and instantiated. The domain and association or the default-domain hierarchies match the configured bridge identifier and VLAN to the service ID and any configured primary VLAN. When a primary VLAN MIP is not configured, the VLAN is either ignored or configured as none.

The domain and association MIP creation function triggers a search for all ETH-CFM domain association bridge identifier matches to the service it is linked to. A MIP candidate is then be evaluated using the mhf-creation mode and the rules that govern the algorithm. The domain association mhf-creation modes and their uses are listed below:

  • none

    A MIP is not a candidate for creation using this domain association bridge identifier. This is the default mhf-creation mode for every bridge identifier under this hierarchy.

  • explicit

    A MIP is a candidate for creation using this domain association bridge identifier only if a lower-level MEP exists.

  • default

    A MIP is a candidate for creation using this domain association bridge identifier regardless of the existence of a lower-level MEP. If a lower-level MEP is present, this creation mode behaves in the same manner as explicit creation mode.

  • static

    A MIP is a candidate for creation using the domain association bridge identifier at the level of the domain. This creation mode is specific to MIPs with the primary-vlan-enabled parameter configured. Different VLANs maintain their own level hierarchies. Primary VLAN creation under this context requires static mode.

For all modes except static mode, only a single MIP can be created. All candidates are collected and the lowest-level valid MIP is created. In static mode, all valid MIPs are created for the bridge identifier VLAN pair. A MIP is considered invalid if the level of the MIP is equal to or below a downward-facing MEP, or below the level of an upward-facing MEP and the MIP shares the same service component as the Up MEP.

Not all creation modes require the mip creation statement within the service. The explicit and default mhf-creation modes may instantiate a MIP without the mip creation statement under the service if a lower-level MEP exists for the domain association bridge identifier. If a lower-level MEP does not exist, the default and static mhf-creation modes require the mip creation statement on the service connection.

MEPs require the domain and association configurations to ensure that all ETH-CFM PDUs can be supported. MIPs have restricted ETH-CFM PDU support: ETH-LB and ETH-LT. These two protocols do not require the configuration of a domain and association. MIPs may be created outside of the association context using the default-domain table.

The default-domain table is an object table populated with values that are used for MIP creation. The table is indexed by the bridge identifier and VLAN. An index entry is automatically added when the mip creation statement is added under a SAP or SDP binding. When an index entry is added, the bridge identifier is set to the service ID and the VLAN is set to the primary-vlan-enable vlan-id. If the MIP does not use primary VLAN functionality, the VLAN is configured as none. When the entry has been added to the default-domain table, the default values can be configured. The default-domain table defers to the system-wide, read-only values.

Because there are two different locations able to process the MIP creation logic, a per-bridge identifier VLAN authority must be determined. The authority is a component, table, or configuration that is responsible for executing the MIP creation algorithm. In general, any domain association bridge identifier that could be used to create a specific MIP is authoritative. Other configurations influence the authority, such as the type of MIP (primary VLAN or non-primary VLAN), the different mhf-creation modes, the interaction of those modes with MEPs, and the direction of the MEP.

The following rules provide some high-level guidelines to determine the authority:

  • rule 1

    The original model predating the default-domain is always applied first. If a MIP is created using the original model, the new model is not applied. The original model includes complex Up MEP MIP creation rules. If an Up MEP exists on a service connection, any service connection other than the one with the active Up MEP attempts to create the lowest higher-level MIP using the domain association bridge identifier table. If a higher-level MIP cannot be created, and no higher-level association exists, the default-domain table is consulted.

  • rule 2

    A mip creation statement is required under the service connection to use the default-domain table. This is different from the domain association table. The domain association table does not require the mip creation statement when the mhf-creation mode is configured as either explicit or default and a lower-level MEP is present.

  • rule 3

    If no domain association bridge identifier matches the service ID, the default-domain table is consulted.

  • rule 4

    If a domain association bridge identifier matches a service ID for the sole purpose of MEP creation, and no higher or lower domain association with the same bridge identifier exists, the default-domain table is consulted.

    • rule 4a

      Any domain association bridge identifier matching a service ID with a configured VLAN and a static mhf-creation mode is authoritative for all matching service IDs and MIPs with primary-vlan-enable configured with the same VLAN.

    • rule 4b

      Any domain association bridge identifier attempting to create a MIP with primary-vlan-enable configured is considered non-authoritative if the mhf-creation mode is anything other than static.

When the authority for MIP creation is determined, the MIP attributes are derived from that creation table. The default domain table defers to the read-only, system-wide MIP values and inherits those defaults. Some of the objects under the default-domain hierarchy must be configured using the same statement to avoid transient and unexpected MIP creation while the configuration is being completed. To this end, the mhf-creation mode and level have been combined in the same configuration statement.

The standard mhf-creation modes (none, default, explicit) are configurable as part of the default-domain table. Static mode can only be configured under the domain association bridge identifier. This is because default domain table indexing precludes multiple MIPs at different levels.

MIP creation requires configuration. The default values in both the domain association and the default domain table prevent MIP instantiation.

The show eth-cfm mip-instantiation command can be used to check the authority for each MIP.

There are two locations in the configuration where ETH-CFM is defined. The first location, where the domains, associations (including links to the service), MIP creation method, common ETH-CFM functions, and remote MEPs are defined under the top-level eth-cfm command. The second location is within the service or facility.

ETH-CFM support matrix is a general table that indicates ETH-CFM support for the different services and SAP or SDP binding. It is not meant to indicate the services that are supported or the requirements for those services on the individual platforms.

Table 5. ETH-CFM support matrix
Service Ethernet connection Down MEP Up MEP MIP Virtual MEP

Epipe

No

SAP

Yes

Yes

Yes

Spoke-SDP

Yes

Yes

Yes

PW-SAP

No

No

Yes

VPLS

Yes

SAP

Yes

Yes

Yes

Spoke-SDP

Yes

Yes

Yes

Mesh-SDP

Yes

Yes

Yes

B-VPLS

Yes

SAP

Yes

Yes

Yes

Spoke-SDP

Yes

Yes

Yes

Mesh-SDP

Yes

Yes

Yes

I-VPLS

No

SAP

Yes

Yes

Yes

Spoke-SDP

Yes

Yes

Yes

M-VPLS

No

SAP

Yes

Yes

Yes

Spoke-SDP

Yes

Yes

Yes

Mesh-SDP

Yes

Yes

Yes

PBB Epipe

No

SAP

Yes

Yes

Yes

Spoke-SDP

Yes

Yes

Yes

Ipipe

No

SAP

Yes

No

No

Ethernet-Tunnel SAP

Yes

No

No

IES

No

SAP

Yes

No

No

Spoke-SDP (Interface)

Yes

No

No

Subscriber Group-int SAP

Yes

No

No

VPRN

No

SAP

Yes

No

No

Spoke-SDP (Interface)

Yes

No

No

Subscriber Group-int SAP

Yes

No

No

Figure 21. MEP and MIP

MEP creation illustrates the usage of an Epipe on two different nodes that are connected using ether SAP 1/1/2:100.31. The SAP 1/1/10:100.31 is an access port that is not used to connect the two nodes.

Figure 22. MEP creation
NODE1
config>eth-cfm# info
----------------------------------------------
        domain 3 format none level 3
            association 1 format icc-based name "03-0000000101"
                bridge-identifier 100
                exit
            exit
        exit
        domain 4 format none level 4
            association 1 format icc-based name "04-0000000102"
                bridge-identifier 100
                exit
            exit
        exit

config>service>epipe# info
----------------------------------------------
            sap 1/1/2:100.31 create
                eth-cfm
                    mep 111 domain 3 association 1 direction down
                        mac-address d0:0d:1e:00:01:11
                        no shutdown
                    exit
                exit
            exit
            sap 1/1/10:100.31 create
                eth-cfm
                    mep 101 domain 4 association 1 direction up
                        mac-address d0:0d:1e:00:01:01
                        no shutdown
                    exit
                exit
            exit
            no shutdown
----------------------------------------------

NODE 2
eth-cfm# info
----------------------------------------------
        domain 3 format none level 3
            association 1 format icc-based name "03-0000000101"
                bridge-identifier 100
                exit
            exit
        exit
        domain 4 format none level 4
            association 1 format icc-based name "04-0000000102"
                bridge-identifier 100
                exit
            exit
        exit
----------------------------------------------
config>service>epipe# info
----------------------------------------------
            sap 1/1/2:100.31 create
                eth-cfm
                    mep 112 domain 3 association 1 direction down
                        mac-address d0:0d:1e:00:01:12
                        no shutdown
                    exit
                exit
            exit
            sap 1/1/10:100.31 create
                eth-cfm
                    mep 102 domain 4 association 1 direction up
                        mac-address d0:0d:1e:00:01:02
                        no shutdown
                    exit
                exit
            exit
            no shutdown
----------------------------------------------

Examining the configuration from NODE1, MEP 101 is configured with a direction of UP causing all ETH-CFM traffic originating from this MEP to generate into the switch fabric and out the mate SAP 1/1/2:100.31. MEP 111 uses the default direction of DOWN causing all ETH-CFM traffic that is generated from this MEP to send away from the fabric and only egress the SAP on which it is configured, SAP 1/1/2:100.31.

Further examination of the domain constructs reveal that the configuration properly uses domain nesting rules. In this case, the Level 3 domain is completely contained in a Level 4 domain.

MIP creation example (NODE1) illustrates the creation of an explicit MIP using the association MIP construct.

Figure 23. MIP creation example (NODE1)
NODE1
config>eth-cfm# info
----------------------------------------------
        domain 3 format none level 3
            association 1 format icc-based name "03-0000000101"
                bridge-identifier 100
                exit
            exit
        exit
        domain 4 format none level 4
            association 1 format icc-based name "04-0000000102"
                bridge-identifier 100
                exit
            exit
     association 2 format icc-based name "04-MIP0000102"
                bridge-identifier 100
                    mhf-creation explicit
                exit
            exit
        exit

config>service>epipe# info
----------------------------------------------
            sap 1/1/2:100.31 create
                eth-cfm
                    mep 111 domain 3 association 1 direction down
                        mac-address d0:0d:1e:00:01:11
                        no shutdown
                    exit
                exit
            exit
            sap 1/1/10:100.31 create
                eth-cfm
                    mep 101 domain 4 association 1 direction up
                        mac-address d0:0d:1e:00:01:01
                        no shutdown
                    exit
                exit
            exit
            no shutdown
----------------------------------------------

NODE 2
eth-cfm# info
----------------------------------------------
        domain 3 format none level 3
            association 1 format icc-based name "03-0000000101"
                bridge-identifier 100
                exit
            exit
        exit
        domain 4 format none level 4
            association 1 format icc-based name "04-0000000102"
                bridge-identifier 100
                exit
            exit
    association 2 format icc-based name "04-MIP0000102"
                bridge-identifier 100
                    mhf-creation explicit
                exit
            exit
        exit
----------------------------------------------

config>service>epipe# info
----------------------------------------------
            sap 1/1/2:100.31 create
                eth-cfm
                    mep 112 domain 3 association 1 direction down
                        mac-address d0:0d:1e:00:01:12
                        no shutdown
                    exit
                exit
            exit
            sap 1/1/10:100.31 create
                eth-cfm
                    mep 102 domain 4 association 1 direction up
                        mac-address d0:0d:1e:00:01:02
                        no shutdown
                    exit
                exit
            exit
            no shutdown
----------------------------------------------

An addition of association 2 under domain 4 includes the mhf-creation explicit statement. This means that when the level 3 MEP is assigned to the SAP 1/1/2:100.31 using the definition in domain 3 association 1, creating the higher level MIP on the same SAP. Because a MIP does not have directionality ‟Both” sides are active. The service configuration and MEP configuration within the service did not change.

MIP creation default illustrates a simpler method that does not require the creation of the lower level MEP. The operator simply defines the association parameters and uses the mhf-creation default setting, then places the MIP on the SAP of their choice.

Figure 24. MIP creation default

NODE1:

config>eth-cfm# info
----------------------------------------------
        domain 4 format none level 4
            association 1 format icc-based name "04-0000000102"
                bridge-identifier 100
                exit
            exit
            association 2 format icc-based name "04-MIP0000102"
                bridge-identifier 100
                    mhf-creation default
                exit
            exit
        exit
----------------------------------------------


config>service>epipe# info
----------------------------------------------
            sap 1/1/2:100.31 create
                eth-cfm
                     mip mac d0:0d:1e:01:01:01
                exit
            exit
            sap 1/1/10:100.31 create
                eth-cfm
                    mep 101 domain 4 association 1 direction up
                        mac-address d0:0d:1e:00:01:01
                        no shutdown
                    exit
                exit
            exit
            no shutdown
----------------------------------------------

NODE2:

config>eth-cfm# info
----------------------------------------------
        domain 4 format none level 4
            association 1 format icc-based name "04-0000000102"
                bridge-identifier 100
                exit
            exit
            association 2 format icc-based name "04-MIP0000102"
                bridge-identifier 100
                    mhf-creation default
                exit
            exit
        exit
----------------------------------------------


config>service>epipe# info
----------------------------------------------
            sap 1/1/2:100.31 create
                eth-cfm
                    mip mac d0:0d:1e:01:01:02
                exit
            exit
            sap 1/1/10:100.31 create
                eth-cfm
                    mep 102 domain 4 association 1 direction up
                        mac-address d0:0d:1e:00:01:02
                        no shutdown
                    exit
                exit
            exit
            no shutdown
----------------------------------------------

MEP, MIP and MD levels shows the detailed IEEE representation of MEPs, MIPs, levels and associations, using the standards defined icons.

SAPs support a comprehensive set of rules including wild cards to map packets to services. For example, a SAP mapping packets to a service with a port encapsulation of QinQ may choose to only look at the outer VLAN and wildcard the inner VLAN. SAP 1/1/1:100.* would map all packets arriving on port 1/1/1 with an outer VLAN 100 and any inner VLAN to the service the SAP belongs to. These powerful abstractions extract inbound ETH-CFM PDUs only when there is an exact match to the SAP construct. In the case of the example when then an ETH-CFM PDU arrives on port 1/1/1 with a single VLAN with a value of 100 followed immediately with e-type (0x8902 ETH-CFM). Furthermore, the generation of the ETH-CFM PDUs that egress this specific SAP are sent with only a single tag of 100. The primary VLAN is required if the operator needs to extract ETH-CFM PDUs or generate ETH-CFM PDUs on wildcard SAPs and the offset includes an additional VLAN that was not part of the SAP configuration.

Extraction comparison with primary VLAN shows how packets that would normally bypass the ETH-CFM extraction would be extracted when the primary VLAN is configured. This assumes that the processing rules for MEPs and MIPs is met, E-type 0x8902, Levels and OpCodes.

Table 6. Extraction comparison with primary VLAN
Port encapsulation E-type Ingress tags Ingress SAP No primary VLAN

ETH-CFM extraction

With primary VLAN (10)

ETH-CFM extraction

MEP

MIP

MEP

MIP

Dot1q

0x8902

10

x/y/z:*

No

No

Yes

Yes

Dot1q

0x8902

10.10

x/y/z:10

No

No

Yes

Yes

QinQ

0x8902

10.10

x/y/z:10.*

No

No

Yes

Yes

QinQ (Default Behavior)

0x8902

10.10

x/y/z:10.0

No

No

Yes

Yes

Null

0x8902

10

x/y/z

No

No

Yes

Yes

The mapping of the service data remains unchanged. The primary VLAN function allows for one additional VLAN offset beyond the SAP configuration, up to a maximum of two VLANs in the frame. If a fully qualified SAP specifies two VLANs (SAP 1/1/1:10.10) and a primary VLAN of 12 is configured for the MEP there is no extraction of ETH-CFM for packets arriving tagged 10.10.12. That exceeds the maximum of two tags.

The mapping or service data based on SAPs has not changed. ETH-CFM MPs functionality remains SAP specific. In instances where as service includes a specific SAP with a specified VLAN (1/1/1:50) and a wildcard SAP on the same port (1/1/1:*) it is important to understand how the ETH-CFM packets are handled. Any ETH-CFM packet with etype 0x8902 arriving with a single tag or 50 would be mapped to a classic MEP configured under SAP 1/1/1:50. Any packet arriving with an outer VLAN of 50 and second VLAN of 10 would be extracted by the 1/1/1:50 SAP and would require a primary VLAN enabled MEP with a value of 10, assuming the operator would like to extract the ETH-CFM PDU of course. An inbound packet on 1/1/1 with an outer VLAN tag of 10 would be mapped to the SAP 1/1/1:*. If ETH-CFM extraction is required under SAP 1/1/1:* a primary VLAN enabled MEP with a value of 10 would be required.

The packet that is generated from a MEP or MIP with the primary VLAN enabled is include that VLAN. The SAP encapsulates the primary VLAN using the SAP encapsulation.

Primary VLAN support includes UP MEPs, DOWN MEPs and MIPs on Ethernet SAPs, including LAG, as well as SDP bindings for Epipe and VPLS services. Classic MEPs, those without a primary VLAN enabled, and a primary VLAN enabled MEPs can coexist under the same SAP or SDP binding. Classic MIPs and primary VLAN-enabled MIPs may also coexist. The enforcement of a single classic MIP per SAP or SDP binding continues to be enforced. However, the operator may configure multiple primary VLAN-enabled MIPs on the same SAP or SDP binding. MIPs in the primary VLAN space must include the mhf-creation static configuration under the association and must also include the specific VLAN on the MIP creation statement under the SAP. The no version of the mip command must include the entire statement including the VLAN information.

The eight MD Levels (0 to 7) are specific to context in which the Management Point (MP) is configured. This means the classic MPs have a discrete set of the levels from the primary VLAN enabled space. Each primary VLAN space has its own eight Level MD space for the specified primary VLAN. Consideration must be extended before allowing overlapping levels between customers and operators should the operator be provision a customer facing MP, like a MIP on a UNI. CPU Protection extensions for ETH-CFM are VLAN unaware and based on MD Level and the OpCode. Any configured rates are applied to the Level and OpCode as a group.

There are two configuration steps to enable the primary VLAN. Under the bridging instance, contained within the association context (config>eth-cfm>domain>assoc>bridge), the VLAN information must be configured. Until this is enabled using the primary-vlan-enable option as part of the MEP creation step or the MIP statement (config>service>…>{sap | mesh-sdp | spoke-sdp}>eth-cfm) the VLAN specified under the bridging instance remains inactive. This is to ensure backward interoperability.

Primary VLAN functions require an FP2-based card or better. Primary VLAN is not supported for vpls-sap-templates, sub-second CCM intervals, or vMEPs.

Figure 25. MEP, MIP and MD levels

An operator may see the following INFO message (during configuration reload), or MINOR (error) message (during configuration creation) when upgrading to 11.0r4 or later if two MEPs are in a previously undetected conflicting configuration. The messaging is an indication that a MEP, the one stated in the message using format (domain md-index/association ma-index/mep mep-id), is already configured and has allocated that context. During a reload (INFO) a MEP that encounters this condition is created but its state machine is disabled. If the MINOR error occurs during a configuration creation this MEP fails the creation step. The indicated MEP must be correctly re-configured.

INFO: ETH_CFM #1341 Unsupported MA ccm-interval for this MEP - MEP 1/112/
21 conflicts with sub-second config on this MA
MINOR: ETH_CFM #1341 Unsupported MA ccm-interval for this MEP - MEP 1/112/
21 conflicts with sub-second config on this MA

Service data arriving at an ingress SAP performs several parsing operations to map the packet to the service as well as to VLAN functions. VLAN functions include determining the service-delineated VLANs based on the ingress configuration. Locally-generated CFM packets are unaware of the ingress VLAN functions. This may lead to service data and CFM data tagging alignment issues when the egress connection is a binding. For example, if the SDP is configured with vc-type vlan and the binding referencing the SDP does not specify the VLAN tag with the optional vlan-vc-tag vlan-id configuration, the service data crossing the service and the locally-generated CFM packets can use different VLAN tags. A problem can occur if this VLAN is significant to the peer. Similarly, an EVPN service cannot specify the vlan-vc-tag vlan-id to be used on the binding.

The optional cfm-vlan-tag <qtag1[.<qtag2>]> command used for MEP and MIP configurations supports the alignment of service data and the locally generated CFM packet VLAN tags for bindings that require matching VLAN tags. The qtag configuration should typically match the ingress SAP configuration. For example, a SAP that is configured with an enacp-type qinq and the associated SAP of 100.* should consider using the cfm vlan- tag 100 configuration option under the MEP or MIP, when a situation describing misalignment of VLAN tags is encountered.

The cfm-vlan-tag <qtag1[.<qtag2>]> command option is supported for VPLS and Epipe services only.

The cfm-vlan-tag <qtag1[.<qtag2>]> command option is not supported for the following cases:

  • when the configuration requires the CFM to include more than two VLAN tags, which may cause a configuration error; invalid configurations include a MEP or MIP configured with a primary-vlan-enable vlan-id and both tags specified of the cfm-vlan-tag <qtag1[.<qtag2>]>

  • when the MIP creation does not include the MIP configuration statement under the service, specifically, the default behavior MIP that is created solely based on the associated MHF creation

  • when the context is config>service>template vpls-sap-template

  • for G.8031 (ETH-tunnel) and G.8032 (ETH-ring)

  • for a MIP on a PW-SAP

Loopback

A loopback message is generated by an MEP to its peer MEP or a MIP (CFM loopback). The functions are similar to an IP ping to verify Ethernet connectivity between the nodes.

Figure 26. CFM loopback

The following loopback-related functions are supported:

  • Loopback message functionality on an MEP or MIP can be enabled or disabled.

  • MEP supports generating loopback messages and responding to loopback messages with loopback reply messages. The ETH-LB PDU format does not allow a MEP to have more than a single active ETH-LB session.

  • MIP supports responding to loopback messages with loopback reply messages when loopback messages are targeted to self.

  • SenderID TLV may optionally be configured to carry the ChassisID. When configured, this information is included in LBM messages.

    • Only the ChassisID portion of the TLV is included.

    • The Management Domain and Management Address fields are not supported on transmission.

    • As per the specification, the LBR function copies and returns any TLVs received in the LBM message. This means that the LBR message includes the original SenderID TLV.

    • Supported for both service (id-permission) and facility MEPs (facility-id-permission).

    • Supported for both MEP and MIP.

  • Displays the loopback test results on the originating MEP.

The ETH-LBM (loopback) function includes parameters for sub second intervals, timeouts, and new padding parameters.

When an ETH-LBM command is issued using a sub second interval (100ms), the output success is represented with a ‟!” character, and a failure is represented with a ‟.” The updating of the display waits for the completion of the previous request before producing the next result. However, the packets maintain the transmission spacing based on the interval option specified in the command.

oam eth-cfm loopback 00:00:00:00:00:30 mep 28 domain 14 association 2 interval 1 
send-count 100 timeout 1
Eth-Cfm Loopback Test Initiated: Mac-Address: 00:00:00:00:00:30, out service: 5

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!

Sent 100 packets, received 100 packets [0 out-of-order, 0 Bad Msdu]
Packet loss 1.00% 

When the interval is one seconds or higher, the output provides information that includes the number of bytes (from the LBR), the source MEP ID (format md-index/ma-index/mepid), and the sequence number as it relates to this test and the result.

oam eth-cfm loopback 00:00:00:00:00:30 mep 28 domain 14 association 2 interval 10  
send-count 10 timeout 1
Eth-Cfm Loopback Test Initiated: Mac-Address: 00:00:00:00:00:30, out service: 5

56 bytes from 14/2/28; lb_seq=1 passed
56 bytes from 14/2/28; lb_seq=2 passed
56 bytes from 14/2/28; lb_seq=3 passed
56 bytes from 14/2/28; lb_seq=4 passed
56 bytes from 14/2/28; lb_seq=5 passed
56 bytes from 14/2/28; lb_seq=6 passed
56 bytes from 14/2/28; lb_seq=7 passed
56 bytes from 14/2/28; lb_seq=8 passed
56 bytes from 14/2/28; lb_seq=9 passed
56 bytes from 14/2/28; lb_seq=10 passed

Sent 10 packets, received 10 packets [0 out-of-order, 0 Bad Msdu]
Packet loss 0.00%

Because ETH-LB does not support standard timestamps, no indication of delay is produced as these times are not representative of network delay.

By default, if no interval is included in the command, the default is back to back LBM transmissions. The maximum count for such a test is 5.

Loopback multicast

Multicast loopback also supports the new intervals (see Loopback). However, the operator must be careful when using this approach. Every MEP in the association responds to this request. This means an exponential impact on system resources for large scale tests. If the multicast option is used and there with an interval of 1 (100ms) and there are 50 MEPs in the association, this results in a 50 times increase in the receive rate (500pps) compared to a unicast approach. Multicast displays are not be updated until the test is completed. There is no packet loss percentage calculated for multicast loopback commands.

This on demand operation tool is used to quickly check the reachability of all MEPs within an Association. A multicast address can be coded as the destination of an oam eth-cm loopback command. The specific class 1 multicast MAC address or the keyword ‟multicast” can be used as the destination for the loopback command. The class 1 ETH-CFM multicast address is in the format 01:80:C2:00:00:3x (where x = 0 - 7 and is the number of the domain level for the source MEP). When the ‟multicast” option is used, the class 1 multicast destination is built according to the local MEP level initiating the test.

Remote MEPs that receive the multicast loopback message, configured at the equivalent level, are terminated and process the multicast loopback message by responding with the appropriate unicast loopback response (ETH-LBR). Regardless of whether a multicast or unicast ETH-LBM is used, there is no provision in the standard LBR PDU to carry the MEP-ID of the responder. This means only the remote MEP MAC Address is reported and subsequently displayed. MIPs do not extract a multicast LBM request. The LBM multicast is transparent to the MIP.

MEP loopback stats are not updated as a result of this test being run. That means the received, out-of-order and bad-msdu counts are not affected by multicast loopback tests. The multicast loopback command is meant to provide immediate connectivity troubleshooting feedback for remote MEP reachability only.

oam eth-cfm loopback multicast mep 28 domain 14 association 2 interval 1 send-
count 100
Eth-Cfm Loopback Test Initiated: Mac-Address: multicast, out service: 5


MAC Address          Receive Order
-------------------------------------------------------------------------------
00:00:00:00:00:30    1    2    3    4    5    6    7    8    9   10   11   12   13  
14   15   16   17   18   19   20   21   22   23   24   25   26   27   28   29   30 
31   32   33   34   35   36   37   38   39   40   41   42   43   44   45   46   47
48   49   50   51   52   53   54   55   56   57   58   59   60   61   62   63   6
4    65   66   67   68   69   70   71   72   73   74   75   76   77   78   79   80  
81   82   83   84   85   86   87   88   89   90   91   92   93   94   95   96   97  
98   99  100
00:00:00:00:00:32    1    2    3    4    5    6    7    8    9   10   11   12   13  
14   15   16   17   18   19   20   21   22   23   24   25   26   27   28   29   30 
31   32   33   34   35   36   37   38   39   40   41   42   43   44   45   46   47
48   49   50   51   52   53   54   55   56   57   58   59   60   61   62   63   6
4    65   66   67   68   69   70   71   72   73   74   75   76   77   78   79   80  
81   82   83   84   85   86   87   88   89   90   91   92   93   94   95   96   97  
98   99  100

Sent 100 multicast packets, received 200 packets

Linktrace

A linktrace message is originated by an MEP and targeted to a peer MEP in the same MA and within the same MD level (CFM linktrace). Linktrace traces a specific MAC address through the service. The peer MEP responds with a linktrace reply message after successful inspection of the linktrace message. The MIPs along the path also process the linktrace message and respond with linktrace replies to the originating MEP if the received linktrace message that has a TTL greater than 1 and forward the linktrace message if a look up of the target MAC address in the Layer 2 FDB is successful. The originating MEP shall expect to receive multiple linktrace replies and from processing the linktrace replies, it can put together the route to the target bridge.

A traced MAC address is carried in the payload of the linktrace message, the target MAC. Each MIP and MEP receiving the linktrace message checks whether it has learned the target MAC address. To use linktrace the target MAC address must have been learned by the nodes in the network. If so, a linktrace message is sent back to the originating MEP. Also, a MIP forwards the linktrace message out of the port where the target MAC address was learned.

The linktrace message itself has a multicast destination address. On a broadcast LAN, it can be received by multiple nodes connected to that LAN. But, at most, one node sends a reply.

Figure 27. CFM linktrace

The following linktrace related functions are supported:

  • MEP supports generating linktrace messages and responding with linktrace reply messages. The ETH-LT PDU format does not allow a MEP to have more than a single active ETH-LT session.

  • MIP supports responding to linktrace messages with linktrace reply messages when encoded TTL is greater than 1, and forward the linktrace messages accordingly if a lookup of the target MAC address in the Layer 2 FDB is successful.

  • Displays linktrace test results on the originating MEP.

  • SenderID TLV may optionally be configured to carry the ChassisID. When configured, this information is included in LTM and LTR messages.

    • Only the ChassisID portion of the TLV is included.

    • The Management Domain and Management Address fields are not supported on transmission.

    • THE LBM message includes the SenderID TLV that is configure on the launch point. The LBR message includes the SenderID TLV information from the reflector (MIP or MEP) if it is supported.

    • Supported for both service (id-permission) and facility MEPs (facility-id-permission).

    • Supported for both MEP and MIP.

The following output includes the SenderID TLV contents if it is included in the LBR.

oam eth-cfm linktrace 00:00:00:00:00:30 mep 28 domain 14 association 2
Index Ingress Mac          Egress Mac           Relay      Action
----- -------------------- -------------------- ---------- ----------
1     00:00:00:00:00:00    00:00:00:00:00:30    n/a        terminate
SenderId TLV: ChassisId (local)
              access-012-west
----- -------------------- -------------------- ---------- ----------
No more responses received in the last 6 seconds.

Continuity Check

A Continuity Check Message (CCM) is a multicast frame that is generated by a MEP and multicast to all other MEPs in the same MA. The CCM does not require a reply message. To identify faults, the receiving MEP maintains an internal list of remote MEPs it should be receiving CCMs from.

This list based on the remote MEP ID configuration within the association the MEP is created in. When the local MEP does not receive a CCM from one of the configured remote MEPs within a preconfigured period, the local MEP raises an alarm.

The following figure displays CFM CC.

Figure 28. CFM CC

The following figure displays a CFM CC failure scenario.

Figure 29. CFM CC failure scenario

A MEP may be configured to generate an ETH-CC packet using a unicast destination Layer 2 MAC address. This may help reduce the overhead in some operational models where Down MEPs per peer are not available. For example, mapping an I-VPLS to a PBB core where a hub is responsible for multiple spokes is one of the applicable models. When ETH-CFM packets are generated from an I-context toward a remote I-context, the packets traverse the B-VPLS context. Most B-contexts are multipoint. This means broadcast, unknown, and multicast (BUM) packets are flooded to all nodes in the B-context. When ETH‑CC multicast packets are generated, all the I-VPLS contexts in the association must be configured with all the appropriate remote MEP IDs. If direct spoke to spoke connectivity is not part of the validation requirement, the operational complexity can be reduced by configuring unicast DA addressing on the ‟spokes” and continuing to use multicast CCM from the ‟hub”. When the unicast MAC is learned in the forwarding database, traffic is scoped to a single node.

The following figure displays unicast CCM in a hub and spoke environment.

Figure 30. Unicast CCM in hub and spoke environments

Defect condition, reception, and processing remains unchanged for both hub and spokes. When an ETH-CC defect condition is raised on the hub or spoke, the appropriate defect condition is set and distributed throughout the association from the multicasting MEP. For example, if a spoke raises a defect condition or timeout, the hub sets the RDI bit in the multicast ETH-CC packet, which is received on all spokes. Any local hub MEP defect condition continues to be propagated in the multicast ETH-CC packet. Defect conditions are cleared as per normal behavior.

The forwarding plane must be considered before deploying this type of ETH-CC model. A unicast packet is handled as unknown when the destination MAC does not exist in local forwarding table. If a unicast ETH-CC packet is flooded in a multipoint context, it reaches all the appropriate I-contexts. This causes the spoke MEPs to raise the ‟DefErrorCCM” condition because an ETH-CC packet was received from a MEP that has not been configured as part of the receiving MEPs database.

The remote unicast MAC address must be configured and is not automatically learned. A MEP cannot send both unicast and multicast ETH-CC packets. Unicast ETH-CC is only applicable to a local association with a single configured remote peer. There is no validation of MAC addresses for ETH-CC packets. The configured unicast destination MAC address of the peer MEP only replaces the multicast class 1 destination MAC address with a unicast destination.

Unicast CCM is not supported on any MEPs that are configured with sub second CCM-intervals.

The following functions are supported:

  • Enabling and disabling CC for an MEP

  • Configuring and deleting the MEP entries in the CC MEP monitoring database manually. It is only required to provision remote MEPs. Local MEPs shall be automatically put into the database when they are created.

  • CCM transmit interval: 10 ms, 100 ms, 1 s, 10 s, 60 s, 600 s. Default: 10 s. Interval support is platform dependent. When configuring MEPs with sub-second CCM intervals, bandwidth consumption must be taken into consideration. Each CCM PDU is approximately 100 bytes (800 bits). Taken individually, this is a small value. However, the bandwidth consumption increases rapidly as multiple MEPs are configured with 10 ms timers, 100 packets per second.

  • The following section describes some basic hierarchical considerations and the software requirements and configurations that need to be met when considering sub-second enabled MEPs:

    • Down MEPs only

    • Single peer only

    • Any MD level

      • As long as lower MD level MEPs are not CCM or ETH-APS enabled

        G.8031 Ethernet-Tunnels enables OpCode39 Linear APS

        G.8032 Ethernet-Rings enables OpCode 40 Ring APS

      • As long as lower MD level MEPs are not receiving ETH-CCM or ETH-APS PDUs, even if they not locally enabled or configured to do so

        The reception of the lower MD level ETH-CCM and ETH-APS PDUs are processed by the sub second CCM enabled MEP, regardless of MD level

        All other ETH-CFM PDUs are handled by the MEP at the MD level matching the PDU that has arrived, assuming one has been configured

    • Service MEPs (excluding primary VLAN MEPs)

      Ethernet SAPs configured on port with any Ethernet encapsulation (null, dot1q, or QinQ)

    • Facility MEPs

      • Port
      • LAG
      • Base router interface
    • Service MEPs and facility MEPs can simultaneously execute sub-second CCM enabled MEPs as these are considered different MEP families.

    • General processing rules for service MEPs and facility MEPs must be met regardless of the CCM interval. These are included here because of the impact misunderstanding could have on the CCM extraction.

      • All the above rules apply

      • MD level hierarchy must be ensured across different families

      • Facility MEPs are the first processing routine for ETH-CFM PDUs

      • VLAN encapsulation uniqueness must exist when processing the ETH-CFM PDU across the two families

        Unique Example: An Ethernet port-based facility down MEP configured on port 1/1/1 and service down MEP SAP 1/1/1:100 (dot1q encaps) are unique

        Conflict Example: An Ethernet port-based facility down MEP configured on port 1/1/1 and service down MEP SAP 1/1/1 (null encaps) are in conflict and cannot coexist. All ETH-CFM PDUs arrive untagged and the facility MEP takes precedence.

    • G.8031 (Ethernet-Tunnels) support both sub-second and 1 second CCM intervals and optionally no CCM. When the MEP is created on a G.8031 Ethernet-Tunnel no other MEP that is any way connected to the G.8031 Ethernet-Tunnel can execute sub-second CCM intervals. Facility MEPs are not supported in conjunction with G.8031 (Ethernet-Tunnel MEPs).

    • G.8032 (Ethernet-Ring) support both sub second and 1 second CCM intervals and optionally no CCM. Facility MEPs are supported in combination with G.8032 MEPs. However, facility MEPs and G.8032 MEPs cannot both execute sub-second CCM where the infrastructure is shared. If the user configures this combination the last updated sub-second MEP overwrites the previous sub-second MEP and interrupt the previous configured MEP causing a defRemoteCCM condition.

  • The size of the CCM PDU may be increased by configuring the optional Data TLV. This is accomplished by configuring the ccm-padding-size command under the specific MEP. The configured value represents the total length of the Data TLV that is included with the other CCM PDU informational elements. The no form of this command removes the optional Data TLV from the CCM PDU. The user must consider a CCM PDU is 83 byte size in length (75 base elements plus 8 bytes for port status and interface status). If the size of the optional TLV combined with the size of the CCM PDU exceeds 1500 bytes the packet is dropped if the MTU is 1518/1522.

  • CCM declares a defect when:

    • it stops hearing from one of the remote MEPs for 3.5 times CC interval

    • it hears from a MEP with a lower MD level

    • it hears from a MEP that is not part of the local MEPs MA

    • it hears from a MEP that is in the same MA but not in the configured MEP list

    • it hears from a MEP in the same MA with the same MEP ID as the receiving MEP

    • the CC interval of the remote MEP does not match the local configured CC interval

    • the remote MEP is declaring a fault

  • An alarm is raised and a trap is sent if the defect is greater than or equal to the configured low-priority-defect value.

  • Remote Defect Indication (RDI) is supported but by default is not recognized as a defect condition because the low-priority-defect setting default does not include RDI.

  • SenderID TLV may optionally be configured to carry the ChassisID. When configured, this information is included in CCM messages.

    • Only the Chassis ID portion of the TLV is included.

    • The Management Domain and Management Address fields are not supported on transmission.

    • The Sender ID TLV is not supported with sub-second CCM enabled MEPs.

    • Supported for both service (id-permission) and facility MEPs (facility-id-permission).

  • Alarm notification alarm and reset times are configurable under the MEP. By default, the alarm notification times are set to zero, which means the behavior is immediate logging and resetting. When the value is zero and a previous higher level alarm is reset, if a lower level alarm exists, and is above the low-priority defect, a log event is created. However, when either of the alarm notification timers are non-zero and a lower priority alarm exists, it is not logged.

    • Alarm (fng-alarm-time) delays the generation of the log event by the value configured. The alarm must be present for this amount of time before the log event is created. This is for only log event purposes.

    • Reset (fng-reset-time) is the amount of time the alarm must be absent before it is cleared.

The optional ccm-tlv-ignore command ignores the reception of interface-status and port-status TLVs in the ETH-CCM PDU on Facility MEPs (port, LAG, QinQ, tunnel, and router). No processing is performed on the ignored ETH-CCM TLVs values.

Any TLV that is ignored is reported as absent for that remote peer and the values in the TLV do not have an impact on the ETH-CFM state machine. This the same behavior as if the remote MEP never included the ignored TLVs in the ETH-CCM PDU. If the TLV is not properly formed, the CCM PDU fails the packet parsing process, which causes it to be discarded and a defect condition is raised.

There are various display commands that are available to show the status of the MEP and the list of remote peers.

CC remote peer auto discovery

As specified in the section ‟Continuity Checking (CC),” all remote MEP-IDs must be configured under the association using the remote-mepid command to accept them as peers. When a CCM is received from a MEP-ID that has not been configured, the ‟unexpected MEP” causes the defErrorCCM condition to be raised. The defErrorCCM is raised for all invalid CC reception conditions.

The auto-mep-discovery option allows for the automatic adding of remote MEP-IDs contained in the received CCM. When learned, the automatically discovered MEP behave the same as a manually configured entry. This includes the handling and reporting of defect conditions. For example, if an auto discovered MEP is deleted from its host node, it experiences the standard timeout on the node which auto discovered it.

When this function is enabled, the ‟unexpected MEP” condition no longer exists. That is because all MEPs are accepted as peers and automatically added to the MEP database upon reception. There is an exception to this statement. If the maintenance association has reached its maximum MEP count, and no new MEPs can be added, the ‟unexpected MEP” condition raises the defErrorCCM defect condition. This is because the MEP was not added to the association and the remote MEP is still transmitting CCM.

The clear eth-cfm auto-discovered-meps [mep-id] domain md-index association ma-index is available to remove auto discovered MEPs from the association. When the optional mep-id is included as part of the clear command, only that specific MEP-ID within the domain and association is cleared. If the optional mep-id is omitted when the clear command is issued, all auto discovered MEPs that match the domain and association are cleared. The clear command is only applicable to auto-discovered MEPs.

If there is a failure to add a MEP to the MEP database and the action was manual addition using the ‟remote-mepid” configuration statement, the error ‟MINOR: ETH_CFM #1203 Reached maximum number of local and remote endpoints configured for this association” is produced. When failure to add a MEP to the database through an auto discovery, no event is created. The CCM Last Failure indicator tracks the last CCM error condition. The decode can be viewed using the show eth-cfm mep mep-id domain md-index association ma-index command. An association may include both the manual addition of remote peers using the remote-mepid and the auto-mep-discovery option.

The all-remote-mepid display includes an additional column AD to indicate where a MEP has been auto discovered, using the indicator T.

Auto discovered MEPs do not survive a system reboot. These are not permanent additions to the MEP database and are not reloaded after a reboot. The entries are relearned when the CCM is received. Auto discovered MEPs can be changed to manually created entries simply by adding the appropriate remote-mepid statement to the correct association. At that point, the MEP is no longer considered auto discovered and can no longer be cleared.

If a remote-mepid statement is removed from the association context and auto-mep-discovery is configured and a CC message arrives from that remote MEP, it is added to the MEP database, this time as an auto discovered MEP.

The individual MEP database for an association must not exceed the maximum number of MEPs allowed. A MEP database consists of all local MEPs plus all configured remote-mepids and all auto-discovered MEPs. If the number of MEPs in the association has reached capacity, no new MEPs may be added. The number of MEPs must be brought below the maximum value before MEPs can be added. Also, the number of MEPs across all MEP databases must not exceed the system maximum. The number of MEPs supported per association and the total number of MEPs across all associations is platform dependent.

ETH-CFM grace overview

ETH-CFM grace is an indication that MEPs on a node undergoing a maintenance operation may be expected to be unable to transmit or receive ETH-CC PDUs, failing to satisfy the peers requirements. Without the use of a supporting grace function, CCM-enabled MEPs time out after an interval of 3.5 ✕ ccm-interval. During planned maintenance operations, the use of grace can extend the timeout condition to a longer interval.

The Ethernet CFM system-wide configuration eth-cfm>system>[no] grace-tx-enable command controls the transmission of ETH-CFM grace. The ETH-CFM grace function is enabled by the Soft Reset notification by default. The ETH-CFM grace function determines the individual MEP actions based on their configured parameters.

To transmit a grace PDU, the MEP must be administratively enabled and ETH-CC must also be enabled. The ETH-CC interval is ignored. Grace transmission uses the class 1 DA, with the last nibble (4 bits) indicating the domain level, for all grace-enabled MEPs. When a grace event occurs, all MEPs on a node that are configured for grace actively participate in the grace function until the grace event has completed. When a soft reset occurs, ETH-CFM does not determine which peers are directly affected by a soft reset of a specific IOM or line card. This means that all MEPs enter a grace state, regardless of their location on the local node.

The grace process prevents the local MEP from presenting a new timeout condition, and prevents its peer, also supporting a complementary grace process, from declaring a new timeout defect (DefRemoteCCM). Other defects, unrelated to timeout conditions, are processed as during normal operation. This includes the setting, transmission, and reception processing of the RDI flag in the CCM PDU. Because the timeout condition has been prevented, it can be assumed that the RDI is caused by some other unrelated CCM defect condition. Entering the grace period does not clear existing defect conditions, and any defect condition that exists at the start of the grace period is maintained and cleared using normal operation.

Two approaches are supported for ETH-CFM grace:

Both approaches use the same triggering infrastructure but have unique PDU formats and processing behaviors. Only one grace transmission function can be active under an individual MEP. MEPs can be configured to receive and process both grace PDU formats. If a MEP receives both types of grace PDUs, the last grace PDU received becomes the authority for the grace period, using its procedures. If the operator needs to clear a grace window or expected defect window on a receiving peer, the appropriate authoritative reception function can be disabled.

Active AIS server transmissions include a vendor-specific TLV that instructs the client to extend the timeout of AIS during times of grace. When the grace period is completed, the server MEP removes the TLV and the client reverts to standard timeout processing based on the interval in the AIS PDU.

ETH-VSM grace (Nokia SR OS vendor-specific)

The ETH-VSM Multicast Class 1 DA announcement includes the start of a grace period, the new remote timeout value of 90 s, and the completion of the grace process.

At the start of the maintenance operation, a burst of three packets is sent over a 3-second window to reduce the chance that a remote peer may miss the grace announcement. Following the initial burst, evenly-spaced ETH-VSM packets are sent at intervals of one third of the ETH-VSM grace window; this means that the ETH-VSM packet are sent every 30 seconds to all appropriate remote peers. Reception of an ETH-VSM grace packet refreshes the timeout calculation. The local node that is undergoing the maintenance operation also delays the CCM timeout of the local MEP during the grace window using the announced ETH-VSM interval. MEPs restart their timeout countdown when any ETH-CC PDU is received.

At the end of the maintenance operation, there is a burst of three ETH-VSM grace packets to signal that the maintenance operation has been completed. After the first of these packets has been received, the receiving peer transitions back to the ETH-CCM message and associated interval as the indication for the remote timeout (3.5 ✕ ccm-interval + hold (where applicable)).

CCM packets continue to be sent during this process, but loss of the CCM packets during the advertised grace window do not affect the peer timeout. The only change to the CCM processing is the timeout value used during the grace operation. During the operation, the value that is announced as part of the ETH-VSM packet is used. If the grace value is lower than the configured CCM interval standard timeout computation (3.5 ✕ ccm-interval + hold (where applicable)), the grace value is not installed as the new timeout metric.

This is a value-added function that is applicable only to nodes that implement support for Nokia’s approach for announcing grace using ETH-VSM. This pre-dates the introduction of the ITU-T Y.1371 Ethernet-Expected Defect (ETH-ED) standard. As specified in the standards, when a node does not support a specific optional function such as ETH-VSM, the message is ignored and no processing is performed.

The ETH-VSM function is enabled by default for reception and transmission. The per-MEP configuration statements under the grace>eth-vsm-grace context can affect the transmission, reception, and processing of the ETH-VSM grace function.

ITU-T Y.1731 ETH-ED

The ETH-ED PDU is used to announce the expected defect window to peer MEPs. The peer MEPs uses the expected defect window value to prevent ETH-CC timeout (DefRemoteCCM) conditions for the announcing MEP. The MEP announcing ETH-ED does not time out any remote peers during the expected defect window. The expected defect window is not a configurable value.

At the start of the operation, a burst of three packets are sent over a 3-second window to reduce the chance that a remote peer may miss the expected defect window announcement.

It is possible to restrict the value that is installed for the expected defect timer by configuring the max-rx-defect-window command for the receiving MEP. A comparison is used to determine the expected defect timer to be installed during grace. Either the lower of the received expected defect timer values in the ETH-ED PDU or the configured maximum is installed if they are larger than the standard computation for ETH-CC timeout. The no max-rx-defect-window command is configured by default; therefore, the maximum received expected defect window is disabled, and it is not considered in determining the installed expected defect timer.

Subsequent ETH-ED packets are only transmitted at the completion of the Soft Rest function that triggered the grace function. The three-packet burst at the completion of the Soft Reset function contains an expected defect window size of 5 seconds. Receiving peers should use this new advertisement to reset the expected window to 5 seconds.

The termination of the grace window occurs when the expected defect window timer reaches zero, or when the receive function is manually disabled.

CCM hold timers

In some cases, the requirement exists to prevent a MEP from entering the defRemoteCCM defect, remote peer timeout, for more time than the standard 3.5 times the ccm-interval. Both the IEEE 802.1ag standard and ITU-T Y.1731 recommendation provide a non-configurable 3.5 times the CCM interval to determine a peer time out. However, when sub-second CCM timers (10 ms/100 ms) are enabled, the carrier may want to provide additional time for different network segments to converge before declaring a peer lost because of a timeout. To maintain compliance with the specifications, the ccm-hold-timer down delay-down option artificially increases the amount of time it takes for a MEP to enter a failed state if the peer times out. This timer is only additive to CCM timeout conditions. All other CCM defect conditions, like defMACStatus, defXconCCM, and so on, maintain their existing behavior of transitioning the MEP to a failed state and raising the correct defect condition without delay.

When the ccm-hold-timer down delay-down option is configured, the following calculation is used to determine the remote peer time out: 3.5 ✕ ccm-interval + ccm-hold-timer down delay-down.

This command is configured under the association. Only sub-second CCM-enabled MEPs support this hold timer. Ethernet tunnel paths use a similar but slightly different approach and continue to use the existing method. Ethernet tunnels are blocked from using this new hold timer.

It is possible to change this command on the fly without deleting it first. Entering the command with the new values change the values without having to first delete the command.

It is possible to change the ccm-interval of a MEP on the fly without first deleting it. This means it is possible to change a sub-second CCM-enabled MEP to 1 second or more. The operator is prevented from changing an association from a sub second CCM interval to a non-sub second CCM interval when a ccm-hold-timer is configured in that association. The ccm-hold-timer must be removed using the no option before allowing the transition from sub second to non-sub second CCM interval.

ITU-T Y.1731 ETH-AIS

Alarm Indication Signal (AIS) provides a MEP the ability to signal a fault condition in the reverse direction of the MEP, out the passive side. When a fault condition is detected the MEP generates AIS packets at the configured client levels and at the specified AIS interval until the condition is cleared. Currently a MEP that is configured to generate AIS must do so at a level higher than its own. The MEP configured on the service receiving the AIS packets is required to have the active side facing the receipt of the AIS packet and must be at the same level as the AIS. The absence of an AIS packet for 3.5 times the AIS interval set by the sending a node clear the condition on the receiving MEP.

AIS generation is not subject to the CCM low-priority-defect parameter setting. When enabled, AIS is generated if the MEP enters any defect condition, by default this includes CCM RDI condition.

To prevent the generation of AIS for the CCM RDI condition, the AIS version of the low-priority-defect parameter (under the ais-enable command) can be configured to ignore RDI by setting the parameter value to macRemErrXcon. The low-priority-defect parameter is specific and influences the protocol under which it is configured. When the low-priority-defect parameter is configured under CCM, it only influences CCM and not AIS. When the low-priority-defect parameter is configured under AIS, it only influences AIS and not CCM. Each protocol can make use of this parameter using different values.

AIS configuration has two components: receive and transmit. AIS reception is enabled when the command ais-enable is configured under the MEP. The transmit function is enabled when the client-meg-level is configured.

Alarm Indication Signal function is used to suppress alarms at the client (sub) layer following detection of defect conditions at the server (sub) layer. Because of independent restoration capabilities provided within the Spanning Tree Protocol (STP) environments, ETH-AIS is not expected to be applied in the STP environment.

Transmission of frames with ETH-AIS information can be enabled or disabled on a MEP. Frames with ETH-AIS information can be issued at the client MEG Level by a MEP, including a Server MEP, upon detecting the following conditions:

  • signal failure conditions in the case that ETH-CC is enabled

  • AIS condition in the case that ETH-CC is disabled

For a point-to-point ETH connection at the client (sub) layer, a client layer MEP can determine that the server (sub) layer entity providing connectivity to its peer MEP has encountered defect condition upon receiving a frame with ETH-AIS information. Alarm suppression is straightforward because a MEP is expected to suppress defect conditions associated only with its peer MEP.

For multipoint ETH connectivity at the client (sub) layer, a client (sub) layer MEP cannot determine the specific server (sub) layer entity that has encountered defect conditions upon receiving a frame with ETH-AIS information. More importantly, it cannot determine the associated subset of its peer MEPs for which it should suppress alarms because the received ETH-AIS information does not contain that information. Therefore, upon receiving a frame with ETH-AIS information, the MEP suppresses alarms for all peer MEPs whether there is still connectivity or not.

Only a MEP, including a server MEP, is configured to issue frames with ETH-AIS information. Upon detecting a defect condition the MEP can immediately start transmitting periodic frames with ETH-AIS information at a configured client MEG Level. A MEP continues to transmit periodic frames with ETH-AIS information until the defect condition is removed. Upon receiving a frame with ETH-AIS information from its server (sub) layer, a client (sub) layer MEP detects AIS condition and suppresses alarms associated with all its peer MEPs. A MEP resumes alarm generation upon detecting defect conditions when AIS condition is cleared.

AIS may also be triggered or cleared based on the state of the entity over which it has been enabled. Including the optional command interface-support-enable under the ais-enable command tracks the state of the entity and invoke the appropriate AIS action. This means that operators are not required to enable CCM on a MEP to generate AIS if the only requirement is to track the local entity. If a CCM enabled MEP is enabled in addition to this function then both are used to act upon the AIS function. When both CCM and interface support are enabled, a fault in either triggers AIS. To clear the AIS state, the entity must be in an UP operational state and there must be no defects associated with the MEP. The interface support function is available on both service MEPs and facility MEPs both in the Down direction only, with the following exception. An Ethernet QinQ Tunnel Facility MEP does not support interface-support-enable. Many operational models for Ethernet QinQ Tunnel Facility MEPs are deployed with the SAP in the shutdown state.

The following specific configuration information is used by a MEP to support ETH-AIS:

client MEG level
MEG level at which the most immediate client layer MIPs and MEPs exist
ETH-AIS transmission period
determines the transmission period of frames with ETH-AIS information
priority
identifies the priority of frames with ETH-AIS information
drop eligibility
frames with ETH-AIS information are always marked as drop ineligible
interface-support-enable
optional configuration to track the state of the entity over which the MEP is configured
low-priority-defect
optional configuration to exclude the CCM RDI condition from triggering the generation of AIS

A MIP is transparent to frames with ETH-AIS information and therefore does not require any information to support ETH-AIS functionality.

It is important to note that Facility MEPs do not support the generation of AIS to an explicitly configured endpoint. An explicitly configured endpoint is an object that contains multiple individual endpoints, as in pseudowire redundancy.

AIS is enabled under the service and has two parts, receive and transmit. Both components have their own configuration option. The ais-enable command under the SAP allows for the processing of received AIS packets at the MEP level. The client-meg-level command is the transmit portion that generates AIS if the MEP enter a fault state.

When MEP 101 enters a defect state, it starts to generate AIS out the passive side of the MEP, away from the fault. In this case, the AIS generates out sap 1/1/10:100.31 because MEP 101 is an up MEP on that SAP. The Defect Flag indicates that an RDI error state has been encountered. The Eth-Ais Tx Counted value is increasing, indicating that AIS is actively being sent.

A single network event may, in turn, cause the number of AIS transmissions to exceed the AIS transmit rate of the network element. A pacing mechanism is in place to assist the network element to gracefully handle this overload condition. Should an event occur that causes the AIS transmit requirements to exceed the AIS transmit resources, a credit system is used to grant access to the resources. Once all the credits have been used, any remaining MEPs attempting to allocate a transmit resource are placed on a wait list, unable to transmit AIS. If a credit be released, when the condition that caused the MEP to transmit AIS is cleared, a MEP on the wait list consumes the newly available credit. If it is critical that AIS transmit resources be available for every potential event, consideration must be given to the worst case scenario and the configuration should never exceed the potential. Access to the resources and the wait list are ordered and maintained in first come first serve basis.

A MEP that is on the wait list only increments the ‟Eth-Ais Tx Fail” counter and not the ‟Eth-Ais TxCount” for every failed attempt while the MEP is on the wait list.

There is no synchronization of AIS transmission state between peer nodes. This is particularly important when AIS is used to propagate fault in ETH-CFM MC-LAG linked designs.

ITU-T Y.1731 ETH-CSF

Client signal fail (CSF) is a method that allows for the propagation of a fault condition to a MEP peer, without requiring ETH-CC or ETH-AIS. The message is sent when a MEP detects an issue with the entity in the direction the MEP to its peer MEP. A typical deployment model is an UP MEP configured on the entity that is not executing ETH-CC with its peer. When the entity over which the MEP is configured fails, the MEP can send the ETH-CSF fault message.

To process the reception of the ETH-CSF message, the csf-enable function must be enabled under the MEP. When processing of the received CSF message is enabled, the CSF is used as another method to trigger fault propagation, assuming fault propagation is enabled. If CSF is enabled but fault propagation is not enabled, the MEP shows the state of CSF being received from the peer. And lastly, when there is no fault condition, the CSF Rx State displays DCI (Client defect clear) indicating there are no existing failures, even if no CSF has been received. The CSF Rx State indicates the various fault and clear conditions received from the peer during the event.

CSF carries the type of defect that has been detected by the local MEP generating the CSF message.

  • 000 – LOS – Client Loss of Signal

  • 001 – FDI/AIS – Client forward defect indication

  • 010 – RDI – Client reverse defect indication

Clearing the CSF state can be either implicit, time out, or explicit, requiring the client to send the PDU with the clear indicator (011 – DCI – Client defect clear indication). The receiving node uses the multiplier option to determine how to clear the CSF condition. When the multiplier is configured as non-zero (in increments of half seconds between 2 and 30) the CSF is cleared when CSF PDUs have not been received for that duration. A multiplier value of 0 means that the peer that has generated the CSF must send the 011 – DCI flags. There is no timeout condition.

Service-based MEP supports the reception of the ETH-CSF as an additional trigger for the fault propagation process. Primary VLAN and Virtual MEPs do not support the processing of the CSF PDU. CSF is transparent to MIPs. There is no support for the transmission of ETH-CSF packets on any MEP.

ITU-T Y.1731 ETH-TST

Ethernet test provides a MEP with the ability to send an in-service on-demand function to test connectivity between two MEPs. The test is generated on the local MEP and the results are verified on the destination MEP. Any ETH-TST packet generated that exceeds the MTU is silently dropped by the lower level processing of the node.

Specific configuration information required by a MEP to support ETH-test is the following:

  • MEG level (MEG level at which the MEP exists)

  • unicast MAC address of the peer MEP for which ETH-test is intended

  • data (optional element whose length and contents are configurable at the MEP)

  • priority (identifies the priority of frames with ETH-Test information)

  • drop eligibility (identifies the eligibility of frames with ETHTest information to be dropped when congestion conditions are encountered)

A MIP is transparent to the frames with ETH-Test information and does not require any configuration information to support ETH-Test functionality.

Both nodes require the eth-test function to be enabled to successfully execute the test. Because this is a dual-ended test, initiate on sender with results calculated on the receiver, both nodes need to be check to see the results.

ITU-T Y.1731 ETH-1DM

One-way delay measurement provides a MEP with the ability to check unidirectional delay between MEPs. An ETH-1DM packet is timestamped by the generating MEP and sent to the remote node. The remote node timestamps the packet on receipt and generates the results. The results, available from the receiving MEP, indicate the delay and jitter. Jitter, or delay variation, is the difference in delay between tests. This means the delay variation on the first test is not valid. It is important to ensure that the clocks are synchronized on both nodes to ensure the results are accurate. NTP can be used to achieve a level of clock synchronization between the nodes.

Note: Accuracy relies on the nodes ability to timestamp the packet in hardware, and the support of PTP for clock sync.

ITU-T Y.1731 ETH-DMM

Two-way delay measurement is similar to one-way delay measurement except it measures the round trip delay from the generating MEP. In this case, clock synchronization issues do not influence the round-trip test results because four timestamps are used. This allows the time it takes for the remote node to process the frame to be removed from the calculation, and as a result, clock variances are not included in the results. The same consideration for first test and hardware based time stamping stated for one-way delay measurement are applicable to two-way delay measurement.

Delay can be measured using one-way and two-way on demand functions. The two-way test results are available single-ended, test initiated, calculation and results viewed on the same node. There is no specific configuration under the MEP on the SAP to enable this function. An example of an on demand test and results are below. The latest test result is stored for viewing. Further tests overwrite the previous results. Delay variation is only valid if more than one test has been executed.

ITU-T Y.1731 ETH-SLM

Note: Release 9.0 R1 uses pre-standard OpCodes and does not interoperate with any other release or future release.

This synthetic loss measurement approach is a single-ended feature that allows the operator to run on-demand and proactive tests to determine ‟in”, ‟out” loss and ‟unacknowledged” packets. This approach can be used between peer MEPs in both point to point and multipoint services. Only remote MEP peers within the association and matching the unicast destination respond to the SLM packet.

The specification uses various sequence numbers to determine in which direction the loss occurred. Nokia has implemented the required counters to determine loss in each direction. To properly use the information that is gathered the following terms are defined:

  • count

    This is the number of probes that are sent when the last frame is not lost. When the last frames are lost, the count + unacknowledged equals the number of probes sent.

  • out-loss (far-end)

    This represents packets lost on the way to the remote node, from test initiator to test destination.

  • in-loss (near-end)

    This represents packets lost on the way back from the remote node to the test initiator.

  • unacknowledged

    This is the number of packets at the end of the test that were not responded to.

The per probe specific loss indicators are available when looking at the on-demand test runs, or the individual probe information stored in the MIB. When tests are scheduled by Service Assurance Application (SAA) the per probe data is summarized and per probe information is not maintained. Any ‟unacknowledged” packets are recorded as ‟in-loss” when summarized.

The on-demand function can be executed from CLI or SNMP. The on demand tests are meant to provide the carrier a means to perform on the spot testing. However, this approach is not meant as a method for storing archived data for later processing. The probe count for on demand SLM has a range of one to 100 with configurable probe spacing between one second and ten seconds. This means it is possible that a single test run can be up to 1000 seconds in length. Although possible, it is more likely the majority of on demand case are run up to 100 probes or less at a one second interval. A node may only initiate and maintain a single active on demand SLM test at any one time. A maximum of one storage entry per remote MEP is maintained in the results table. Subsequent runs to the same peer overwrite the results for that peer. This means when using on demand testing the test should be run and the results checked before starting another test.

The proactive measurement functions are linked to SAA. This backend provides the scheduling, storage and summarization capabilities. Scheduling may be either continuous or periodic. It also allows for the interpretation and representation of data that may enhance the specification. As an example, an optional TLV has been included to allow for the measurement of both loss and delay/jitter with a single test. The implementation does not cause any interoperability because the optional TLV is ignored by equipment that does not support this. In mixed vendor environments loss measurement continues to be tracked but delay and jitter only reports round trip times. It is important to point out that the round trip times in this mixed vendor environment include the remote nodes processing time because only two time stamps are included in the packet. In an environment where both nodes support the optional TLV to include time stamps unidirectional and round trip times are reported. Because all four time stamps are included in the packet, the round trip time in this case does not include remote node processing time. Of course, those operators that want to run delay measurement and loss measurement at different frequencies are free to run both ETH-SL and ETH-DM functions. ETH-SL is not replacing ETH-DM. Service Assurance is only briefly discussed here to provide some background on the basic functionality.

The ETH-SL packet format contains a test-id that is internally generated and not configurable. The test-id is visible for the on demand test in the display summary. It is possible a remote node processing the SLM frames receive overlapping test-ids as a result of multiple MEPs measuring loss between the same remote MEP. For this reason, the uniqueness of the test is based on remote MEP-ID, test-id and source MAC of the packet.

ETH-SL is applicable to up and down MEPs and as per the recommendation transparent to MIPs. There is no coordination between various fault conditions that could impact loss measurement. This is also true for conditions where MEPs are placed in shutdown state as a result of linkage to a redundancy scheme like MC-LAG. Loss measurement is based on the ETH-SL and not coordinated across different functional aspects on the network element. ETH-SL is supported on service based MEPs.

It is possible that two MEPs may be configured with the same MAC on different remote nodes. This causes various issues in the FDB for multipoint services and is considered a misconfiguration for most services. It is possible to have a valid configuration where multiple MEPs on the same remote node have the same MAC. In fact, this is somewhat likely. Only the first responder is used to measure packet loss. The second responder is dropped, because the same MAC for multiple MEPs is only truly valid on the same remote node.

There is no way for the responding node to understand when a test is completed. For this reason a configurable inactivity-timer determines the length of time a test is valid. The timer maintains an active test as long as it is receiving packets for that specific test, defined by the test-id, remote MEP ID and source MAC. When there is a gap between the packets that exceeds the inactivity timer value, the responding node releases the index in the table and responds with a sequence number of 1, regardless of the sequence number sent by the instantiating node. Expiration of this timer causes the reflecting peer to expire the previous test. Packets that follow the expiration of a text are viewed as a new test. The default for the inactivity-timer is 100 second and has a range of ten to 100 seconds.

Only the configuration is supported by HA. There is no synchronization of data between active and standby. Any unwritten, or active tests are lost during a switchover and the data is not recoverable.

ETH-SL provides a mechanism for operators to pro-actively trend packet loss.

ITU-T Y.1731 ETH-LMM

The Ethernet Frame Loss Measurement (ETH-LMM) allows the collection of frame counters to determine the unidirectional frame loss between point-to-point ETH-CFM MEP peers. This measurement does not count its own PDU to determine frame loss. The ETH-LMM protocol PDU includes four counters which represent the data sent and received in each direction: Transmit Forward (TxFCf), Receive Forward (RxFCf), Transmit Backward (TxFCb) and the Receive Backward (RxFCb).

The ETH-LMM protocol is designed specifically for point-to-point connections. It is impossible for the protocol to accurately report loss if the point-to-point relationship is broken; for example, if a SAP or MPLS binding receives data from multiple peers, as can be the case in VPLS deployments, this protocol would not be reliable indicator of frame loss.

The loss differential between transmit and receive is determined the first time an LMM PDU is sent. Each subsequent PDU for a specific test performs a computation of differential loss from that epoch. Each processing cycle for an LMR PDU determines if there is a new maximum or minimum loss window, adds any new loss to the frame loss ratio computation, and updates the four raw transmit and receive counters. The individual probe results are not maintained; these results are only used to determine a new minimum or maximum. A running total of all transmit and receive values is used to determine the average Frame Loss Ratio (FLR) at the completion of the measurement interval. The data set includes the protocol information in the opening header, followed by the frame counts in each direction, and finally the FLR percentages.

The user must understand the restrictions of service before selecting this method of loss measurement. Statistics are maintained per forwarding complex. Multiple path environments may spread frames between the same two peers across different forwarding complexes (for example, link aggregation groups). The ETH-LMM protocol has no method to rationalize different transmit and receive statistics when there are complex changes or when any statistics are cleared on either of the peer entities. The protocol resynchronizes but the data collected for that measurement interval is invalid. The protocol has no method to determine if the loss is true loss or whether some type of complex switch has occurred or statistics were cleared. Consequently, the protocol cannot use any suspect flag to mark the data as invalid. Higher level systems must coordinate network events and administrative actions that can cause the counters to become non-representative of the service data loss.

Packet reordering also affect frame loss and gain reporting. If there is queuing contention on the local node or if path differences in the network cause interleaved or delayed frames, the counter stamped into the LMM PDU can introduce frame gain or loss in either direction. For example, if the LMM PDU is stamped with the TxFCf counter and the LMM PDU traffic is interleaved, the interleaving cannot be accounted for in the counter and a potential gain is realized in the forward direction. This is because the original counter included as the TxFCf value does not include the interleaved packets and the RxFCf counter on the remote peer includes them. Gains and losses even out over the life of the measurement interval. Absolute values are used for any negative values, per interval or at the end of the measurement interval.

Launching a single-ended test is under the control of the OAM Performance Monitoring (OAM-PM) architecture, and the test adheres to the rules of OAM-PM. The ETH-LMM functionality is only available under the OAM-PM configuration. This feature is not available through interactive CLI or SAA. OAM-PM requires the configuration of a test ID for all OAM-PM tests. The ETH-LMM protocol does not define the necessity for this ID, nor does it carry the 4-byte test ID in the packet. This is for local significance and uniformity with other protocols under the control of the OAM-PM architecture.

Support is included for point-to-point Up and Down Service MEPs and Down Facility MEPs (port, LAG, and base router interfaces). Base router interface accuracy may be affected by the Layer 2 or Layer 3 inter-working functions, routing protocol, ACLs, QoS policies, and other Layer 3 functions that were never meant to be accounted for by an Ethernet frame loss measurement tool. Launch functions require IOM/IMM or later, as well as a SF/CPM3 or later.

Resource contention extends beyond the sharing of common LMM resources used for packet counting and extraction. There is also protocol-level contention. For example, Cflowd cannot be counted or sampled on an entity that is collecting LMM statistics. Collection of statistics per Ethernet SAP, per MPLS SDP binding, or per facility is not enabled by default.

ETH-LMM is not supported in the following models:

  • up MEPs in an I-VPLS or PBB Epipe that crosses a PBB infrastructure. This configuration results in LMM PDUs being discarded on the remote BVPLS node.

  • ETH-LMM when primary VLANs are configured against the MEP

  • nonoperational SAP or MPLS SDP bindings over which the Up MEP is configured. This configuration causes LMM or LMR transmissions to fail because the SAP which stores the counters is unavailable to the LMM PDU.

QinQ tunnel collection is the aggregate of all outer VLANs that share the VLAN with the tunnel. If the QinQ is configured to collect LMM statistics, then any service MEP that shares the same VLAN as the QinQ tunnel is blocked from configuring the respective collect-lmm-stats command. The reverse is also true; if a fully qualified SAP is configured to collect LMM statistics, the QinQ tunnel that shares the outer VLAN is blocked from configuring the respective collect-lmm-stats command.

QoS models contribute significantly to the accuracy of the LMM counters. If the QoS function is beyond the LMM counting function, it can lead to mismatches in the counter and transmit and receive information.

ETH-LMM single SAP counter

A single LMM counter per SAP or per MPLS SDP binding or per facility counter is the most common option for deployment of the LMM frame-based counting model. This single counter model requires careful consideration for the counter location. Counter integrity is lost when counting incurs entity conflicts, as is typical in facility MEP and service MEP overlap. The operator must choose one type of facility MEP or the service MEP. If a facility MEP is chosen (Port, LAG, QinQ Tunnel or Base Router Interface) care must be taken to ensure the highest configured MEP performs the loss collection routine.

Configuring loss collection on a lower level MEP leads to additive gain introduced in both directions. Although the collection statement is not blocked by CLI or SNMP when there are potential conflicts, only one can produce accurate results. The operator must be aware of lower level resource conflicts. For example, a null based service SAP, any default SAP context or SAP that covers the entire port or facility resource, such as sap 1/1/1, always counts the frame base loss counter against the SAP and never the port, regardless of the presences of a MEP or the collect-lmm-stats configuration on the SAP. Resource contention extends beyond the sharing of common resources used for packet counting and extraction.

For this feature to function with accurate measurements, the collect-lmm-stats is required under the ETH-CFM context for the Ethernet SAP or MPLS SDP binding or under the MEP in the case of the facility MEP. If this command is not enabled on the launch and reflector, the data in the ETH-LMM and ETH-LMR PDU is not representative and the data captured is invalid.

The show>service>sdp-using eth-cfm and show>service>sap-using eth-cfm commands have been expanded to include the collect-lmm-stats option for service based MEPs. The show>eth-cfm>cfm-stack-table facility command has been expanded to include collect-lmm-stats to view all facility MEPs. Using these commands with this new option displays the entities that are currently collecting LMM counter.

The counter includes all frames that are transmitted or received regardless of class of service or discard eligibility markings. Locally transmitted and locally terminated ETH-CFM frames on the peer collecting the statistics are not included in the counter. However, there are deployment models that introduce artificial frame loss or gain when the ETH-CFM launch node and the terminating node for some ETH-CFM packets are not the same peers. Mismatched LMM statistical counters demonstrates this issue.

Figure 31. Mismatched LMM statistical counters

ETH-LMM per forwarding class counter

Frame loss measurement can be deployed per forwarding class (FC) counter. The config>oam-pm>session>ethernet>lmm>enable-fc-collection command in the related oam-pm session enables frames to be counted on an FC basis, either in or out of profile. This counting method alleviates some of the ordering and interleaving issues that arise when using a single counter, but does not improve on the base protocol concerns derived from multiple paths and complex based counting.

This approach requires the operator to configure the individual FCs of interest and the profile status of the frames under the collect-lmm-fc-stats context. The command allows for the addition or removal of an individual FC by using a differential. The entire command with the wanted FC statements must be included. The system determines the new, deleted, and unchanged FCs. New FCs are allocated a counter. Deleted FCs stop counting. Unchanged FCs continue counting.

Support for per-FC collection includes SAPs, MPLS SDP bindings, and router interfaces.

The enable-fc-collection command must be coordinated between the ETH-LMM test and counting model to configure either single per SAP or MPLS SDP binding counter, or per FC counter. The command is disabled by default, and single per SAP or MPLS SDP binding counter is used.

Symmetrical QoS is required for correct collection of frame counters. The FC must match the priority of the OAM-PM ETH-LMM test. The ETH-LM PDUs must ensure that they are mapped to the correct FC on ingress and egress so that the appropriate counters are collected. Mismatches between the ETH-LMM PDUs and the collected FC cause incorrect or no data to be reported.

The show>eth-cfm>collect-lmm-fc-stats command displays the SAPs, MPLS SDP bindings, and router interfaces that are configured for per-FC collection, and whether the collection is priority aware or unaware. It also includes the base mapping of OAM-PM ETH-LMM priority to FC.

Interaction between single and per FC counters

Entities that support LMM collection may only use one of the following collection models:

  • single counter (collect-lmm-stats)

  • per FC counter (collect-lmm-fc-stats)

The collect-lmm-stats and collect-lmm-fc-stats commands are mutually exclusive.

OAM-PM rejects ETH-LMM test configurations from same source MEPs that have different enable-fc-collection configurations.

Ensure that the LMM collection model that is configured on the entity (collect-lmm-stats or collect-lmm-fc-stats) matches the configuration of the enable-fc-collection command within the OAM-PM session, and that the priority of the test maps to the required FC.

ETH-CFM destination options

ETH-CFM relies on Ethernet addressing and reachability. ETH-CFM destination addressing may be derived from the Ethernet encapsulation, or may be a target address within the ETH-CFM PDU. Addressing is the key to identifying both the source and the destination management points (MPs).

The SR OS implementation dynamically assigns the MP MAC address using the appropriate pool of available hardware addresses on the network element, which simplifies the configuration and maintenance of the MP. The MP MAC address is tied to the specific hardware element, and its addressing can change when the associated hardware is changed.

The optional mac-address mac-address configuration command can be used to eliminate the dynamic nature of the MEP MAC addressing. This optional configuration associates a configured MAC address with the MEP in place of dynamic hardware addressing. The optional mac-address configuration is not supported for all service types.

ETH-CFM tests can adapt to changing destination MAC addressing by using the remote-mepid mep-id command in place of the unicast statically-configured MAC address. SR OS maintains a learned remote MAC table (visible by using the show>eth-cfm>learned-remote-mac command) for all MEPs that are configured to use ETH-CC messaging. Usually, when the remote-mepid mep-id command is used as part of a supported test function, the test searches the learned remote MAC table for a unicast address that associates the local MEP and the requested remote MEP ID. If a unicast destination address is found for that relationship, it is used as the unicast destination MAC address.

The learned remote MAC table is updated and maintained by the ETH-CC messaging process. When an address is learned and recorded in the table, it is maintained even if the remote peer times out or the local MEP is shut down. The address is not maintained in the table if the remote-mepid statement is removed from the associated context by using the no remote-mepid mep-id command for a peer. The CCM database clears the peer MAC address and enters an all-0 MAC address for the entry when the peer times out. The learned remote MAC table maintains the previously learned peer MAC address. If an entry must be deleted from the learned remote MAC table, the clear>learned-remote-mac [mep mep-id [remote-mepid mep-id]] domain md-index association ma-index command can be used. Deleting a local MEP removes the local MEP and all remote peer relationships, including the addresses previously stored in the learned remote MAC table.

The individual ETH-CFM test scheduling functions that use the remote-mepid mep-id option have slightly different operational behaviors.

Global interactive CFM tests support the remote-mepid mep-id option as an alternative to mac-address. A test only starts if a learned remote MAC table contains a unicast MAC address for the remote peer, and runs to completion with that MAC address. If the table does not contain the required unicast entry associated with the specified remote MEP ID, the test fails to start.

SAA ETH-CFM test types support the remote-mepid mep-id option as an alternative to mac-address. If, at the scheduled start of the individual run, the learned remote MAC table contains a unicast learned remote MAC address for the remote peer, the test runs to completion with the initial MAC address. If the table does not contain the required entry, the test terminates after the lesser window of either the full test run or 300 s. A run that cannot successfully determine a unicast MAC address designates the last test result as ‟failed”. If a test is configured with the continuous configuration option, it is rescheduled; otherwise, the test is not rescheduled.

OAM-PM Ethernet test families, specifically DMM, SLM, and LMM, support the remote-mepid mep-id option as an alternative to the dest-mac ieee-address configuration. If the learned remote MAC table contains a unicast learned remote MAC address for the remote peer, the test uses this MAC address as the destination. OAM-PM adapts to changes for MAC addressing during the measurement interval when the remote-mepid mep-id option is configured. It should be expected that the measurement interval includes update-induced PM errors during the transition. If the table does not contain the required entry, the test does not attempt to transmit test PDUs, and presents the ‟Dest Remote MEP Unknown” detectable transmission error.

ITU-T Y.1731 ETH-BN

The Ethernet Bandwidth Notification (ETH-BN) function is used by a server MEP to signal changes in link bandwidth to a client MEP.

This functionality is for point-to-point microwave radios to modify the downstream traffic rate toward the microwave radio to match its microwave link rate. When a microwave radio uses adaptive modulation, the capacity of the radio can change based on the condition experienced by the microwave link. For example, in adverse weather conditions that cause link degradation, the radio can change its modulation scheme to a more robust one (which reduces the link bandwidth) to continue transmitting. This change in bandwidth is communicated from the server MEP on the radio, using ETH-BN Message (ETH-BNM), to the client MEP on the connected router. The server MEP transmits periodic messages with ETH-BN information including the interval, the nominal, and currently available bandwidth. A port MEP with the ETH-BN feature enabled processes the information in the CFM PDU. The operational port egress rate can be modified to adjust the rate of traffic sent to the radio.

A port MEP supports the client side reception and processing of the ETH-BNM sent by the server MEP. By default, processing is disabled. A port that can process an ETH-BNM is a configuration specific to that port, even when the port is a LAG member port. The ETH-BN configuration on the LAG member ports does not have to be the same. However, mismatches in the configuration on these member ports could lead to significant differences in operational egress rates within the same LAG. Different operational rates on the LAG member ports as a result of ETH-BN updates are not considered when hashing packets to the LAG member ports.

The no form of the configure port ethernet eth-cfm mep eth-bn receive CLI command sets the ETH-BN processing state on the port MEP. A port MEP supports untagged packet processing of ETH-CFM PDUs at domain levels 0 and 1 only. The port client MEP sends the ETH-BN rate information received to be applied to the port egress rate in a QoS update. A pacing mechanism limits the number of QoS updates sent. The configure port ethernet eth-cfm mep eth-bn rx-update pacing CLI command allows the updates to be paced using a configurable range of 1 to 600 seconds (the default is 5 seconds). The pacing timer begins to countdown following the most recent QoS update sent to the system for processing. When the timer expires, the most recent update that arrived from the server MEP is compared to the most recent value sent for system processing. If the value of the current bandwidth is different from the previously processed value, the update is sent and the process begins again. Updates with a different current bandwidth that arrive when the pacing timer has already expired are not subject to a timer delay. See the 7450 ESS, 7750 SR, 7950 XRS, and VSR Interface Configuration Guide for more information about these commands.

A complementary QoS configuration is required to allow the system to process nominal bandwidth updates from the CFM engine. The configure port ethernet no eth-bn-egress-rate changes CLI command is required to enable the QoS function to update the port egress rates based on the current available bandwidth updates from the CFM engine. By default, the function is disabled.

Both the CFM and the QoS functions must be enabled for the changes in current bandwidth to dynamically update the egress rate.

When the MEP enters a state that prevents it from receiving the ETH-BNM, the current bandwidth last sent for processing is cleared and the egress rate reverts to the configured rate. Under these conditions, the last update cannot be guaranteed as current. Explicit notification is required to dynamically update the port egress rate. The following types of conditions lead to ambiguity:

  • administrative MEP shutdown

  • port admin down

  • port link down

  • eth-bn no receive transitioning the ETH-BN function to disable

If the configure port ethernet eth-bn-egress-rate-changes command is disabled using the no option, CFM continues to send updates, but the updates are held without affecting the port egress rate.

The ports supporting ETH-BN MEPs can be configured for network, access, or hybrid modes. When ETH-BN is enabled on a port MEP and the config>port>ethernet>eth-cfm>mep>eth-bn>receive and the QoS config>port>ethernet>eth-bn-egress-rate-changes contexts are configured, the egress rate is dynamically changed based on the current available bandwidth indicated by the ETH-BN server.

The port egress rate is capped by the minimum of the configured egress rate and the maximum port rate. The minimum egress rate is one kbyte. If a current bandwidth of zero is received, it does not affect the egress port rate and the previously processed current bandwidth continues to be used.

The client MEP requires explicit notification of changes to update the port egress rate. The system does not timeout any previously-processed current bandwidth rates using a timeout condition. The specification does allow a timeout of the current bandwidth if a message has not been received in 3.5 times the ETH-BN interval. However, the implicit approach can lead to misrepresented conditions and has not been implemented.

When starting or restarting the system, the configured egress rate is used until an ETH-BNM arrives on the port with a new bandwidth request from the ETH-BN server MEP.

An event log is generated each time the egress rate is changed based on reception of an ETH-BNM. If an ETH-BNM is received that does not result in a bandwidth change, no event log is generated.

The destination MAC address can be a Class 1 multicast MAC address (that is, 01-80-C2-00-0x) or the MAC address of the port MEP configured. Standard CFM validation and identification must be successful to process any CFM PDU.

For information about the eth-bn-egress-rate-changes command, see the 7450 ESS, 7750 SR, 7950 XRS, and VSR Interface Configuration Guide.

The PDU used for ETH-BN information is called the Bandwidth Notification Message (BNM). It is identified by a sub-OpCode within the Ethernet Generic Notification Message (ETH-GNM).

BNM PDU format fields shows the BNM PDU format fields.

Table 7. BNM PDU format fields
Label Description

MEG Level

Carries the MEG level of the client MEP (0 to 7). This field must be set to either 0 or 1 to be recognized as a port MEP.

Version

The current version is 0.

OpCode

The value for this PDU type is GNM (32).

Flags

Contains one information element: Period (3 bits) to indicate how often ETH-BNM messages are transmitted by the server MEP. Valid values are:

  • 100 (1 frame/s)

  • 101 (1 frame/10 s)

  • 110 (1 frame/min)

TLV Offset

This value is set to 13.

Sub-OpCode

The value for this PDU type is BNM (1).

Nominal Bandwidth

The nominal full bandwidth of the link, in Mbytes/s.

This information is reported in the display but not used to influence QoS egress rates.

Current Bandwidth

The current bandwidth of the link in Mbytes/s. The value is used to influence the egress rate.

Port ID

A non-zero unique identifier for the port associated with the ETH-BN information, or zero if not used.

This information is reported in the display, but is not used to influence QoS egress rates.

End TLV

An all zeros octet value.

The show eth-cfm mep eth-bandwidth-notification display output includes the ETH-BN values received and extracted from the PDU, including the last reported value and the pacing timer. If n/a appears in the field, it means that field has not been processed.

The base show eth-cfm mep output is expanded to include the disposition of the ETH-BN receive function and the configured pacing timer.

The show port port-id detail is expanded to include an ETH-BNM section. This section includes the egress rate disposition and the current egress BN rate being used.

ETH-CFM statistics

A number of statistics are available to view the current overall processing requirements for CFM. Any packet that is counted against the CFM resource is included in the statistics counters. These counters do not include the counting of sub-second CCM, ETH-CFM PDUs that are generated by non-ETH-CFM functions (which includes OAM-PM and SAA) or are filtered by an applicable security configuration.

SAA and OAM-PM use standard CFM PDUs. The reception of these packets are included in the receive statistics. However, these two functions are responsible for launching their own test packets and do not consume ETH-CFM transmission resources.

Per system and per MEP statistics are available with a per OpCode breakdown. Use the show>eth-cfm>statistics command to view the statistics at the system level. Use the show>eth-cfm>mep mep-id domain md-index association ma-index statistics command to view the per MEP statistics. These statistics may be cleared by substituting the clear command for the show command. The clear function only clears the statistics for that function. For example, clearing the system statistics does not clear the individual MEP statistics, each maintain their own unique counters.

All known OpCodes are listed in transmit and receive columns. Different versions for the same OpCode are not distinguished for this display. This does not imply the network element supports all listed functions in the table. Unknown OpCodes are dropped.

It is also possible to view the top ten active MEPs on the system. The term active can be defined as any MEP that is in a no shutdown state. The tools dump eth-cfm top-active-meps command can be used to see the top ten active MEPs on the system. The counts are based from the last time to command was issued with the clear option. MEPs that are in a shutdown state are still terminating packets, but these do not appear on the active list.

These statistics help operators to determine the busiest active MEPs on the system as well a breakdown of per OpCode processing at the system and MEP level.

ETH-CFM packet debug

The debug infrastructure supports the decoding of both received and transmitted valid ETH-CFM packets for MEPs and MIPs that have been tagged for decoding. The eth-cfm hierarchy has been added to the existing debug CLI command tree. When a MEP or MIP is tagged by the debug process, valid ETH-CFM PDUs are decoded and presented to the logging infrastructure for operator analysis. Fixed queue limits restrict the overall packet rate for decoding. The receive and transmit ETH-CFM debug queues are serviced independently. Receive and transmit correlation is not guaranteed across the receive and transmit debug queues. The tools dump eth-cfm debug-packet command displays message queue exceptions.

Valid ETH-CFM packets must pass a multiple-phase validity check before being passed to the debug parsing function. The MAC addresses must be non-zero. If the destination MAC address is multicast, the last nibble of the multicast address must match the expected level of a MEP or MIP tagged for decoding. Packet length and TLV formation, usage, and, where applicable, field validation are performed. Finally, the OpCode-specific TLV structural checks are performed against the remainder of the PDU.

An ETH-CFM packet that passes the validation process is passed to the debug decoding process for tagged MEPs or MIPs. The decoding process parses the PDU for analysis. Truncation of individual TLVs occurs when:

  • TLV processing requires multiple functions; this occurs with TLVs that include sub-fields

  • an Organizational Specific TLV exists

  • padding has been added, as in the case of the optional Data or Test TLVs

  • an unknown OpCode is detected; the decode procedure processes the generic ETH-CFM header with a hex dump for unknown fields and TLVs

The number of printable bytes is dependent on the reason for truncation.

Any standard fields in the PDU that are defined for a specific length with a Must Be Zero (MBZ) attribute in the specification are decoded based on the specification field length. There is no assumption that packets adhere to the MBZ requirement in the byte field; for example, the MEP-ID is a 2-byte field with three reserved MBZ bits, which translates into a standard MEP-ID range of 0 to 8191. If the MBZ bits are violated, then the 2-byte field is decoded using all non-zero bits in the 2-byte field.

The decoding function is logically positioned between ETH-CFM and the forwarding plane. Any ETH-CFM PDU discarded by an applicable security configuration is not passed to the debug function. Any packet that is discarded by squelching (using the config>service>sap>eth-cfm>squelch-ingress-levels and config>service>sap>eth-cfm>squelch-ingress-ctag-levels commands) or CPU protection (using the config>service>sap>eth-cfm>cpu-protection eth-cfm-monitoring command), bypasses the decoding function. Care must be taken when interpreting specific ETH-CFM PDU decodes. Those PDUs that have additional, subsequent, or augmented information applied by the forwarding mechanisms may not be part of the decoded packet. Augmentation includes the timestamp (the stamping of hardware based counters [LMM]) applied to ETH-CFM PDUs by the forwarding plane.

This function allows for enhanced troubleshooting for ETH-CFM PDUs to and from tagged MEPs and MIPs. Only defined and node-supported functionality are decoded, possibly with truncation. Unsupported or unknown functionality on the node is treated on a best-effort basis, typically handled with a decode producing a truncated number of hex bytes.

This functionality does not support decoding of sub-second CCM, or any ETH-CFM PDUs that are processed by non-ETH-CFM entities (which includes SAA CFM transmit functions), or MIPs created using the default-domain table.

ETH-CFM CoS considerations

UP MEPs and Down MEPs have been aligned to better emulate service data. When an UP MEP or DOWN MEP is the source of the ETH-CFM PDU, the priority value configured, as part of the configuration of the MEP or specific test, is treated as the Forwarding Class (FC) by the egress QoS policy. The numerical ETH-CFM priority value resolves FCs using the following mapping:

  • 0 — be

  • 1 — l2

  • 2 — af

  • 3 — l1

  • 4 — h2

  • 5 — ef

  • 6 — h1

  • 7 — nc

If there is no egress QoS policy, the priority value is mapped to the CoS values in the frame. An ETH-CFM frame utilizing VLAN tags has the DEI bit mark the frame as ‟discard ineligible”. However, egress QoS Policy may overwrite this original value. The Service Assurance Agent (SAA) uses [fc {fc-name} [profile {in | out}]] to accomplish similar functionality.

UP MEPs and DOWN MEPs terminating an ETH-CFM PDU use the received FC as the return priority for the appropriate response, again feeding into the egress QoS policy as the FC.

This does not include Ethernet Linktrace Response (ETH-LTR). The specification requires the highest priority on the bridge port should be used in response to an Ethernet Linktrace Message (ETH-LTM). This provides the highest possible chance of the response returning to the source. Operators may configure the linktrace response priority of the MEP using the ccm-ltm-priority. MIPs inherit the MEPs priority unless the mip-ltr-priority is configured under the bridging instance for the association (config>eth-cfm>domain>assoc>bridge).

Silent CFM dropping with squelching

The ETH-CFM architecture defines the hierarchy that supports separation of Ethernet CFM OAM domains of responsibility. Typically, encapsulation methods are used to tunnel traffic transparently through intermediate segments. Using a network topology as shown in ETH-CFM CPE to CPE, CE traffic arrives at the aggregation node and is encapsulated with a service-provider tag which hides the customer-specific tag as the packet moves through segments of the network. This method treats the ETH-CFM traffic in the same manner. The application of additional tags prevents ETH-CFM conflicts in the various Ethernet CFM OAM domains, even if the domain levels, in this case, four in this example are reused.

Figure 32. ETH-CFM CPE to CPE

In some scenarios, this additional tagging principle is not displayed and this may result in conflicts and collisions. For example, as shown in ETH-CFM collision between OAM domains, an additional pair of Domain Level 4 UP MEPs are configured on the aggregation nodes. These aggregation nodes are c-tag aware. The ETH-CFM packets that are transmitted from the CE pass transparently through the passive side of the MEP (the side facing away from the ETH-CFM packet transmission) and arrive on the active side of the unintended peer MEP. This could cause a number of defect conditions to occur on the unexpectedly terminating MEP and on the unreachable MEP. For simplicity, only the direction from left CE to right side of the network is shown, although the problem exists in both directions.

Figure 33. ETH-CFM collision between OAM domains

These issues can be resolved through communication of ETH-CFM domain-level ownership using a business agreement. However, this communication is only a business agreement and could be violated by misconfiguration. Network-level enforcement of this agreement is important to protect both the ETH-CFM OAM domains of responsibility.

ETH-CFM ingress squelching capabilities are available to enforce the agreement and prevent unwanted ETH-CFM packets from entering a domain of responsibility that should not be exposed to ETH-CFM packets from outside its domain. Enforcement of the CFM level agreement using squelching shows the generic enforcement of the agreement using squelching. In this agreement, Domain Levels 4 and lower are reserved by the provider of the EVC. Domain Levels 5 and above are outside the EVC provider’s scope and must pass transparently through the Ethernet CFM OAM domain. The EVC provider’s boundaries are configured to enforce this agreement and silently discard all ETH-CFM packets that arrive on the ingress points of the boundary at Domain Level 4 and below.

Figure 34. Enforcement of the CFM level agreement using squelching

Two different squelch functions are supported, using the squelch-ingress-levels command and the squelch-ingress-ctag-levels command.

The squelch-ingress-levels command configures an exact service delineation SAP and binding match at the ingress followed immediately by Ethernet type 0x8902. This configuration silently discards the appropriate ETH-CFM packets according to the configured levels of the command, regardless of the presence of an ingress ETH-CFM management point, MEP or MIP. This squelch function occurs before other ETH-CFM packet processing functions.

The squelch-ingress-ctag-levels command is supported for Epipe and VPLS services only. It configures an exact service delineation SAP and binding match of the ingress skipping one addition tag at the ingress, for a maximum total tag length of two tags, followed by Ethernet type 0x8902. This configuration silently discards the appropriate ETH-CFM packets according to the levels that match the configured squelch levels, if an addition tag beyond the service delineation exists. It ignores the value of the additional tag exposing that entire range to this squelch function, if there is no ingress ETH-CFM management point, MEP or MIP, at one of the configured levels covered by the squelch configuration. This squelch function is different from the option configured by squelch-ingress-levels, because it occurs after the processing of an ingress MEP or ingress MIP configured with a primary VLAN within the configured squelch levels. When a primary VLAN ingress MEP or ingress MIP is configured at a VLAN within the squelch level, that entire primary VLAN ETH-CFM function follows regular ETH-CFM primary VLAN rules. In this configuration, the ingress ETH-CFM packets that do not have an ingress MEP or ingress MIP configured for that VLAN are exposed to the squelching rules instead of the primary VLAN rules of ETH-CFM processing. In this case, ETH-CFM primary VLAN ingress processing occurs before the squelch-ingress-ctag-levels functions.

Both variants can be configured together on supported connections and within their supported services. Logical processing chain and interaction shows the logical processing chain and interaction using an ingress QinQ SAP in the form VID.* and various ingress ETH-CFM MEPs. Although not shown in the Logical processing chain and interaction, the processing rules are the same for ingress MIPs, which are ETH-LBM (loopback) and ETH-CFM-LTM (linktrace) aware.

Figure 35. Logical processing chain and interaction

There is no requirement to configure ingress MEPs or ingress MIPs if the goal is simply to silently discard ETH-CFM packets matching a domain level criterion. The squelch commands require contiguous levels configuration.

OAM mapping

OAM mapping is a mechanism that enables a way of deploying OAM end-to-end in a network where different OAM tools are used in different segments. For instance, an Epipe service could span across the network using Ethernet access (CFM used for OAM), pseudowire (T-LDP status signaling used for OAM), and Ethernet access (E-LMI used for OAM).

In the SR OS implementation, the Service Manager (SMGR) is used as the central point of OAM mapping. It receives and processes the events from different OAM components, then decides the actions to take, including triggering OAM events to remote peers.

Fault propagation for CFM is by default disabled at the MEP level to maintain backward compatibility. When required, it can be explicitly enabled by configuration.

Fault propagation for a MEP can only be enabled when the MA comprises no more than two MEPs (point-to-point).

Fault propagation cannot be enabled for eth-tun control MEPs (MEPs configured under the eth-tun primary and protection paths). However, failure of the eth-tun (meaning both paths fail) is propagated by SMGR because all the SAPs on the eth-tun go down.

CFM connectivity fault conditions

CFM MEP declares a connectivity fault when its defect flag is equal to or higher than its configured lowest defect priority. The defect can be any of the following depending on configuration.

  • DefRDICCM

    This is Remote Defect Indication. Remote MEP is declaring a fault by setting the RDI bit in the CCM PDU. Typically a result of raising a local defect based on of the CCM or lack of CCM from an expected or unexpected peer. A feedback loop into the association as a notification because CCM is multicast message with no response.

  • DefMACstatus

    This indicates a MAC layer issue. Remote MEP is indicating remote port or interface status not operational.

  • DefRemoteCCM

    This indicates there is no communication from remote peer. MEP not receiving CCM from an expected remote peer. Timeout of CCM occurs in 3.5 x CC interval.

  • DefErrorCCM

    This indicates remote configuration does not match local expectations. Receiving CC from remote MEP with inconsistent timers, lower MD/MEG level within same MA/MEG, MEP receiving CCM with its own MEP ID within same MA/MEG.

  • DefXconCCM

    Cross-connected services. MEP receiving CCM from different MA/MEG.

  • Reception of AIS for the local MEP level

    This is an additional fault condition that also applies to Y.1731 MEPs.

Setting the lowest defect priority to allDef may cause problems when fault propagation is enabled in the MEP. In this scenario, when MEP A sends CCM to MEP B with interface status down, MEP B responds with a CCM with RDI set. If MEP A is configured to accept RDI as a fault, then it gets into a dead lock state, where both MEPs declare fault and are never be able to recover. The default lowest defect priority is DefMACstatus. In general terms, when a MEP propagates fault to a peer the peer receiving the fault must not reciprocate with a fault back to the originating MEP with a fault condition equal to or higher than the originating MEP low-priority-defect setting. It is also very important that different Ethernet OAM strategies should not overlap the span of each other. In some cases, independent functions attempting to perform their normal fault handling can negatively impact the other. This interaction can lead to fault propagation in the direction toward the original fault, a false positive, or worse, a deadlock condition that may require the operator to modify the configuration to escape the condition. For example, overlapping Link Loss Forwarding (LLF) and ETH-CFM fault propagation could cause these issues.

CFM fault propagation methods

When CFM is the OAM module at the other end, it is required to use any of the following methods (depending on local configuration) to notify the remote peer:

  • generating AIS for specific MEP levels

  • sending CCM with interface status TLV ‟down”

  • stopping CCM transmission

For using AIS for fault propagation, AIS must be enabled for the MEP. The AIS configuration needs to be updated to support the MD level of the MEP (currently it only supports the levels above the local MD level).

Note that the existing AIS procedure still applies even when fault propagation is disabled for the service or the MEP. For example, when a MEP loses connectivity to a configured remote MEP, it generates AIS if it is enabled. The new procedure that is defined in this document introduces a new fault condition for AIS generation, fault propagated from SMGR, that is used when fault propagation is enabled for the service and the MEP.

The transmission of CCM with interface status TLV is triggered and does not wait for the expiration of the remaining CCM interval transmission. This rule applies to CFM fault notification for all services.

For a specific SAP and SDP-binding, CFM and SMGR can only propagate one single fault to each other for each direction (up or down).

When there are multiple MEPs (at different levels) on a single SAP and SDP-binding, the fault reported from CFM to SMGR is the logical OR of results from all MEPs. Basically, the first fault from any MEP is reported, and the fault is not cleared as long as there is a fault in any local MEP on the SAP and SDP-binding.

Epipe services

Down and up MEPs are supported for Epipe services as well as fault propagation. When there are both up and down MEPs configured in the same SAP and SDP-binding and both MEPs have fault propagation enabled, a fault detected by one of them is propagated to the other, which in turn propagates fault in its own direction.

CFM detected fault

When a MEP detects a fault and fault propagation is enabled for the MEP, CFM needs to communicate the fault to SMGR, so SMGR marks the SAP or SDP-binding faulty but still oper up. CFM traffic can still be transmitted to or received from the SAP and SDP-binding to ensure when the fault is cleared, the SAP goes back to normal operational state. Because the operational status of the SAP and SDP-binding is not affected by the fault, no fault handling is performed. For example, applications relying on the operational status are not affected.

If the MEP is an up MEP, the fault is propagated to the OAM components on the same SAP or SDP binding; if the MEP is a down MEP, the fault is propagated to the OAM components on the mate SAP or SDP-binding at the other side of the service.

SAP and SDP-binding failure (including pseudowire status)

When a SAP or SDP-binding becomes faulty (oper-down, admin-down, or pseudowire status faulty), SMGR needs to propagate the fault to up MEPs on the same SAP or SDP-bindings about the fault, as well as to OAM components (such as down MEPs and E-LMI) on the mate SAP or SDP-binding.

Service down

This section describes procedures for the scenario where an Epipe service is down because of the following:

  • Service is administratively shutdown. When service is administratively shutdown, the fault is propagated to the SAP and SDP-bindings in the service.

  • If the Epipe service is used as a PBB tunnel into a B-VPLS, the Epipe service is also considered operationally down when the B-VPLS service is administratively shutdown or operationally down. If this is the case, fault is propagated to the Epipe SAP.

  • In addition, one or more SAPs or SDP-bindings in the B-VPLS can be configured to propagate fault to this Epipe (see fault-propagation-bmac below). If the B-VPLS is operationally up but all of these entities have detected fault or are down, the fault is propagated to this Epipe’s SAP.

Interaction with pseudowire redundancy

When a fault occurs on the SAP side, the pseudowire status bit is set for both active and standby pseudowires. When only one of the pseudowire is faulty, SMGR does not notify CFM. The notification occurs only when both pseudowire becomes faulty. The SMGR propagates the fault to CFM.

Because there is no fault handling in the pipe service, any CFM fault detected on an SDP binding is not used in the pseudowire redundancy’s algorithm to choose the most suitable SDP binding to transmit on.

Ipipe services

SAP or SDP-binding failure (including pseudowire status)

When a SAP or SDP-binding becomes faulty (oper-down, admin-down, or pseudowire status faulty), SMGR propagates the fault to OAM components on the mate SAP or SDP-binding.

Service administratively shutdown

When the service is administratively shutdown, SMGR propagates the fault to OAM components on both SAP or SDP-bindings.

Interaction with pseudowire redundancy

When the fault occurs on the SAP side, the pseudowire status bit is set for both active and standby pseudowires.

When only one of the pseudowire is faulty, SMGR does not notify CFM. The notification only occurs when both pseudowires become faulty. Then the SMGR propagates the fault to CFM. Because there is no fault handling in the pipe service, any CFM fault detected on a SDP-binding is not used in the pseudowire redundancy’s algorithm to choose the most suitable SDP-binding to transmit on.

VPLS service

For VPLS services, on down MEPs are supported for fault propagation.

CFM detected fault

When a MEP detects a fault and fault propagation is enabled for the MEP, CFM communicates the fault to the SMGR. The SMGR marks the SAP and SDP-binding as oper-down. Note that oper-down is used here in VPLS instead of ‟oper-up but faulty” in the pipe services. CFM traffic can be transmitted to or received from the SAP and SDP-binding to ensure when the fault is cleared, the SAP goes back to normal operational state.

Note that as stated in CFM connectivity fault conditions, a fault is raised whenever a remote MEP is down (not all remote MEPs have to be down). When it is not desirable to trigger fault handling actions in some cases when a down MEP has multiple remote MEPs, operators can disable fault propagation for the MEP.

If the MEP is a down MEP, SMGR performs the fault handling actions for the affected services. Local actions done by the SMGR include (but are not limited to):

  • Flushing MAC addresses learned on the faulty SAP and SDP-binding.

  • Triggering transmission of MAC flush messages.

  • Notifying MSTP/RSTP about topology change. If the VPLS instance is a management VPLS (mVPLS), all VPLS instances that are managed by the m VPLS inherits the MSTP/RSTP state change and react accordingly to it.

  • If the service instance is a B-VPLS, and fault-propagation-bmac address(es) is/are configured for the SAP and SDP-binding, SMGR performs a lookup using the B-MAC address(es) to find out which pipe services need to be notified, then propagates a fault to these services. There can be up to four remote B-MAC addresses associated with an SAP and SDP-binding for the same B-VPLS.

SAP and SDP-binding failure (including pseudowire status)

If the service instance is a B-VPLS, and an associated B-MAC address is configured for the failed SAP and SDP-binding, the SMGR performs a lookup using the B-MAC address to find out which pipe services are notified and then propagate fault to these services.

Within the same B-VPLS service, all SAPs/SDP-bindings configured with the same fault propagation B-MACs must be faulty or oper down for the fault to be propagated to the appropriate pipe services.

Service down

When a VPLS service is down:

  • If the service is not a B-VPLS service, the SMGR propagates the fault to OAM components on all SAP and SDP-bindings in the service.

  • If the service is a B-VPLS service, the SMGR propagates the fault to OAM components on all SAP and SDP-bindings in the service as well as all pipe services that are associated with the B-VPLS instance.

Pseudowire redundancy and Spanning Tree Protocol

A SAP or SDP binding that has a down MEP fault is made operationally down. This causes pseudowire redundancy or Spanning Tree Protocol (STP) to take the appropriate actions.

However, the reverse is not true. If the SAP or SDP binding is blocked by STP, or is not tx-active because of pseudowire redundancy, no fault is generated for this entity.

IES and VPRN services

For IES and VPRN services, only down MEP is supported on Ethernet SAPs and spoke SDP bindings.

When a down MEP detects a fault and fault propagation is enabled for the MEP, CFM communicates the fault to the SMGR. The SMGR marks the SAP/SDP binding as operationally down. CFM traffic can still be transmitted to or received from the SAP and SDP-binding to ensure when the fault is cleared and the SAP goes back to normal operational state.

Because the SAP and SDP-binding goes down, it is not usable to upper applications. In this case, the IP interface on the SAP and SDP-binding go down. The prefix is withdrawn from routing updates to the remote PEs. The same applies to subscriber group interface SAPs on the 7450 ESS and 7750 SR.

When the IP interface is administratively shutdown, the SMGR notifies the down MEP and a CFM fault notification is generated to the CPE through interface status TLV or suspension of CCM based on local configuration.

Pseudowire switching

When the node acts as a pseudowire switching node, meaning two pseudowires are stitched together at the node, the SMGR does not communicate pseudowire failures to CFM. Such features are expected to be communicated by pseudowire status messages, and CFM runs end-to-end on the head end and tail end of the stitched pseudowire for failure notification.

LLF and CFM fault propagation

LLF and CFM fault propagation are mutually exclusive. CLI protection is in place to prevent enabling both LLF and CFM fault propagation in the same service, on the same node and at the same time. However, there are still instances where irresolvable fault loops can occur when the two schemes are deployed within the same service on different nodes. This is not preventable by the CLI. At no time should these two fault propagation schemes be enabled within the same service.

802.3ah EFM OAM mapping and interaction with service manager

802.3ah EFM OAM declares a link fault when any of the following occurs:

  • loss of OAMPDU for a specific period of time

  • receiving OAMPDU with link fault flags from the peer

When 802.3ah EFM OAM declares a fault, the port goes into operation state down. The SMGR communicates the fault to CFM MEPs in the service.

OAM fault propagation in the opposite direction (SMGR to EFM OAM) is not supported.

Bidirectional Forwarding Detection

Bidirectional Forwarding Detection (BFD) is an efficient, short-duration detection of failures in the path between two systems. If a system stops receiving BFD messages for a long enough period (based on configuration), it is assumed that a failure along the path has occurred and the associated protocol or service is notified of the failure.

BFD can provide a mechanism used for failure detection over any media, at any protocol layer, with a wide range of detection times and overhead, to avoid a proliferation of different methods.

SR OS supports asynchronous and on-demand modes of BFD in which BFD messages are sent to test the path between systems.

If multiple protocols are running between the same two BFD endpoints, only a single BFD session is established, and all associated protocols share the single BFD session.

As well as the typical asynchronous mode, there is also an echo function defined within RFC 5880, Bidirectional Forwarding Detection, that allows either of the two systems to send a sequence of BFD echo packets to the other system, which loops them back within that system’s forwarding plane. If a number of these echo packets are lost, the BFD session is declared down.

BFD control packet

The base BFD specification does not specify the encapsulation type to be used for sending BFD control packets. Instead, use the appropriate encapsulation type for the medium and network. The encapsulation for BFD over IPv4 and IPv6 networks is specified in RFC 5881, Bidirectional Forwarding Detection (BFD) for IPv4 and IPv6 (Single Hop), and RFC 5883, Bidirectional Forwarding Detection (BFD) for Multihop Paths, BFD for IPv4 and IPv6. This specification requires that BFD control packets be sent over UDP with a destination port number of 3784 (single hop) or 4784 (multi-hop paths) and the source port number must be within the range 49152 to 65535.

Also, the TTL of all transmitted BFD packets must have an IP TTL of 255. All BFD packets received must have an IP TTL of 255 if authentication is not enabled. If authentication is enabled, the IP TTL should be 255, but can still be processed if it is not (assuming the packet passes the enabled authentication mechanism).

If multiple BFD sessions exist between two nodes, the BFD discriminator is used to de-multiplex the BFD control packet to the appropriate BFD session.

Control packet format

The BFD control packet has two sections: a mandatory section and an optional authentication section.

Figure 36. Mandatory frame format
Table 8. BFD control packet field descriptions
Field Description

Vers

The version number of the protocol. The initial protocol version is 0.

Diag

A diagnostic code specifying the local system’s reason for the last transition of the session from Up to some other state.

Possible values are:

0-No diagnostic

1-Control detection time expired

2-Echo function failed

3-Neighbor signaled session down

4-Forwarding plane reset

5-Path down

6-Concatenated path down

7-Administratively down

D Bit

The demand mode bit. (Not supported)

P Bit

The poll bit. If set, the transmitting system is requesting verification of connectivity, or of a parameter change.

F Bit

The final bit. If set, the transmitting system is responding to a received BFD control packet that had the poll (P) bit set.

Rsvd

Reserved bits. These bits must be zero on transmit and ignored on receipt.

Length

Length of the BFD control packet, in bytes.

My Discriminator

A unique, non-zero discriminator value generated by the transmitting system, used to demultiplex multiple BFD sessions between the same pair of systems.

Your Discriminator

The discriminator received from the corresponding remote system. This field reflects back the received value of my discriminator, or is zero if that value is unknown.

Desired Min TX Interval

This is the minimum interval, in microseconds, that the local system would like to use when transmitting BFD control packets.

Required Min RX Interval

This is the minimum interval, in microseconds, between received BFD control packets that this system is capable of supporting.

Required Min Echo RX Interval

This is the minimum interval, in microseconds, between received BFD echo packets that this system is capable of supporting. If this value is zero, the transmitting system does not support the receipt of BFD echo packets.

Echo support

Echo support for BFD calls for the support of the echo function within BFD. By supporting BFD echo, the router loops back received BFD echo messages to the original sender based on the destination IP address in the packet.

The echo function is useful when the local router does not have sufficient CPU power to handle a periodic polling rate at a high frequency. Therefore, it relies on the echo sender to send a high rate of BFD echo messages through the receiver node, which is only processed by the receiver’s forwarding path. This allows the echo sender to send BFD echo packets at any rate.

SR OS does not support the sending of echo requests, only the response to echo requests.

Centralized BFD

The following applications of centralized BFD require BFD to run on the SF/CPM.

IES over spoke SDP

One application for a central BFD implementation is so BFD can be supported over spoke SDPs used to inter-connect IES or VPRN interfaces. When there are spoke SDPs for inter-connections over an MPLS network between two routers, BFD is used to speed up failure detections between nodes so re-convergence of unicast and multicast routing information can begin as quickly as possible.

The MPLS LSP associated with the spoke SDP can enter or egress from multiple interfaces on the router. BFD for these types of interfaces cannot exist on the IOM/XCM by itself.

Figure 37. BFD for IES/VPRN over spoke SDP

BFD over LAG and VSM interfaces

A second application for a central BFD implementation is so BFD can be supported over LAG or VSM interface. This is useful where BFD is not used for link failure detection, but for node failure detection. In this application, the BFD session can run between the IP interfaces associated with the LAG or VSM interface, but there is only one session between the two nodes. There is no requirement for the message flow to across a specific link, or VSM, to get to the remote node.

Figure 38. BFD over LAG and VSM interfaces

BFD on an unnumbered IPv4 interface

BFD sessions can be associated with an unnumbered IPv4 interface to monitor the liveliness of the connection for IP and MPLS routing protocols, when routing protocol adjacencies and static routes are configured to use this function. When a BFD session is associated with an unnumbered interface as the local anchor point, the BFD parameters are taken from the BFD configuration under the unnumbered interface context. If the BFD parameters are not configured within the unnumbered interface context, then BFD sessions are not attempted. All BFD sessions associated with an unnumbered interface are automatically run on the FP complex associated with the CPM.

LSP BFD and VCCV BFD

BFD is supported over MPLS-TP, RSVP, and LDP LSPs, as well as over pseudowires that support Layer 2 services such as Epipe VPLS spoke-SDPs and mesh-SDPs using centralized BFD. See the 7450 ESS, 7750 SR, 7950 XRS, and VSR MPLS Guide and 7450 ESS, 7750 SR, 7950 XRS, and VSR Layer 2 Services and EVPN Guide for more information.

Seamless Bidirectional Forwarding Detection

Seamless BFD (S-BFD), RFC 7880, Seamless Bidirectional Forwarding Detection (S-BFD), is a form of BFD that avoids the negotiation and state establishment for the BFD sessions. This is done primarily by pre-determining the session discriminator and then using other mechanisms to distribute the discriminators to a remote network entity. This allows client applications or protocols to more quickly initiate and perform connectivity tests. Furthermore, a per-session state is maintained only at the head end of a session. The tail end simply reflects BFD control packets back to the head end.

A seamless BFD session is established between an initiator and a reflector. There is only one instance of a reflector per SR OS router. A discriminator is assigned to the reflector. Each of the initiators on a router is also assigned a discriminator.

By default, S-BFD operates in asynchronous mode where the reflector encapsulates and routes IP/UDP encapsulated S-BFD packets back to the initiator using the IGP shortest path. However, some applications also support a controlled return TE path for S-BFD reply packets, where S-BFD operates in echo mode and the reflector router forwards packets back toward the initiator on a specified labelled path using, for example, an SR policy. For more information, see the application-specific descriptions for S-BFD in the 7750 SR and 7950 XRS Segment Routing and PCE User Guide.

Seamless BFD sessions are created on the request of a client application, for example, MPLS. This user guide describes the base S-BFD configuration required on initiator and reflector routers. Application-specific configuration is required to create S-BFD sessions.

S-BFD reflector configuration and behavior

The S-BFD reflector is configured by using the following CLI:

configure 
   bfd 
      seamless-bfd 
        [no] reflector <name> 
                discriminator <value>
                description <string>
                local-state up | admin-down
                [no] shutdown

The discriminator value must be allocated from the S-BFD reflector pool, 524288 - 526335. When the router receives an S-BFD packet from the initiator, with the local router's S-BFD discriminator as the ‟YourDiscriminator” field, then the local node sends the S-BFD packet back to the initiator via a routed path. The state field in the reflected packet is populated with either the Up or AdminDown state value based on the local-state configuration.

Note: Only a single reflector discriminator per node is supported, and the reflector cannot be no shutdown unless at least a discriminator is configured.

Seamless BFD control packets are discarded when the reflector is not configured, is shutdown, or the ‟YourDiscriminator” field does not match the discriminator of the reflector. Both IPv4 and IPv6 are supported, but in the case of IPv6, the reflector can only reflect BFD control packets with a global unicast destination address.

S-BFD initiator global configuration

Before an application can request the establishment of an S-BFD session, a mapping table of remote discriminators to peer far-end IP addresses must exist. The mapping can be accomplished in two ways:

  • statically configured

  • automatically learned using opaque OSPF/IS-IS routing extensions

See Static S-BFD discriminator configuration and Automated S-BFD discriminator distribution for more information about mapping remote discriminators to IP-addresses and to originated router-id.

Static S-BFD discriminator configuration

To statically map a Seamless BFD remote IP address with its discriminator, use the following CLI commands:

config>router>bfd
   seamless-bfd
      peer <ip-address> discriminator <remote-discriminator>
      peer <ip-address> discriminator <remote-discriminator>
      ...
      exit

The S-BFD initiator immediately starts sending S-BFD packets if the discriminator value of the far-end reflector is known, no session setup is required.

With S-BFD sessions, there is no INIT state. The initiator state changes from AdminDown to Up when it begins to send (initiate) S-BFD packets.

The S-BFD initiator sends the BFD packet to the reflector using the following fields:

Src IP
This field contains the local session IP address. For IPv6, this is a global unicast address belonging to the node.
Dst IP
This field contains the reflector's IP address (configured).
MyDiscriminator
This field contains the locally assigned discriminator.
YourDiscriminator
This field contains the reflector's discriminator value.

If the initiator receives a valid response from the reflector with an Up state, the initiator declares the S-BFD session state as Up.

If the initiator fails to receive a specific number of responses, as determined by the BFD multiplier in the BFD template for the session, the initiator declares the S-BFD session state as Failed.

If any of the discriminators change, the session fails and the router attempts to restart with the new values. If the reflector discriminator changes at the far-end peer, the session fails, but the mapping may not have been updated locally before the system checks for a new reflector discriminator from the local mapping table. The session is bounced, bringing it up with the new values. If any discriminator is deleted, the corresponding S-BFD sessions are deleted.

Automated S-BFD discriminator distribution

It is possible to automatically map an S-BFD remote IP address with its discriminator using IGP routing protocol extensions. The required protocol extensions are introduced by RFC 7883 for IS-IS and RFC 7884 for OSPF. These extensions provide the encodings to advertise the S-BFD discriminators as opaque information within the advertised IGP link state information. BGP-LS added extensions allow the export of IS-IS and OSPF S-BFD discriminator information using encodings defined in draft-ietf-idr-bgp-ls-sbfd-extensions-01.

Two preconditions must apply before automated mapping of S-BFD discriminators is enabled:

traffic-engineering
This enables the TE-DB infrastructure to create the mapping between an IP address and an S-BFD discriminator.
Note: traffic-engineering is not supported in VPRN or for OSPFv3.
advertise-router-capability
This encodes the S-BFD discriminator in OSPF or IS-IS opaque Router Information TLVs.

The following is an example of an OSPF configuration output:

A:Router-A>config>bfd# info detail
        seamless-bfd
            reflector "aaa"
                no description
                discriminator 525002
                local-state up
                no shutdown
            exit
        exit
A:Router-A>config>router>ospf# info 
----------------------------------------------
            router-id 10.20.1.1
            traffic-engineering
            advertise-router-capability area
            area 0.0.0.0
                interface "system"
                    no shutdown
                exit
                interface "to_Router-C"
                    hello-interval 2
                    dead-interval 10
                    metric 1000
                    no shutdown
                exit
                interface "to_Router-B"
                    hello-interval 2
                    dead-interval 10
                    metric 1000
                    no shutdown
                exit
            exit
            no shutdown
----------------------------------------------
*A:Dut-A>config>router>ospf#

Traceroute with ICMP tunneling in common applications

This section provides example output of the traceroute OAM tool when the ICMP tunneling feature is enabled in a few common applications.

The ICMP tunneling feature is described in Tunneling of ICMP reply packets over MPLS LSP and provides supports for appending to the ICMP reply of type Time Exceeded the MPLS label stack object defined in RFC 4950. The new MPLS Label Stack object allows an LSR to include label stack information including label value, EXP, and TTL field values, from the encapsulation header of the packet that expired at the LSR node.

BGP-LDP stitching and ASBR/ABR/datapath RR for BGP IPv4 labeled route

ASBR1     ASBR2
            -------- D ------- E -------- 
           |                             |
           |                             | 
A -------- C                             F -------- B 
DSLAM1    PE1                           PE2       DSLAM2
           |                             |
           |                             |
            -------- G ------- H --------
                   ASBR3     ASBR4

# lsp-trace ldp-bgp stitching
*A:Dut-A# oam lsp-trace prefix 10.20.1.6/32 detail downstream-map-
tlv ddmap                 
lsp-trace to 10.20.1.6/32: 0 hops min, 0 hops max, 108 byte packets
1  10.20.1.1  rtt=2.89ms rc=15(LabelSwitchedWithFecChange) rsc=1 
     DS 1: ipaddr=10.10.1.2 ifaddr=10.10.1.2 iftype=ipv4Numbered MRU=1496 
           label[1]=262143 protocol=3(LDP)
           label[2]=262139 protocol=2(BGP)
           fecchange[1]=POP  fectype=LDP IPv4 prefix=10.20.1.6 remotepeer=0.0.0.0 (Unknown)
fecchange[2]=PUSH fectype=BGP IPv4 prefix=10.20.1.6 remotepeer=10.20.1.2 
           fecchange[3]=PUSH fectype=LDP IPv4 prefix=10.20.1.2 remotepeer=10.10.1.2
2  10.20.1.2  rtt=5.19ms rc=3(EgressRtr) rsc=2
2  10.20.1.2  rtt=5.66ms rc=8(DSRtrMatchLabel) rsc=1 
     DS 1: ipaddr=10.10.4.4 ifaddr=10.10.4.4 iftype=ipv4Numbered MRU=0 
           label[1]=262138 protocol=2(BGP)
3  10.20.1.4  rtt=6.53ms rc=15(LabelSwitchedWithFecChange) rsc=1 
     DS 1: ipaddr=10.10.6.5 ifaddr=10.10.6.5 iftype=ipv4Numbered MRU=1496 
           label[1]=262143 protocol=3(LDP)
           label[2]=262138 protocol=2(BGP)
           fecchange[1]=PUSH fectype=LDP IPv4 prefix=10.20.1.5 remotepeer=10.10.6.5
4  10.20.1.5  rtt=8.51ms rc=3(EgressRtr) rsc=2
4  10.20.1.5  rtt=8.45ms rc=15(LabelSwitchedWithFecChange) rsc=1 
     DS 1: ipaddr=10.10.10.6 ifaddr=10.10.10.6 iftype=ipv4Numbered MRU=1496 
           label[1]=262143 protocol=3(LDP)
           fecchange[1]=POP  fectype=BGP IPv4 prefix=10.20.1.6 remotepeer=0.0.0.0 
(Unknown)
           fecchange[2]=PUSH fectype=LDP IPv4 prefix=10.20.1.6 remotepeer=10.10.10.6
5  10.20.1.6  rtt=11.2ms rc=3(EgressRtr) rsc=1 
*A:Dut-A# configure router ldp-
shortcut (to add ldp label on first hop but overall behavior is similar)

# 12.0R4 default behavior (we have routes back to the source)
*A:Dut-A# traceroute 10.20.1.6 detail wait 100                             
traceroute to 10.20.1.6, 30 hops max, 40 byte packets
  1   1  10.10.2.1  (10.10.2.1)  3.47 ms
  1   2  10.10.2.1  (10.10.2.1)  3.65 ms
  1   3  10.10.2.1  (10.10.2.1)  3.46 ms
  2   1  10.10.1.2  (10.10.1.2)  5.46 ms
  2   2  10.10.1.2  (10.10.1.2)  5.83 ms
  2   3  10.10.1.2  (10.10.1.2)  5.20 ms
  3   1  10.10.4.4  (10.10.4.4)  8.55 ms
  3   2  10.10.4.4  (10.10.4.4)  7.45 ms
  3   3  10.10.4.4  (10.10.4.4)  7.29 ms
  4   1  10.10.6.5  (10.10.6.5)  9.67 ms
  4   2  10.10.6.5  (10.10.6.5)  10.1 ms
  4   3  10.10.6.5  (10.10.6.5)  10.9 ms
  5   1  10.20.1.6  (10.20.1.6)  11.5 ms
  5   2  10.20.1.6  (10.20.1.6)  11.1 ms
  5   3  10.20.1.6  (10.20.1.6)  11.4 ms


# Enable ICMP tunneling on PE and ASBR nodes.
*A:Dut-D# # configure router ttl-propagate label-route-local all *A:Dut-
C,D,E,F# configure router icmp-tunneling

*A:Dut-C# traceroute 10.20.1.6 detail wait 100                                
traceroute to 10.20.1.6, 30 hops max, 40 byte packets
  1   1  10.10.1.1  (10.10.1.1)  11.8 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  1   2  10.10.1.1  (10.10.1.1)  12.5 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  1   3  10.10.1.1  (10.10.1.1)  12.9 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  2   1  10.10.4.2  (10.10.4.2)  13.0 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262139, Exp = 7, TTL =   1, S = 1
  2   2  10.10.4.2  (10.10.4.2)  13.0 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262139, Exp = 7, TTL =   1, S = 1
  2   3  10.10.4.2  (10.10.4.2)  12.8 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262139, Exp = 7, TTL =   1, S = 1
  3   1  10.10.6.4  (10.10.6.4)  10.1 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  3   2  10.10.6.4  (10.10.6.4)  11.1 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  3   3  10.10.6.4  (10.10.6.4)  9.70 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  4   1  10.10.10.5  (10.10.10.5)  12.5 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL = 255, S = 0
             entry  2:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  4   2  10.10.10.5  (10.10.10.5)  11.9 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL = 255, S = 0
             entry  2:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  4   3  10.10.10.5  (10.10.10.5)  11.8 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL = 255, S = 0
             entry  2:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  5   1  10.20.1.6  (10.20.1.6)  12.2 ms
  5   2  10.20.1.6  (10.20.1.6)  12.5 ms
  5   3  10.20.1.6  (10.20.1.6)  13.2 ms


# With lsr-label-route all on all LSRs (only needed on Dut-E) *A:Dut-
E# configure router ttl-propagate lsr-label-route all 

*A:Dut-
A# traceroute 10.20.1.6 detail wait 100 traceroute to 10.20.1.6, 30 hops max, 40 byt
e packets
  1   1  10.10.1.1  (10.10.1.1)  12.4 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  1   2  10.10.1.1  (10.10.1.1)  11.9 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  1   3  10.10.1.1  (10.10.1.1)  12.7 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  2   1  10.10.4.2  (10.10.4.2)  11.6 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262139, Exp = 7, TTL =   1, S = 1
  2   2  10.10.4.2  (10.10.4.2)  13.5 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262139, Exp = 7, TTL =   1, S = 1
  2   3  10.10.4.2  (10.10.4.2)  11.9 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262139, Exp = 7, TTL =   1, S = 1
  3   1  10.10.6.4  (10.10.6.4)  9.21 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  3   2  10.10.6.4  (10.10.6.4)  9.58 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  3   3  10.10.6.4  (10.10.6.4)  9.38 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  4   1  10.10.10.5  (10.10.10.5)  12.2 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  4   2  10.10.10.5  (10.10.10.5)  11.5 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  4   3  10.10.10.5  (10.10.10.5)  11.5 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 1
  5   1  10.20.1.6  (10.20.1.6)  11.9 ms
  5   2  10.20.1.6  (10.20.1.6)  12.2 ms
  5   3  10.20.1.6  (10.20.1.6)  13.7 ms

VPRN inter-AS option B

      ASBR1     ASBR2
            -------- D ------- E -------- 
           |                             |
           |                             | 
A -------- C                             F -------- B 
CE1       PE1                           PE2        CE2
           |                             |
           |                             |
            -------- G ------- H --------
                   ASBR3     ASBR4

# 12.0R4 default behavior (vc-only)
*A:Dut-A# traceroute 3.3.3.4 source 3.3.4.2 wait 100 no-dns 
detail traceroute to 3.3.3.4 from 3.3.4.2, 30 hops max, 40 byte packets
  1   1  3.3.4.1  1.97 ms
  1   2  3.3.4.1  1.74 ms
  1   3  3.3.4.1  1.71 ms
  2   1  *
  2   2  *
  2   3  *
  3   1  *
  3   2  *
  3   3  *
  4   1  3.3.3.6  6.76 ms
  4   2  3.3.3.6  7.37 ms
  4   3  3.3.3.6  8.36 ms
  5   1  3.3.3.4  11.1 ms
  5   2  3.3.3.4  9.46 ms
  5   3  3.3.3.4  8.28 ms


# Configure icmp-tunneling on C, D, E and F

*A:Dut-A# traceroute 3.3.3.4 source 3.3.4.2 wait 100 no-dns 
detail traceroute to 3.3.3.4 from 3.3.4.2, 30 hops max, 40 byte packets
  1   1  3.3.4.1  1.95 ms
  1   2  3.3.4.1  1.85 ms
  1   3  3.3.4.1  1.62 ms
  2   1  10.0.7.3  6.76 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 0, TTL = 255, S = 0
             entry  2:  MPLS Label =  262140, Exp = 0, TTL =   1, S = 1
  2   2  10.0.7.3  6.92 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 0, TTL = 255, S = 0
             entry  2:  MPLS Label =  262140, Exp = 0, TTL =   1, S = 1
  2   3  10.0.7.3  7.58 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 0, TTL = 255, S = 0
             entry  2:  MPLS Label =  262140, Exp = 0, TTL =   1, S = 1
  3   1  10.0.5.4  6.92 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262140, Exp = 0, TTL =   1, S = 1
  3   2  10.0.5.4  7.03 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262140, Exp = 0, TTL =   1, S = 1
  3   3  10.0.5.4  8.66 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262140, Exp = 0, TTL =   1, S = 1
  4   1  3.3.3.6  6.67 ms
  4   2  3.3.3.6  6.75 ms
  4   3  3.3.3.6  6.96 ms
  5   1  3.3.3.4  8.32 ms
  5   2  3.3.3.4  11.6 ms
  5   3  3.3.3.4  8.45 ms


# With ttl-propagate vprn-transit none on PE1 *A:Dut-C# configure router ttl-
propagate vprn-transit none *A:Dut-B# traceroute 3.3.3.4 source 3.3.4.2 wait 100 no-
dns detail traceroute to 3.3.3.4 from 3.3.4.2, 30 hops max, 40 byte packets
  1   1  3.3.4.1  1.76 ms
  1   2  3.3.4.1  1.75 ms
  1   3  3.3.4.1  1.76 ms
  2   1  3.3.3.6  6.50 ms
  2   2  3.3.3.6  6.70 ms
  2   3  3.3.3.6  6.36 ms
  3   1  3.3.3.4  8.34 ms
  3   2  3.3.3.4  7.64 ms
  3   3  3.3.3.4  8.73 ms


# With ttl-propagate vprn-transit all on PE1 *A:Dut-C# configure router ttl-
propagate vprn-transit all *A:Dut-B# traceroute 3.3.3.4 source 3.3.4.2 wait 100 no-
dns detail traceroute to 3.3.3.4 from 3.3.4.2, 30 hops max, 40 byte packets
  1   1  3.3.4.1  1.97 ms
  1   2  3.3.4.1  1.77 ms
  1   3  3.3.4.1  2.37 ms
  2   1  10.0.7.3  9.27 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 0, TTL =   1, S = 0
             entry  2:  MPLS Label =  262140, Exp = 0, TTL =   1, S = 1
  2   2  10.0.7.3  6.39 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 0, TTL =   1, S = 0
             entry  2:  MPLS Label =  262140, Exp = 0, TTL =   1, S = 1
  2   3  10.0.7.3  6.19 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 0, TTL =   1, S = 0
             entry  2:  MPLS Label =  262140, Exp = 0, TTL =   1, S = 1
  3   1  10.0.5.4  6.80 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262140, Exp = 0, TTL =   1, S = 1
  3   2  10.0.5.4  6.71 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262140, Exp = 0, TTL =   1, S = 1
  3   3  10.0.5.4  6.58 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262140, Exp = 0, TTL =   1, S = 1
  4   1  3.3.3.6  6.47 ms
  4   2  3.3.3.6  6.75 ms
  4   3  3.3.3.6  9.06 ms
  5   1  3.3.3.4  7.99 ms
  5   2  3.3.3.4  9.31 ms
  5   3  3.3.3.4  8.13 ms

VPRN inter-AS option C and ASBR/ABR/datapath RR for BGP IPv4 labeled route

      ASBR1     ASBR2
            -------- D ------- E -------- 
           |                             |
           |                             | 
A -------- C                             F -------- B 
CE1       PE1                           PE2        CE2
           |                             |
           |                             |
            -------- G ------- H --------
                   ASBR3     ASBR4

# 12.0R4 default behavior

*A:Dut-B# traceroute 16.1.1.1 source 26.1.1.2 detail no-dns 
wait 100 traceroute to 16.1.1.1 from 26.1.1.2, 30 hops max, 40 byte packets
  1   1  26.1.1.1  1.90 ms
  1   2  26.1.1.1  1.81 ms
  1   3  26.1.1.1  2.01 ms
  2   1  16.1.1.1  6.11 ms
  2   2  16.1.1.1  8.35 ms
  2   3  16.1.1.1  5.33 ms

*A:Dut-C# traceroute router 600 26.1.1.2 source 16.1.1.1 detail no-dns 
wait 100 traceroute to 26.1.1.2 from 16.1.1.1, 30 hops max, 40 byte packets
  1   1  26.1.1.1  5.03 ms
  1   2  26.1.1.1  4.60 ms
  1   3  26.1.1.1  4.60 ms
  2   1  26.1.1.2  6.54 ms
  2   2  26.1.1.2  5.99 ms
  2   3  26.1.1.2  5.74 ms


# With ttl-propagate vprn-transit all and icmp-tunneling

*A:Dut-B# traceroute 16.1.1.1 source 26.1.1.2 detail no-dns 
wait 100 traceroute to 16.1.1.1 from 26.1.1.2, 30 hops max, 40 byte packets
  1   1  26.1.1.1  2.05 ms
  1   2  26.1.1.1  1.87 ms
  1   3  26.1.1.1  1.85 ms
  2   1  10.10.4.4  8.42 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 0, TTL =   1, S = 0
             entry  2:  MPLS Label =  262137, Exp = 0, TTL =   1, S = 0
             entry  3:  MPLS Label =  262142, Exp = 0, TTL =   1, S = 1
  2   2  10.10.4.4  5.85 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 0, TTL =   1, S = 0
             entry  2:  MPLS Label =  262137, Exp = 0, TTL =   1, S = 0
             entry  3:  MPLS Label =  262142, Exp = 0, TTL =   1, S = 1
  2   3  10.10.4.4  5.75 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 0, TTL =   1, S = 0
             entry  2:  MPLS Label =  262137, Exp = 0, TTL =   1, S = 0
             entry  3:  MPLS Label =  262142, Exp = 0, TTL =   1, S = 1
  3   1  10.10.1.2  5.54 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262137, Exp = 0, TTL =   1, S = 0
             entry  2:  MPLS Label =  262142, Exp = 0, TTL =   2, S = 1
  3   2  10.10.1.2  7.89 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262137, Exp = 0, TTL =   1, S = 0
             entry  2:  MPLS Label =  262142, Exp = 0, TTL =   2, S = 1
  3   3  10.10.1.2  5.56 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262137, Exp = 0, TTL =   1, S = 0
             entry  2:  MPLS Label =  262142, Exp = 0, TTL =   2, S = 1
  4   1  16.1.1.1  9.50 ms
  4   2  16.1.1.1  5.91 ms
  4   3  16.1.1.1  5.85 ms


# With ttl-propagate vprn-local all
*A:Dut-C# traceroute router 600 26.1.1.2 source 16.1.1.1 detail no-dns 
wait 100 traceroute to 26.1.1.2 from 16.1.1.1, 30 hops max, 40 byte packets
  1   1  10.10.4.2  4.78 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262136, Exp = 7, TTL =   1, S = 0
             entry  3:  MPLS Label =  262142, Exp = 7, TTL =   1, S = 1
  1   2  10.10.4.2  4.56 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262136, Exp = 7, TTL =   1, S = 0
             entry  3:  MPLS Label =  262142, Exp = 7, TTL =   1, S = 1
  1   3  10.10.4.2  4.59 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262143, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262136, Exp = 7, TTL =   1, S = 0
             entry  3:  MPLS Label =  262142, Exp = 7, TTL =   1, S = 1
  2   1  10.10.6.4  4.55 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262142, Exp = 7, TTL =   2, S = 1
  2   2  10.10.6.4  4.47 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262142, Exp = 7, TTL =   2, S = 1
  2   3  10.10.6.4  4.20 ms
         returned MPLS Label Stack Object
             entry  1:  MPLS Label =  262138, Exp = 7, TTL =   1, S = 0
             entry  2:  MPLS Label =  262142, Exp = 7, TTL =   2, S = 1
  3   1  26.1.1.1  4.62 ms
  3   2  26.1.1.1  4.41 ms
  3   3  26.1.1.1  4.64 ms
  4   1  26.1.1.2  5.74 ms
  4   2  26.1.1.2  6.22 ms
  4   3  26.1.1.2  5.77 ms

Hashing visibility tool

The hashing visibility tool allows users to define a test packet and then inject that packet into a specified ingress port. The result of the test displays the egress port, routing context, egress interface name, and the IP next hop used to forward the packet.

There are three major steps when running this test:

  1. Configure header templates needed to build test packets.

  2. Configure parameter overrides and build packet header sequences.

  3. Execute the find-egress test specifying the ingress port.

Execute the test with the oam find-egress packet packet-number ingress-port port-id command. This causes the specified test frame or packet to be injected at the specified port and report the results.

The find-egress command is supported on IPv4 and IPv6 routing, Layer 3, Layer 2 VPLS, and Epipe services.

The following lists the supported packet header sequences.

  • Ethernet>Payload Ethernet>MPLS>IPv4>GRE>IPv4>Payload

  • Ethernet>IPv4>Payload Ethernet>MPLS>Control-Word>Ethernet>Payload

  • Ethernet>IPv4>IPSec>Payload Ethernet>MPLS>Control-Word>Ethernet>IPv4>Payload

  • Ethernet>IPv4>UDP>Payload Ethernet>MPLS>Control-Word>Ethernet>IPv4>UDP>Payload

  • Ethernet>IPv4>UDP>GTP-U>Payload Ethernet>MPLS>Control-Word>Ethernet>IPv4>TCP>Payload

  • Ethernet>IPv4>UDP>IPSec>Payload Ethernet>MPLS>Control-Word>Ethernet>IPv6>Payload

  • Ethernet>IPv4>UDP>L2TP>Payload Ethernet>MPLS>Control-Word>Ethernet>IPv6>UDP>Payload

  • Ethernet>IPv4>TCP>Payload Ethernet>MPLS>Control-Word>Ethernet>IPv6>TCP>Payload

  • Ethernet>IPv4>GRE>IPv4>Payload Ethernet>MPLS>IPv4>IPSec>Payload

  • Ethernet>IPv6>Payload Ethernet>MPLS>Ethernet>IPv4>IPSec>Payload

  • Ethernet>IPv6>UDP>Payload Ethernet>MPLS>Control-Word>Ethernet>IPv4>IPSec>Payload

  • Ethernet>IPv6>TCP>Payload Ethernet>MPLS>IPv6>IPSec>Payload

  • Ethernet>IPv6>IPSec>Payload Ethernet>MPLS>Ethernet>IPv6>IPSec>Payload

  • Ethernet>IPv6>UDP>IPSec>Payload Ethernet>MPLS>Control-Word>Ethernet>IPv6>IPSec>Payload

  • Ethernet>IPv6>UDP>GTP-U>Payload Ethernet>MPLS>IPv4>UDP>IPSec>Payload

  • Ethernet>IPv6>UDP>L2TP>Payload Ethernet>MPLS>IPv6>UDP>IPSec>Payload

  • Ethernet>MPLS>Payload Ethernet>MPLS>Ethernet>IPv4>UDP>IPSec>Payload

  • Ethernet>MPLS>Control-Word>Payload Ethernet>MPLS>Ethernet>IPv6>UDP>IPSec>Payload

Configuring the header templates

Follow this procedure to configure the header templates:
  1. Configure the header in the configure test-oam build-packet context.
  2. Define the possible header types that are to be used in find-egress tests.
    Any header that needs to be used in the test packet must be created in this step.
    The operator can optionally specify a default value for the associated header parameters, however all parameters can also be set or overridden in step 2 of using the Hashing visibility tool.
    Only header parameters that are used in the hashing decision-making can be configured. All other parameters are set to valid default values internally.
    A:node-2>config>test-oam>build-packet
          header 1 create
             ethernet
                src-mac-address AA:BB:CC:DD:EE:FF
                dst-mac-address FF:EE:DD:CC:BB:AA
          header 2 create
             udp
                src-port 12345
                dest-port 54321
          header 3 create
             ipv4
                dscp 0
                src-ipv4-address 11.22.33.44
                dst-ipv4-address 55.66.77.88
    

Configuring parameter overrides and header sequences

The steps to configure parameter overrides and header sequences are:
  1. Define the header parameter overrides in the debug oam build-packet packet field-override context.
    Any header value can be overridden.
  2. Define the test packet header sequence in the debug oam build-packet packet header-sequence context.
    The header-sequence command includes a string that specifies the sequence of header to create the test packet.
    Each header is defined in the form of h<header-number> with a ‟/” separating the header identifiers.
    The headers sequence is defined from outer header to innermost header.
    A:node-2>debug>oam>build-packet>packet <pkt-id> 
    field-override 
          header 1  
             ethernet 
                src-mac-address 11:22:33:44:55:66 
                dst-mac-address 22:33:44:55:66:77 
     
    header-sequence ‟h1/h3/h2” 
    
Execute the test with the oam find-egress packet packet-id ingress-port port-id command. This causes the specified test frame or packet to be injected at specified port and this reports the result.
A:bkvm14# oam find-egress packet 1 ingress-port 1/5/7  
------------------------------------------------------------------------------- 
Egress Information for Packet 1, Ingress Port 1/5/7 
------------------------------------------------------------------------------- 
Port        : 1/5/1 
Router Name : Base 
Interface Nm: toDUT-2917 
Next Hop    : 10.10.30.2 
------------------------------------------------------------------------------- 
Test completed.