EVPN

Overview and EVPN applications

Ethernet Virtual Private Networks (EVPN) is an IETF technology per RFC 7432, BGP MPLS-Based Ethernet VPN, that uses a new BGP address family and allows VPLS services to be operated as IP-VPNs, where the MAC addresses and the information to set up the flooding trees are distributed by BGP.

EVPN is defined to fill the gaps of other L2VPN technologies such as VPLS. The main objective of the EVPN is to build E-LAN services in a similar way to RFC 4364 IP-VPNs, while supporting MAC learning within the control plane (distributed by MP-BGP), efficient multidestination traffic delivery, and active-active multihoming.

EVPN can be used as the control plane for different data plane encapsulations. The Nokia implementation supports the following data planes:

  • EVPN for VXLAN overlay tunnels (EVPN-VXLAN)

    EVPN for VXLAN overlay tunnels (EVPN-VXLAN), being the Data Center Gateway (DGW) function the main application for this feature. In such application VXLAN is expected within the Data Center and VPLS SDP bindings or SAPs are expected for the connectivity to the WAN. R-VPLS and VPRN connectivity to the WAN is also supported.

    The EVPN-VXLAN functionality is standardized in RFC 8365.

  • EVPN for MPLS tunnels (EVPN-MPLS)

    EVPN for MPLS tunnels (EVPN-MPLS), where PEs are connected by any type of MPLS tunnel. EVPN-MPLS is generally used as an evolution for VPLS services in the WAN, being Data Center Interconnect one of the main applications.

    The EVPN-MPLS functionality is standardized in RFC 7432.

  • EVPN for PBB over MPLS tunnels (PBB-EVPN)

    PEs are connected by PBB over MPLS tunnels in this data plane. It is usually used for large scale E-LAN and E-Line services in the WAN.

    The PBB-EVPN functionality is standardized in RFC 7623.

The 7750 SR, 7450 ESS, or 7950 XRS EVPN VXLAN implementation is integrated in the Nuage Data Center architecture, where the router serves as the DGW.

For more information about the Nuage Networks architecture and products, see the Nuage Networks Virtualized Service Platform Guide. The following sections describe the applications supported by EVPN in the 7750 SR, 7450 ESS, or 7950 XRS implementation.

EVPN for VXLAN tunnels in a Layer 2 DGW (EVPN-VXLAN)

Layer 2 DC PE with VPLS to the WAN shows the use of EVPN for VXLAN overlay tunnels on the 7750 SR, 7450 ESS, or 7950 XRS when it is used as a Layer 2 DGW.

Figure 1. Layer 2 DC PE with VPLS to the WAN

DC providers require a DGW solution that can extend tenant subnets to the WAN. Customers can deploy the NVO3-based solutions in the DC, where EVPN is the standard control plane and VXLAN is a predominant data plane encapsulation. The Nokia DC architecture uses EVPN and VXLAN as the control and data plane solutions for Layer 2 connectivity within the DC and so does the SR OS.

While EVPN VXLAN is used within the DC, some service providers use VPLS and H-VPLS as the solution to extend Layer 2 VPN connectivity. Layer 2 DC PE with VPLS to the WAN shows the Layer 2 DGW function on the 7750 SR, 7450 ESS, and 7950 XRS routers, providing VXLAN connectivity to the DC and regular VPLS connectivity to the WAN.

The WAN connectivity is based on VPLS where SAPs (null, dot1q, and qinq), spoke SDPs (FEC type 128 and 129), and mesh-SDPs are supported.

The DC GWs can provide multihoming resiliency through the use of BGP multihoming.

EVPN-MPLS can also be used in the WAN. In this case, the Layer 2 DGW function provides translation between EVPN-VXLAN and EVPN-MPLS. EVPN multihoming can be used to provide DGW redundancy.

If point-to-point services are needed in the DC, SR OS supports the use of EVPN-VPWS for VXLAN tunnels, including multihoming, according to RFC8214.

EVPN for VXLAN tunnels in a Layer 2 DC with integrated routing bridging connectivity on the DGW

Gateway IRB on the DC PE for an L2 EVPN/VXLAN DC shows the use of EVPN for VXLAN overlay tunnels on the 7750 SR, 7450 ESS, or 7950 XRS when the DC provides Layer 2 connectivity and the DGW can route the traffic to the WAN through an R-VPLS and linked VPRN.

Figure 2. Gateway IRB on the DC PE for an L2 EVPN/VXLAN DC

In some cases, the DGW must provide a Layer 3 default gateway function to all the hosts in a specified tenant subnet. In this case, the VXLAN data plane is terminated in an R-VPLS on the DGW, and connectivity to the WAN is accomplished through regular VPRN connectivity. The 7750 SR, 7450 ESS, and 7950 XRS support IPv4 and IPv6 interfaces as default gateways in this scenario.

EVPN for VXLAN tunnels in a Layer 3 DC with integrated routing bridging connectivity among VPRNs

Gateway IRB on the DC PE for an L3 EVPN/VXLAN DC shows the use of EVPN for VXLAN tunnels on the 7750 SR, 7450 ESS, or 7950 XRS when the DC provides distributed Layer 3 connectivity to the DC tenants.

Figure 3. Gateway IRB on the DC PE for an L3 EVPN/VXLAN DC

Each tenant has several subnets for which each DC Network Virtualization Edge (NVE) provides intra-subnet forwarding. An NVE may be a Nuage VSG, VSC/VRS, or any other NVE in the market supporting the same constructs, and each subnet normally corresponds to an R-VPLS. For example, in Gateway IRB on the DC PE for an L3 EVPN/VXLAN DC, subnet 10.20.0.0 corresponds to R-VPLS 2001 and subnet 10.10.0.0 corresponds to R-VPLS 2000.

In this example, the NVE provides inter-subnet forwarding too, by connecting all the local subnets to a VPRN instance. When the tenant requires Layer 3 connectivity to the IP-VPN in the WAN, a VPRN is defined in the DGWs, which connects the tenant to the WAN. That VPRN instance is connected to the VPRNs in the NVEs by means of an IRB (Integrated Routing and Bridging) backhaul R-VPLS. This IRB backhaul R-VPLS provides a scalable solution because it allows Layer 3 connectivity to the WAN without the need for defining all of the subnets in the DGW.

The 7750 SR, 7450 ESS, and 7950 XRS DGW support the IRB backhaul R-VPLS model, where the R-VPLS runs EVPN-VXLAN and the VPRN instances exchange IP prefixes (IPv4 and IPv6) through the use of EVPN. Interoperability between the EVPN and IP-VPN for IP prefixes is also fully supported.

EVPN for VXLAN tunnels in a Layer 3 DC with EVPN-tunnel connectivity among VPRNs

EVPN-tunnel gateway IRB on the DC PE for an L3 EVPN/VXLAN DC shows the use of EVPN for VXLAN tunnels on the 7750 SR, 7450 ESS, or 7950 XRS, when the DC provides distributed Layer 3 connectivity to the DC tenants and the VPRN instances are connected through EVPN tunnels.

Figure 4. EVPN-tunnel gateway IRB on the DC PE for an L3 EVPN/VXLAN DC

The solution described in section EVPN for VXLAN tunnels in a Layer 3 DC with integrated routing bridging connectivity among VPRNs provides a scalable IRB backhaul R-VPLS service where all the VPRN instances for a specified tenant can be connected by using IRB interfaces. When this IRB backhaul R-VPLS is exclusively used as a backhaul and does not have any SAPs or SDP bindings directly attached, the solution can be optimized by using EVPN tunnels.

EVPN tunnels are enabled using the evpn-tunnel command under the R-VPLS interface configured on the VPRN. EVPN tunnels provide the following benefits to EVPN-VXLAN IRB backhaul R-VPLS services:

  • easier provisioning of the tenant service

    If an EVPN tunnel is configured in an IRB backhaul R-VPLS, there is no need to provision the IRB IPv4 addresses on the VPRN. This makes the provisioning easier to automate and saves IP addresses from the tenant space.

    Note: IPv6 interfaces do not require the provisioning of an IPv6 Global Address; a Link Local Address is automatically assigned to the IRB interface.
  • higher scalability of the IRB backhaul R-VPLS

    If EVPN tunnels are enabled, multicast traffic is suppressed in the EVPN-VXLAN IRB backhaul R-VPLS service (it is not required). As a result, the number of VXLAN binds in IRB backhaul R-VPLS services with EVPN-tunnels can be much higher.

This optimization is fully supported by the 7750 SR, 7450 ESS, and 7950 XRS.

EVPN for MPLS tunnels in E-LAN services

EVPN for MPLS in VPLS services shows the use of EVPN for MPLS tunnels on the 7750 SR, 7450 ESS, and 7950 XRS. In this case, EVPN is used as the control plane for E-LAN services in the WAN.

Figure 5. EVPN for MPLS in VPLS services

EVPN-MPLS is standardized in RFC 7432 as an L2VPN technology that can fill the gaps in VPLS for E-LAN services. A significant number of service providers offering E-LAN services today are requesting EVPN for their multihoming capabilities, as well as the optimization EVPN provides. EVPN supports all-active multihoming (per-flow load-balancing multihoming) as well as single-active multihoming (per-service load-balancing multihoming).

EVPN is a standard-based technology that supports all-active multihoming, and although VPLS already supports single-active multihoming, EVPN's single-active multihoming is perceived as a superior technology because of its mass-withdrawal capabilities to speed up convergence in scaled environments.

EVPN technology provides a number of significant benefits, including:

  • superior multihoming capabilities

  • an IP-VPN-like operation and control for E-LAN services

  • reduction and (in some cases) suppression of the BUM (broadcast, Unknown unicast, and Multicast) traffic in the network

  • simple provision and management

  • new set of tools to control the distribution of MAC addresses and ARP entries in the network

The SR OS EVPN-MPLS implementation is compliant with RFC 7432.

EVPN-MPLS can also be enabled in R-VPLS services with the same feature-set that is described for VXLAN tunnels in sections EVPN for VXLAN tunnels in a Layer 3 DC with integrated routing bridging connectivity among VPRNs and EVPN for VXLAN tunnels in a Layer 3 DC with EVPN-tunnel connectivity among VPRNs.

EVPN for MPLS tunnels in E-Line services

The MPLS network used by EVPN for E-LAN services can also be shared by E-Line services using EVPN in the control plane. EVPN for E-Line services (EVPN-VPWS) is a simplification of the RFC 7432 procedures, and it is supported in compliance with RFC 8214.

EVPN for MPLS tunnels in E-Tree services

The MPLS network used by E-LAN and E-Line services can also be shared by Ethernet-Tree (E-Tree) services using the EVPN control plane. EVPN E-Tree services use the EVPN control plane extensions described in IETF RFC 8317 and are supported on the 7750 SR, 7450 ESS, and 7950 XRS.

EVPN for PBB over MPLS tunnels (PBB-EVPN)

EVPN for PBB over MPLS shows the use of EVPN for MPLS tunnels on the 7750 SR, 7450 ESS, and 7950 XRS. In this case, EVPN is used as the control plane for E-LAN services in the WAN.

Figure 6. EVPN for PBB over MPLS

EVPN for PBB over MPLS (hereafter called PBB-EVPN) is specified in RFC 7623. It provides a simplified version of EVPN for cases where the network requires very high scalability and does not need all the advanced features supported by EVPN-MPLS (but still requires single-active and all-active multihoming capabilities).

PBB-EVPN is a combination of 802.1ah PBB and RFC 7432 EVPN and reuses the PBB-VPLS service model, where BGP-EVPN is enabled in the B-VPLS domain. EVPN is used as the control plane in the B-VPLS domain to control the distribution of B-MACs and setup per-ISID flooding trees for I-VPLS services. The learning of the C-MACs, either on local SAPs/SDP bindings or associated with remote B-MACs, is still performed in the data plane. Only the learning of B-MACs in the B-VPLS is performed through BGP.

The SR OS PBB-EVPN implementation supports PBB-EVPN for I-VPLS and PBB-Epipe services, including single-active and all-active multihoming.

EVPN for VXLAN tunnels and cloud technologies

This section provides information about EVPN for VXLAN tunnels and cloud technologies.

VXLAN

The SR OS, SR Linux and Nuage solution for DC supports VXLAN (Virtual eXtensible Local Area Network) overlay tunnels as per RFC 7348.

VXLAN addresses the data plane needs for overlay networks within virtualized data centers accommodating multiple tenants. The main attributes of the VXLAN encapsulation are:

  • VXLAN is an overlay network encapsulation used to carry MAC traffic between VMs over a logical Layer 3 tunnel.

  • Avoids the Layer 2 MAC explosion, because VM MACs are only learned at the edge of the network. Core nodes simply route the traffic based on the destination IP (which is the system IP address of the remote PE or VTEP-VXLAN Tunnel End Point).

  • Supports multipath scalability through ECMP (to a remote VTEP address, based on source UDP port entropy) while preserving the Layer 2 connectivity between VMs. xSTP is no longer needed in the network.

  • Supports multiple tenants, each with their own isolated Layer 2 domain. The tenant identifier is encoded in the VNI field (VXLAN Network Identifier) and allows up to 16M values, as opposed to the 4k values provided by the 802.1q VLAN space.

VXLAN frame format shows an example of the VXLAN encapsulation supported by the Nokia implementation.

Figure 7. VXLAN frame format

As shown in VXLAN frame format, VXLAN encapsulates the inner Ethernet frames into VXLAN + UDP/IP packets. The main pieces of information encoded in this encapsulation are:

  • VXLAN header (8 bytes)

    • Flags (8 bits) where the I flag is set to 1 to indicate that the VNI is present and valid. The rest of the flags (‟Reserved” bits) are set to 0.

    • Includes the VNI field (24-bit value) or VXLAN network identifier. It identifies an isolated Layer 2 domain within the DC network.

    • The rest of the fields are reserved for future use.

  • UDP header (8 bytes)

    • Where the destination port is a well-known UDP port assigned by IANA (4789).

    • The source port is derived from a hashing of the inner source and destination MAC/IP addresses that the 7750 SR, 7450 ESS, or 7950 XRS does at ingress. This creates an ‟entropy” value that can be used by the core DC nodes for load balancing on ECMP paths.

    • The checksum is set to zero.

  • Outer IP and Ethernet headers (34 or 38 bytes)

    • The source IP and source MAC identifies the source VTEP. That is, these fields are populated with the PE’s system IP and chassis MAC address.

      Note: The source MAC address is changed on all the IP hops along the path, as is usual in regular IP routing.
    • The destination IP identifies the remote VTEP (remote system IP) and be the result of the destination MAC lookup in the service Forwarding Database (FDB).

      Note: All remote MACs are learned by the EVPN BGP and associated with a remote VTEP address and VNI.

Some considerations related to the support of VXLAN on the 7750 SR, 7450 ESS, and 7950 XRS are:

  • VXLAN is only supported on network or hybrid ports with null or dot1q encapsulation.

  • VXLAN is supported on Ethernet/LAG and POS/APS.

  • IPv4 and IPv6 unicast addresses are supported as VTEPs.

  • By default, system IP addresses are supported, as VTEPs, for originating and terminating VXLAN tunnels. Non-system IPv4 and IPv6 addresses are supported by using a Forwarding Path Extension (FPE).

VXLAN ECMP and LAG

The DGW supports ECMP load balancing to reach the destination VTEP. Also, any intermediate core node in the Data Center should be able to provide further load balancing across ECMP paths because the source UDP port of each tunneled packet is derived from a hash of the customer inner packet. The following must be considered:

  • ECMP for VXLAN is supported on VPLS services, but not for BUM traffic. Unicast spraying is based on the packet contents.

  • ECMP for VXLAN on R-VPLS services is supported for VXLAN IPv6 tunnels.

  • ECMP for VXLAN IPv4 tunnels on R-VPLS is only supported if the command configure service vpls allow-ip-int-bind vxlan-ipv4-tep-ecmp is enabled on the R-VPLS (as well as config>router>ecmp).

  • ECMP for Layer 3 multicast traffic on R-VPLS services with EVPN-VXLAN destinations is only supported if the vpls allow-ip-int-bind ip-multicast-ecmp command is enabled (as well as config>router>ecmp).

  • In the cases where ECMP is not supported (BUM traffic in VPLS and ECMP on R-VPLS if not enabled), each VXLAN binding is tied to a single (different) ECMP path, so that in a normal deployment with a reasonable number of remote VTEPs, there should be a fair distribution of the traffic across the paths. In other words, only per-VTEP load-balancing is supported, instead of per-flow load-balancing.

  • LAG spraying based on the packet hash is supported in all the cases (VPLS unicast, VPLS BUM, and R-VPLS).

VXLAN VPLS tag handling

The following describes the behavior on the 7750 SR, 7450 ESS, and 7950 XRS with respect to VLAN tag handling for VXLAN VPLS services:

  • Dot1q, QinQ, and null SAPs, as well as regular VLAN handling procedures at the WAN side, are supported on VXLAN VPLS services.

  • No ‟vc-type vlan” like VXLAN VNI bindings are supported. Therefore, at the egress of the VXLAN network port, the router does not add any inner VLAN tag on top of the VXLAN encapsulation, and at the ingress network port, the router ignores any VLAN tag received and considers it as part of the payload.

VXLAN MTU considerations

For VXLAN VPLS services, the network port MTU must be at least 50 Bytes (54 Bytes if dot1q) greater than the Service-MTU to allow enough room for the VXLAN encapsulation.

The Service-MTU is only enforced on SAPs, (any SAP ingress packet with MTU greater than the service-mtu is discarded) and not on VXLAN termination (any VXLAN ingress packet makes it to the egress SAP regardless of the configured service-mtu).

If BGP-EVPN is enabled in a VXLAN VPLS service, the Service-MTU can be advertised in the Inclusive Multicast Ethernet Tag routes and enforce that all the routers attached to the same EVPN service have the same Service-MTU configured.

Note: The router never fragments or reassemble VXLAN packets. In addition, the router always sets the DF (Do not Fragment) flag in the VXLAN outer IP header.

VXLAN QoS

VXLAN is a network port encapsulation; therefore, the QoS settings for VXLAN are controlled from the network QoS policies.

Ingress

The network ingress QoS policy can be applied either to the network interface over which the VXLAN traffic arrives or under vxlan/network/ingress within the EVPN service.

Regardless of where the network QoS policy is applied, the ingress network QoS policy is used to classify the VXLAN packets based on the outer dot1p (if present), then the outer DSCP, to yield an FC/profile.

If the ingress network QoS policy is applied to the network interface over which the VXLAN traffic arrives then the VXLAN unicast traffic uses the network ingress queues configured on FP where the network interface resides. QoS control of BUM traffic received on the VXLAN tunnels is possible by separately redirecting these traffic types to policers within an FP ingress network queue group. This QoS control uses the per forwarding class fp-redirect-group parameter together with broadcast-policer, unknown-policer, and mcast-policer within the ingress section of a network QoS policy. This QoS control applies to all BUM traffic received for that forwarding class on the network IP interface on which the network QoS policy is applied.

The ingress network QoS policy can also be applied within the EVPN service by referencing an FP queue group instance, as follows:

configure
    service
        vpls <service-id>
            vxlan vni <vni-id>
                network
                    ingress
                        qos <network-policy-id>
                            fp-redirect-group <queue-group-name>
                                     instance <instance-id>

In this case, the redirection to a specific ingress FP queue group applies as a single entity (per forwarding class) to all VXLAN traffic received only by this service. This overrides the QoS applied to the related network interfaces for traffic arriving on VXLAN tunnels in that service but does not affect traffic received on a spoke SDP in the same service. It is possible to also redirect unicast traffic to a policer using the per forwarding class fp-redirect-group policer parameter, as well as the BUM traffic as above, within the ingress section of a network QoS policy. The use of ler-use-dscp, ip-criteria and ipv6-criteria statements are ignored if configured in the ingress section of the referenced network QoS policy. If the instance of the named queue group template referenced in the qos command is not configured on an FP receiving the VXLAN traffic, then the traffic uses the ingress network queues or queue group related to the network interface.

Egress

On egress, there is no need to specify ‟remarking” in the policy to mark the DSCP. This is because the VXLAN adds a new IPv4 header, and the DSCP is always marked based on the egress network qos policy.

VXLAN ping

A new VXLAN troubleshooting tool, VXLAN Ping, is available to verify VXLAN VTEP connectivity. The VXLAN Ping command is available from interactive CLI and SNMP.

This tool allows the user to specify a wide range of variables to influence how the packet is forwarded from the VTEP source to VTEP termination. The ping function requires the user to specify a different test-id (equates to originator handle) for each active and outstanding test. The required local service identifier from which the test is launched determines the source IP (the system IP address) to use in the outer IP header of the packet. This IP address is encoded into the VXLAN header Source IP TLV. The service identifier also encodes the local VNI. The outer-ip-destination must equal the VTEP termination point on the remote node, and the dest-vni must be a valid VNI within the associated service on the remote node. The outer source IP address is automatically detected and inserted in the IP header of the packet. The outer source IP address uses the IPv4 system address by default.

If the VTEP is created using a non-system source IP address through the vxlan-src-vtep command, the outer source IP address uses the address specified by vxlan-src-vtep. The remainder of the variables are optional.

The VXLAN PDU is encapsulated in the appropriate transport header and forwarded within the overlay to the appropriate VTEP termination. The VXLAN router alert (RA) bit is set to prevent forwarding OAM PDU beyond the terminating VTEP. Because handling of the router alert bit was not defined in some early releases of VXLAN implementations, the VNI Informational bit (I-bit) is set to ‟0” for OAM packets. This indicates that the VNI is invalid, and the packet should not be forwarded. This safeguard can be overridden by including the i-flag-on option that sets the bit to ‟1”, valid VNI. Ensure that OAM frames meant to be contained to the VTEP are not forwarded beyond its endpoints.

The supporting VXLAN OAM ping draft includes a requirement to encode a reserved IEEE MAC address as the inner destination value. However, at the time of implementation, that IEEE MAC address had not been assigned. The inner IEEE MAC address defaults to 00:00:00:00:00:00, but may be changed using the inner-l2 option. Inner IEEE MAC addresses that are included with OAM packets are not learned in the local Layer 2 forwarding databases.

The echo responder terminates the VXLAN OAM frame, and takes the appropriate response action, and include relevant return codes. By default, the response is sent back using the IP network as an IPv4 UDP response. The user can choose to override this default by changing the reply-mode to overlay. The overlay return mode forces the responder to use the VTEP connection representing the source IP and source VTEP. If a return overlay is not available, the echo response is dropped by the responder.

Support is included for:

  • IPv4 VTEP

  • Optional specification of the outer UDP Source, which helps downstream network elements along the path with ECMP to hash to flow to the same path

  • Optional configuration of the inner IP information, which helps the user test different equal paths where ECMP is deployed on the source. A test only validates a single path where ECMP functions are deployed. The inner IP information is processed by a hash function, and there is no guarantee that changing the IP information between tests selects different paths.

  • Optional end system validation for a single L2 IEEE MAC address per test. This function checks the remote FDB for the configured IEEE MAC Address. Only one end system IEEE MAC Address can be configured per test.

  • Reply mode UDP (default) or Overlay

  • Optional additional padding can be added to each packet. There is an option that indicates how the responder should handle the pad TLV. By default, the padding is not reflected to the source. The user can change this behavior by including the reflect-pad option. The reflect-pad option is not supported when the reply mode is set to UDP.

  • Configurable send counts, intervals, times outs, and forwarding class

The VXLAN OAM PDU includes two timestamps. These timestamps are used to report forward direction delay. Unidirectional delay metrics require accurate time of day clock synchronization. Negative unidirectional delay values are reported as ‟0.000”. The round trip value includes the entire round trip time including the time that the remote peer takes to process that packet. These reported values may not be representative of network delay.

The following example commands and outputs show how the VXLAN Ping function can be used to validate connectivity. The echo output includes a new header to better describe the VXLAN ping packet headers and the various levels.

oam vxlan-ping test-id 1 service 1 dest-vni 2 outer-ip-destination 10.20.1.4 
interval
0.1 send-count 10 

TestID 1, Service 1, DestVNI 2, ReplyMode UDP, IFlag Off, PadSize 0, ReflectPad No,
SendCount 10, Interval 0.1, Timeout 5
Outer: SourceIP 10.20.1.3, SourcePort Dynamic, DestIP 10.20.1.4, TTL 10, FC be, Prof
ile
In
Inner: DestMAC 00:00:00:00:00:00, SourceIP 10.20.1.3, DestIP 127.0.0.1

! ! ! ! ! ! ! ! ! ! 
---- vxlan-id 2 ip-address 10.20.1.4 PING Statistics ----
10 packets transmitted, 10 packets received, 0.00% packet loss
   10 non-errored responses(!), 0 out-of-order(*), 0 malformed echo responses(.)
   0 send errors(.), 0 time outs(.)
   0 overlay segment not found, 0 overlay segment not operational
forward-delay min = 1.097ms, avg = 2.195ms, max = 2.870ms, stddev = 0.735ms
round-trip-delay min = 1.468ms, avg = 1.693ms, max = 2.268ms, stddev = 0.210ms



oam vxlan-ping test-id 2 service 1 dest-vni 2 outer-ip-destination 10.20.1.4 outer-
ip-source-udp 65000 outer-ip-ttl 64 inner-l2 d0:0d:1e:00:00:01 inner-ip-source
 192.168.1.2 inner-ip-destination 127.0.0.8 reply-mode overlay send-count 20 
interval
 1 timeout 3 padding 1000 reflect-pad fc nc profile out 

TestID 2, Service 1, DestVNI 2, ReplyMode overlay, IFlag Off, PadSize 1000, ReflectP
ad
 Yes, SendCount 20, Interval 1, Timeout 3
Outer: SourceIP 10.20.1.3, SourcePort 65000, DestIP 10.20.1.4, TTL 64, FC nc, Profil
e
 out
Inner: DestMAC d0:0d:1e:00:00:01, SourceIP 192.168.1.2, DestIP 127.0.0.8

===================================================================================
rc=1 Malformed Echo Request Received, rc=2 Overlay Segment Not Present, rc=3 Overlay
 Segment Not Operational, rc=4 Ok
===================================================================================
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=1 ttl=255 rtt-time=1.733ms fwd
-time=0.302ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=2 ttl=255 rtt-time=1.549ms fwd
-time=1.386ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=3 ttl=255 rtt-time=3.243ms fwd
-time=0.643ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=4 ttl=255 rtt-time=1.551ms fwd
-time=2.350ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=5 ttl=255 rtt-time=1.644ms fwd
-time=1.080ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=6 ttl=255 rtt-time=1.670ms fwd
-time=1.307ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=7 ttl=255 rtt-time=1.636ms fwd
-time=0.490ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=8 ttl=255 rtt-time=1.649ms fwd
-time=0.005ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=9 ttl=255 rtt-time=1.401ms fwd
-time=0.685ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=10 ttl=255 rtt-time=1.634ms fwd
-time=0.373ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=11 ttl=255 rtt-time=1.559ms fwd
-time=0.679ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=12 ttl=255 rtt-time=1.666ms fwd
-time=0.880ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=13 ttl=255 rtt-time=1.629ms fwd
-time=0.669ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=14 ttl=255 rtt-time=1.280ms fwd
-time=1.029ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=15 ttl=255 rtt-time=1.458ms fwd
-time=0.268ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=16 ttl=255 rtt-time=1.659ms fwd
-time=0.786ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=17 ttl=255 rtt-time=1.636ms fwd
-time=1.071ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=18 ttl=255 rtt-time=1.568ms fwd
-time=2.129ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=19 ttl=255 rtt-time=1.657ms fwd
-time=1.326ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=20 ttl=255 rtt-time=1.762ms fwd
-time=1.335ms. rc=4

---- vxlan-id 2 ip-address 10.20.1.4 PING Statistics ----
20 packets transmitted, 20 packets received, 0.00% packet loss
   20 valid responses, 0 out-of-order, 0 malformed echo responses
   0 send errors, 0 time outs
   0 overlay segment not found, 0 overlay segment not operational
forward-delay min = 0.005ms, avg = 0.939ms, max = 2.350ms, stddev = 0.577ms
round-trip-delay min = 1.280ms, avg = 1.679ms, max = 3.243ms, stddev = 0.375ms



oam vxlan-ping test-id 1 service 1 dest-vni 2 outer-ip-destination 10.20.1.4 send
-count 10 end-system 00:00:00:00:00:01 interval 0.1 
TestID 1, Service 1, DestVNI 2, ReplyMode UDP, IFlag Off, PadSize 0, ReflectPad No,
 EndSystemMAC 00:00:00:00:00:01, SendCount 10, Interval 0.1, Timeout 5
Outer: SourceIP 10.20.1.3, SourcePort Dynamic, DestIP 10.20.1.4, TTL 10, FC be, Prof
ile
 In
Inner: DestMAC 00:00:00:00:00:00, SourceIP 10.20.1.3, DestIP 127.0.0.1

2 2 2 2 2 2 2 2 2 2 
---- vxlan-id 2 ip-address 10.20.1.4 PING Statistics ----
10 packets transmitted, 10 packets received, 0.00% packet loss
   10 non-errored responses(!), 0 out-of-order(*), 0 malformed echo responses(.)
   0 send errors(.), 0 time outs(.)
   0 overlay segment not found, 0 overlay segment not operational
   0 end-system present(1), 10 end-system not present(2)
forward-delay min = 0.467ms, avg = 0.979ms, max = 1.622ms, stddev = 0.504ms
round-trip-delay min = 1.501ms, avg = 1.597ms, max = 1.781ms, stddev = 0.088ms



oam vxlan-ping test-id 1 service 1 dest-vni 2 outer-ip-destination 10.20.1.4 send
-count 10 end-system 00:00:00:00:00:01 
TestID 1, Service 1, DestVNI 2, ReplyMode UDP, IFlag Off, PadSize 0, ReflectPad No,
 EndSystemMAC 00:00:00:00:00:01, SendCount 10, Interval 1, Timeout 5
Outer: SourceIP 10.20.1.3, SourcePort Dynamic, DestIP 10.20.1.4, TTL 10, FC be, Prof
ile
 In
Inner: DestMAC 00:00:00:00:00:00, SourceIP 10.20.1.3, DestIP 127.0.0.1

===================================================================================
rc=1 Malformed Echo Request Received, rc=2 Overlay Segment Not Present, rc=3 Overlay
 Segment Not Operational, rc=4 Ok
mac=1 End System Present, mac=2 End System Not Present
===================================================================================

92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=1 ttl=255 rtt-time=2.883ms fwd
-time=4.196ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=2 ttl=255 rtt-time=1.596ms fwd
-time=1.536ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=3 ttl=255 rtt-time=1.698ms fwd
-time=0.000ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=4 ttl=255 rtt-time=1.687ms fwd
-time=1.766ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=5 ttl=255 rtt-time=1.679ms fwd
-time=0.799ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=6 ttl=255 rtt-time=1.678ms fwd
-time=0.000ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=7 ttl=255 rtt-time=1.709ms fwd
-time=0.031ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=8 ttl=255 rtt-time=1.757ms fwd
-time=1.441ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=9 ttl=255 rtt-time=1.613ms fwd
-time=2.570ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=10 ttl=255 rtt-time=1.631ms fwd
-time=2.130ms. rc=4 mac=2

---- vxlan-id 2 ip-address 10.20.1.4 PING Statistics ----
10 packets transmitted, 10 packets received, 0.00% packet loss
   10 valid responses, 0 out-of-order, 0 malformed echo responses
   0 send errors, 0 time outs
   0 overlay segment not found, 0 overlay segment not operational
   0 end-system present, 10 end-system not present
forward-delay min = 0.000ms, avg = 1.396ms, max = 4.196ms, stddev = 1.328ms
round-trip-delay min = 1.596ms, avg = 1.793ms, max = 2.883ms, stddev = 0.366ms

EVPN-VXLAN routed VPLS multicast routing support

IPv4 and IPv6 multicast routing is supported in an EVPN-VXLAN VPRN and IES routed VPLS service through its IP interface when the source of the multicast stream is on one side of its IP interface and the receivers are on either side of the IP interface. For example, the source for multicast stream G1 could be on the IP side, sending to receivers on both other regular IP interfaces and the VPLS of the routed VPLS service, while the source for group G2 could be on the VPLS side sending to receivers on both the VPLS and IP side of the routed VPLS service. See IPv4 and IPv6 multicast routing support for more details.

IGMP and MLD snooping on VXLAN

The delivery of IP multicast in VXLAN services can be optimized with IGMP and MLD snooping. IGMP and MLD snooping are supported in EVPN-VXLAN VPLS services and in EVPN-VXLAN VPRN/IES R-VPLS services. When enabled, IGMP and MLD reports are snooped on SAPs or SDP bindings, but also on VXLAN bindings, to create or modify entries in the MFIB for the VPLS service.

When configuring IGMP and MLD snooping in EVPN-VXLAN VPLS services, consider the following:

  • To enable IGMP snooping in the VPLS service on VXLAN, use the configure service vpls igmp-snooping no shutdown command.

  • To enable MLD snooping in the VPLS service on VXLAN, use the configure service vpls mld-snooping no shutdown command.

  • The VXLAN bindings only support basic IGMP/MLD snooping functionality. Features configurable under SAPs or SDP bindings are not available for VXLAN (VXLAN bindings are configured with the default values used for SAPs and SDP bindings). By default, a specified VXLAN binding only becomes a dynamic Mrouter when it receives IGMP or MLD queries and adds a specified multicast group to the MFIB when it receives an IGMP or MLD report for that group.

    Alternatively, it is possible to configure all VXLAN bindings for a particular VXLAN instance to be Mrouter ports using the configure service vpls vxlan igmp-snooping mrouter-port and configure service vpls vxlan mld-snooping mrouter-port commands.

  • The show service id igmp-snooping, clear service id igmp-snooping, show service id mld-snooping, and clear service id mld-snooping commands are also available for VXLAN bindings.

    Note: MLD snooping uses MAC-based forwarding. See MAC-based IPv6 multicast forwarding for more details.

    The following CLI commands show how the system displays IGMP snooping information and statistics on VXLAN bindings (the equivalent MLD output is similar).

*A:PE1# show service id 1 igmp-snooping port-db vxlan vtep  192.0.2.72 vni 1 detail 
===============================================================================
IGMP Snooping VXLAN 192.0.2.72/1 Port-DB for service 1
===============================================================================
-------------------------------------------------------------------------------
IGMP Group 239.0.0.1
-------------------------------------------------------------------------------
Mode             : exclude              Type             : dynamic
Up Time          : 0d 19:07:05          Expires          : 137s
Compat Mode      : IGMP Version 3       
V1 Host Expires  : 0s                   V2 Host Expires  : 0s
-------------------------------------------------------
Source Address  Up Time      Expires  Type     Fwd/Blk
-------------------------------------------------------
No sources.
-------------------------------------------------------------------------------
IGMP Group 239.0.0.2
-------------------------------------------------------------------------------
Mode             : include              Type             : dynamic
Up Time          : 0d 19:06:39          Expires          : 0s
Compat Mode      : IGMP Version 3       
V1 Host Expires  : 0s                   V2 Host Expires  : 0s
-------------------------------------------------------
Source Address  Up Time      Expires  Type     Fwd/Blk
-------------------------------------------------------
10.0.0.232      0d 19:06:39  137s     dynamic  Fwd              
-------------------------------------------------------------------------------
Number of groups: 2
===============================================================================

*A:PE1# show service id 1 igmp-snooping 
statistics vxlan vtep  192.0.2.72 vni 1        
 
===============================================================================
IGMP Snooping Statistics for VXLAN 192.0.2.72/1 (service 1)
===============================================================================
Message Type            Received      Transmitted   Forwarded
-------------------------------------------------------------------------------
General Queries         0             0             556
Group Queries           0             0             0
Group-Source Queries    0             0             0
V1 Reports              0             0             0
V2 Reports              0             0             0
V3 Reports              553           0             0
V2 Leaves               0             0             0
Unknown Type            0             N/A           0
-------------------------------------------------------------------------------
Drop Statistics
-------------------------------------------------------------------------------
Bad Length               : 0
Bad IP Checksum          : 0
Bad IGMP Checksum        : 0
Bad Encoding             : 0
No Router Alert          : 0
Zero Source IP           : 0
Wrong Version            : 0
Lcl-Scope Packets        : 0
Rsvd-Scope Packets       : 0
 
Send Query Cfg Drops     : 0
Import Policy Drops      : 0
Exceeded Max Num Groups  : 0
Exceeded Max Num Sources : 0
Exceeded Max Num Grp Srcs: 0
MCAC Policy Drops        : 0
===============================================================================
*A:PE1# show service id 1 mfib 
===============================================================================
Multicast FIB, Service 1
===============================================================================
Source Address  Group Address         SAP or SDP Id               Svc Id   Fwd/Blk
-------------------------------------------------------------------------------
*               *                     sap:1/1/1:1              Local    Fwd
*               239.0.0.1             sap:1/1/1:1              Local    Fwd
                                      vxlan:192.0.2.72/1       Local    Fwd
10.0.0.232      239.0.0.2             sap:1/1/1:1              Local    Fwd
                                      vxlan:192.0.2.72/1       Local    Fwd
-------------------------------------------------------------------------------
Number of entries: 3
===============================================================================

PIM snooping on VXLAN

PIM snooping for IPv4 and IPv6 are supported in an EVPN-EVPN-VXLAN VPLS or R-VPLS service (with the R-VPLS attached to a VPRN or IES service). The snooping operation is similar to that within a VPLS service (see PIM snooping for VPLS) and supports both PIM snooping and PIM proxy modes.

PIM snooping for IPv4 is enabled using the configure service vpls pim-snooping command.

PIM snooping for IPv6 is enabled using the configure service vpls pim-snooping no ipv6-multicast-disable command.

When using PIM snooping for IPv6, the default forwarding is MAC-based with optional support for SG-based (see IPv6 multicast forwarding). SG-based forwarding requires FP3- or higher-based hardware.

It is not possible to configure max-num-groups for VXLAN bindings.

Static VXLAN termination in Epipe services

By default, the system IP address is used to terminate and generate VXLAN traffic. The following configuration example shows an Epipe service that supports static VXLAN termination:

config service epipe 1 name "epipe1" customer 1 create
  sap 1/1/1:1 create
  exit
  vxlan vni 100 create
    egr-vtep 192.0.2.1
      oper-group op-grp-1
    exit
  no shutdown

Where:

  • vxlan vni vni create specifies the ingress VNI the router uses to identify packets for the service. The following considerations apply:

    • In services that use EVPN, the configured VNI is only used as the ingress VNI to identify packets that belong to the service. Egress VNIs are learned from the BGP EVPN. In the case of Static VXLAN, the configured VNI is also used as egress VNI (because there is no BGP EVPN control plane).

    • The configured VNI is unique in the system, and as a result, it can only be configured in one service (VPLS or Epipe).

  • egr-vtep ip-address specifies the remote VTEP the router uses when encapsulating frames into VXLAN packets. The following consideration apply:

    • When the PE receives VXLAN packets, the source VTEP is not checked against the configured egress VTEP.

    • The ip-address must be present in the global routing table so that the VXLAN destination is operationally up.

  • The oper-group may be added under egr-vtep. The expected behavior for the operational group and service status is as follows:

    • If the egr-vtep entry is not present in the routing table, the VXLAN destination (in the show service id vxlan command) and the provisioned operational group under egr-vtep enters into the operationally down state.

    • If the Epipe SAP goes down, the service goes down, but it is not affected if the VXLAN destination goes down.

    • If the service is admin shutdown, then in addition to the SAP, the VXLAN destination and the oper-group also enters the operationally down state.

    Note: The operational group configured under egr-vtep cannot be monitored on the SAP of the Epipe where it is configured.

The following features are not supported by Epipe services with VXLAN destinations:

  • per-service hashing

  • SDP-binds

  • PBB context

  • BGP-VPWS

  • spoke SDP-FEC

  • PW-port

Static VXLAN termination in VPLS/R-VPLS services

VXLAN instances in VPLS and R-VPLS can be configured with egress VTEPs. This is referred as static vxlan-instances. The following configuration example shows a VPLS service that supports a static vxlan-instance:

config service vpls 1 name "vpls-1" customer 1 create
  sap 1/1/1:1 create
  exit
  vxlan instance 1 vni 100 create
    source-vtep-security
    no disable-aging /* default: disable-aging    
    no disable-learning  /* default: disable-learning
    no discard-unknown-source
    no max-nbr-mac-addr <table-size>
    restrict-protected-src discard-frame
    egr-vtep 192.0.2.1 create
    exit
    egr-vtep 192.0.2.2 create
    exit
  vxlan instance 2 vni 101 create
    egr-vtep 192.0.2.3 create
    exit

 vxlan instance 2 vni 101 create
    egr-vtep 192.0.2.3 create
    exit
  no shutdown

Specifically the following can be stated:

  • Each VPLS service can have up to two static VXLAN instances. Each instance is an implicit split-horizon-group, and up to 255 static VXLAN binds are supported in total, shared between the two VXLAN instances.

  • Single VXLAN instance VPLS services with static VXLAN are supported along with SAPs and SDP bindings. Therefore:

    • VNIs configured in static VXLAN instances are ‟symmetric”, that is, the same ingress and egress VNIs are used for VXLAN packets using that instance. Note that asymmetric VNIs are actually possible in EVPN VXLAN instances.

    • The addresses can be IPv4 or IPv6 (but not a mix within the same service).

    • A specified VXLAN instance can be configured with static egress VTEPs, or be associated with BGP EVPN, but the same instance cannot be configured to support both static and BGP-EVPN based VXLAN bindings.

  • Up to two VXLAN instances are supported per VPLS (up to two).

    • When two VXLAN instances are configured in the same VPLS service, any combination of static and BGP-EVPN enabled instances are supported. That is, the two VXLAN instances can be static, or BGP-EVPN enabled, or one of each type.

    • When a service is configured with EVPN and there is a static BGP-EVPN instance in the same service, the user must configure restrict-protected-src discard-frame along with no disable-learning in the static BGP-EVPN instance, service>vpls>vxlan.

  • MAC addresses are learned also on the VXLAN bindings of the static VXLAN instance. Therefore, they are shown in the FDB commands. Note that disable-learning and disable-aging are by default enabled in static vxlan-instance.

    • The learned MAC addresses are subject to the remote-age, and not the local-age (only MACs learned on SAPs use the local-age setting).

    • MAC addresses are learned on a VTEP as long as no disable-learning is configured, and the VXLAN VTEP is present in the base route table. When the VTEP disappears from the route table, the associated MACs are flushed.

  • The vpls vxlan source-vtep-security command can be configured per VXLAN instance on VPLS services. When enabled, the router performs an IPv4 source-vtep lookup to discover if the VXLAN packet comes from a trusted VTEP. If not, the router discards the frame. If the lookup yields a trusted source VTEP, then the frame is accepted.

    • A trusted VTEP is an egress VTEP that has been statically configured, or dynamically learned (through EVPN) in any service, Epipe or VPLS

    • The command show service vxlan shows the list of trusted VTEPs in the router.

    • The command source-vtep-security works for static VXLAN instances or BGP-EVPN enabled VXLAN instances, but only for IPv4 VTEPs.

    • The command is mutually exclusive with assisted-replication (replicator or leaf) in the VNI instance. AR can still be configured in a different instance.

Static VXLAN instances can use non-system IPv4/IPv6 termination.

Non-system IPv4 and IPv6 VXLAN termination in VPLS, R-VPLS, and Epipe services

By default, only VXLAN packets with the same IP destination address as the system IPv4 address of the router can be terminated and processed for a subsequent MAC lookup. A router can simultaneously terminate VXLAN tunnels destined for its system IP address and three additional non-system IPv4 or IPv6 addresses, which can be on the base router or VPRN instances. This section describes the configuration requirements for services to terminate VXLAN packets destined for a non-system loopback IPv4 or IPv6 address on the base router or VPRN.

Perform the following steps to configure a service with non-system IPv4 or IPv6 VXLAN termination:
  1. Create the FPE (see FPE creation)
  2. Associate the FPE with VXLAN termination (see FPE association with VXLAN termination)
  3. Configure the router loopback interface (see VXLAN router loopback interface)
  4. Configure VXLAN termination (non-system) VTEP addresses (see VXLAN termination VTEP addresses)
  5. Add the service configuration (see VXLAN services)
The following actions must be considered when the aforementioned steps are completed.
  • FPE creation

    A Forwarding Path Extension (FPE) is required to terminate non-system IPv4 or IPv6 VXLAN tunnels.

    In a non-system IPv4 VXLAN termination, the FPE function is used for additional processing required at ingress (VXLAN tunnel termination) only, and not at egress (VXLAN tunnel origination).

    If the IPv6 VXLAN terminates on a VPLS or Epipe service, the FPE function is used at ingress only, and not at egress.

    For R-VPLS services terminating IPv6 VXLAN tunnels and also for VPRN VTEPs, the FPE is used for the egress as well as the VXLAN termination function. In the case of R-VPLS, an internal static SDP is created to allow the required extra processing.

    For information about FPE configuration and functions, see the 7450 ESS, 7750 SR, 7950 XRS, and VSR Interface Configuration Guide, "Forwarding Path Extension".

  • FPE association with VXLAN termination

    The FPE must be associated with the VXLAN termination application. The following example configuration shows two FPEs and their corresponding association. FPE 1 uses the base router and FPE 2 is configured for VXLAN termination on VPRN 10.

    configure
       fwd-path-ext
           fpe 1 create
                 path pxc pxc-1
                 vxlan-termination
           fpe 2 create
                 path pxc pxc-2
                 vxlan-termination router 10
    
  • VXLAN router loopback interface

    Create the interface that terminates and originates the VXLAN packets. The interface is created as a router interface, which is added to the Interior Gateway Protocol (IGP) and used by the BGP as the EVPN NLRI next hop.

    Because the system cannot terminate the VXLAN on a local interface address, a subnet must be assigned to the loopback interface and not a host IP address that is /32 or /128. In the following example, all the addresses in subnet 11.11.11.0/24 (except 11.11.11.1, which is the interface IP) and subnet 10.1.1.0/24 (except 10.1.1.1) can be used for tunnel termination. The subnet is advertised using the IGP and is configured on either the base router or a VPRN. In the example, two subnets are assigned, in the base router and VPRN 10 respectively.

    configure 
      router 
           interface "lo1"
                loopback
                address 10.11.11.1/24 
           isis
                interface "lo1"
                     passive
                     no shutdown
    
    configure
        service
            vprn 10 name "vprn10" customer 1 create
                interface "lo1"
                    loopback
                    address 10.1.1.1/24
                isis 
                    interface "lo1" 
                        passive 
                        no shutdown 
    

    A local interface address cannot be configured as a VXLAN tunnel-termination IP address in the CLI, as shown in the following example.

    *A:PE-3# configure service system vxlan tunnel-termination 192.0.2.3 fpe 1 create
    MINOR: SVCMGR #8353 VXLAN Tunnel termination IP address cannot be configured -
     IP address in use by another application or matches a local interface IP address
    

    The subnet can be up to 31 bits. For example, to use 10.11.11.1 as the VXLAN termination address, the subnet should be configured and advertised as shown in the following example configuration.

     interface "lo1"
                address 10.11.11.0/31
                loopback
                no shutdown
            exit
            isis 0
                interface "lo1"
                    passive
                    no shutdown
                exit
                no shutdown
            exit
    

    It is not a requirement for the remote PEs and NVEs to have the specific /32 or /128 IP address in their RTM to resolve the BGP EVPN NLRI next hop or forward the VXLAN packets. An RTM with a subnet that contains the remote VTEP can also perform these tasks.

    Note: The system does not check for a pre-existing local base router loopback interface with a subnet corresponding to the VXLAN tunnel termination address. If a tunnel termination address is configured and the FPE is operationally up, the system starts terminating VXLAN traffic and responding ICMP messages for that address. The following conditions are ignored in this scenario:
    • the presence of a loopback interface in the base router

    • the presence of an interface with the address contained in the configured subnet, and no loopback

    The following example output includes an IPv6 address in the base router. It could also be configured in a VPRN instance.

    configure 
      router 
          interface "lo1" 
               loopback
               address 10.11.11.1/24 
               ipv6
                   address 2001:db8::/127
               exit
          isis
               interface "lo1"
                   passive
                   no shutdown
    
  • VXLAN termination VTEP addresses

    The service>system>vxlan>tunnel-termination context allows the user to configure non-system IP addresses that can terminate the VXLAN and their corresponding FPEs.

    As shown in the following example, an IP address may be associated with a new or existing FPE already terminating the VXLAN. The list of addresses that can terminate the VXLAN can include IPv4 and IPv6 addresses.

    config service system vxlan# 
           tunnel-termination 10.11.11.1 fpe 1 create
           tunnel-termination 2001:db8:1000::1 fpe 1 create 
            
    config service vprn 10 vxlan# 
           tunnel-termination 10.1.1.2 fpe 2 create
    

    The tunnel-termination command creates internal loopback interfaces that can respond to ICMP requests. In the following sample output, an internal loopback is created when the tunnel termination address is added (for 10.11.11.1 and 2001:db8:1000::1). The internal FPE router interfaces created by the VXLAN termination function are also shown in the output. Similar loopback and interfaces are created for tunnel termination addresses in a VPRN (not shown).

    *A:PE1# show router interface 
    ===============================================================================
    Interface Table (Router: Base)
    ===============================================================================
    Interface-Name                   Adm       Opr(v4/v6)  Mode    Port/SapId
       IP-Address                                                  PfxState
    -------------------------------------------------------------------------------
    _tmnx_fpe_1.a                    Up        Up/Up       Network pxc-2.a:1
       fe80::100/64                                                PREFERRED
    _tmnx_fpe_1.b                    Up        Up/Up       Network pxc-2.b:1
       fe80::101/64                                                PREFERRED
    _tmnx_vli_vxlan_1_131075         Up        Up/Up       Network loopback
       10.11.11.1/32                                               n/a
       2001:db8:1000::1                                            PREFERRED
       fe80::6cfb:ffff:fe00:0/64                                   PREFERRED
    lo1                              Up        Up/Down     Network loopback
       10.11.11.0/31                                               n/a
    system                           Up        Up/Down     Network system
       1.1.1.1/32                                                  n/a
    <snip>
    
  • VXLAN services

    By default, the VXLAN services use the system IP address as the source VTEP of the VXLAN encapsulated frames. The vxlan-src-vtep command in the config>service>vpls or config>service>epipe context enables the system to use a non-system IPv4 or IPv6 address as the source VTEP for the VXLAN tunnels in that service.

    A different vxlan-src-vtep can be used for different services, as shown in the following example where two different services use different non-system IP addresses as source VTEPs.

    configure service vpls 1 
       vxlan-src-vtep 10.11.11.1
     
    configure service vpls 2
       vxlan-src-vtep 2001:db8:1000::1
    

    In addition, if a vxlan-src-vtep is configured and the service uses EVPN, the IP address is also used to set the BGP NLRI next hop in EVPN route advertisements for the service.

    Note: The BGP EVPN next hop can be overridden by the use of export policies based on the following rules:
    • A BGP peer policy can override a next hop pushed by the vxlan-src-vtep configuration.

    • If the VPLS service is IPv6 (that is, the vxlan-src-vtep is IPv6) and a BGP peer export policy is configured with next-hop-self, the BGP next-hop is overridden with an IPv6 address auto-derived from the IP address of the system. The auto-derivation is based on RFC 4291. For example, ::ffff:10.20.1.3 is auto-derived from system IP 10.20.1.3.

    • The policy checks the address type of the next hop provided by the vxlan-src-vtep command. If the command provides an IPv6 next hop, the policy is unable use an IPv4 address to override the IPv6 address provided by the vxlan-src-vtep command.

    After the preceding steps are performed to configure a VXLAN termination, the VPLS, R-VPLS, or Epipe service can be used normally, except that the service terminates VXLAN tunnels with a non-system IPv4 or IPv6 destination address (in the base router or a VPRN instance) instead of the system IP address only.

    The FPE vxlan-termination function creates internal router interfaces and loopbacks that are displayed by the show commands. When configuring IPv6 VXLAN termination on an R-VPLS service, as well as the internal router interfaces and loopbacks, the system creates internal SDP bindings for the required egress processing. The following output shows an example of an internal FPE-type SDP binding created for IPv6 R-VPLS egress processing.

    *A:PE1# show service sdp-using  
    ===============================================================================
    SDP Using
    ===============================================================================
    SvcId      SdpId              Type   Far End              Opr   I.Label E.Label
                                                              State         
    -------------------------------------------------------------------------------
    2002       17407:2002         Fpe    fpe_1.b              Up    262138  262138
    -------------------------------------------------------------------------------
    Number of SDPs : 1
    -------------------------------------------------------------------------------
    ===============================================================================
    

    When BGP EVPN is used, the BGP peer over which the EVPN-VXLAN updates are received can be an IPv4 or IPv6 peer, regardless of whether the next-hop is an IPv4 or IPv6 address.

    The same VXLAN tunnel termination address cannot be configured on different router instances; that is, on two different VPRN instances or on a VPRN and the base router.

EVPN for overlay tunnels

This section describes the specifics of EVPN for non-MPLS Overlay tunnels.

BGP-EVPN control plane for VXLAN overlay tunnels

RFC 8365 describes EVPN as the control plane for overlay-based networks. The 7750 SR, 7450 ESS, and 7950 XRS support all routes and features described in RFC 7432 that are required for the DGW function. EVPN multihoming and BGP multihoming based on the L2VPN BGP address family are both supported if redundancy is needed.

EVPN-VXLAN required routes and communities shows the EVPN MP-BGP NLRI, required attributes and extended communities, and two route types supported for the DGW Layer 2 applications:

route type 3
Inclusive Multicast Ethernet Tag route
route type 2
MAC/IP advertisement route
Figure 8. EVPN-VXLAN required routes and communities
EVPN route type 3 – inclusive multicast Ethernet tag route

Route type 3 is used to set up the flooding tree (BUM flooding) for a specified VPLS service in the data center. The received inclusive multicast routes add entries to the VPLS flood list in the 7750 SR, 7450 ESS, and 7950 XRS. The tunnel types supported in an EVPN route type 3 when BGP-EVPN MPLS is enabled are ingress replication, P2MP MLDP, and composite tunnels.

Ingress Replication (IR) and Assisted Replication (AR) are supported for VXLAN tunnels. See Layer 2 multicast optimization for VXLAN (Assisted-Replication) for more information about the AR.

If ingress-repl-inc-mcast-advertisement is enabled, a route type 3 is generated by the router per VPLS service as soon as the service is in an operationally up state. The following fields and values are used:

  • Route Distinguisher is taken from the RD of the VPLS service within the BGP context.

    Note: The RD can be configured or derived from the bgp-evpn evi value.
  • Ethernet Tag ID is 0.

  • IP address length is always 32.

  • Originating router’s IP address carries an IPv4 or IPv6 address.

    Note: By default, the IP address of the Originating router is derived from the system IP address. However, this can be overridden by the configure service vpls bgp-evpn incl-mcast-orig-ip ip-address command for the Ingress Replication (and mLDP if MPLS is used) tunnel type.
  • For PMSI Tunnel Attribute (PTA), tunnel type = Ingress replication (6) or Assisted Replication (10)

    • Leaf not required for Flags.

    • MPLS label carries the VNI configured in the VPLS service. Only one VNI can be configured per VPLS service.

    • Tunnel endpoint is equal to the system IP address.

    As shown in PMSI attribute flags field for AR, additional flags are used in the PTA when the service is configured for AR.

    Figure 9. PMSI attribute flags field for AR

    The Flags field is defined as a Type field (for AR) with two new flags that are defined as follows:

    • T is the AR Type field (2 bits):
      • 00 (decimal 0) = RNVE (non-AR support)

      • 01 (decimal 1) = AR REPLICATOR

      • 10 (decimal 2) = AR LEAF

    • The U and BM flags defined in IETF Draft draft-ietf-bess-evpn-optimized-ir are not used in the SR OS.

    AR-R and AR-L routes and usage describes the inclusive multicast route information sent per VPLS service when the router is configured as assisted-replication replicator (AR-R) or assisted-replication leaf (AR-L). A Regular Network Virtualization Edge device (RNVE) is defined as an EVPN-VXLAN router that does not support (or is not configured for) Assisted-Replication.

    Note: For AR-R, two inclusive multicast routes may be advertised if ingress-repl-inc-mcast-advertisement is enabled: a route with tunnel-type IR, tunnel-id = IR IP (generally system-ip) and a route with tunnel-type AR, tunnel-id = AR IP (the address configured in the assisted-replication-ip command).
    Table 1. AR-R and AR-L routes and usage
    AR role Function Inclusive Mcast routes advertisement

    AR-R

    Assists AR-LEAFs

    • IR included in the Mcast route (uses IR IP) if ingress-repl-inc-mcast-advertisement is enabled

    • AR included in the Mcast route (uses AR IP, tunnel type=AR, T=1)

    AR-LEAF

    Sends BM only to AR-Rs

    IR inclusive multicast route (IR IP, T=2) if ingress-repl-inc-mcast-advertisement is enabled

    RNVE

    Non-AR support

    IR inclusive multicast route (IR IP) if ingress-repl-inc-mcast-advertisement is enabled

EVPN route type 2 – MAC/IP advertisement route

The 7750 SR, 7450 ESS, and 7950 XRS generates this route type for advertising MAC addresses. If mac-advertisement is enabled, the router generates MAC advertisement routes for the following:

  • learned MACs on SAPs or SDP bindings

  • conditional static MACs

Note: To address unknown-mac-routes, if unknown-mac-route is enabled, there is no bgp-mh site in the service or there is a (single) DF site

The route type 2 generated by a router uses the following fields and values:

  • Route Distinguisher is taken from the RD of the VPLS service within the BGP context.

    Note: The RD can be configured or derived from the bgp-evpn evi value.
  • Ethernet Segment Identifier (ESI) value = 0:0:0:0:0:0:0:0:0:0 or non-zero, depending on whether the MAC addresses are learned on an Ethernet Segment.

  • Ethernet Tag ID is 0.

  • MAC address length is always 48.

  • MAC Address:

    • is 00:00:00:00:00:00 for the Unknown MAC route address.

    • is different from 00:…:00 for the rest of the advertised MACs.

  • IP address and IP address length:

    • The length of the IP address associated with the MAC being advertised is either 32 for IPv4 or 128 for IPv6.

    • If the MAC address is the Unknown MAC route, the IP address length is zero and the IP omitted.

    • In general, any MAC route without IP has IPL=0 (IP length) and the IP is omitted.

    • When received, any IPL value not equal to zero, 32, or 128 discards the route.

  • MPLS Label 1 carries the VNI configured in the VPLS service. Only one VNI can be configured per VPLS.

  • MPLS Label 2 is 0.

  • MAC Mobility extended community is used for signaling the sequence number in case of MAC moves and the sticky bit in case of advertising conditional static MACs. If a MAC route is received with a MAC mobility ext-community, the sequence number and the sticky bit are considered for the route selection.

When EVPN-VXLAN multihoming is enabled, type 1 routes (Auto-Discovery per-ES and per-EVI routes) and type 4 routes (ES routes) are also generated and processed. See BGP-EVPN control plane for MPLS tunnels for more information about route types 1 and 4.

EVPN route type 5 – IP prefix route

EVPN route-type 5 shows the IP prefix route or route-type 5.

Figure 10. EVPN route-type 5

The router generates this route type for advertising IP prefixes in EVPN. The router generates IP prefix advertisement routes for IP prefixes existing in a VPRN linked to the IRB backhaul R-VPLS service.

The route-type 5 generated by a router uses the following fields and values:

  • Route Distinguisher: taken from the RD configured in the IRB backhaul R-VPLS service within the BGP context

  • Ethernet Segment Identifier (ESI): value = 0:0:0:0:0:0:0:0:0:0

  • Ethernet Tag ID: 0

  • IP address length: any value in the 0 to 128 range

  • IP address: any valid IPv4 or IPv6 address

  • Gateway IP address: can carry two different values:

    • if different from zero, the route-type 5 carries the primary IP interface address of the VPRN behind which the IP prefix is known. This is the case for the regular IRB backhaul R-VPLS model.

    • if 0.0.0.0, the route-type 5 is sent with a MAC next-hop extended community that carries the VPRN interface MAC address. This is the case for the EVPN tunnel R-VPLS model.

  • MPLS Label: carries the VNI configured in the VPLS service. Only one VNI can be configured per VPLS service.

All the routes in EVPN-VXLAN is sent with the RFC 5512 tunnel encapsulation extended community, with the tunnel type value set to VXLAN.

EVPN for VXLAN in VPLS services

The EVPN-VXLAN service is designed around the current VPLS objects and the additional VXLAN construct.

Layer 2 DC PE with VPLS to the WAN shows a DC with a Layer 2 service that carries the traffic for a tenant who wants to extend a subnet beyond the DC. The DC PE function is carried out by the 7750 SR, 7450 ESS, and 7950 XRS where a VPLS instance exists for that particular tenant. Within the DC, the tenant has VPLS instances in all the Network Virtualization Edge (NVE) devices where they require connectivity (such VPLS instances can be instantiated in TORs, Nuage VRS, VSG, and so on). The VPLS instances in the redundant DGW and the DC NVEs are connected by VXLAN bindings. BGP-EVPN provides the required control plane for such VXLAN connectivity.

The DGW routers are configured with a VPLS per tenant that provides the VXLAN connectivity to the Nuage VPLS instances. On the router, each tenant VPLS instance is configured with:

  • The WAN-related parameters (SAPs, spoke SDPs, mesh-SDPs, BGP-AD, and so on).

  • The BGP-EVPN and VXLAN (VNI) parameters. The following CLI output shows an example for an EVPN-VXLAN VPLS service.

*A:DGW1>config>service>vpls# info 
----------------------------------------------
            description "vxlan-service"
            vxlan instance 1 vni 1 create
            exit
            bgp
                route-distinguisher 65001:1
                route-target export target:65000:1 import target:65000:1
            exit
            bgp-evpn
                unknown-mac-route
                mac-advertisement 
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            sap 1/1/1:1 create
            exit
            no shutdown
----------------------------------------------

The bgp-evpn context specifies the encapsulation type (only vxlan is supported) to be used by EVPN and other parameters like the unknown-mac-route and mac-advertisement commands. These commands are typically configured in three different ways:

  • If the operator configures no unknown-mac-route and mac-advertisement (default option), the router advertises new learned MACs (on the SAPs or SDP bindings) or new conditional static MACs.

  • If the operator configures unknown-mac-route and no mac-advertisement, the router only advertises an unknown-mac-route as long as the service is operationally up (if no BGP-MH site is configured in the service) or the router is the DF (if BGP-MH is configured in the service).

  • If the operator configures unknown-mac-route and mac-advertisement, the router advertises new learned MACs, conditional static MACs, and the unknown-mac-route. The unknown-mac-route is only advertised under the preceding described conditions.

Other parameters related to EVPN or VXLAN are:

  • MAC duplication parameters

  • VXLAN VNI (defines the VNI that the router uses in the EVPN routes generated for the VPLS service)

After the VPLS is configured and operationally up, the router sends or receives inclusive multicast Ethernet Tag routes, and a full-mesh of VXLAN connections are automatically created. These VXLAN ‟auto-bindings” can be characterized as follows:

  • The VXLAN auto-binding model is based on an IP-VPN-like design, where no SDPs or SDP binding objects are created by or visible to the user. The VXLAN auto-binds are composed of remote VTEPs and egress VNIs, and can be displayed with the following command:

    • show service id 112 vxlan destinations

      Output example

      ==============================================================================
      Egress VTEP, VNI (Instance 1)
      ===============================================================================
      VTEP Address                                         Egress VNI Oper Mcast Num
                                                                      State      MACs
      -------------------------------------------------------------------------------
      192.0.2.2                                            112        Up   BUM   1
      192.0.2.3                                            112        Down BUM   0
      -------------------------------------------------------------------------------
      Number of Egress VTEP, VNI : 2
      ===============================================================================
      
    • show service id 112 vxlan destinations detail

      Output example

      ===============================================================================
      Egress VTEP, VNI (Instance 1)
      ===============================================================================
      VTEP Address                                         Egress VNI Oper Mcast Num
                                                                      State      MACs
      -------------------------------------------------------------------------------
      192.0.2.2                                            112        Up   BUM   1
      Oper Flags       : None
      Type             : evpn
      L2 PBR           : No
      Sup BCast Domain : No
      Last Update      : 02/03/2023 22:15:06
      192.0.2.3                                            112        Down BUM   0
      Oper Flags       : MTU-Mismatch
      Type             : evpn
      L2 PBR           : No
      Sup BCast Domain : No
      Last Update      : 01/31/2023 21:28:39
      -------------------------------------------------------------------------------
      Number of Egress VTEP, VNI : 2
      ===============================================================================
  • If the following command is configured on the PEs attached to the same service, the service MTU value is advertised in the EVPN Layer-2 Attributes extended community along with the Inclusive Multicast Ethernet Tag routes.
    • MD-CLI
      configure service vpls bgp-evpn routes incl-mcast advertise-l2-attributes
    • classic CLI
      configure service vpls bgp-evpn incl-mcast-l2-attributes-advertisement
    Upon receiving the signaled MTU from an egress PE, the ingress PE compares the MTU with the local one and, in case of mismatch, the EVPN VXLAN destination is brought operationally down. An operational flag MTU-Mismatch shows the reason why the VXLAN destination is operationally down in this case. The following command makes the router ignore the MTU signaled by the remote PE and bring up the VXLAN destination if there are no other reasons to keep it down.
    configure service vpls bgp-evpn ignore-mtu-mismatch
  • The VXLAN bindings observe the VPLS split-horizon rule. This is performed automatically without the need for any split-horizon configuration.

  • BGP Next-Hop Tracking for EVPN is fully supported. If the BGP next-hop for a specified received BGP EVPN route disappears from the routing table, the BGP route is not marked as ‟used” and the respective entry in show service id vxlan destinations is removed.

After the flooding domain is setup, the routers and DC NVEs start advertising MAC addresses, and the routers can learn MACs and install them in the FDB. Some considerations are the following:

  • All the MAC addresses associated with remote VTEP/VNIs are always learned in the control plane by EVPN. Data plane learning on VXLAN auto-bindings is not supported.

  • When unknown-mac-route is configured, it is generated when no (BGP-MH) site is configured, or a site is configured AND the site is DF in the PE.

    Note: The unknown-mac-route is not installed in the FDB (therefore, does not show up in the show service id svc-id fdb detail command).
  • While the router can be configured with only one VNI (and signals a single VNI per VPLS), it can accept any VNI in the received EVPN routes as long as the route target is properly imported. The VTEPs and VNIs show up in the FDB associated with MAC addresses:

    A:PE65# show service id 1000 fdb detail 
    ===============================================================================
    Forwarding Database, Service 1000
    ===============================================================================
    ServId    MAC               Source-Identifier        Type     Last Change
                                                         Age      
    -------------------------------------------------------------------------------
    1000      00:00:00:00:00:01 vxlan-1:                 Evpn     10/05/13 23:25:57
                                192.0.2.63:1063                   
    1000      00:00:00:00:00:65 sap:1/1/1:1000           L/30     10/05/13 23:25:57
    1000      00:ca:ca:ca:ca:00 vxlan-1:                 EvpnS    10/04/13 17:35:43
                                192.0.2.63:1063                   
    -------------------------------------------------------------------------------
    No. of MAC Entries: 3
    -------------------------------------------------------------------------------
    Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
    ===============================================================================
    
Resiliency and BGP multihoming

The DC overlay infrastructure relies on IP tunneling, that is, VXLAN; therefore, the underlay IP layer resolves failure in the DC core. The IGP should be optimized to get the fastest convergence.

From a service perspective, resilient connectivity to the WAN may be provided by BGP multihoming.

Use of BGP-EVPN, BGP-AD, and sites in the same VPLS service

All BGP-EVPN (control plane for a VXLAN DC), BGP-AD (control plane for MPLS-based spoke SDPs connected to the WAN), and one site for BGP multihoming (control plane for the multihomed connection to the WAN) can be configured in one service in a specified system. If that is the case, the following considerations apply:

  • The configured BGP route-distinguisher and route-target are used by BGP for the two families, that is, evpn and l2vpn. If different import/export route targets are to be used per family, vsi-import/export policies must be used.

  • The pw-template-binding command under BGP, does not have any effect on evpn or bgp-mh. It is only used for the instantiation of the BGP-AD spoke SDPs.

  • If the same import/export route-targets are used in the two redundant DGWs, VXLAN binding as well as a fec129 spoke SDP binding is established between the two DGWs, creating a loop. To avoid creating a loop, the router allows the establishment of an EVPN VXLAN binding and an SDP binding to the same far-end, but the SDP binding is kept operationally down. Only the VXLAN binding is operationally up.

Use of the unknown-mac-route

This section describes the behavior of the EVPN-VXLAN service in the router when the unknown-mac-route and BGP-MH are configured at the same time.

The use of EVPN, as the control plane of NVO networks in the DC, provides a significant number of benefits as described in IETF Draft draft-ietf-bess-evpn-overlay.

However, there is a potential issue that must be addressed when a VPLS DCI is used for an NVO3-based DC: all the MAC addresses learned from the WAN side of the VPLS must be advertised by BGP EVPN updates. Even if optimized BGP techniques like RT-constraint are used, the number of MAC addresses to advertise or withdraw (in case of failure) from the DC GWs can be difficult to control and overwhelming for the DC network, especially when the NVEs reside in the hypervisors.

The 7750 SR, 7450 ESS, and 7950 XRS solution to this issue is based on the use of an unknown-mac-route address that is advertised by the DC PEs. By using this unknown-mac-route advertisement, the DC tenant may decide to optionally turn off the advertisement of WAN MAC addresses in the DGW, therefore, reducing the control plane overhead and the size of the FDB tables in the NVEs.

The use of the unknown-mac-route is optional and helps to reduce the amount of unknown-unicast traffic within the data center. All the receiving NVEs supporting this concept send any unknown-unicast packet to the owner of the unknown-mac-route, as opposed to flooding the unknown-unicast traffic to all other NVEs that are part of the same VPLS.

Note: Although the router can be configured to generate and advertise the unknown-mac-route, the router never honors the unknown-mac-route and floods to the TLS-flood list when an unknown-unicast packet arrives at an ingress SAP or SDP binding.

The use of the unknown-mac-route assumes the following:

  • A fully virtualized DC where all the MACs are control-plane learned, and learned previous to any communication (no legacy TORs or VLAN connected servers).

  • The only exception is MACs learned over the SAPs/SDP bindings that are part of the BGP-MH WAN site-id. Only one site-id is supported in this case.

  • No other SAPs/SDP bindings out of the WAN site-id are supported, unless only static MACs are used on those SAPs/SDP bindings.

Therefore, when unknown-mac-route is configured, it is only generated when one of the following applies:

  • No site is configured and the service is operationally up.

  • A BGP-MH site is configured AND the DGW is Designated Forwarder (DF) for the site. In case of BGP-MH failover, the unknown-mac-route is withdrawn by the former DF and advertised by the new DF.

EVPN for VXLAN in R-VPLS services

Gateway IRB on the DC PE for an L2 EVPN/VXLAN DC shows a DC with a Layer 2 service that carries the traffic for a tenant who extends a subnet within the DC, while the DGW is the default gateway for all the hosts in the subnet. The DGW function is carried out by the 7750 SR, 7450 ESS, and 7950 XRS where an R-VPLS instance exists for that particular tenant. Within the DC, the tenant has VPLS instances in all the NVE devices where they require connectivity (such VPLS instances can be instantiated in TORs, Nuage VRS, VSG, and so on). The WAN connectivity is based on existing IP-VPN features.

In this model, the DGW routers are configured with a R-VPLS (bound to the VPRN that provides the WAN connectivity) per tenant that provides the VXLAN connectivity to the Nuage VPLS instances. This model provides inter-subnet forwarding for L2-only TORs and other L2 DC NVEs.

On the router:

  • The VPRN is configured with an interface bound to the backhaul R-VPLS. That interface is a regular IP interface (IP address configured or possibly a Link Local Address if IPv6 is added).

  • The VPRN can support other numbered interfaces to the WAN or even to the DC.

  • The R-VPLS is configured with the BGP, BGP-EVPN and VXLAN (VNI) parameters.

The Nuage VSGs and NVEs use a regular VPLS service model with BGP EVPN and VXLAN parameters.

Consider the following:

  • Route-type 2 routes with MACs and IPs are advertised. Some considerations about MAC+IP and ARP/ND entries are:

    • The 7750 SR advertises its IRB MAC+IP in a route type 2 route and possibly the VRRP vMAC+vIP if it runs VRRP and the 7750 SR is the active router. In both cases, the MACs are advertised as static MACs, therefore, protected by the receiving PEs.

    • If the 7750 SR VPRN interface is configured with one or more additional secondary IP addresses, they are all advertised in routes type 2, as static MACs.

    • The 7750 SR processes route-type 2 routes as usual, populating the FDB with the received MACs and the VPRN ARP/ND table with the MAC and IPs, respectively.

      Note: ND entries received from the EVPN are installed as Router entries. The ARP/ND entries coming from the EVPN are tagged as evpn.
    • When a VPLS containing proxy-ARP/proxy-ND entries is bound to a VPRN (allow-ip-int-bind) all the proxy-ARP/proxy-ND entries are moved to the VPRN ARP/ND table. ARP/ND entries are also moved to proxy-ARP/proxy-ND entries if the VPLS is unbound.

    • EVPN does not program EVPN-received ARP/ND entries if the receiving VPRN has no IP addresses for the same subnet. The entries are added when the IP address for the same subnet is added.

    • Static ARP/ND entries have precedence over dynamic and EVPN ARP/ND entries.

  • VPRN interface binding to VPLS service brings down the VPRN interface operational status, if the VPRN interface MAC or the VRRP MAC matches a static-mac or OAM MAC configured in the associated VPLS service. If that is the case, a trap is generated.

  • Redundancy is handled by VRRP. The active 7750 SR advertises vMAC and vIP, as discussed, including the MAC mobility extended community and the sticky bit.

EVPN-enabled R-VPLS services are also supported on IES interfaces.

EVPN for VXLAN in IRB backhaul R-VPLS services and IP prefixes

Gateway IRB on the DC PE for an L3 EVPN/VXLAN DC shows a Layer 3 DC model, where a VPRN is defined in the DGWs, connecting the tenant to the WAN. That VPRN instance is connected to the VPRNs in the NVEs by means of an IRB backhaul R-VPLS. Because the IRB backhaul R-VPLS provides connectivity only to all the IRB interfaces and the DGW VPRN is not directly connected to all the tenant subnets, the WAN ip-prefixes in the VPRN routing table must be advertised in EVPN. In the same way, the NVEs send IP prefixes in EVPN that are received by the DGW and imported in the VPRN routing table.

Note: To generate or process IP prefixes sent or received in EVPN route type 5, support for IP route advertisement must be enabled in BGP-EVPN using the bgp-evpn ip-route-advertisement command. This command is disabled by default and must be explicitly enabled. The command is tied to the allow-ip-int-bind command required for R-VPLS, and it is not supported on an R-VPLS linked to IES services.

Local router interface host addresses are not advertised in EVPN by default. To advertise them, the ip-route-advertisement incl-host command must be enabled. For example:

===============================================================================
Route Table (Service: 2)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                         Active     Metric
-------------------------------------------------------------------------------
10.1.1.0/24                                   Local   Local     00h00m11s  0
       if                                              Y            0
10.1.1.100/32                                 Local   Host      00h00m11s  0
       if                                              Y            0
==============================================================================

For the case displayed by the output above, the behavior is the following:

  • ip-route-advertisement only local subnet (default) - 10.1.1.0/24 is advertised

  • ip-route-advertisement incl-host local subnet, host - 10.1.1.0/24 and 10.1.1.100/32 are advertised

Below is an example of VPRN (500) with two IRB interfaces connected to backhaul R-VPLS services 501 and 502 where EVPN-VXLAN runs:

vprn 500 customer 1 create            
            ecmp 4
            route-distinguisher 65072:500
            vrf-target target:65000:500
            interface "evi-502" create
                address 10.20.20.72/24
                vpls "evpn-vxlan-502"
                exit
            exit
            interface "evi-501" create
                address 10.10.10.72/24
                vpls "evpn-vxlan-501"
                exit
            exit
            no shutdown
vpls 501 name ‟evpn-vxlan-501” customer 1 create
            allow-ip-int-bind
            vxlan instance 1 vni 501 create
            exit
            bgp
                route-distinguisher 65072:501
                route-target export target:65000:501 import target:65000:501
            exit
            bgp-evpn
                ip-route-advertisement incl-host
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit                  
            exit
            no shutdown
        exit
vpls 502 name ‟evpn-xvlan-502” customer 1 create
            allow-ip-int-bind
            vxlan instance 1 vni 502 create
            exit
            bgp
                route-distinguisher 65072:502
                route-target export target:65000:502 import target:65000:502
            exit
            bgp-evpn
                ip-route-advertisement incl-host
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            no shutdown
        exit

When the above commands are enabled, the router behaves as follows:

  • Receive route-type 5 routes and import the IP prefixes and associated IP next-hops into the VPRN routing table.

    • If the route-type 5 is successfully imported by the router, the prefix included in the route-type 5 (for example, 10.0.0.0/24), is added to the VPRN routing table with a next-hop equal to the gateway IP included in the route (for example, 192.0.0.1. that refers to the IRB IP address of the remote VPRN behind which the IP prefix sits).

    • When the router receives a packet from the WAN to the 10.0.0.0/24 subnet, the IP lookup on the VPRN routing table yields 192.0.0.1 as the next-hop. That next-hop is resolved to a MAC in the ARP table and the MAC resolved to a VXLAN tunnel in the FDB table

      Note: IRB MAC and IP addresses are advertised in the IRB backhaul R-VPLS in routes type 2.
  • Generate route-type 5 routes for the IP prefixes in the associated VPRN routing table.

    For example, if VPRN-1 is attached to EVPN R-VPLS 1 and EVPN R-VPLS 2, and R-VPLS 2 has bgp-evpn ip-route-advertisement configured, the 7750 SR advertises the R-VPLS 1 interface subnet in one route-type 5.

  • Routing policies can filter the imported and exported IP prefix routes accordingly.

The VPRN routing table can receive routes from all the supported protocols (BGP-VPN, OSPF, IS-IS, RIP, static routing) as well as from IP prefixes from EVPN, as shown below:

*A:PE72# show router 500 route-table                      
===============================================================================
Route Table (Service: 500)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric   
-------------------------------------------------------------------------------
10.20.20.0/24                                 Local   Local     01d11h10m  0
       evi-502                                                      0
10.20.20.71/32                                Remote  BGP EVPN  00h02m26s  169
       10.10.10.71                                                  0
10.10.10.0/24                                Remote  Static    00h00m05s  5
       10.10.10.71                                                  1
10.16.0.1/32                                 Remote  BGP EVPN  00h02m26s  169
       10.10.10.71                                                  0
-------------------------------------------------------------------------------
No. of Routes: 4

The following considerations apply:

  • The route Preference for EVPN IP prefixes is 169.

    BGP IP-VPN routes have a preference of 170 by default, therefore, if the same route is received from the WAN over BGP-VPRN and from BGP-EVPN, then the EVPN route is preferred.

  • When the same route-type 5 prefix is received from different gateway IPs, ECMP is supported if configured in the VPRN.

  • All routes in the VPRN routing table (as long as they do not point back to the EVPN R-VPLS interface) are advertised via EVPN.

Although the description above is focused on IPv4 interfaces and prefixes, it applies to IPv6 interfaces too. The following considerations are specific to IPv6 VPRN R-VPLS interfaces:

  • IPv4 and IPv6 interfaces can be defined on R-VPLS IP interfaces at the same time (dual-stack).

  • The user may configure specific IPv6 Global Addresses on the VPRN R-VPLS interfaces. If a specific Global IPv6 Address is not configured on the interface, the Link Local Address interface MAC/IP is advertised in a route type 2 as soon as IPv6 is enabled on the VPRN R-VPLS interface.

  • Routes type 5 for IPv6 prefixes are advertised using either the configured Global Address or the implicit Link Local Address (if no Global Address is configured).

    If more than one Global Address is configured, normally the first IPv6 address is used as gateway IP. The ‟first IPv6 address” refers to the first one on the list of IPv6 addresses shown through show router id interface interface ipv6 or through SNMP.

    The rest of the addresses are advertised only in MAC-IP routes (Route Type 2) but not used as gateway IP for IPv6 prefix routes.

EVPN for VXLAN in EVPN tunnel R-VPLS services

EVPN-tunnel gateway IRB on the DC PE for an L3 EVPN/VXLAN DC shows an L3 connectivity model that optimizes the solution described in EVPN for VXLAN in IRB backhaul R-VPLS services and IP prefixes. Instead of regular IRB backhaul R-VPLS services for the connectivity of all the VPRN IRB interfaces, EVPN tunnels can be configured. The main advantage of using EVPN tunnels is that they do not need the configuration of IP addresses, as regular IRB R-VPLS interfaces do.

In addition to the ip-route-advertisement command, this model requires the configuration of the config>service>vprn>if>vpls <name> evpn-tunnel.

Note: EVPN tunnels can be enabled independently of the ip-route-advertisement command, however, no route-type 5 advertisements are sent or processed. Neither command, evpn-tunnel and ip-route-advertisement, is supported on R-VPLS services linked to IES interfaces.

The example below shows a VPRN (500) with an EVPN-tunnel R-VPLS (504):

vprn 500 name "vprn500" customer 1 create
            ecmp 4
            route-distinguisher 65071:500
            vrf-target target:65000:500
            interface "evi-504" create
                vpls "evpn-vxlan-504"
                    evpn-tunnel
                exit
            exit
            no shutdown
        exit
        vpls 504 name "evpn-vxlan-504" customer 1 create
            allow-ip-int-bind
            vxlan instance 1 vni 504 create
            exit
            bgp
                route-distinguisher 65071:504
                route-target export target:65000:504 import target:65000:504
            exit
            bgp-evpn
                ip-route-advertisement
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            no shutdown
        exit

A specified VPRN supports regular IRB backhaul R-VPLS services as well as EVPN tunnel R-VPLS services.

Note: EVPN tunnel R-VPLS services do not support SAPs or SDP-binds.

The process followed upon receiving a route-type 5 on a regular IRB R-VPLS interface differs from the one for an EVPN-tunnel type:

  • IRB backhaul R-VPLS VPRN interface:

    • When a route-type 2 that includes an IP prefix is received and it becomes active, the MAC/IP information is added to the FDB and ARP tables. This can be checked with the show router arp command and the show service id fdb detail command.

    • When route-type 5 is received and becomes active for the R-VPLS service, the IP prefix is added to the VPRN routing table, regardless of the existence of a route-type 2 that can resolve the gateway IP address. If a packet is received from the WAN side and the IP lookup hits an entry for which the gateway IP (IP next-hop) does not have an active ARP entry, the system uses ARP to get a MAC. If ARP is resolved but the MAC is unknown in the FDB table, the system floods into the TLS multicast list. Routes type 5 can be checked in the routing table with the show router route-table and show router fib commands.

  • EVPN tunnel R-VPLS VPRN interface:

    • When route-type 2 is received and becomes active, the MAC address is added to the FDB (only).

    • When a route-type 5 is received and active, the IP prefix is added to the VPRN routing table with next-hop equal to EVPN tunnel: GW-MAC.

      For example, ET-d8:45:ff:00:01:35, where the GW-MAC is added from the GW-MAC extended community sent along with the route-type 5.

      If a packet is received from the WAN side, and the IP lookup hits an entry for which the next-hop is a EVPN tunnel: GW-MAC, the system looks up the GW-MAC in the FDB. Usually a route-type 2 with the GW-MAC is previously received so that the GW-MAC can be added to the FDB. If the GW-MAC is not present in the FDB, the packet is dropped.

    • IP prefixes with GW-MACs as next-hops are displayed by the show router command, as shown below:

*A:PE71# show router 500 route-table 
===============================================================================
Route Table (Service: 500)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric   
-------------------------------------------------------------------------------
10.20.20.72/32                                Remote  BGP EVPN  00h23m50s  169
       10.10.10.72                                                  0
10.30.30.0/24                                 Remote  BGP EVPN  01d11h30m  169
       evi-504 (ET-d8:45:ff:00:01:35)                               0
10.10.10.0/24                                Remote  BGP VPN   00h20m52s  170
       192.0.2.69 (tunneled)                                        0
10.1.0.0/16                                  Remote  BGP EVPN  00h22m33s  169
       evi-504 (ET-d8:45:ff:00:01:35)                               0
-------------------------------------------------------------------------------
No. of Routes: 4

The GW-MAC as well as the rest of the IP prefix BGP attributes are displayed by the show router bgp routes evpn ip-prefix command.

*A:Dut-A# show router bgp routes evpn ip-prefix prefix 3.0.1.6/32 detail
===============================================================================
BGP Router ID:10.20.1.1        AS:100         Local AS:100       
===============================================================================
Legend -
Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
Origin codes  : i - IGP, e - EGP, ? - incomplete, > - best, b - backup
 
===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
-------------------------------------------------------------------------------
Original Attributes
 
Network        : N/A
Nexthop        : 10.20.1.2
From           : 10.20.1.2
Res. Nexthop   : 192.168.19.1
Local Pref.    : 100                    Interface Name : NotAvailable
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : 0
AIGP Metric    : None                  
Connector      : None
Community      : target:100:1 mac-nh:00:00:01:00:01:02
                 bgp-tunnel-encap:VXLAN
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 10.20.1.2
Flags          : Used  Valid  Best  IGP 
Route Source   : Internal              
AS-Path        : No As-Path
EVPN type      : IP-PREFIX             
ESI            : N/A                    Tag            : 1
Gateway Address: 00:00:01:00:01:02     
Prefix         : 3.0.1.6/32             Route Dist.    : 10.20.1.2:1
MPLS Label    : 262140
Route Tag      : 0xb
Neighbor-AS    : N/A
Orig Validation: N/A                   
Source Class   : 0                      Dest Class     : 0
 
Modified Attributes
 
Network        : N/A                 
Nexthop        : 10.20.1.2
From           : 10.20.1.2
Res. Nexthop   : 192.168.19.1
Local Pref.    : 100                    Interface Name : NotAvailable
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : 0
AIGP Metric    : None                  
Connector      : None
Community      : target:100:1 mac-nh:00:00:01:00:01:02
                 bgp-tunnel-encap:VXLAN
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 10.20.1.2
Flags          : Used  Valid  Best  IGP 
Route Source   : Internal              
AS-Path        : 111
EVPN type      : IP-PREFIX             
ESI            : N/A                    Tag            : 1
Gateway Address: 00:00:01:00:01:02     
Prefix         : 3.0.1.6/32             Route Dist.    : 10.20.1.2:1
MPLS Label    : 262140
Route Tag      : 0xb
Neighbor-AS    : 111
Orig Validation: N/A                   
Source Class   : 0                      Dest Class     : 0
 
-------------------------------------------------------------------------------
Routes : 1
===============================================================================

EVPN tunneling is also supported on IPv6 VPRN interfaces. When sending IPv6 prefixes from IPv6 interfaces, the GW-MAC in the route type 5 (IP-prefix route) is always zero. If no specific Global Address is configured on the IPv6 interface, the routes type 5 for IPv6 prefixes are always sent using the Link Local Address as GW-IP. The following example output shows an IPv6 prefix received via BGP EVPN.

*A:PE71# show router 30 route-table ipv6 
 
===============================================================================
IPv6 Route Table (Service: 30)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric   
-------------------------------------------------------------------------------
2001:db8:1000::/64                            Local   Local     00h01m19s  0
       int-PE-71-CE-1                                               0
2001:db8:2000::1/128                          Remote  BGP EVPN  00h01m20s  169
       fe80::da45:ffff:fe00:6a-"int-evi-301"                        0
-------------------------------------------------------------------------------
No. of Routes: 2
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================
 
*A:PE71# show router bgp routes evpn ipv6-prefix prefix 2001:db8:2000::1/128 hunt 
===============================================================================
 BGP Router ID:192.0.2.71       AS:64500       Local AS:64500      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked
 Origin codes  : i - IGP, e - EGP, ? - incomplete, > - best, b - backup
 
===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
Network        : N/A
Nexthop        : 192.0.2.69
From           : 192.0.2.69
Res. Nexthop   : 192.168.19.2
Local Pref.    : 100                    Interface Name : int-71-69
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : 0
AIGP Metric    : None                   
Connector      : None
Community      : target:64500:301 bgp-tunnel-encap:VXLAN
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.69
Flags          : Used  Valid  Best  IGP  
Route Source   : Internal
AS-Path        : No As-Path
EVPN type      : IP-PREFIX              
ESI            : N/A                    Tag            : 301
Gateway Address: fe80::da45:ffff:fe00:* 
Prefix         : 2001:db8:2000::1/128   Route Dist.    : 192.0.2.69:301
MPLS Label     : 0                      
Route Tag      : 0                      
Neighbor-AS    : N/A
Orig Validation: N/A                    
Source Class   : 0                      Dest Class     : 0
Add Paths Send : Default                
Last Modified  : 00h41m17s              
 
-------------------------------------------------------------------------------
RIB Out Entries
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Routes : 1
=============================================================================== 

EVPN-VPWS for VXLAN tunnels

BGP-EVPN control plane for EVPN-VPWS

EVPN-VPWS uses route-type 1 and route-type 4; it does not use route-types 2, 3 or 5. EVPN-VPWS BGP extensions shows the encoding of the required extensions for the Ethernet A-D per-EVI routes. The encoding follows the guidelines described in RFC 8214.

Figure 11. EVPN-VPWS BGP extensions

If the advertising PE has an access SAP-SDP or spoke SDP that is not part of an Ethernet Segment (ES), the PE populates the fields of the AD per-EVI route with the following values:

  • Ethernet Tag ID field is encoded with the value configured by the user in the service bgp-evpn local-attachment-circuit eth-tag value command.

  • RD and MPLS label values are encoded as specified in RFC 7432. For VXLAN, the MPLS field encodes the VXLAN VNI.

  • ESI is 0.

  • The route is sent along an EVPN L2 attributes extended community, as specified in RFC 8214, where:

    • type and subtype are 0x06 and 0x04 as allocated by IANA

    • flag C is set if a control word is configured in the service; C is always zero for VXLAN tunnels

    • P and B flags are zero

    • L2 MTU is encoded with a service MTU configured in the Epipe service

If the advertising PE has an access SAP-SDP or spoke SDP that is part of an ES, the AD per-EVI route is sent with the information described above, with the following minor differences:

  • The ESI encodes the corresponding non-zero value.

  • The P and B flags are set in the following cases:

    • All-active multihoming

      • All PEs that are part of the ES always set the P flag.

      • The B flag is never set in the all-active multihoming ES case.

    • Single-active multihoming

      • Only the DF PE sets the P bit for an EVI and the remaining PEs send it as P=0.

      • Only the backup DF PE sets the B bit.

        If more than two PEs are present in the same single-active ES, the backup PE is the winner of a second DF election (excluding the DF). The remaining non-DF PEs send B=0.

Also, ES and AD per-ES routes are advertised and processed for the Ethernet-Segment, as described in RFC 7432 ESs. The ESI label sent with the AD per-ES route is used by BUM traffic on VPLS services; it is not used for Epipe traffic.

EVPN-VPWS for VXLAN tunnels in Epipe services

BGP-EVPN can be enabled in Epipe services with either SAPs or spoke SDPs at the access, as shown in EVPN-MPLS VPWS.

Figure 12. EVPN-MPLS VPWS

EVPN-VPWS is supported in VXLAN networks that also run EVPN-VXLAN in VPLS services. From a control plane perspective, EVPN-VPWS is a simplified point-to-point version of RFC 7432 for E-Line services for the following reasons:

  • EVPN-VPWS does not use inclusive multicast, MAC/IP routes or IP-prefix routes.

  • AD Ethernet per-EVI routes are used to advertise the local attachment circuit identifiers at each side of the VPWS instance. The attachment circuit identifiers are configured as local and remote Ethernet tags. When an AD per-EVI route is imported and the Ethernet tag matches the configured remote Ethernet tag, an EVPN destination is created for the Epipe.

In the following configuration example, Epipe 2 is an EVPN-VPWS service between PE2 and PE4 (as shown in EVPN-MPLS VPWS).

PE2>config>service>epipe(2)#
-----------------------
vxlan vni 2 instance 1 create
exit
bgp
exit
bgp-evpn
  evi 2
  local-attachment-circuit "AC-1" 
    eth-tag 100
  remote-attachment-circuit "AC-2" 
    eth-tag 200
  vxlan bgp 1 vxlan-instance 1
    ecmp 2
    no shutdown
sap 1/1/1:1 create
PE4>config>service>epipe(2)#
-----------------------
vxlan vni 2 instance 1 create
exit
bgp
exit
bgp-evpn
  evi 2
  local-attachment-circuit "AC-2" 
    eth-tag 200
  remote-attachment-circuit "AC-1" 
    eth-tag 100
  vxlan bgp 1 vxlan-instance 1
    ecmp 2
    no shutdown
spoke-sdp 1:1

The following considerations apply to the preceding example configuration:

  • When the EVI value is lower than 65535, the EVI is used to automatically derive the route-target or route-distinguisher of the service. For EVI values greater than 65535, the route-distinguisher is not automatically derived and the route-target is automatically derived, if evi-three-byte-auto-rt is configured. The EVI values must be unique in the system regardless of the type of service to which they are assigned (Epipe or VPLS).

  • Support for the following BGP-EVPN commands in Epipe services is the same as in VPLS services:

    • vxlan bgp 1 vxlan-instance 1

    • vxlan send-tunnel-encap

    • vxlan shutdown

    • vxlan ecmp

  • The following BGP-EVPN commands identify the local and remote attachment circuits, with the configured Ethernet tags encoded in the advertised and received AD Ethernet per-EVI routes:

    • local-attachment-circuit name

    • local-attachment-circuit name eth-tag tag-value; where tag-value is 1 to 16777215

    • remote-attachment-circuit name

    • remote-attachment-circuit name eth-tag tag-value; where tag-value is 1 to 16777215

      Changes to remote Ethernet tags are allowed without shutting down BGP-EVPN VXLAN or the Epipe service. The local AC Ethernet tag value cannot be changed without BGP-EVPN VXLAN shutdown.

      Both local and remote Ethernet tags are mandatory to bring up the Epipe service.

EVPN-VPWS Epipes can also be configured with the following characteristics:

  • Access attachment circuits can be SAPs or spoke SDP. Only manually-configured spoke SDP is supported; BGP-VPWS and endpoints are not supported. The VC switching configuration is not supported on BGP-EVPN enabled pipes.

  • EVPN-VPWS Epipes can advertise the Layer 2 (service) MTU and check its consistency as follows:

    1. The advertised MTU value is taken from the configured service MTU in the Epipe service.

    2. The received L2 MTU is compared to the local value. In case of a mismatch between the received MTU and the configured service MTU, the system does not set up the EVPN destination; as a result, the service does not come up.

      Consider the following:

      • The system does not check the network port MTU value.

      • If the received L2 MTU value is 0, the MTU is ignored.

Using A/S PW and MC-LAG with EVPN-VPWS Epipes

The use of A/S PW (for access spoke SDP) and MC-LAG (for access SAPs) provides an alternative redundant solution for EVPN-VPWS that do not use the EVPN multi homing procedures described in RFC 8214. A/S PW and MC-LAG support on EVPN-VPWS shows the use of both mechanisms in a single Epipe.

Figure 13. A/S PW and MC-LAG support on EVPN-VPWS

In A/S PW and MC-LAG support on EVPN-VPWS, an A/S PW connects the CE to PE1 and PE2 (left side of the diagram), and an MC-LAG connects the CE to PE3 and PE4 (right side of the diagram). As EVPN multi homing is not used, there are no AD per-ES routes or ES routes. The redundancy is handled as follows:

  • PE1 and PE2 are configured with Epipe-1, where a spoke SDP connects the service in each PE to the access CE. The local AC Ethernet tag is 1 and the remote AC Ethernet tag is 2 (in PE1/PE2).

  • PE3 and PE4 are configured with Epipe-1, where each PE has a lag SAP that belongs to a previously-configured MC-LAG construct. The local AC Ethernet tag is 2 and the remote AC Ethernet tag is 1.

  • An endpoint and A/S PW is configured on the CE on the left side of the diagram. PE1/PE2 are able to advertise Ethernet tag 1 based on the operating status or the forwarding status of the spoke SDP.

    For example, if PE1 receives a standby PW status indication from the CE and the previous status was forward, it withdraws the AD EVI route for Ethernet tag 1. If PE2 receives a forward PW status indication and the previous status was standby or down, it advertises the AD EVI route for Ethernet tag 1.

  • The user can configure MC-LAG for access SAPs using the example configuration of PE3 and PE4, as shown in A/S PW and MC-LAG support on EVPN-VPWS. In this case, the MC-LAG determines which chassis is active and which is standby.

    If PE4 becomes the standby chassis, the entire LAG port is brought down. As a result, the SAP goes operationally down and PE4 withdraws any previous AD EVI routes for Ethernet tag 2.

    If PE3 becomes the active chassis, the LAG port becomes operationally up. As a result, the SAP and the PE3 advertise the AD per-EVI route for Ethernet tag 2.

EVPN multihoming for EVPN-VPWS services

EVPN multihoming is supported for EVPN-VPWS Epipe services with the following considerations:

  • Single-active and all-active multihoming is supported for SAPs and spoke SDP.

  • ESs can be shared between the Epipe (MPLS and VXLAN) and VPLS (MPLS) services for LAGs, ports, and SDPs.

  • No split-horizon function is required because no traffic exists between the Designated Forwarder (DF) and the non-DF for Epipe services. As a result, the ESI label is never used, and the following commands do not affect Epipe services. Additionally, configure the single-active-no-esi-label or all-active-no-esi-label modes to increase the scale of Ethernet Segments for EVPN VPWS services.

    • MD-CLI
      configure service system bgp evpn ethernet-segment multi-homing-mode single-active-no-esi-label
      configure service system bgp evpn ethernet-segment multi-homing-mode all-active-no-esi-label
      configure service system bgp evpn ethernet-segment pbb source-bmac-lsb
    • classic CLI
      configure service system bgp-evpn ethernet-segment multi-homing single-active no-esi-label
      configure service system bgp-evpn ethernet-segment multi-homing all-active no-esi-label
      configure service system bgp-evpn ethernet-segment source-bmac-lsb
  • The local Ethernet tag values must match on all PEs that are part of the same ES, regardless of the multi homing mode. The PEs in the ES use the AD per-EVI routes from the peer PEs to validate the PEs as DF election candidates for a specific EVI.

The DF election for Epipes that is defined in an all-active multi homing ES is not relevant because all PEs in the ES behave in the same way as follows:

  • All PEs send P=1 on the AD per-EVI routes.

  • All PEs can send upstream and downstream traffic, regardless of whether the traffic is unicast, multicast, or broadcast (all traffic is treated as unicast in the Epipe services).

    Therefore, the following tools command shows N/A when all-active multihoming is configured.

    *A:PE-2# tools dump service system bgp-evpn ethernet-segment "ESI-12" evi 6000 df
    [03/18/2016 20:31:35] All Active VPWS - DF N/A
    

Aliasing is supported for traffic sent to an ES destination. If ECMP is enabled on the ingress PE, per-flow load balancing is performed to all PEs that advertise P=1. The PEs that advertise P=0, are not considered as next hops for an ES destination.

Note: The ingress PE load balances the traffic if shared queuing or ingress policing is enabled on the access SAPs.

Although DF election is not relevant for Epipes in an all-active multi homing ES, it is essential for the following forwarding and backup functions in a single-active multihoming ES:

  • The PE elected as DF is the primary PE for the ES in the Epipe. The primary PE unblocks the SAP or spoke SDP for upstream and downstream traffic; the remaining PEs in the ES bring their ES SAPs or spoke SDPs operationally down.

  • The DF candidate list is built from the PEs sending ES routes for the same ES and is pruned for a specific service, depending on the availability of the AD per-ES and per-EVI routes.

  • When the SAP or spoke SDPs that are part of the ES come up, the AD per-EVI routes are sent with P=0 and B=0. The remote PEs do not start sending traffic until the DF election process is complete and the ES activation timer is expired, and the PEs advertise AD per-EVI routes with P and B bits other than zero.

  • The backup PE function is supported as defined in RFC 8214. The primary PE, backup, or none status is signaled by the PEs (part of the same single-active MH ES) in the P or B flags of the EVPN L2 attributes extended community. EVPN-VPWS single-active multihoming shows the advertisement and use of the primary, backup, or none indication by the PEs in the ES.

    Figure 14. EVPN-VPWS single-active multihoming

    As specified in RFC 7432, the remote PEs in VPLS services have knowledge of the primary PE in the remote single-active ES, based on the advertisement of the MAC/IP routes because only the DF learns and advertises MAC/IP routes.

    Because there are no MAC/IP routes in EVPN-VPWS, the remote PEs can forward the traffic based on the P/B bits. The process is described in the following list:

    1. The DF PE for an EVI (PE1) sends P=1 and B=0.

    2. For each ES or EVI, a second DF election is run among the PEs in the backup candidate list to elect the backup PE. The backup PE sends P=0 and B=1 (PE2).

    3. All remaining multi homing PEs send P=0 and B=0 (PE3 and PE4).

    4. At the remote PEs (PE5), the P and B flags are used to identify the primary and backup PEs within the ES destination. The traffic is then sent to the primary PE, provided that it is active.

  • When a remote PE receives the withdrawal of an Ethernet AD per-ES (or per-EVI) route from the primary PE, the remote PE immediately switches the traffic to the backup PE for the affected EVIs. The backup PE takes over immediately without waiting for the ES activation timer to bring up its SAP or spoke SDP.

  • The BGP-EVPN MPLS ECMP setting also governs the forwarding in single-active multi homing, regardless of the single-active multi homing bit in the AD per-ES route received at the remote PE (PE5).

    • PE5 always sends the traffic to the primary remote PE (the owner of the P=1 bit). In case of multiple primary PEs and ECMP>1, PE5 load balances the traffic to all primary PEs, regardless of the multi homing mode.

    • If the last primary PE withdraws its AD per-EVI or per-ES route, PE5 sends the traffic to the backup PE or PEs. In case of multiple backup PEs and ECMP>1, PE1 load balances the traffic to the backup PEs.

Non-system IPv4/IPv6 VXLAN termination for EVPN-VPWS services

EVPN-VPWS services support non-system IPv4/IPv6 VXLAN termination. For system configuration information, see Non-system IPv4 and IPv6 VXLAN termination in VPLS, R-VPLS, and Epipe services.

EVPN multihoming is supported when the PEs use non-system IP termination, however additional configuration steps are needed in this case:

  • The configure service system bgp-evpn eth-seg es-orig-ip ip-address command must be configured with the non-system IPv4/IPv6 address used for the EVPN-VPWS VXLAN service. As a result, this command modifies the originating-ip field in the ES routes advertised for the Ethernet Segment, and makes the system use this IP address when adding the local PE as DF candidate.

  • The configure service system bgp-evpn eth-seg route-next-hop ip-address command must be configured with the non-system IP address, too. The command changes the next-hop of the ES and AD per-ES routes to the configured address.

  • The non-system IP address (in each of the PEs in the ES) must match in these three commands for the local PE to be considered suitable for DF election:

    • es-orig-ip ip-address

    • route-next-hop ip-address

    • vxlan-src-vtep ip-address

EVPN for VXLAN in IRB backhaul R-VPLS services and IP prefixes

Gateway IRB on the DC PE for an L3 EVPN/VXLAN DC shows a Layer 3 DC model, where a VPRN is defined in the DGWs, connecting the tenant to the WAN. That VPRN instance is connected to the VPRNs in the NVEs by means of an IRB backhaul R-VPLS. Because the IRB backhaul R-VPLS provides connectivity only to all the IRB interfaces and the DGW VPRN is not directly connected to all the tenant subnets, the WAN ip-prefixes in the VPRN routing table must be advertised in EVPN. In the same way, the NVEs send IP prefixes in EVPN that is received by the DGW and imported in the VPRN routing table.

Note: To generate or process IP prefixes sent or received in EVPN route type 5, the support for IP route advertisement must be enabled in BGP-EVPN. This is performed through the bgp-evpn ip-route-advertisement command. This command s disabled by default and must be explicitly enabled. The command is tied to the allow-ip-int-bind command required for R-VPLS, and it is not supported on R-VPLS linked to IES services.

Local router interface host addresses are not advertised in EVPN by default. To advertise them, the ip-route-advertisement incl-host command must be enabled. For example:

===============================================================================
Route Table (Service: 2)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                         Active     Metric
-------------------------------------------------------------------------------
10.1.1.0/24                                   Local   Local     00h00m11s  0
       if                                              Y            0
10.1.1.100/32                                 Local   Host      00h00m11s  0
       if                                              Y            0
==============================================================================

For the case displayed by the output above, the behavior is the following:

  • ip-route-advertisement only local subnet (default) - 10.1.1.0/24 is advertised

  • ip-route-advertisement incl-host local subnet, host - 10.1.1.0/24 and 10.1.1.100/32 are advertised

Below is an example of VPRN (500) with two IRB interfaces connected to backhaul R-VPLS services 501 and 502 where EVPN-VXLAN runs:

vprn 500 customer 1 create            
            ecmp 4
            route-distinguisher 65072:500
            vrf-target target:65000:500
            interface "evi-502" create
                address 10.20.20.72/24
                vpls "evpn-vxlan-502"
                exit
            exit
            interface "evi-501" create
                address 10.10.10.72/24
                vpls "evpn-vxlan-501"
                exit
            exit
            no shutdown
vpls 501 name "evpn-vxlan-501" customer 1 create
            allow-ip-int-bind
            vxlan instance 1 vni 501 create
            exit
            bgp
                route-distinguisher 65072:501
                route-target export target:65000:501 import target:65000:501
            exit
            bgp-evpn
                ip-route-advertisement incl-host
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit                  
            exit
            no shutdown
        exit
vpls 502 name "evpn-vxlan-502" customer 1 create
            allow-ip-int-bind
            vxlan instance 1 vni 502 create
            exit
            bgp
                route-distinguisher 65072:502
                route-target export target:65000:502 import target:65000:502
            exit
            bgp-evpn
                ip-route-advertisement incl-host
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            no shutdown
        exit

When the above commands are enabled, the router behaves as follows:

  • Receive route-type 5 routes and import the IP prefixes and associated IP next-hops into the VPRN routing table.

    • If the route-type 5 is successfully imported by the router, the prefix included in the route-type 5 (for example, 10.0.0.0/24), is added to the VPRN routing table with a next-hop equal to the gateway IP included in the route (for example, 192.0.0.1. that refers to the IRB IP address of the remote VPRN behind which the IP prefix sits).

    • When the router receives a packet from the WAN to the 10.0.0.0/24 subnet, the IP lookup on the VPRN routing table yields 192.0.0.1 as the next-hop. That next-hop is resolved to a MAC in the ARP table and the MAC resolved to a VXLAN tunnel in the FDB table

      Note: IRB MAC and IP addresses are advertised in the IRB backhaul R-VPLS in routes type 2.
  • Generate route-type 5 routes for the IP prefixes in the associated VPRN routing table.

    For example, if VPRN-1 is attached to EVPN R-VPLS 1 and EVPN R-VPLS 2, and R-VPLS 2 has bgp-evpn ip-route-advertisement configured, the 7750 SR advertises the R-VPLS 1 interface subnet in one route-type 5.

  • Routing policies can filter the imported and exported IP prefix routes accordingly.

The VPRN routing table can receive routes from all the supported protocols (BGP-VPN, OSPF, IS-IS, RIP, static routing) as well as from IP prefixes from EVPN, as shown below:

*A:PE72# show router 500 route-table                      
===============================================================================
Route Table (Service: 500)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric   
-------------------------------------------------------------------------------
10.20.20.0/24                                 Local   Local     01d11h10m  0
       evi-502                                                      0
10.20.20.71/32                                Remote  BGP EVPN  00h02m26s  169
       10.10.10.71                                                  0
10.10.10.0/24                                Remote  Static    00h00m05s  5
       10.10.10.71                                                  1
10.16.0.1/32                                 Remote  BGP EVPN  00h02m26s  169
       10.10.10.71                                                  0
-------------------------------------------------------------------------------
No. of Routes: 4

The following considerations apply:

  • The route Preference for EVPN IP prefixes is 169.

    BGP IP-VPN routes have a preference of 170 by default, therefore, if the same route is received from the WAN over BGP-VPRN and from BGP-EVPN, then the EVPN route is preferred.

  • When the same route-type 5 prefix is received from different gateway IPs, ECMP is supported if configured in the VPRN.

  • All routes in the VPRN routing table (as long as they do not point back to the EVPN R-VPLS interface) are advertised via EVPN.

Although the description above is focused on IPv4 interfaces and prefixes, it applies to IPv6 interfaces too. The following considerations are specific to IPv6 VPRN R-VPLS interfaces:

  • IPv4 and IPv6 interfaces can be defined on R-VPLS IP interfaces at the same time (dual-stack).

  • The user may configure specific IPv6 Global Addresses on the VPRN R-VPLS interfaces. If a specific Global IPv6 Address is not configured on the interface, the Link Local Address interface MAC/IP is advertised in a route type 2 as soon as IPv6 is enabled on the VPRN R-VPLS interface.

  • Routes type 5 for IPv6 prefixes are advertised using either the configured Global Address or the implicit Link Local Address (if no Global Address is configured).

    If more than one Global Address is configured, normally the first IPv6 address is used as gateway IP. The ‟first IPv6 address” refers to the first one on the list of IPv6 addresses shown through the show router <id> interface interface IPv6 or through SNMP.

    The rest of the addresses are advertised only in MAC-IP routes (Route Type 2) but not used as gateway IP for IPv6 prefix routes.

EVPN for VXLAN in EVPN tunnel R-VPLS services

EVPN-tunnel gateway IRB on the DC PE for an L3 EVPN/VXLAN DC shows an L3 connectivity model that optimizes the solution described in EVPN for VXLAN in IRB backhaul R-VPLS services and IP prefixes. Instead of regular IRB backhaul R-VPLS services for the connectivity of all the VPRN IRB interfaces, EVPN tunnels can be configured. The main advantage of using EVPN tunnels is that they do not need the configuration of IP addresses, as regular IRB R-VPLS interfaces do.

In addition to the ip-route-advertisement command, this model requires the configuration of the config>service>vprn>if>vpls <name> evpn-tunnel.

Note: The evpn-tunnel can be enabled independently of ip-route-advertisement, however, no route-type 5 advertisements are sent or processed in that case. Neither command, evpn-tunnel and ip-route-advertisement, is supported on R-VPLS services linked to IES interfaces.

The example below shows a VPRN (500) with an EVPN-tunnel R-VPLS (504):

vprn 500 customer 1 create
            ecmp 4
            route-distinguisher 65071:500
            vrf-target target:65000:500
            interface "evi-504" create
                vpls "evpn-vxlan-504"
                    evpn-tunnel
                exit
            exit
            no shutdown
        exit
        vpls 504 name ‟evpn-vxlan-504” customer 1 create
            allow-ip-int-bind
            vxlan instance 1 vni 504 create
            exit
            bgp
                route-distinguisher 65071:504
                route-target export target:65000:504 import target:65000:504
            exit
            bgp-evpn
                ip-route-advertisement
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            no shutdown
        exit

A specified VPRN supports regular IRB backhaul R-VPLS services as well as EVPN tunnel R-VPLS services.

Note: EVPN tunnel R-VPLS services do not support SAPs or SDP-binds.

The process followed upon receiving a route-type 5 on a regular IRB R-VPLS interface differs from the one for an EVPN-tunnel type:

  • IRB backhaul R-VPLS VPRN interface:

    • When a route-type 2 that includes an IP prefix is received and it becomes active, the MAC/IP information is added to the FDB and ARP tables. This can be checked with the show router arp command and the show service id fdb detail command.

    • When route-type 5 is received and becomes active for the R-VPLS service, the IP prefix is added to the VPRN routing table, regardless of the existence of a route-type 2 that can resolve the gateway IP address. If a packet is received from the WAN side and the IP lookup hits an entry for which the gateway IP (IP next-hop) does not have an active ARP entry, the system uses ARP to get a MAC. If ARP is resolved but the MAC is unknown in the FDB table, the system floods into the TLS multicast list. Routes type 5 can be checked in the routing table with the show router route-table and show router fib commands.

  • EVPN tunnel R-VPLS VPRN interface:

    • When route-type 2 is received and becomes active, the MAC address is added to the FDB (only).

    • When a route-type 5 is received and active, the IP prefix is added to the VPRN routing table with next-hop equal to EVPN tunnel: GW-MAC.

      For example, ET-d8:45:ff:00:01:35, where the GW-MAC is added from the GW-MAC extended community sent along with the route-type 5.

      If a packet is received from the WAN side, and the IP lookup hits an entry for which the next-hop is a EVPN tunnel: GW-MAC, the system looks up the GW-MAC in the FDB. Usually a route-type 2 with the GW-MAC is previously received so that the GW-MAC can be added to the FDB. If the GW-MAC is not present in the FDB, the packet is dropped.

    • IP prefixes with GW-MACs as next-hops are displayed by the show router command, as shown below:

*A:PE71# show router 500 route-table 
===============================================================================
Route Table (Service: 500)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric   
-------------------------------------------------------------------------------
10.20.20.72/32                                Remote  BGP EVPN  00h23m50s  169
       10.10.10.72                                                  0
10.30.30.0/24                                 Remote  BGP EVPN  01d11h30m  169
       evi-504 (ET-d8:45:ff:00:01:35)                               0
10.10.10.0/24                                Remote  BGP VPN   00h20m52s  170
       192.0.2.69 (tunneled)                                        0
10.1.0.0/16                                  Remote  BGP EVPN  00h22m33s  169
       evi-504 (ET-d8:45:ff:00:01:35)                               0
-------------------------------------------------------------------------------
No. of Routes: 4

The GW-MAC as well as the rest of the IP prefix BGP attributes are displayed by the show router bgp routes evpn ip-prefix command.

*A:Dut-A# show router bgp routes evpn ip-prefix prefix 3.0.1.6/32 detail
===============================================================================
BGP Router ID:10.20.1.1        AS:100         Local AS:100       
===============================================================================
Legend -
Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
Origin codes  : i - IGP, e - EGP, ? - incomplete, > - best, b - backup
 
===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
-------------------------------------------------------------------------------
Original Attributes
 
Network        : N/A
Nexthop        : 10.20.1.2
From           : 10.20.1.2
Res. Nexthop   : 192.168.19.1
Local Pref.    : 100                    Interface Name : NotAvailable
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : 0
AIGP Metric    : None                  
Connector      : None
Community      : target:100:1 mac-nh:00:00:01:00:01:02
                 bgp-tunnel-encap:VXLAN
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 10.20.1.2
Flags          : Used  Valid  Best  IGP 
Route Source   : Internal              
AS-Path        : No As-Path
EVPN type      : IP-PREFIX             
ESI            : N/A                    Tag            : 1
Gateway Address: 00:00:01:00:01:02     
Prefix         : 3.0.1.6/32             Route Dist.    : 10.20.1.2:1
MPLS Label    : 262140
Route Tag      : 0xb
Neighbor-AS    : N/A
Orig Validation: N/A                   
Source Class   : 0                      Dest Class     : 0
 
Modified Attributes
 
Network        : N/A                 
Nexthop        : 10.20.1.2
From           : 10.20.1.2
Res. Nexthop   : 192.168.19.1
Local Pref.    : 100                    Interface Name : NotAvailable
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : 0
AIGP Metric    : None                  
Connector      : None
Community      : target:100:1 mac-nh:00:00:01:00:01:02
                 bgp-tunnel-encap:VXLAN
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 10.20.1.2
Flags          : Used  Valid  Best  IGP 
Route Source   : Internal              
AS-Path        : 111
EVPN type      : IP-PREFIX             
ESI            : N/A                    Tag            : 1
Gateway Address: 00:00:01:00:01:02     
Prefix         : 3.0.1.6/32             Route Dist.    : 10.20.1.2:1
MPLS Label    : 262140
Route Tag      : 0xb
Neighbor-AS    : 111
Orig Validation: N/A                   
Source Class   : 0                      Dest Class     : 0
 
-------------------------------------------------------------------------------
Routes : 1
===============================================================================

EVPN tunneling is also supported on IPv6 VPRN interfaces. When sending IPv6 prefixes from IPv6 interfaces, the GW-MAC in the route type 5 (IP-prefix route) is always zero. If no specific Global Address is configured on the IPv6 interface, the routes type 5 for IPv6 prefixes are always sent using the Link Local Address as GW-IP. The following example output shows an IPv6 prefix received through BGP EVPN.

*A:PE71# show router 30 route-table ipv6 
 
===============================================================================
IPv6 Route Table (Service: 30)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric   
-------------------------------------------------------------------------------
2001:db8:1000::/64                            Local   Local     00h01m19s  0
       int-PE-71-CE-1                                               0
2001:db8:2000::1/128                          Remote  BGP EVPN  00h01m20s  169
       fe80::da45:ffff:fe00:6a-"int-evi-301"                        0
-------------------------------------------------------------------------------
No. of Routes: 2
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================
 
*A:PE71# show router bgp routes evpn ipv6-prefix prefix 2001:db8:2000::1/128 hunt 
===============================================================================
 BGP Router ID:192.0.2.71       AS:64500       Local AS:64500      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked
 Origin codes  : i - IGP, e - EGP, ? - incomplete, > - best, b - backup
 
===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
Network        : N/A
Nexthop        : 192.0.2.69
From           : 192.0.2.69
Res. Nexthop   : 192.168.19.2
Local Pref.    : 100                    Interface Name : int-71-69
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : 0
AIGP Metric    : None                   
Connector      : None
Community      : target:64500:301 bgp-tunnel-encap:VXLAN
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.69
Flags          : Used  Valid  Best  IGP  
Route Source   : Internal
AS-Path        : No As-Path
EVPN type      : IP-PREFIX              
ESI            : N/A                    Tag            : 301
Gateway Address: fe80::da45:ffff:fe00:* 
Prefix         : 2001:db8:2000::1/128   Route Dist.    : 192.0.2.69:301
MPLS Label     : 0                      
Route Tag      : 0                      
Neighbor-AS    : N/A
Orig Validation: N/A                    
Source Class   : 0                      Dest Class     : 0
Add Paths Send : Default                
Last Modified  : 00h41m17s              
 
-------------------------------------------------------------------------------
RIB Out Entries
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Routes : 1
=============================================================================== 

Layer 2 multicast optimization for VXLAN (Assisted-Replication)

The Assisted-Replication feature for IPv4 VXLAN tunnels (both Leaf and Replicator functions) is supported in compliance with the non-selective mode described in IETF Draft draft-ietf-bess-evpn-optimized-ir.

The Assisted-Replication feature is a Layer 2 multicast optimization feature that helps software-based PE and NVEs with low-performance replication capabilities to deliver broadcast and multicast Layer 2 traffic to remote VTEPs in the VPLS service.

The EVPN and proxy-ARP/ND capabilities can reduce the amount of broadcast and unknown unicast in the VPLS service; ingress replication is sufficient for most use cases in this scenario. However, when multicast applications require a significant amount of replication at the ingress node, software-based nodes struggle because of their limited replication performance. By enabling the Assisted-Replication Leaf function on the software-based SR-series router, all the broadcast and multicast packets are sent to a 7x50 router configured as a Replicator, which replicates the traffic to all the VTEPs in the VPLS service on behalf of the Leaf. This guarantees that the broadcast or multicast traffic is delivered to all the VPLS participants without any packet loss caused by performance issues.

The Leaf or Replicator function is enabled per VPLS service by the configure service vpls vxlan assisted-replication {replicator | leaf} command. In addition, the Replicator requires the configuration of an Assisted-Replication IP (AR-IP) address. The AR-IP loopback address indicates whether the received VXLAN packets have to be replicated to the remote VTEPs. The AR-IP address is configured using the configure service system vxlan assisted-replication-ip <ip-address> command.

Based on the assisted-replication {replicator | leaf} configuration, the SR-series router can behave as a Replicator (AR-R), Leaf (AR-L), or Regular Network Virtualization Edge (RNVE) router. An RNVE router does not support the Assisted-Replication feature. Because it is configured with no assisted replication, the RNVE router ignores the AR-R and AR-L information and replicates to its flooding list where VTEPs are added based on the regular ingress replication routes.

Replicator (AR-R) procedures

An AR-R configuration is shown in the following example.

*A:PE-2>config>service>system>vxlan# info 
----------------------------------------------
            assisted-replication-ip 10.2.2.2
----------------------------------------------
*A:PE-2>config>service>vpls# info 
----------------------------------------------
            vxlan instance 1 vni 4000 create
                assisted-replication replicator
            exit
            bgp
            exit
            bgp-evpn
                evi 4000
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
<snip>
            no shutdown
----------------------------------------------

In this example configuration, the BGP advertises a new inclusive multicast route with tunnel-type = AR, type (T) = AR-R, and tunnel-id = originating-ip = next-hop = assisted-replication-ip (IP address 10.2.2.2 in the preceding example). In addition to the AR route, the AR-R sends a regular IR route if ingress-repl-inc-mcast-advertisement is enabled.

Note: You should disable the ingress-repl-inc-mcast-advertisement command if the AR-R does not have any SAP or SDP bindings and is used solely for Assisted-Replication functions.

The AR-R builds a flooding list composed of ACs (SAPs and SDP bindings) and VXLAN tunnels to remote nodes in the VPLS. All objects in the flooding list are broadcast/multicast (BM) and unknown unicast (U) capable. The following example output of the show service id vxlan command shows that the VXLAN destinations in the flooding list are tagged as ‟BUM”.

*A:PE-2# show service id 4000 vxlan 
===============================================================================
Vxlan Src Vtep IP: N/A
===============================================================================
VPLS VXLAN, Ingress VXLAN Network Id: 4000
Creation Origin: manual
Assisted-Replication: replicator
RestProtSrcMacAct: none
===============================================================================
VPLS VXLAN service Network Specifics
===============================================================================
Ing Net QoS Policy : none                            Vxlan VNI Id     : 4000
Ingress FP QGrp    : (none)                          Ing FP QGrp Inst : (none)
===============================================================================
Egress VTEP, VNI
===============================================================================
VTEP Address                            Egress VNI  Num. MACs   Mcast Oper  L2
                                                                      State PBR
-------------------------------------------------------------------------------
192.0.2.3                               4000        0           BUM   Up    No
192.0.2.5                               4000        0           BUM   Up    No
192.0.2.6                               4000        0           BUM   Up    No
-------------------------------------------------------------------------------
Number of Egress VTEP, VNI : 3
-------------------------------------------------------------------------------
===============================================================================

When the AR-R receives a BUM packet on an AC, the AR-R forwards the packet to its flooding list (including the local ACs and remote VTEPs).

When the AR-R receives a BM packet on a VXLAN tunnel, it checks the IP DA of the underlay IP header and performs the following BM packet processing.

Note: The AR-R function is only relevant to BM packets; it does not apply to unknown unicast packets. If the AR-R receives unknown unicast packets, it sends them to the flooding list, skipping the VXLAN tunnels.
  • If the destination IP matches its AR-IP, the AR-R forwards the BM packet to its flooding list (ACs and VXLAN tunnels). The AR-R performs source suppression to ensure that the traffic is not sent back to the originating Leaf.

  • If the destination IP matches its regular VXLAN termination IP (IR-IP), the AR-R skips all the VXLAN tunnels from the flooding list and only replicates to the local ACs. This is the default Ingress Replication (IR) behavior.

Leaf (AR-L) procedures

An AR-L is configured as shown in the following example.

A:PE-3>config>service>vpls# info 
----------------------------------------------
            vxlan instance 1 vni 4000 create
                assisted-replication leaf replicator-activation-time 30
            bgp
            exit
            bgp-evpn
                evi 4000
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
                mpls
                    shutdown
                exit
            exit
            stp
                shutdown
            exit
            sap 1/1/1:4000 create
                no shutdown
            exit
            no shutdown
----------------------------------------------

In this example configuration, the BGP advertises a new inclusive multicast route with a tunnel-type = IR, type (T) = AR-L and tunnel-id = originating-ip = next-hop = IR-IP (IP address terminating VXLAN normally, either system-ip or vxlan-src-vtep address).

The AR-L builds a single flooding list per service but controlled by the BM and U flags. These flags are displayed in the following show service id vxlan command example output.

A:PE-3# show service id 4000 vxlan 
===============================================================================
Vxlan Src Vtep IP: N/A
===============================================================================
VPLS VXLAN, Ingress VXLAN Network Id: 4000
Creation Origin: manual
Assisted-Replication: leaf      Replicator-Activation-Time: 30
RestProtSrcMacAct: none
===============================================================================
VPLS VXLAN service Network Specifics
===============================================================================
Ing Net QoS Policy : none                            Vxlan VNI Id     : 4000
Ingress FP QGrp    : (none)                          Ing FP QGrp Inst : (none)
===============================================================================
Egress VTEP, VNI
===============================================================================
VTEP Address                            Egress VNI  Num. MACs   Mcast Oper  L2
                                                                      State PBR
-------------------------------------------------------------------------------
10.2.2.2                                 4000        0           BM    Up    No
10.4.4.4                                 4000        0           -     Up    No
192.0.2.2                               4000        0           U     Up    No
192.0.2.5                               4000        0           U     Up    No
192.0.2.6                               4000        0           U     Up    No
-------------------------------------------------------------------------------
Number of Egress VTEP, VNI : 5
-------------------------------------------------------------------------------
===============================================================================

The AR-L creates the following VXLAN destinations when it receives and selects a Replicator-AR route or the Regular-IR routes:

  • A VXLAN destination to each remote PE that sent an IR route. These bindings have the U flag set.

  • A VXLAN destination to the selected AR-R. These bindings have only the BM flag set; the U flag is not set.

  • The non-selected AR-Rs create a binding with flag ‟-” (in the CPM) that is displayed by the show service id vxlan command. Although the VXLAN destinations to non-selected AR-Rs do not carry any traffic, the destinations count against the total limit and must be considered when accounting for consumed VXLAN destinations in the router.

The BM traffic is only sent to the selected AR-R, whereas the U (unknown unicast) traffic is sent to all the destinations with the U flag.

The AR-L performs per-service load-balancing of the BM traffic when two or more AR-Rs exist in the same service. The AR Leaf creates a list of candidate PEs for each AR-R (ordered by IP and VNI; candidate 0 being the lowest IP and VNI). The replicator is selected out of a modulo function of the service-id and the number of replicators, as shown in the following example output.

A:PE-3# show service id 4000 vxlan assisted-replication replicator 
===============================================================================
Vxlan AR Replicator Candidates
===============================================================================
VTEP Address           Egress VNI     In Use  In Candidate List Pending Time
-------------------------------------------------------------------------------
10.2.2.2                4000           yes     yes               0
10.4.4.4                4000           no      yes               0
-------------------------------------------------------------------------------
Number of entries : 2
-------------------------------------------------------------------------------
===============================================================================

A change in the number of Replicator-AR routes (for example, if a route is withdrawn or a new route appears) affects the result of the hashing, which may cause a different AR-R to be selected.

Note: An AR-L waits for the configured replicator-activation-time before sending the BM packets to the AR-R. In the interim, the AR-L uses regular ingress replication procedures. This activation time allows the AR-R to program the Leaf VTEP. If the timer is zero, the AR-R may receive packets from a not-yet-programmed source VTEP, in which case it discards the packets.

The following list summarizes other aspects of the AR-L behavior:

  • When a Leaf receives a BM packet on an AC, it sends the packet to its flood list that includes access SAP or SDP bindings and VXLAN destinations with BM or BUM flags. If a single AR-R is selected, only a VXLAN destination includes the BM flags.

  • Control plane-generated BM packets, such as ARP/ND (when proxy-ARP/ND is enabled) or Eth-CFM, follow the behavior of regular data plane BM packets.

  • When a Leaf receives an unknown unicast packet on an AC, it sends the packet to the flood-list, skipping the AR destination because the U flag is set to 0. To avoid packet re-ordering, the unknown unicast packets do not go through the AR-R.

  • When a Leaf receives a BUM packet on an overlay tunnel, it forwards the packet to the flood list, skipping the VXLAN tunnels (that is, the packet is sent to the local ACs and never to a VXLAN tunnel). This is the default IR behavior.

  • When the last Replicator-AR route is withdrawn, the AR-L removes the AR destination from the flood list and falls back to ingress replication.

AR BM replication behavior for a BM packet shows the expected replication behavior for BM traffic when received at the access on an AR-R, AR-L, or RNVE router. Unknown unicast follows regular ingress replication behavior regardless of the role of the ingress node for the specific service.

Figure 15. AR BM replication behavior for a BM packet

Assisted-Replication interaction with other VPLS features

The Assisted-Replication feature has the following limitations:

  • The following features are not supported on the same service where the Assisted-Replication feature is enabled.

    • Aggregate QoS per VNI

    • VXLAN IPv6 transport

    • IGMP/MLD/PIM-snooping

  • Assisted-Replication Leaf and Replicator functions are mutually exclusive within the same VPLS service.

  • The Assisted-Replication feature is supported with IPv4 non-system-ip VXLAN termination. However, the configured assisted-replication-ip (AR-IP) must be different from the tunnel termination IP address.

  • The AR-IP address must be a /32 loopback interface on the base router.

  • The Assisted-Replication feature is only supported in EVPN-VXLAN services (VPLS with BGP-EVPN vxlan enabled). Although services with a combination of EVPN-MPLS and EVPN-VXLAN are supported, the Assisted-Replication configuration is only relevant to the VXLAN.

DGW policy based forwarding/routing to an EVPN ESI

The Nuage Virtual Services Platform (VSP) supports a service chaining function that ensures traffic traverses a number of services (also known as Service Functions) between application hosts (FW, LB, NAT, IPS/IDS, and so on.) if the operator needs to do so. In the DC, tenants want the ability to specify these functions and their sequence, so that services can be added or removed without requiring changes to the underlying application.

This service chaining function is built based on a series of policy based routing/forwarding redirecting rules that are automatically coordinated and abstracted by the Nuage Virtual Services Directory (VSD). From a networking perspective, the packets are hop-by-hop redirected based on the location of the corresponding SF (Service Function) in the DC fabric. The location of the SF is specified by its VTEP and VNI and is advertised by BGP-EVPN along with an Ethernet Segment Identifier that is uniquely associated with the SF.

For more information about the Nuage Service Chaining solution, see the Nuage VSP documentation.

The 7750 SR, 7450 ESS, or 7950 XRS can be integrated as the first hop in the chain in a Nuage DC. This service chaining integration is intended to be used as described in the following three use cases.

Policy based forwarding in VPLS services for Nuage Service Chaining integration in L2-domains

PBF to ESI function shows the 7750 SR, 7450 ESS, and 7950 XRS Service Chaining integration with the Nuage VSP on VPLS services. In this example, the DC gateway, PE1, is connected to an L2-DOMAIN that exists in the DC and must redirect the traffic to the Service Function SF-1. The regular Layer 2 forwarding procedures would have taken the packets to PE2, as opposed to SF-1.

Figure 16. PBF to ESI function

An operator must configure a PBF match/action filter policy entry in an IPv4 or MAC ingress access or network filter deployed on a VPLS interface using CLI/SNMP/NETCONF management interfaces. The PBF target is the first service function in the chain (SF-1) that is identified by an ESI.

In the example shown in PBF to ESI function, the PBF filter redirects the matching packets to ESI 0x01 in VPLS-1.

Note: The PBF to ESI function represents ESI as ‟0x01” for simplicity; in reality, the ESI is a 10-byte number.

As soon as the redirection target is configured and associated with the vport connected to SF-1, the Nuage VSC (Virtual Services Controller, or the remote PE3 in the example) advertises the location of SF-1 via an Auto-Discovery Ethernet Tag route (route type 1) per-EVI. In this AD route, the ESI associated with SF-1 (ESI 0x01) is advertised along with the VTEP (PE3's IP) and VNI (VNI-1) identifying the vport where SF-1 is connected. PE1 sends all the frames matching the ingress filter to PE3's VTEP and VNI-1.

Note: When packets get to PE3, VNI-1 (the VNI advertised in the AD route) indicate that a cut-through switching operation is needed to deliver the packets straight to the SF-1 vport, without the need for a regular MAC lookup.

The following filter configuration shows an example of PBF rule redirecting all the frames to an ESI.

A:PE1>config>filter>mac-filter# info 
----------------------------------------------
            default-action forward
            entry 10 create
                action
                    forward esi ff:00:00:00:00:00:00:00:00:01 service-id 301
                exit
            exit

When the filter is properly applied to the VPLS service (VPLS-301 in this example), it shows 'Active' in the following show commands as long as the Auto-Discovery route for the ESI is received and imported.

A:PE1# show filter mac 1 
===============================================================================
Mac Filter
===============================================================================
Filter Id   : 1                                Applied         : Yes
Scope       : Template                         Def. Action     : Forward
Entries     : 1                                Type            : normal
Description : (Not Specified)
-------------------------------------------------------------------------------
Filter Match Criteria : Mac
-------------------------------------------------------------------------------
Entry       : 10                               FrameType       : Ethernet
Description : (Not Specified)
Log Id      : n/a                              
Src Mac     : Undefined
Dest Mac    : Undefined
Dot1p       : Undefined                        Ethertype       : Undefined
DSAP        : Undefined                        SSAP            : Undefined
Snap-pid    : Undefined                        ESnap-oui-zero  : Undefined
Match action: Forward (ESI) Active             
  ESI       : ff:00:00:00:00:00:00:00:00:01
  Svc Id    : 301                              
PBR Down Act: Forward (entry-default)          
Ing. Matches: 3 pkts
Egr. Matches: 0 pkts
===============================================================================

A:PE1# show service id 301 es-pbr 
===============================================================================
L2 ES PBR
===============================================================================
ESI                           Users      Status
                                         VTEP:VNI
-------------------------------------------------------------------------------
ff:00:00:00:00:00:00:00:00:01 1          Active
                                         192.0.2.72:7272
-------------------------------------------------------------------------------
Number of entries : 1
-------------------------------------------------------------------------------
===============================================================================

Details of the received AD route that resolves the filter forwarding are shown in the following show router bgp routes command.

A:PE1# show router bgp routes evpn auto-
disc esi ff:00:00:00:00:00:00:00:00:01                                    
===============================================================================
 BGP Router ID:192.0.2.71       AS:64500       Local AS:64500      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup
 Origin codes  : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP EVPN Auto-Disc Routes
===============================================================================
Flag  Route Dist.         ESI                           NextHop
      Tag                                               Label
-------------------------------------------------------------------------------

u*>i  192.0.2.72:100      ff:00:00:00:00:00:00:00:00:01 192.0.2.72
      0                                                 VNI 7272

-------------------------------------------------------------------------------
Routes : 1
=============================================================

This AD route, when used for PBF redirection, is added to the list of EVPN-VXLAN bindings for the VPLS service and shown as 'L2 PBR' type:

A:PE1# show service id 301 vxlan 
===============================================================================
VPLS VXLAN, Ingress VXLAN Network Id: 301
===============================================================================
Egress VTEP, VNI
===============================================================================
VTEP Address           Egress VNI     Num. MACs    Mcast   Oper State   L2 PBR
-------------------------------------------------------------------------------
192.0.2.69             301            1            Yes     Up           No
192.0.2.72             301            1            Yes     Up           No
192.0.2.72             7272           0            No      Up           Yes
-------------------------------------------------------------------------------
Number of Egress VTEP, VNI : 3
-------------------------------------------------------------------------------
===============================================================================

If the AD route is withdrawn, the binding disappears and the filter is inactive again. The user can control whether the matching packets are dropped or forwarded if the PBF target cannot be resolved by BGP.

Note: ES-based PBF filters can be applied only on services with the default bgp (vxlan) instance (instance 1).

Policy based routing in VPRN services for Nuage Service Chaining integration in L2-DOMAIN-IRB domains

PBR to ESI function shows the 7750 SR, 7450 ESS, and 7950 XRS Service Chaining integration with the Nuage VSP on L2-DOMAIN-IRB domains. In this example, the DC gateway, PE1, is connected to an L2-DOMAIN-IRB that exists in the DC and must redirect the traffic to the Service Function SF-1 with IP address 10.10.10.1. The regular Layer 3 forwarding procedures would have taken the packets to PE2, as opposed to SF-1.

Figure 17. PBR to ESI function

In this case, an operator must configure a PBR match/action filter policy entry in an IPv4 ingress access or network filter deployed on IES/VPRN interface using CLI, SNMP or NETCONF management interfaces. The PBR target identifies first service function in the chain (ESI 0x01 in PBR to ESI function, identifying where the Service Function is connected and the IPv4 address of the SF) and EVPN VXLAN egress interface on the PE (VPRN routing instance and R-VPLS interface name). The BGP control plane together with ESI PBR configuration are used to forward the matching packets to the next-hop in the EVPN-VXLAN data center chain (through resolution to a VNI and VTEP). If the BGP control plane information is not available, the packets matching the ESI PBR entry is, by default, forwarded using regular routing. Optionally, an operator can select to drop the packets when the ESI PBR target is not reachable.

The following filter configuration shows an example of a PBR rule redirecting all the matching packets to an ESI.

*A:PE1>config>filter>ip-filter# info 
----------------------------------------------
            default-action forward
            entry 10 create
                match 
                    dst-ip 10.10.10.253/32
                exit 
                action
                    forward esi ff:00:00:00:00:21:5f:00:df:e5 sf-ip 10.10.10.1 vas-
interface "evi-301" router 300
                exit
                pbr-down-action-override filter-default-action
            exit 
----------------------------------------------

In this use case, the following are required in addition to the ESI: the sf-ip (10.10.10.1 in the example above), router instance (300), and vas-interface.

The sf-ip is used by the system to know which inner MAC DA it has to use when sending the redirected packets to the SF. The SF-IP is resolved to the SF MAC following regular ARP procedures in EVPN-VXLAN.

The router instance may be the same as the one where the ingress filter is configured or may be different: for instance, the ingress PBR filter can be applied on an IES interface pointing at a VPRN router instances that is connected to the DC fabric.

The vas-interface refers to the R-VPLS interface name through which the SF can be found. The VPRN instance may have more than one R-VPLS interface, therefore, it is required to specify which R-VPLS interface to use.

When the filter is properly applied to the VPRN or IES service (VPRN-300 in this example), it shows 'Active' in the following show commands as long as the Auto-Discovery route for the ESI is received and imported and the SF-IP resolved to a MAC address.

*A:PE1# show filter ip 1 

===============================================================================
IP Filter
===============================================================================
Filter Id    : 1                                Applied        : Yes
Scope        : Template                         Def. Action    : Forward
System filter: Unchained                        
Radius Ins Pt: n/a                              
CrCtl. Ins Pt: n/a                              
RadSh. Ins Pt: n/a                              
PccRl. Ins Pt: n/a                              
Entries      : 1                                
Description  : (Not Specified)
-------------------------------------------------------------------------------
Filter Match Criteria : IP
-------------------------------------------------------------------------------
Entry        : 10
Description  : (Not Specified)
Log Id       : n/a                              
Src. IP      : 0.0.0.0/0
Src. Port    : n/a
Dest. IP     : 10.16.0.253/32
Dest. Port   : n/a
Protocol     : Undefined                        Dscp           : Undefined
ICMP Type    : Undefined                        ICMP Code      : Undefined
Fragment     : Off                              Src Route Opt  : Off
Sampling     : Off                              Int. Sampling  : On
IP-Option    : 0/0                              Multiple Option: Off
TCP-syn      : Off                              TCP-ack        : Off
Option-pres  : Off                              
Egress PBR   : Undefined                        
Match action : Forward (ESI) Active
  ESI        : ff:00:00:00:00:21:5f:00:df:e5    
  SF IP      : 10.10.10.1
  VAS If name: evi-301                          
  Router     : 300                              
PBR Down Act : Forward (filter-default-action) Ing. Matches : 3 pkts (318 bytes)
Egr. Matches : 0 pkts
===============================================================================

*A:PE1# show service id 300 es-pbr 
===============================================================================
L3 ES PBR
===============================================================================
SF IP              ESI                                 Users Status
                   Interface                                 MAC
                                                             VTEP:VNI
-------------------------------------------------------------------------------
10.10.10.1         ff:00:00:00:00:21:5f:00:df:e5       1     Active
                   evi-301                                   d8:47:01:01:00:0a
                                                             192.0.2.71:7171
-------------------------------------------------------------------------------
Number of entries : 1
-------------------------------------------------------------------------------
=================================================================================

In the FDB for the R-VPLS 301, the MAC address is associated with the VTEP and VNI specified by the AD route, and not by the MAC/IP route anymore. When a PBR filter with a forward action to an ESI and SF-IP (Service Function IP) exists, a MAC route is auto-created by the system and this route has higher priority that the remote MAC, or IP routes for the MAC (see BGP and EVPN route selection for EVPN routes).

The following shows that the AD route creates a new EVPN-VXLAN binding and the MAC address associated with the SF-IP uses that 'binding':

*A:PE1# show service id 301 vxlan 
===============================================================================
VPLS VXLAN, Ingress VXLAN Network Id: 301
===============================================================================
Egress VTEP, VNI
===============================================================================
VTEP Address           Egress VNI     Num. MACs    Mcast   Oper State   L2 PBR
-------------------------------------------------------------------------------
192.0.2.69             301            1            Yes     Up           No
192.0.2.71             301            0            Yes     Up           No
192.0.2.71             7171           1            No      Up           No
-------------------------------------------------------------------------------
Number of Egress VTEP, VNI : 3
-------------------------------------------------------------------------------
===============================================================================
*A:PE1# show service id 301 fdb detail 
===============================================================================
Forwarding Database, Service 301
===============================================================================
ServId    MAC               Source-Identifier        Type     Last Change
                                                     Age      
-------------------------------------------------------------------------------
301       d8:45:ff:00:00:6a vxlan-1:                 EvpnS    06/15/15 21:55:27
                            192.0.2.69:301
301       d8:47:01:01:00:0a vxlan-1:                 EvpnS    06/15/15 22:32:56
                            192.0.2.71:7171
301       d8:48:ff:00:00:6a cpm                      Intf     06/15/15 21:54:12
-------------------------------------------------------------------------------
No. of MAC Entries: 3
-------------------------------------------------------------------------------
Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================

For Layer 2, if the AD route is withdrawn or the SF-IP ARP not resolved, the filter is inactive again. The user can control whether the matching packets are dropped or forwarded if the PBF target cannot be resolved by BGP.

EVPN VXLAN multihoming

SR OS supports EVPN VXLAN multihoming as specified in RFC8365. Similar to EVPN-MPLS, as described in EVPN for MPLS tunnels, ESs and virtual ESs can be associated with VPLS and R-VPLS services where BGP-EVPN VXLAN is enabled. EVPN multihoming for EVPN-VXLAN illustrates the use of ESs in EVPN VXLAN networks.

Figure 18. EVPN multihoming for EVPN-VXLAN

As described in EVPN multihoming in VPLS services, the multihoming procedures consist of three components:

  • Designated Forwarder (DF) election

  • split-horizon

  • aliasing

DF election is the mechanism by which the PEs attached to the same ES elect a single PE to forward all traffic (in case of single-active mode) or all BUM traffic (in case of all-active mode) to the multihomed CE. The same DF Election mechanisms described in EVPN for MPLS tunnels are supported for VXLAN services.

Split-horizon is the mechanism by which BUM traffic received from a peer ES PE is filtered so that it is not looped back to the CE that first transmitted the frame. It is applicable to all-active multihoming. This is illustrated in EVPN multihoming for EVPN-VXLAN, where PE4 receives BUM traffic from PE3 but, in spite of being the DF for ES-2, PE4 filters the traffic and does not send it back to host-1. While split-horizon filtering uses ESI-labels in EVPN MPLS services, an alternative procedure called ‟Local Bias” is applied in VXLAN services, as described in RFC 8365. In MPLS services, split-horizon filtering may be used in single-active mode to avoid in-flight BUM packets from being looped back to the CE during transient times. In VXLAN services, split-horizon filtering is only used with all-active mode.

Aliasing is the procedure by which PEs that are not attached to the ES can process non-zero MAC/IP and AD routes and create ES destinations to which per-flow ecmp can be applied. Aliasing only applies to all-active mode.

As an example, the configuration of an ES that is used for VXLAN services follows. Note that this ES can be used for VXLAN services and MPLS services (in both cases VPLS and Epipes).

A:PE-3# configure service system bgp-evpn ethernet-segment "ES-2" 
A:PE-3>config>service>system>bgp-evpn>eth-seg# info 
----------------------------------------------
                esi 01:02:00:00:00:00:00:00:00:00
                service-carving
                    mode manual
                    manual
                        preference non-revertive create
                            value 10
                        exit
                    exit
                exit
                multi-homing all-active
                lag 1
                no shutdown
----------------------------------------------

An example of configuration of a VXLAN service using the above ES follows:

A:PE-3# configure service vpls 1 
A:PE-3>config>service>vpls# info 
----------------------------------------------
            vxlan instance 1 vni 1 create
            exit
            bgp
            exit
            bgp-evpn
                evi 1
                vxlan bgp 1 vxlan-instance 1
                    ecmp 2
                    auto-disc-route-advertisement
                    mh-mode network
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            sap lag-1:30 create
                no shutdown
            exit
            no shutdown
----------------------------------------------

The auto-disc-route-advertisement and mh-mode network commands are required in all services that are attached to at least one ES, and they must be configured in both, the PEs attached to the ES locally and the remote PEs in the same service. The former enables the advertising of multihoming routes in the service, whereas the latter activates the multihoming procedures for the service, including the local bias mode for split-horizon.

In addition, the configuration of vpls>bgp-evpn>vxlan>ecmp 2 (or greater) is required so that VXLAN ES destinations with two or more next hops can be used for per-flow load balancing. The following command shows how PE1, as shown in EVPN multihoming for EVPN-VXLAN, creates an ES destination composed of two VXLAN next hops.

A:PE-1# show service id 1 vxlan destinations 
===============================================================================
Egress VTEP, VNI
===============================================================================
Instance    VTEP Address                            Egress VNI  Evpn/   Num.
 Mcast       Oper State                              L2 PBR     Static  MACs
-------------------------------------------------------------------------------
1           192.0.2.3                               1           evpn    0
 BUM         Up                                      No                 
1           192.0.2.4                               1           evpn    0
 BUM         Up                                      No                 
-------------------------------------------------------------------------------
Number of Egress VTEP, VNI : 2
-------------------------------------------------------------------------------
===============================================================================
===============================================================================
BGP EVPN-VXLAN Ethernet Segment Dest
===============================================================================
Instance  Eth SegId                       Num. Macs     Last Change
-------------------------------------------------------------------------------
1         01:02:00:00:00:00:00:00:00:00   1             04/01/2019 08:54:54
-------------------------------------------------------------------------------
Number of entries: 1
-------------------------------------------------------------------------------
===============================================================================

A:PE-1# show service id 1 vxlan esi 01:02:00:00:00:00:00:00:00:00 
===============================================================================
BGP EVPN-VXLAN Ethernet Segment Dest
===============================================================================
Instance  Eth SegId                       Num. Macs     Last Change
-------------------------------------------------------------------------------
1         01:02:00:00:00:00:00:00:00:00   1             04/01/2019 08:54:54
-------------------------------------------------------------------------------
Number of entries: 1
-------------------------------------------------------------------------------
===============================================================================
===============================================================================
BGP EVPN-VXLAN Dest TEP Info
===============================================================================
Instance  TEP Address                   Egr VNI             Last Change
-------------------------------------------------------------------------------
1         192.0.2.3                     1                   04/01/2019 08:54:54
1         192.0.2.4                     1                   04/01/2019 08:54:54
-------------------------------------------------------------------------------
Number of entries : 2
-------------------------------------------------------------------------------
===============================================================================

Local bias for EVPN VXLAN multihoming

EVPN MPLS, as described in EVPN for MPLS tunnels, uses ESI-labels to identify the BUM traffic sourced from a specified ES. The egress PE performs a label lookup to find the ESI label below the EVI label and to determine if a frame can be forwarded to a local ES. Because VXLAN does not support ESI-labels, or any MPLS label for that matter, the split-horizon filtering must be based on the tunnel source IP address. This also implies that the SAP-to-SAP forwarding rules must be changed when the SAPs belong to local ESs, irrespective of the DF state. This new forwarding is what RFC 8365 refers to as local bias. EVPN-VXLAN multihoming with local bias illustrates the local bias forwarding behavior.

Figure 19. EVPN-VXLAN multihoming with local bias

Local bias is based on the following principles:

  • Every PE knows the IP addresses associated with the other PEs with which it has shared multihomed ESs.

  • When the PE receives a BUM frame from a VXLAN bind, it looks up the source IP address in the tunnel header and filters out the frame on all local interfaces connected to ESs that are shared with the ingress PE.

With this approach, the ingress PE must perform replication locally to all directly-attached ESs (regardless of the DF Election state) for all flooded traffic coming from the access interfaces. BUM frames received on any SAP are flooded to:

  • local non-ES SAPs and non-ES SDP-binds

  • local all-active ES SAPs (DF and NDF)

  • local single-active ES SDP-binds and SAPs (DF only)

  • EVPN-VXLAN destinations

As an example, in EVPN-VXLAN multihoming with local bias, PE2 receives BUM traffic from Host-3 and it forwards it to the remote PEs and the local ES SAP, even though the SAP is in NDF state.

The following rules apply to egress PE forwarding for EVPN-VXLAN services:

  • The source VTEP is looked up for BUM frames received on EVPN-VXLAN.

  • If the source VTEP matches one of the PEs with which the local PE shares both an ES and a VXLAN service:

    • the local PE is not forwarded to the shared ES local SAPs

    • the local PE forwards normally to ES SAPs unless they are in NDF state

  • Because there is no multicast label or multicast B-MAC in VXLAN, the egress PE only identifies BUM traffic using the customer MAC DA; as a result, BM or unknown MAC DAs identify BUM traffic.

For example, in EVPN-VXLAN multihoming with local bias, PE3 receives BUM traffic on VXLAN. PE3 identifies the source VTEP as a PE with which two ESs are shared, therefore it does not forward the BUM frames to the two shared ESs. It forwards to the non-shared ES (Host-5) because it is in DF state. PE4 receives BUM traffic and forwards it based on normal rules because it does not share any ESs with PE2.

The following command can be used to check whether the local PE has enabled the local bias procedures for a specific ES:

A:PE-2# tools dump service system bgp-evpn ethernet-segment "ES-1" local-bias 
-------------------------------------------------------------------------------
[04/01/2019 08:45:08] Vxlan Local Bias Information 
----------------------------------------------------------------------+--------
Peer                                                                  | Enabled   
----------------------------------------------------------------------+--------
192.0.2.3                                                             | Yes       
-------------------------------------------------------------------------------

Known limitations for local bias

In EVPN MPLS networks, an ingress PE that uses ingress replication to flood unknown unicast traffic pushes a BUM MPLS label that is different from a unicast label. The egress PEs use this BUM label to identify such BUM traffic to apply DF filtering for All-Active multihomed sites. In PBB-EVPN, in addition to the multicast label, the egress PE can also rely on the multicast B-MAC DA to identify customer BUM traffic.

In VXLAN there are no BUM labels or any tunnel indication that can assist the egress PE in identifying the BUM traffic. As such, the egress PE must solely rely on the C-MAC destination address, which may create some transient issues that are depicted in EVPN-VXLAN multihoming and unknown unicast issues.

Figure 20. EVPN-VXLAN multihoming and unknown unicast issues

As shown in EVPN-VXLAN multihoming and unknown unicast issues, top diagram, in absence of the mentioned unknown unicast traffic indication there can be transient duplicate traffic to All-Active multihomed sites under the following condition: CE1’s MAC address is learned by the egress PEs (PE1 and PE2) and advertised to the ingress PE3; however, the MAC advertisement has not been received or processed by the ingress PE, resulting in the host MAC address to be unknown on the ingress PE3 but known on the egress PEs. Therefore, when a packet destined for CE1 address arrives on PE3, it floods it through ingress replication to PE1 or PE2 and, because CE1’s MAC is known to PE1 and PE2, multiple copies are sent to CE1.

Another issue is shown at the bottom of EVPN-VXLAN multihoming and unknown unicast issues. In this case, CE1’s MAC address is known on the ingress PE3 but unknown on PE1 and PE2. If PE3’s aliasing hashing picks up the path to the ES’ NDF, a black-hole occurs.

The above two issues are solved in MPLS, as unicast known and unknown frames are identified with different labels.

Finally, another issue is described in Blackhole created by a remote SAP shutdown. Under normal circumstances, when CE3 sends BUM traffic to PE3, the traffic is ‟local-biased” to PE3’s SAP3 even though it is NDF for the ES. The flooded traffic to PE2 is forwarded to CE2, but not to SAP2 because the local bias split-horizon filtering takes place.

Figure 21. Blackhole created by a remote SAP shutdown

The right side of the diagram in Blackhole created by a remote SAP shutdown shows an issue when SAP3 is manually shutdown. In this case, PE3 withdraws the AD per-EVI route corresponding to SAP3; however, this does not change the local bias filtering for SAP2 in PE2. Therefore, when CE3 sends BUM traffic, it can neither be forwarded to CE23 via local SAP3 nor can it be forwarded by PE2.

Non-system IPv4 and IPv6 VXLAN termination for EVPN VXLAN multihoming

EVPN VXLAN multihoming is supported on VPLS and R-VPLS services when the PEs use non-system IPv4 or IPv6 termination, however, as with EVPN VPWS services, additional configuration steps are required.

  • The configure service system bgp-evpn eth-seg es-orig-ip ip-address command must be configured with the non-system IPv4 or IPv6 address used for the EVPN-VXLAN service. This command modifies the originating-ip field in the ES routes advertised for the Ethernet Segment, and makes the system use this IP address when adding the local PE as DF candidate.

  • The configure service system bgp-evpn eth-seg route-next-hop ip-address command must also be configured with the non-system IP address. This command changes the next-hop of the ES and AD per-ES routes to the configured address.

  • Finally, the non-system IP address (in each of the PEs in the ES) must match in these three commands for the local PE to be considered suitable for DF election:

    • es-orig-ip ip-address

    • route-next-hop ip-address

    • vxlan-src-vtep ip-address

EVPN for MPLS tunnels

This section provides information about EVPN for MPLS tunnels.

BGP-EVPN control plane for MPLS tunnels

EVPN routes and usage lists all the EVPN routes supported in 7750 SR, 7450 ESS, or 7950 XRS SR OS and their usage in EVPN-VXLAN, EVPN-MPLS, and PBB-EVPN.

Note: Route type 1 is not required in PBB-EVPN as per RFC 7623.
Table 2. EVPN routes and usage
EVPN route Usage EVPN-VXLAN EVPN-MPLS PBB-EVPN

Type 1 - Ethernet Auto-Discovery route (A-D)

Mass-withdraw, ESI labels, Aliasing

Y

Y

Type 2 - MAC/IP Advertisement route

MAC/IP advertisement, IP advertisement for ARP resolution

Y

Y

Y

Type 3 - Inclusive Multicast Ethernet Tag route

Flooding tree setup (BUM flooding)

Y

Y

Y

Type 4 - ES route

ES discovery and DF election

Y

Y

Y

Type 5 - IP Prefix advertisement route

IP Routing

Y

Y

Type 6 - Selective Multicast Ethernet Tag route

Signal interest on a multicast group

Y

Y

Type 7 - Multicast Join Synch route

Join a multicast group on a multihomed ES

Y

Y

Type 8 - Multicast Leave Synch route

Leave a multicast group on a multihomed ES

Y

Y

Type 10 - Selective Provider Multicast Service Interface Auto-Discovery route Signal and setup Selective Provider Tunnels for IP Multicast - Y -

RFC 7432 describes the BGP-EVPN control plane for MPLS tunnels. If EVPN multihoming is not required, two route types are needed to set up a basic EVI (EVPN Instance): MAC/IP Advertisement and the Inclusive Multicast Ethernet Tag routes. If multihoming is required, the ES and the Auto-Discovery routes are also needed.

The route fields and extended communities for route types 2 and 3 are shown in EVPN-VXLAN required routes and communities. BGP-EVPN control plane for VXLAN overlay tunnels The changes compared to their use in EVPN-VXLAN are described below.

EVPN route type 3 - inclusive multicast Ethernet tag route

As in EVPN-VXLAN, route type 3 is used for setting up the flooding tree (BUM flooding) for a specified VPLS service. The received inclusive multicast routes add entries to the VPLS flood list in the 7750 SR, 7450 ESS, and 7950 XRS. Ingress replication, p2mp mLDP, and composite tunnels are supported as tunnel types in route type 3 when BGP-EVPN MPLS is enabled

The following route values are used for EVPN-MPLS services:

  • Route Distinguisher is taken from the RD of the VPLS service within the BGP context. The RD can be configured or derived from the bgp-evpn evi value.

  • Ethernet Tag ID is 0.

  • IP address length is always 32.

  • Originating router's IP address carries an IPv4 or IPv6 address.

  • The PMSI attribute can have different formats depending on the tunnel type enabled in the service.

    • Tunnel type = Ingress replication (6)

      The route is referred to as an Inclusive Multicast Ethernet Tag IR (IMET-IR) route and the PMSI Tunnel Attribute (PTA) fields are populated as follows:

      • Leaf not required for Flags.

      • MPLS label carries the MPLS label allocated for the service in the high-order 20 bits of the label field.

        Unless bgp-evpn mpls ingress-replication-bum-label is configured in the service, the MPLS label used is the same as that used in the MAC/IP routes for the service.

      • Tunnel endpoint is equal to the originating IP address.

    • Tunnel type=p2mp mLDP (2)

      The route is referred to as an IMET-P2MP route and its PTA fields are populated as follows:

      • Leaf not required for Flags.

      • MPLS label is 0.

      • Tunnel endpoint includes the route node address and an opaque number. This is the tunnel identifier that the leaf-nodes use to join the mLDP P2MP tree.

    • Tunnel type=Composite tunnel (130)

      The route is referred to as an IMET-P2MP-IR route and its PTA fields are populated as follows:

      • Leaf not required for Flags.

      • MPLS label 1 is 0.

      • Tunnel endpoint identifier includes the following:
        MPLS label2
        non-zero, downstream allocated label (like any other IR label). The leaf-nodes use the label to set up an EVPN-MPLS destination to the root and add it to the default-multicast list.
        mLDP tunnel identifier
        the route node address and an opaque number. This is the tunnel identifier that the leaf-nodes use to join the mLDP P2MP tree.

IMET-P2MP-IR routes are used in EVIs with a few root nodes and a significant number of leaf-only PEs. In this scenario, a combination of P2MP and IR tunnels can be used in the network, such that the root nodes use P2MP tunnels to send broadcast, Unknown unicast, and Multicast traffic but the leaf-PE nodes use IR to send traffic to the roots. This use case is documented in IETF RFC 8317 and the main advantage it offers is the significant savings in P2MP tunnels that the PE/P routers in the EVI need to handle (as opposed to a full mesh of P2MP tunnels among all the PEs in an EVI).

In this case, the root PEs signals a special tunnel type in the PTA, indicating that they intend to transmit BUM traffic using an mLDP P2MP tunnel but they can also receive traffic over an IR evpn-mpls binding. An IMET route with this special ‟composite” tunnel type in the PTA is called an IMET-P2MP-IR route and the encoding of its PTA is shown in Composite p2mp mLDP and IR tunnels—PTA.

Figure 22. Composite p2mp mLDP and IR tunnels—PTA

EVPN route type 2 - MAC/IP advertisement route

The 7750 SR, 7450 ESS, or 7950 XRS router generates this route type for advertising MAC addresses (and IP addresses if proxy-ARP/proxy-ND is enabled). If mac-advertisement is enabled, the router generates MAC advertisement routes for the following:

  • learned MACs on SAPs or SDP bindings

  • conditional static MACs

    Note: The unknown-mac-route is not supported for EVPN-MPLS services.

The route type 2 generated by a router uses the following fields and values:

  • Route Distinguisher is taken from the RD of the VPLS service within the BGP context. The RD can be configured or derived from the bgp-evpn evi value.

  • Ethernet Segment Identifier (ESI) is zero for MACs learned from single-homed CEs and different from zero for MACs learned from multihomed CEs.

  • Ethernet Tag ID is 0.

  • MAC address length is always 48.

  • MAC address can be learned or statically configured.

  • IP address and IP address length:

    • It is the IP address associated with the MAC being advertised with a length of 32 (or 128 for IPv6).

    • In general, any MAC route without IP has IPL=0 (IP length) and the IP is omitted.

    • When received, any IPL value not equal to zero, 32, or 128 discards the route.

    • MPLS Label 1 carries the MPLS label allocated by the system to the VPLS service. The label value is encoded in the high-order 20 bits of the field and is the same label used in the routes type 3 for the same service unless bgp-evpn mpls ingress-replication-bum-label is configured in the service.

  • MPLS Label 2 is 0.

  • The MAC mobility extended community is used for signaling the sequence number in case of MAC moves and the sticky bit in case of advertising conditional static MACs. If a MAC route is received with a MAC mobility ext-community, the sequence number and the 'sticky' bit are considered for the route selection.

When EVPN multihoming is enabled in the system, two more routes are required. EVPN routes type 1 and 4 shows the fields in routes type 1 and 4 and their associated extended communities.

Figure 23. EVPN routes type 1 and 4

EVPN route type 1 - Ethernet auto-discovery route (AD route)

The 7750 SR, 7450 ESS, or 7950 XRS router generates this route type for advertising for multihoming functions. The system can generate two types of AD routes:

  • Ethernet AD route per-ESI (Ethernet Segment ID)

  • Ethernet AD route per-EVI (EVPN Instance)

The Ethernet AD per-ESI route generated by a router uses the following fields and values:

  • Route Distinguisher is taken from the system level RD or service level RD.

  • Ethernet Segment Identifier (ESI) contains a 10-byte identifier as configured in the system for a specified ethernet-segment.

  • Ethernet Tag ID is MAX-ET (0xFFFFFFFF). This value is reserved and used only for AD routes per ESI.

  • MPLS label is 0.

  • ESI Label Extended community includes the single-active bit (0 for all-active and 1 for single-active) and ESI label for all-active multihoming split-horizon.

  • Route target extended community is taken from the service level RT or an RT-set for the services defined on the Ethernet segment.

The system can either send a separate Ethernet AD per-ESI route per service, or a few Ethernet AD per-ESI routes aggregating the route-targets for multiple services. While both alternatives inter-operate, RFC 7432 states that the EVPN Auto-Discovery per-ES route must be sent with a set of route-targets corresponding to all the EVIs defined on the Ethernet Segment (ES). Either option can be enabled using the command: config>service>system>bgp-evpn#ad-per-es-route-target <[evi-rt ] | [evi-rt-set]> route-distinguisher ip-address [extended-evi-range]

The default option ad-per-es-route-target evi-rt configures the system to send a separate AD per-ES route per service. When enabled, the evi-rt-set option supports route aggregation: a single AD per-ES route with the associated RD (ip-address:1) and a set of EVI route targets are advertised (up to a maximum of 128). When the number of EVIs defined in the Ethernet Segment is significant (therefore the number of route-targets), the system sends more than one route. For example:

  • AD per-ES route for evi-rt-set 1 is sent with RD ip-address:1

  • AD per-ES route for evi-rt-set 2 is sent with RD ip-address:2

  • up to an AD per-ES route is sent with RD ip-address:512

The extended-evi-range option is needed for the use of evi-rt-set with a comm-val extended range of 1 through 65535. This option is recommended when EVIs greater than 65535 are configured in some services. In this case, there are more EVIs for which the route-targets must be packed in the AD per-ES routes. This command option extends the maximum number of AD per-ES routes that can be sent (since the RD now supports up to ip-address:65535) and allows many more route-targets to be included in each set.

Note: When evi-rt-set is configured, no vsi-export policies are possible on the services defined on the Ethernet Segment. If vsi-export policies are configured for a service, the system sends an individual AD per-ES route for that service. The maximum standard BGP update size is 4KB, with a maximum of 2KB for the route-target extended community attribute.

The Ethernet AD per-EVI route generated by a router uses the following fields and values:

  • Route Distinguisher is taken from the service level RD.

  • Ethernet Segment Identifier (ESI) contains a 10-byte identifier as configured in the system for a specified Ethernet Segment.

  • Ethernet Tag ID is 0.

  • MPLS label encodes the unicast label allocated for the service (high-order 20 bits).

  • Route-target extended community is taken from the service level RT.

Note: The AD per-EVI route is not sent with the ESI label Extended Community.

EVPN route type 4 - ES route

The router generates this route type for multihoming ES discovery and DF (Designated Forwarder) election.

  • Route Distinguisher is taken from the service level RD.

  • Ethernet Segment Identifier (ESI) contains a 10-byte identifier as configured in the system for a specified ethernet-segment.

  • The value of ES-import route-target community is automatically derived from the MAC address portion of the ESI. This extended community is treated as a route-target and is supported by RT-constraint (route-target BGP family).

EVPN route type 5 - IP prefix route

IP Prefix Routes are also supported for MPLS tunnels. The route fields for route type 5 are shown in EVPN route-type 5 . The 7750 SR, 7450 ESS, or 7950 XRS router generates this route type for advertising IP prefixes in EVPN using the same fields that are described in section BGP-EVPN control plane for VXLAN overlay tunnels, with the following exceptions:

  • MPLS label carries the MPLS label allocated for the service.

  • This route is sent with the RFC 5512 tunnel encapsulation extended community with the tunnel type value set to MPLS

RFC 5512 - BGP tunnel encapsulation extended community

The following routes are sent with the RFC 5512 BGP Encapsulation Extended Community: MAC/IP, Inclusive Multicast Ethernet Tag, and AD per-EVI routes. ES and AD per-ESI routes are not sent with this Extended Community.

The router processes the following BGP Tunnel Encapsulation tunnel values registered by IANA for RFC 5512:

  • VXLAN encapsulation is 8.

  • MPLS encapsulation is 10.

Any other tunnel value makes the route 'treat-as-withdraw'.

If the encapsulation value is MPLS, the BGP validates the high-order 20-bits of the label field, ignoring the low-order 4 bits. If the encapsulation is VXLAN, the BGP takes the entire 24-bit value encoded in the MPLS label field as the VNI.

If the encapsulation extended community (as defined in RFC 5512) is not present in a received route, BGP treats the route as an MPLS or VXLAN-based configuration of the config>router>bgp>neighbor# def-recv-evpn-encap [mpls | vxlan] command. The command is also available at the bgp and group levels.

EVPN for MPLS tunnels in VPLS services (EVPN-MPLS)

EVPN can be used in MPLS networks where PEs are interconnected through any type of tunnel, including RSVP-TE, Segment-Routing TE, LDP, BGP, Segment Routing IS-IS, Segment Routing OSPF, RIB-API, MPLS-forwarding-policy, SR-Policy, or MPLSoUDP. As with VPRN services, tunnel selection for a VPLS service (with BGP-EVPN MPLS enabled) is based on the auto-bind-tunnel command. The BGP EVPN routes next-hops can be IPv4 or IPv6 addresses and can be resolved to a tunnel in the IPv4 tunnel-table or IPv6 tunnel-table.

EVPN-MPLS is modeled similar to EVPN-VXLAN, that is, using a VPLS service where EVPN-MPLS ‟bindings” can coexist with SAPs and SDP bindings. The following shows an example of a VPLS service with EVPN-MPLS.

*A:PE-1>config>service>vpls# info 
----------------------------------------------
  description "evpn-mpls-service"
  bgp 
  exit
  bgp-evpn
    evi 10
    mpls bgp 1
      no shutdown
      auto-bind-tunnel resolution any
  exit
  sap 1/1/1:1 create 
  exit
  spoke-sdp 1:1 create

First configure a bgp-evpn context where VXLAN must be disabled and MPLS enabled. In addition to enabling MPLS the command, the minimum set of commands to be configured to set up the EVPN-MPLS instance are the evi and the auto-bind-tunnel resolution commands. The relevant configuration options are the following.

evi {1..16777215} — This EVPN identifier is unique in the system and is used for the service-carving algorithm used for multihoming (if configured), and for auto-deriving the route target and route distinguishers (if lower than 65535) in the service. It can be used for EVPN-MPLS and EVPN-VXLAN services.

The following options are supported:

  • If this EVPN identifier is not specified, the value is zero and no route distinguisher or route target is automatically derived from it.
  • If the specified EVPN identifier is lower than 65535 and no other route distinguisher or route target is configured in the service, the following applies:
    • The route distinguisher is derived from <system_ip>:evi.
    • The route target is derived from <autonomous-system>:evi.
  • If the specified EVPN identifier is higher than 65535 and no other route distinguisher or route target is configured in the service, the following applies:
    • The route distinguisher cannot be automatically derived. An error is generated if enabling EVPN is attempted without a route distinguisher. A manual or an auto-rd route distinguisher must be configured.
    • The route target can only be automatically derived if the evi-three-byte-auto-rt command is configured. If configured, the route target is automatically derived in accordance with the following rules described in RFC8365.
      • The route target is composed of ASN(2-octets):A/type/D-ID/EVI.
      • The ASN is a 2-octect value configured in the system. For AS numbers exceeding the 2-byte limit, the low order 16-bit value is used.
      • The A=0 value is used for auto-derivation.
      • The type=4 (EVI-based) is used.
      • The BGP instance is encoded using D-ID= [1..2]. This allows the automatic derivation of different RTs in multi-instance services. The value is inherited from the corresponding BGP instance.
      • EVI indicates the configured EVI in the service

For example, consider a service with the following characteristics:

  • ASN=64500
  • VPLS with two BGP instances, bgp 1 for VXLAN-instance 1 and bgp 2 for EVPN-MPLS
  • EVI=100000

The automatically derived route targets for this service are:

  • bgp 1 — 64500:1090619040 (ASN:0x410186A0)
  • bgp 2 — 64500:1107396256 (ASN:0x420186A0)

If this EVPN identifier is not specified, the value is zero and no route distinguisher or route targets is automatically derived from it. If specified and no other route distinguisher/route target are configured in the service:, then the following applies:

  • the route distinguisher is derived from: <system_ip>:evi

  • the route target is derived from: <autonomous-system>:evi

Note: When the vsi-import/export polices are configured, the route target must be configured in the policies and those values take preference over the automatically derived route targets. The operational route target for a service is displayed by the show service id svc-id bgp command. If the bgp-ad vpls-id is configured in the service, the vpls-id derived route target takes precedence over the evi-derived route target.

When the evi is configured, a configure service vpls bgp node (even empty) is required to allow the user to see the correct information about the show service id 1 bgp and show service system bgp-route-distinguisher commands.

The configuration of an evi is enforced for EVPN services with SAPs/SDP bindings in an ethernet-segment. See EVPN multihoming in VPLS services for more information about ESs.

The following options are specific to EVPN-MPLS (and defined in configure service vpls bgp-evpn mpls):

  • control word

    Enable or disable control word capability to guarantee interoperability to other vendors. When enabled along with the following command, the control word capability is signaled in the C flag of the EVPN Layer 2 Attributes extended community, as per draft-ietf-bess-rfc7432bis;
    • MD-CLI
      configure service vpls bgp-evpn routes incl-mcast advertise-l2-attributes
    • classic CLI
      configure service vpls bgp-evpn incl-mcast-l2-attributes-advertisement
    On reception, the router compares the C flag with the local setting for control-word. In case of a mismatch, the EVPN destination goes operationally down with the corresponding operational flag indicating the reason.
    Note: The control-word is required as per RFC 7432 to avoid frame disordering.
  • hash-label
    Enables or disables the use of hash-label (also known as Flow Aware Transport label) in the EVPN unicast destinations. Similar to the control-word command, when the hash-label command is enabled along with the incl-mcast-l2-attributes-advertisement (advertise-l2-attributes in classic CLI) command, the F flag capability is signaled in the EVPN Layer 2 Attributes extended community, as per draft-ietf-bess-rfc7432bis. In addition:
    • When the hash-label is enabled and advertise-l2-attributes false is configured, the hash-label is always pushed to a unicast EVPN destination. The hash label is never used for BUM packets, as per draft-ietf-bess-rfc7432bis.
    • When hash-label is enabled and advertise-l2-attributes true is configured, the F bit is set in the Layer-2 Attributes extended community of the EVPN Inclusive Multicast Ethernet Tag (IMET) route for the service. The hash-label towards a specific remote PE is pushed in the datapath only if the remote PE previously signaled support for hash-label (F=1). Otherwise, the unicast EVPN destination is brought operationally down, with the corresponding operational flag indicating the reason.
  • auto bind tunnel

    Select which type of MPLS transport tunnel to use for a particular instance; this command is used in the same way as in VPRN services.

    For BGP-EVPN MPLS, you must explicitly add BGP to the resolution filter in EVPN (BGP is implicit in VPRNs).

  • force VLAN VC forwarding

    This option allows the system to preserve the VLAN ID and pbits of the service-delimiting qtag in a new tag added in the customer frame before sending it to the EVPN core.

    Note: You can use this option in conjunction with the sap ingress vlan-translation command. If so, the configured translated VLAN ID is sent to the EVPN binds as opposed to the service-delimiting tag VLAN ID. If the ingress SAP/binding is null-encapsulated, the output VLAN ID and pbits are zero.
  • force QinQ VC forwarding with c-tag-c-tag or s-tag-c-tag

    This command allows the system to preserve the VLAN ID and pbits of the service-delimiting Q-tags (up to two tags) in customer frames before sending them to the EVPN core.

    Note:

    You can use this option in conjunction with the sap ingress qinq-vlan-translation s-tag.c-tag command. If so, the configured translated S-tag and C-tag VLAN IDs are the VLAN IDs sent to the EVPN binds as opposed to the service-delimiting tags VLAN IDs. If the ingress SAP or binding is null-encapsulated, the output VLAN ID and pbits are zero.

  • split horizon group

    This command allows the association of a user-created split horizon group to all the EVPN-MPLS destinations. See EVPN and VPLS integration for more information.

  • ecmp

    Set this option to a value greater than 1 to activate aliasing to the remote PEs that are defined in the same all-active multihoming ES. See EVPN all-active multihoming for more information.

  • ingress replication bum label

    You can use this option when you want the PE to advertise a label for BUM traffic (Inclusive Multicast routes) that is different from the label advertised for unicast traffic (with the MAC/IP routes). This is useful to avoid potential transient packet duplication in all-active multihoming.

In addition to these options, the following BGP EVPN options are also available for EVPN-MPLS services:

  • mac-advertisement

  • mac-duplication and settings

  • incl-mcast advertise-l2-attributes (MD-CLI)

    incl-mcast-l2-attributes-advertisement (classic CLI)

    This function enables the advertisement and processing of the EVPN Layer 2 Attributes extended community. The control-word, hash-label configuration, and the Service-MTU value are advertised in the extended community. On reception, the received MTU, hash-label and control-word flags are compared with the local MTU and hash-label or control-word configuration. In case of a mismatch in any of the three settings, the EVPN destination goes operationally down with the corresponding operational flag indicating what the mismatch is. The absence of an IMET route from an egress PE or the absence of the EVPN L2 Attributes extended community on a received IMET route from the PE, causes the route to bring down the EVPN destinations to that PE.

  • ignore-mtu-mismatch

    This command makes the router ignore the received Layer 2 MTU in the EVPN L2 Attributes extended community of the IMET route for a peer. If disabled, the local service MTU is compared against the received Layer 2 MTU. If there is a mismatch, the EVPN destinations to the peer stay oper-state down.

When EVPN-MPLS is established among some PEs in the network, EVPN unicast and multicast 'bindings' are created on each PE to the remote EVPN destinations. A specified ingress PE creates:

  • A unicast EVPN-MPLS destination binding to a remote egress PE as soon as a MAC/IP route is received from that egress PE.

  • A multicast EVPN-MPLS destination binding to a remote egress PE, if and only if the egress PE advertises an Inclusive Multicast Ethernet Tag Route with a BUM label. That is only possible if the egress PE is configured with ingress-replication-bum-label.

Those bindings, as well as the MACs learned on them, can be checked through the following show commands. In the following example, the remote PE(192.0.2.69) is configured with no ingress-replication-bum-label and PE(192.0.2.70) is configured with ingress-replication-bum-label. Therefore, Dut has a single EVPN-MPLS destination binding to PE(192.0.2.69) and two bindings (unicast and multicast) to PE(192.0.2.70).

show service id 1 evpn-mpls
===============================================================================
BGP EVPN-MPLS Dest
===============================================================================
TEP Address           Transport:Tnl     Egr Label     Oper     Mcast     Num
                                                      State              MACs                            
-------------------------------------------------------------------------------
192.0.2.69            ldp:65537         524118        Up       bum       0             
192.0.2.70            ldp:65538         524160        Up       none      1   
192.0.2.70            ldp:65538         524164        Up       bum       0         
192.0.2.72            ldp:65547         524144        Up       bum       0   
192.0.2.72            ldp:65547         524138        Up       none      2   
192.0.2.73            ldp:65548         524148        Up       bum       1   
192.0.2.254           ldp:65550         524150        Up       bum       0   
-------------------------------------------------------------------------------
Number of entries : 7
-------------------------------------------------------------------------------
===============================================================================
show service id 1 fdb detail
===============================================================================
Forwarding Database, Service 1
===============================================================================
ServId    MAC               Source-Identifier        Type     Last Change
                                                     Age      
-------------------------------------------------------------------------------
1         00:ca:fe:ca:fe:69 eMpls:                   EvpnS    06/11/15 21:53:48
                            192.0.2.69:262118
1         00:ca:fe:ca:fe:70 eMpls:                   EvpnS    06/11/15 19:59:57
                            192.0.2.70:262140
1         00:ca:fe:ca:fe:72 eMpls:                   EvpnS    06/11/15 19:59:57
                            192.0.2.72:262141
-------------------------------------------------------------------------------
No. of MAC Entries: 3
-------------------------------------------------------------------------------
Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================

EVPN and VPLS integration

The 7750 SR, 7450 ESS, or 7950 XRS router SR OS EVPN implementation supports RFC 8560 so that EVPN-MPLS and VPLS can be integrated into the same network and within the same service. Because EVPN is not deployed in green-field networks, this feature is useful for the integration between both technologies and even for the migration of VPLS services to EVPN-MPLS.

The following behavior enables the integration of EVPN and SDP bindings in the same VPLS network:
  1. Systems with EVPN endpoints and SDP bindings to the same far-end bring down the SDP bindings.

    • The router allows the establishment of an EVPN endpoint and an SDP binding to the same far-end but the SDP binding is kept operationally down. Only the EVPN endpoint remains operationally up. This is true for spoke SDPs (manual, BGP-AD, and BGP-VPLS) and mesh SDPs. It is also possible between VXLAN and SDP bindings.

    • If there is an existing EVPN endpoint to a specified far-end and a spoke SDP establishment is attempted, the spoke SDP is setup but kept down with an operational flag indicating that there is an EVPN route to the same far-end.

    • If there is an existing spoke SDP and a valid/used EVPN route arrives, the EVPN endpoint is setup and the spoke SDP is brought down with an operational flag indicating that there is an EVPN route to the same far-end.

    • In the case of an SDP binding and EVPN endpoint to different far-end IPs on the same remote PE, both links are up. This can happen if the SDP binding is terminated in an IPv6 address or IPv4 address different from the system address where the EVPN endpoint is terminated.

  2. The user can add spoke SDPs and all the EVPN-MPLS endpoints in the same split horizon group (SHG).

    • A CLI command is added under the bgp-evpn>mpls> context so that the EVPN-MPLS endpoints can be added to a split horizon group: bgp-evpn>mpls> [no] split-horizon-group group-name

    • The bgp-evpn mpls split-horizon-group must reference a user-configured split horizon group. User-configured split horizon groups can be configured within the service context. The same group-name can be associated with SAPs, spoke SDPs, pw-templates, pw-template-bindings, and EVPN-MPLS endpoints.

    • If the split-horizon-group command in bgp-evpn>mpls> is not used, the default split horizon group (that contains all the EVPN endpoints) is still used, but it is not possible to refer to it on SAPs/spoke SDPs.

    • SAPs and SDP bindings that share the same split horizon group of the EVPN-MPLS provider-tunnel are brought operationally down if the point-to-multipoint tunnel is operationally up.

  3. The system disables the advertisement of MACs learned on spoke SDPs and SAPs that are part of an EVPN split horizon group.

    • When the SAPs and spoke SDPs (manual or BGP-AD/VPLS-discovered) are configured within the same split horizon group as the EVPN endpoints, MAC addresses are still learned on them, but they are not advertised in EVPN.

    • The preceding statement is also true if proxy-ARP/proxy-ND is enabled and an IP-MAC pair is learned on a SAP or SDP binding that belongs to the EVPN split horizon group.

    • The SAPs or spoke SDPs, or both, added to an EVPN split horizon group should not be part of any EVPN multihomed ES. If that happened, the PE would still advertise the AD per-EVI route for the SAP or spoke SDP, or both, attracting EVPN traffic that could not possibly be forwarded to that SAP or SDP binding, or both.

    • Similar to the preceding statement, a split horizon group composed of SAPs/SDP bindings used in a BGP-MH site should not be configured under bgp-evpn>mpls>split-horizon-group. This misconfiguration would prevent traffic being forwarded from the EVPN to the BGP-MH site, regardless of the DF/NDF state.

      EVPN-VPLS integration shows an example of EVPN-VPLS integration.

      Figure 24. EVPN-VPLS integration

      An example CLI configuration for PE1, PE5, and PE2 is provided below.

      
      *A:PE1>config>service# info 
      ----------------------------------------------
      pw-template 1 create
      vpls 1 name "vpls-1" customer 1 create
        split-horizon-group "SHG-1" create 
        bgp
          route-target target:65000:1
          pw-template-binding 1 split-horizon-group SHG-1 
        exit
        bgp-ad
          no shutdown
          vpls-id 65000:1
        exit
        bgp-evpn
          evi 1
          mpls bgp 1
            no shutdown
            split-horizon-group SHG-1
        exit
        spoke-sdp 12:1 create
        exit
        sap 1/1/1:1 create
        exit
      
      *A:PE5>config>service# info 
      ----------------------------------------------
      pw-template 1 create
        exit
      vpls 1 customer 1 create
        bgp
          route-target target:65000:1
          pw-template-binding 1 split-horizon-group SHG-1 # auto-created SHG
        exit
        bgp-ad
          no shutdown
          vpls-id 65000:1
        exit
        spoke-sdp 52:1 create
        exit
      
      *A:PE2>config>service# info 
      ----------------------------------------------
      vpls 1 name "vpls-1" customer 1 create
        end-point CORE create
          no suppress-standby-signaling
        exit
        spoke-sdp 21:1 end-point CORE
          precedence primary
        exit
        spoke-sdp 25:1 end-point CORE
      
    • PE1, PE3, and PE4 have BGP-EVPN and BGP-AD enabled in VPLS-1. PE5 has BGP-AD enabled and PE2 has active/standby spoke SDPs to PE1 and PE5.

      In this configuration:

      • PE1, PE3, and PE4 attempt to establish BGP-AD spoke SDPs, but they are kept operationally down as long as there are EVPN endpoints active among them.

      • BGP-AD spoke SDPs and EVPN endpoints are instantiated within the same split horizon group, for example, SHG-1.

      • Manual spoke SDPs from PE1 and PE5 to PE2 are not part of SHG-1.

    • EVPN MAC advertisements:

      • MACs learned on FEC128 spoke SDPs are advertised normally in EVPN.

      • MACs learned on FEC129 spoke SDPs are not advertised in EVPN (because they are part of SHG-1, which is the split horizon group used for bgp-evpn>mpls). This prevents any data plane MACs learned on the SHG from being advertised in EVPN.

    • BUM operation on PE1:

      • When CE1 sends BUM, PE1 floods to all the active bindings.

      • When CE2 sends BUM, PE2 sends it to PE1 (active spoke SDP) and PE1 floods to all the bindings and SAPs.

      • When CE5 sends BUM, PE5 floods to the three EVPN PEs. PE1 floods to the active spoke SDP and SAPs, never to the EVPN PEs because they are part of the same SHG.

The operation in services with BGP-VPLS and BGP-EVPN is equivalent to the one described above for BGP-AD and BGP-EVPN.

EVPN single-active multihoming and BGP-VPLS integration

In a VPLS service to which multiple EVPN PEs and BGP-VPLS PEs are attached, single-active multihoming is supported on two or more of the EVPN PEs with no special considerations. All-active multihoming is not supported, because the traffic from the all-active multihomed CE could cause a MAC flip-flop effect on remote BGP-VPLS PEs, asymmetric flows, or other issues.

BGP-VPLS to EVPN integration and single-active MH illustrates a scenario with a single-active Ethernet-segment used in a service where EVPN PEs and BGP-VPLS are integrated.

Figure 25. BGP-VPLS to EVPN integration and single-active MH

Although other single-active examples are supported, in BGP-VPLS to EVPN integration and single-active MH, CE1 is connected to the EVPN PEs through a single LAG (lag-1). The LAG is associated with the Ethernet-segment 1 on PE1 and PE2, which is configured as single-active and with oper-group 1. PE1 and PE2 make use of lag>monitor-oper-group 1 so that the non-DF PE can signal the non-DF state to CE1 (in the form of LACP out-of-synch or power-off).

In addition to the BGP-VPLS routes sent for the service ve-id, the multihoming PEs in this case need to generate additional BGP-VPLS routes per Ethernet Segment (per VPLS service) for the purpose of MAC flush on the remote BGP-VPLS PEs in case of failure.

The sap>bgp-vpls-mh-veid number command should be configured on the SAPs that are part of an EVPN single-active Ethernet Segment, and allows the advertisement of L2VPN routes that indicate the state of the multihomed SAPs to the remote BGP-VPLS PEs. Upon a Designated Forwarder (DF) switchover, the F and D bits of the generated L2VPN routes for the SAP ve-id are updated so that the remote BGP-VPLS PEs can perform a mac-flush operation on the service and avoid blackholes.

As an example, in case of a failure on the Ethernet-segment sap on PE1, PE1 must indicate PE3 and PE4 the need to flush MAC addresses learned from PE1 (flush-all-from-me message). Otherwise, for example, PE3 continues sending traffic with MAC DA = CE1 to PE1, and PE1 blackholes the traffic.

In the BGP-VPLS to EVPN integration and single-active MH example:

  • Both ES peers (PE1 and PE2) should be configured with the same ve-id for the ES SAP. However, this is not mandatory.

  • In addition to the regular service ve-id L2VPN route, based on the sap>bgp-vpls-mh-ve-id configuration and upon BGP VPLS being enabled, the PE advertises an L2VPN route with the following fields:

    • ve-id = sap>bgp-vpls-mh-ve-id identifier

    • RD, RT, next hop and other attributes same as the service BGP VPLS route

    • L2VPN information extended community with the following flags:

      • D=0 if the SAP is oper-up or oper-down with a flag MHStandby (for example, the PE is non-DF in single-active MH)

      • D=0 also if there is an ES oper-group and the port is down because of the oper-group

      • D=1 if the SAP is oper-down with a different flag (for example, port-down or admin-down)

      • F (DF bit) =1 if the SAP is oper-up, F=0 otherwise

  • Upon a failure on the access SAP, there are only mac-flush messages triggered in case the command bgp-vpls-mh-ve-id is configured in the failing SAP. In case it is configured with ve-id 1:

    • If the non-DF PE has a failure on the access SAP, PE2 sends an update with ve-id=1/D=1/F=0. This is an indication for PE3/PE4 that PE2's SAP is oper-down but it should not trigger a mac-flush on PE3/PE4.

    • If the DF PE has a failure on the SAP, PE1 advertises ve-id=1/D=1/F=0. Upon receiving this update, PE3 and PE4 flushes all their MACs associated with the PE1's spoke SDP. Note, that the failure on PE1, triggers an EVPN DF Election on PE2, which becomes DF and advertises ve-id=1/D=0/F=1. This message does not trigger any mac-flush procedures.

Other considerations:

  • PE3/PE4 are SR OS or any third-party PEs that support the procedures in draft-ietf-bess-vpls-multihoming, so that BGP-VPLS mac-flush signaling is understood.

  • PE1 and PE2 are expected to run an SR OS version that supports the sap>bgp-vpls-mh-veid number configuration on the multihomed SAPs. Otherwise, the mac-flush behavior would not work as expected.

  • The procedures described above are also supported if the EVPN PEs use MC-LAG instead of an ES for the CE1 redundancy. In this case, the SAP ve-id route for the standby PE is sent as ve-id=1/D=1/F=0, whereas the active chassis advertises ve-id=1/D=0/F=1. A switchover triggers mac-flush on the remote PEs as described earlier.

  • The L2VPN routes generated for the ES and SAPs with the sap bgp-vpls-mh-veid number command are decoded in the remote nodes as bgp-mh routes (because they do not have label information) in the show router bgp routes l2-vpn command and debug.

Auto-derived RD in services with multiple BGP families

In a VPLS service, multiple BGP families and protocols can be enabled at the same time. When bgp-evpn is enabled, bgp-ad and bgp-mh are also supported. A single RD is used per service and not per BGP family or protocol.

The following rules apply:

  • The VPLS RD is selected based on the following precedence:

    • Manual RD or automatic RD always take precedence when configured.

    • If no manual or automatic RD configuration, the RD is derived from the bgp-ad>vpls-id.

    • If manual RD, automatic RD, or VPLS ID are not configured, the RD is derived from the bgp-evpn>evi, except for bgp-mh and except when the EVI is greater than 65535. In these two cases, no EVI-derived RD is possible.

    • If manual RD, automatic RD, VPLS ID, or EVI is not configured, there is no RD and the service fails.

  • The selected RD (see preceding rules) is displayed by the Oper Route Dist field of the show service id bgp command.

  • The service supports dynamic RD changes. For example, the CLI allows the dynamic update of VPLS ID be , even if it is used to automatically derive the service RD for bgp-ad, bgp-vpls, or bgp-mh.

    Note: When the RD changes, the active routes for that VPLS are withdrawn and readvertised with the new RD.
  • If one of the mechanisms to derive the RD for a specified service is removed from the configuration, the system selects a new RD based on the preceding rules. For example, if the VPLS ID is removed from the configuration, the routes are withdrawn, the new RD selected from the EVI, and the routes readvertised with the new RD.

    Note: This reconfiguration fails if the new RD already exists in a different VPLS or Epipe.
  • Because the vpls-id takes precedence over the EVI when deriving the RD automatically, adding evpn to an existing bgp-ad service does not impact the existing RD. The latter is important to support bgp-ad to evpn migration.

EVPN multihoming in VPLS services

EVPN multihoming implementation is based on the concept of the ethernet-segment. An ethernet-segment is a logical structure that can be defined in one or more PEs and identifies the CE (or access network) multihomed to the EVPN PEs. An ethernet-segment is associated with port, LAG, PW port, or SDP objects and is shared by all the services defined on those objects. In the case of virtual ESs, individual VID or VC-ID ranges can be associated with the port, LAG, or PW port, SDP objects defined in the ethernet-segment.

Each ethernet-segment has a unique Ethernet Segment Identifier (ESI) that is 10 bytes long and is manually configured in the router.

Note: The ESI is advertised in the control plane to all the PEs in an EVPN network; therefore, it is very important to ensure that the 10-byte ESI value is unique throughout the entire network. Single-homed CEs are assumed to be connected to an Ethernet-Segment with esi = 0 (single-homed Ethernet-Segments are not explicitly configured).

This section describes the behavior of the EVPN multihoming implementation in an EVPN-MPLS service.

EVPN all-active multihoming

As described in RFC 7432, all-active multihoming is only supported on access LAG SAPs and it is mandatory that the CE is configured with a LAG to avoid duplicated packets to the network. Configuring the LACP is optional. SR OS also supports the association of a PW port or a normal port to an all-active multihoming ES. When the ES is associated with a physical port and not a LAG, the CE must be configured with a single LAG without LACP.

Three different procedures are implemented in 7750 SR, 7450 ESS, and 7950 XRS SR OS to provide all-active multihoming for a specified Ethernet-Segment:

  • DF (Designated Forwarder) election

  • Split-horizon

  • Aliasing

DF election shows the need for DF election in all-active multihoming.

Figure 26. DF election

The DF election in EVPN all-active multihoming avoids duplicate packets on the multihomed CE. The DF election procedure is responsible for electing one DF PE per ESI per service; the rest of the PEs being non-DF for the ESI and service. Only the DF forwards BUM traffic from the EVPN network toward the ES SAPs (the multihomed CE). The non-DF PEs do not forward BUM traffic to the local Ethernet-Segment SAPs.

Note: The BUM traffic from the CE to the network and known unicast traffic in any direction is allowed on both the DF and non-DF PEs.

Split-horizon shows the EVPN split-horizon concept for all-active multihoming.

Figure 27. Split-horizon

The EVPN split-horizon procedure ensures that the BUM traffic originated by the multihomed PE and sent from the non-DF to the DF, is not replicated back to the CE (echoed packets on the CE). To avoid these echoed packets, the non-DF (PE1) sends all the BUM packets to the DF (PE2) with an indication of the source Ethernet-Segment. That indication is the ESI Label (ESI2 in the example), previously signaled by PE2 in the AD per-ESI route for the Ethernet-Segment. When PE2 receives an EVPN packet (after the EVPN label lookup), the PE2 finds the ESI label that identifies its local Ethernet-Segment ESI2. The BUM packet is replicated to other local CEs but not to the ESI2 SAP.

Aliasing shows the EVPN aliasing concept for all-active multihoming.

Figure 28. Aliasing

Because CE2 is multihomed to PE1 and PE2 using an all-active Ethernet-Segment, 'aliasing' is the procedure by which PE3 can load-balance the known unicast traffic between PE1 and PE2, even if the destination MAC address was only advertised by PE1 as in the example. When PE3 installs MAC1 in the FDB, it associates MAC1 not only with the advertising PE (PE1) but also with all the PEs advertising the same esi (ESI2) for the service. In this example, PE1 and PE2 advertise an AD per-EVI route for ESI2, therefore, the PE3 installs the two next-hops associated with MAC1.

Aliasing is enabled by configuring ECMP greater than 1 in the bgp-evpn>mpls context.

All-active multihoming service model

The following shows an example PE1 configuration that provides all-active multihoming to the CE2 shown in Aliasing .

*A:PE1>config>lag(1)# info 
----------------------------------------------
  mode access
  encap-type dot1q
  port 1/1/2
  lacp active administrative-key 1 system-id 00:00:00:00:00:22
  no shutdown

*A:PE1>config>service>system>bgp-evpn# info 
----------------------------------------------
  route-distinguisher 10.1.1.1:0
  ethernet-segment "ESI2" create
    esi 01:12:12:12:12:12:12:12:12:12
    multi-homing all-active
    service-carving
    lag 1 
    no shutdown

*A:PE1>config>redundancy>evpn-multi-homing# info 
----------------------------------------------
    boot-timer 120
    es-activation-timer 10

*A:PE1>config>service>vpls# info 
----------------------------------------------
  description "evpn-mpls-service with all-active multihoming"
  bgp
  bgp-evpn
    evi 10
    mpls bgp 1
      no shutdown
      auto-bind-tunnel resolution any
  sap lag-1:1 create 
  exit

In the same way, PE2 is configured as follows:

*A:PE1>config>lag(1)# info 
----------------------------------------------
  mode access
  encap-type dot1q
  port 1/1/1
  lacp active administrative-key 1 system-id 00:00:00:00:00:22
  no shutdown

*A:PE1>config>service>system>bgp-evpn# info 
----------------------------------------------
  route-distinguisher 10.1.1.1:0
  ethernet-segment "ESI12" create
    esi 01:12:12:12:12:12:12:12:12:12
    multi-homing all-active
    service-carving
    lag 1 
    no shutdown

*A:PE1>config>redundancy>evpn-multi-homing# info 
----------------------------------------------
    boot-timer 120
    es-activation-timer 10

*A:PE1>config>service>vpls# info 
----------------------------------------------
  description "evpn-mpls-service with all-active multihoming"
  bgp
    route-distinguisher 65001:60
    route-target target:65000:60
  bgp-evpn
    evi 10
    mpls bgp 1
      no shutdown
      auto-bind-tunnel resolution any
  sap lag-1:1 create 
  exit

The preceding configuration enables the all-active multihoming procedures. The following must be considered:

  • The ethernet-segment must be configured with a name and a 10-byte esi:

    • config>service>system>bgp-evpn# ethernet-segment<es_name> create

    • config>service> system>bgp-evpn>ethernet-segment# esi <value>

  • When configuring the esi, the system enforces the 6 high-order octets after the type to be different from zero (so that the auto-derived route-target for the ES route is different from zero). Other than that, the entire esi value must be unique in the system.

  • Only a LAG or a PW port can be associated with the all-active ethernet-segment. This LAG is exclusively used for EVPN multihoming. Other LAG ports in the system can be still used for MC-LAG and other services.

  • When the LAG is configured on PE1 and PE2, the same admin-key, system-priority, and system-id must be configured on both PEs, so that CE2 responds as though it is connected to the same system.

  • The same ethernet-segment may be used for EVPN-MPLS, EVPN-VXLAN and PBB-EVPN services.

    Note: The source-bmac-lsb attribute must be defined for PBB-EVPN (so that it is only used in PBB-EVPN, and ignored by EVPN). Other than EVPN-MPLS, EVPN-VXLAN and PBB-EVPN I-VPLS/Epipe services, no other Layer 2 services are allowed in the same ethernet-segment (regular VPLS defined on the ethernet-segment is kept operationally down).
  • Only one SAP per service can be part of the same ethernet-segment.

ES discovery and DF election procedures

The ES discovery and DF election is implemented in three logical steps, as shown in ES discovery and DF election.

Figure 29. ES discovery and DF election
Step 1 - ES advertisement and discovery

The ethernet-segment ESI-1 is configured as per the previous section, with all the required parameters. When ethernet-segment no shutdown is executed, PE1 and PE2 advertise an ES route for ESI-1. They both include the route-target auto-derived from the MAC portion of the configured ESI. If the route-target address family is configured in the network, this allows the RR to keep the dissemination of the ES routes under control.

In addition to the ES route, PE1 and PE2 advertise AD per-ESI routes and AD per-EVI routes.

  • AD per-ESI routes announce the Ethernet-Segment capabilities, including the mode (single-active or all-active) as well as the ESI label for split-horizon.

  • AD per-EVI routes are advertised so that PE3 knows what services (EVIs) are associated with the ESI. These routes are used by PE3 for its aliasing and backup procedures.

Step 2 - DF election

When ES routes exchange between PE1 and PE2 is complete, both run the DF election for all the services in the ethernet-segment.

PE1 and PE2 elect a Designated Forwarder (DF) per <ESI, service>. The default DF election mechanism in 7750 SR, 7450 ESS, and 7950 XRS SR OS is service-carving (as per RFC 7432). The following applies when enabled on a specified PE:

  • An ordered list of PE IPs where ESI-1 resides is built. The IPs are gotten from the Origin IP fields of all the ES routes received for ESI-1, as well as the local system address. The lowest IP is considered ordinal '0' in the list.

  • The local IP can only be considered a ‟candidate” after successful ethernet-segment no shutdown for a specified service.

    Note: The remote PE IPs must be present in the local PE's RTM so that they can participate in the DF election.
  • A PE only considers a specified remote IP address as candidate for the DF election algorithm for a specified service if, as well as the ES route, the corresponding AD routes per-ESI and per-EVI for that PE have been received and properly activated.

  • All the remote PEs receiving the AD per-ES routes (for example, PE3), interpret that ESI-1 is all-active if all the PEs send their AD per-ES routes with the single-active bit = 0. Otherwise, if at least one PE sends an AD route per-ESI with the single-active flag set or the local ESI configuration is single-active, the ESI behaves as single-active.

  • An es-activation-timer can be configured at the redundancy>bgp-evpn-multi-homing>es-activation-timer level or at the service>system>bgp-evpn>eth-seg>es-activation-timer level. This timer, which is 3 seconds by default, delays the transition from non-DF to DF for a specified service, after the DF election has run.

    • This use of the es-activation-timer is different from zero and minimizes the risks of loops and packet duplication because of ‟transient” multiple DFs.

    • The same es-activation-timer should be configured in all the PEs that are part of the same ESI. It is up to the user to configure either a long timer to minimize the risks of loops/duplication or even es-activation-timer=0 to speed up the convergence for non-DF to DF transitions. When the user configures a specific value, the value configured at ES level supersedes the configured global value.

  • The DF election is triggered by the following events:

    • config>service>system>bgp-evpn>eth-seg# no shutdown triggers the DF election for all the services in the ESI.

    • Reception of a new update/withdrawal of an ES route (containing an ESI configured locally) triggers the DF election for all the services in the ESI.

    • Reception of a new update/withdrawal of an AD per-ES route (containing an ESI configured locally) triggers the DF election for all the services associated with the list of route-targets received along with the route.

    • Reception of a new update of an AD per-ES route with a change in the ESI-label extended community (single-active bit or MPLS label) triggers the DF election for all the services associated with the list of route-targets received along with the route.

    • Reception of a new update/withdrawal of an AD route per-EVI (containing an ESI configured locally) triggers the DF election for that service.

  • When the PE boots up, the boot-timer allows the necessary time for the control plane protocols to come up before bringing up the Ethernet-Segment and running the DF algorithm. The boot-timer is configured at system level - config>redundancy>bgp-evpn-multi-homing# boot-timer - and should use a value long enough to allow the IOMs and BGP sessions to come up before exchanging ES routes and running the DF election for each EVI/ISID.

    • The system does not advertise ES routes until the boot timer expires. This guarantees that the peer ES PEs do not run the DF election either until the PE is ready to become the DF if it needs to.

    • The following show command displays the configured boot-timer as well as the remaining timer if the system is still in boot-stage.

      A:PE1# show redundancy bgp-evpn-multi-homing 
      ===============================================================================
      Redundancy BGP EVPN Multi-homing Information
      ===============================================================================
      Boot-Timer              : 10 secs                 
      Boot-Timer Remaining    : 0 secs                  
      ES Activation Timer     : 3 secs                  
      ===============================================================================
      
  • When service-carving mode auto is configured (default mode), the DF election algorithm runs the function [V(evi) mod N(peers) = i(ordinal)] to identify the DF for a specified service and ESI, as described in the following example.

    As shown in ES discovery and DF election, PE1 and PE2 are configured with ESI-1. Given that V(10) mod N(2) = 0, PE1 is elected DF for VPLS-10 (because its IP address is lower than PE2's and it is the first PE in the candidate list).

    Note: The algorithm takes the configured evi in the service as opposed to the service-id itself. The evi for a service must match in all the PEs that are part of the ESI. This guarantees that the election algorithm is consistent across all the PEs of the ESI. The evi must be always configured in a service with SAPs/SDP bindings that are created in an ES.
  • A manual service-carving option is allowed so that the user can manually configure for which evi identifiers the PE is primary: service-carving mode manual / manual evi <start-evi> to <end-evi>

    • The system is the PE forwarding/multicasting traffic for the evi identifiers included in the configuration. The PE is secondary (non-DF) for the non-specified evi identifiers.

    • If a range is configured but the service-carving is not mode manual, then the range has no effect.

    • Only two PEs are supported when service-carving mode manual is configured. If a third PE is configured with service-carving mode manual for an ESI, the two non-primary PEs remain non-DF regardless of the primary status.

    • For example, as shown in ES discovery and DF election: if PE1 is configured with service-carving manual evi 1 to 100 and PE2 with service-carving manual evi 101 to 200, then PE1 is the primary PE for service VPLS 10 and PE2 the secondary PE.

  • When service-carving is disabled, the lowest originator IP wins the election for a specified service and ESI:

    config>service>system>bgp-evpn>eth-seg>service-carving> mode off

    The following show command displays the ethernet-segment configuration and DF status for all the EVIs and ISIDs (if PBB-EVPN is enabled) configured in the ethernet-segment.

    *A:PE1# show service system bgp-evpn ethernet-segment name "ESI-1" all 
    ===============================================================================
    Service Ethernet Segment
    ===============================================================================
    Name                    : ESI-1
    Admin State             : Up                 Oper State         : Up
    ESI                     : 01:00:00:00:00:71:00:00:00:01
    Multi-homing            : allActive          Oper Multi-homing  : allActive
    Source BMAC LSB         : 71-71              
    ES BMac Tbl Size        : 8                  ES BMac Entries    : 1
    Lag Id                  : 1                  
    ES Activation Timer     : 0 secs             
    Exp/Imp Route-Target    : target:00:00:00:00:71:00
    
    Svc Carving             : auto               
    ES SHG Label            : 262142             
    ===============================================================================
    ===============================================================================
    EVI Information 
    ===============================================================================
    EVI                 SvcId               Actv Timer Rem      DF
    -------------------------------------------------------------------------------
    1                   1                   0                   no
    -------------------------------------------------------------------------------
    Number of entries: 1
    ===============================================================================
    -------------------------------------------------------------------------------
    DF Candidate list
    -------------------------------------------------------------------------------
    EVI                                     DF Address
    -------------------------------------------------------------------------------
    1                                       192.0.2.69
    1                                       192.0.2.72
    -------------------------------------------------------------------------------
    Number of entries: 2
    -------------------------------------------------------------------------------
    -------------------------------------------------------------------------------
    ===============================================================================
    ISID Information 
    ===============================================================================
    ISID                SvcId               Actv Timer Rem      DF
    -------------------------------------------------------------------------------
    20001               20001               0                   no
    -------------------------------------------------------------------------------
    Number of entries: 1
    ===============================================================================
                                     
    -------------------------------------------------------------------------------
    DF Candidate list
    -------------------------------------------------------------------------------
    ISID                                    DF Address
    -------------------------------------------------------------------------------
    20001                                   192.0.2.69
    20001                                   192.0.2.72
    -------------------------------------------------------------------------------
    Number of entries: 2
    -------------------------------------------------------------------------------
    -------------------------------------------------------------------------------
    ===============================================================================
    BMAC Information 
    ===============================================================================
    SvcId                                   BMacAddress
    -------------------------------------------------------------------------------
    20000                                   00:00:00:00:71:71
    -------------------------------------------------------------------------------
    Number of entries: 1
    ===============================================================================
    
Step 3 - DF and non-DF service behavior

Based on the result of the DF election or the manual service-carving, the control plane on the non-DF (PE1) instructs the datapath to remove the LAG SAP (associated with the ESI) from the default flooding list for BM traffic (unknown unicast traffic may still be sent if the EVI label is a unicast label and the source MAC address is not associated with the ESI). On PE1 and PE2, both LAG SAPs learn the same MAC address (coming from the CE). For instance, in the following show commands, 00:ca:ca:ba:ce:03 is learned on both PE1 and PE2 access LAG (on ESI-1). However, PE1 learns the MAC as 'Learned' whereas PE2 learns it as 'Evpn'. This is because of the CE2 hashing the traffic for that source MAC to PE1. PE2 learns the MAC through EVPN but it associates the MAC to the ESI SAP, because the MAC belongs to the ESI.

*A:PE1# show service id 1 fdb detail 
===============================================================================
Forwarding Database, Service 1
===============================================================================
ServId    MAC               Source-Identifier        Type     Last Change
                                                     Age      
-------------------------------------------------------------------------------
1         00:ca:ca:ba:ce:03 sap:lag-1:1              L/0      06/11/15 00:14:47
1         00:ca:fe:ca:fe:70 eMpls:                   EvpnS    06/11/15 00:09:06
                            192.0.2.70:262140
1         00:ca:fe:ca:fe:72 eMpls:                   EvpnS    06/11/15 00:09:39
                            192.0.2.72:262141
-------------------------------------------------------------------------------
No. of MAC Entries: 3
-------------------------------------------------------------------------------
Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================

*A:PE2# show service id 1 fdb detail 
===============================================================================
Forwarding Database, Service 1
===============================================================================
ServId    MAC               Source-Identifier        Type     Last Change
                                                     Age      
-------------------------------------------------------------------------------
1         00:ca:ca:ba:ce:03 sap:lag-1:1              Evpn     06/11/15 00:14:47
1         00:ca:fe:ca:fe:69 eMpls:                   EvpnS    06/11/15 00:09:40
                            192.0.2.69:262141
1         00:ca:fe:ca:fe:70 eMpls:                   EvpnS    06/11/15 00:09:40
                            192.0.2.70:262140
-------------------------------------------------------------------------------
No. of MAC Entries: 3
-------------------------------------------------------------------------------
Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================

When PE1 (non-DF) and PE2 (DF) exchange BUM packets for evi 1, all those packets are sent including the ESI label at the bottom of the stack (in both directions). The ESI label advertised by each PE for ESI-1 can be displayed by the following command:

*A:PE1# show service system bgp-evpn ethernet-segment name "ESI-1" 
===============================================================================
Service Ethernet Segment
===============================================================================
Name                    : ESI-1
Admin State             : Up                 Oper State         : Up
ESI                     : 01:00:00:00:00:71:00:00:00:01
Multi-homing            : allActive          Oper Multi-homing  : allActive
Source BMAC LSB         : 71-71              
ES BMac Tbl Size        : 8                  ES BMac Entries    : 1
Lag Id                  : 1                  
ES Activation Timer     : 0 secs             
Exp/Imp Route-Target    : target:00:00:00:00:71:00

Svc Carving             : auto               
ES SHG Label            : 262142             
===============================================================================
*A:PE2# show service system bgp-evpn ethernet-segment name "ESI-1" 

===============================================================================
Service Ethernet Segment
===============================================================================
Name                    : ESI-1
Admin State             : Up                 Oper State         : Up
ESI                     : 01:00:00:00:00:71:00:00:00:01
Multi-homing            : allActive          Oper Multi-homing  : allActive
Source BMAC LSB         : 71-71              
ES BMac Tbl Size        : 8                  ES BMac Entries    : 0
Lag Id                  : 1                  
ES Activation Timer     : 20 secs            
Exp/Imp Route-Target    : target:00:00:00:00:71:00

Svc Carving             : auto               
ES SHG Label            : 262142             
===============================================================================
Aliasing

Following the example in ES discovery and DF election, if the service configuration on PE3 has ECMP > 1, PE3 adds PE1 and PE2 to the list of next-hops for ESI-1. As soon as PE3 receives a MAC for ESI-1, it starts load-balancing between PE1 and PE2 the flows to the remote ESI CE. The following command shows the FDB in PE3.

Note: MAC 00:ca:ca:ba:ce:03 is associated with the Ethernet-Segment eES:01:00:00:00:00:71:00:00:00:01 (esi configured on PE1 and PE2 for ESI-1).
*A:PE3# show service id 1 fdb detail 
===============================================================================
Forwarding Database, Service 1
===============================================================================
ServId    MAC               Source-Identifier        Type     Last Change
                                                     Age      
-------------------------------------------------------------------------------
1         00:ca:ca:ba:ce:03 eES:                     Evpn     06/11/15 00:14:47
                            01:00:00:00:00:71:00:00:00:01
1         00:ca:fe:ca:fe:69 eMpls:                   EvpnS    06/11/15 00:09:18
                            192.0.2.69:262141
1         00:ca:fe:ca:fe:70 eMpls:                   EvpnS    06/11/15 00:09:18
                            192.0.2.70:262140
1         00:ca:fe:ca:fe:72 eMpls:                   EvpnS    06/11/15 00:09:39
                            192.0.2.72:262141
-------------------------------------------------------------------------------
No. of MAC Entries: 4
-------------------------------------------------------------------------------
Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================

The following command shows all the EVPN-MPLS destination bindings on PE3, including the ES destination bindings.

The Ethernet-Segment eES:01:00:00:00:00:71:00:00:00:01 is resolved to PE1 and PE2 addresses:

*A:PE3# show service id 1 evpn-mpls 
===============================================================================
BGP EVPN-MPLS Dest
===============================================================================
TEP Address     Egr Label     Num. MACs   Mcast           Last Change
                 Transport                                
-------------------------------------------------------------------------------
192.0.2.69      262140        0           Yes             06/10/2015 14:33:30
                ldp                                        
192.0.2.69      262141        1           No              06/10/2015 14:33:30
                ldp                                        
192.0.2.70      262139        0           Yes             06/10/2015 14:33:30
                ldp                                        
192.0.2.70      262140        1           No              06/10/2015 14:33:30
                ldp                                        
192.0.2.72      262140        0           Yes             06/10/2015 14:33:30
                ldp                                        
192.0.2.72      262141        1           No              06/10/2015 14:33:30
                ldp                                        
192.0.2.73      262139        0           Yes             06/10/2015 14:33:30
                ldp                                        
192.0.2.254     262142        0           Yes             06/10/2015 14:33:30
                bgp                                        
-------------------------------------------------------------------------------
Number of entries : 8
-------------------------------------------------------------------------------
===============================================================================
===============================================================================
BGP EVPN-MPLS Ethernet Segment Dest
===============================================================================
Eth SegId                     TEP Address     Egr Label   Last Change
                                               Transport  
-------------------------------------------------------------------------------
01:00:00:00:00:71:00:00:00:01 192.0.2.69      262141      06/10/2015 14:33:30
                                              ldp          
01:00:00:00:00:71:00:00:00:01 192.0.2.72      262141      06/10/2015 14:33:30
                                              ldp          
01:74:13:00:74:13:00:00:74:13 192.0.2.73      262140      06/10/2015 14:33:30
                                              ldp          
-------------------------------------------------------------------------------
Number of entries : 3
-------------------------------------------------------------------------------
===============================================================================

PE3 performs aliasing for all the MACs associated with that ESI. This is possible because PE1 is configured with ECMP parameter >1:

*A:PE3>config>service>vpls# info 
----------------------------------------------
            bgp
            exit
            bgp-evpn
                evi 1
                mpls bgp 1
                    ecmp 4
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            proxy-arp
                shutdown
            exit
            stp
                shutdown
            exit
            sap 1/1/1:2 create
            exit
            no shutdown
Network failures and convergence for all-active multihoming

All-active multihoming ES failure shows the behavior on the remote PEs (PE3) when there is an ethernet-segment failure.

Figure 30. All-active multihoming ES failure

The unicast traffic behavior on PE3 is as follows:

  1. PE3 forwards MAC DA = CE2 to both PE1 and PE2 when the MAC advertisement route came from PE1 (or PE2) and the set of Ethernet AD per-ES routes and Ethernet AD per-EVI routes from PE1 and PE2 are active at PE3.

  2. If there was a failure between CE2 and PE2, PE2 would withdraw its set of Ethernet AD and ES routes, then PE3 would forward traffic destined for CE2 to PE1 only. PE3 does not need to wait for the withdrawal of the individual MAC.

    The same behavior would be followed if the failure had been at PE1.

  3. If after step 2, PE2 withdraws its MAC advertisement route, then PE3 treats traffic to MAC DA = CE2 as unknown unicast, unless the MAC had been previously advertised by PE1.

For BUM traffic, the following events would trigger a DF election on a PE and only the DF would forward BUM traffic after the esi-activation-timer expiration (if there was a transition from non-DF to DF).

  • Reception of ES route update (local ES shutdown/no shutdown or remote route)

  • New AD-ES route update/withdraw

  • New AD-EVI route update/withdraw

  • Local ES port/SAP/service shutdown

  • Service carving range change (effecting the evi)

  • Multihoming mode change (single/all active to all/single-active)

Logical failures on ESs and blackholes

Be aware of the effects triggered by specific 'failure scenarios'; some of these scenarios are shown in Blackhole caused by SAP/SVC shutdown:

Figure 31. Blackhole caused by SAP/SVC shutdown

If an individual VPLS service is shutdown in PE1 (the example is also valid for PE2), the corresponding LAG SAP goes operationally down. This event triggers the withdrawal of the AD per-EVI route for that particular SAP. PE3 removes PE1 of its list of aliased next-hops and PE2 takes over as DF (if it was not the DF already). However, this does not prevent the network from black-holing the traffic that CE2 'hashes' to the link to PE1. Traffic sent from CE2 to PE2 or traffic from the rest of the CEs to CE2 is not affected, so this situation is not easily detected on the CE.

The same result occurs if the ES SAP is administratively shutdown instead of the service.

Note: When bgp-evpn mpls shutdown is executed, the SAP associated with the ES is brought operationally down (StandbyforMHprotocol) and so does the entire service if there are no other SAPs or SDP bindings in the service. However, if there are other SAPs/SDP bindings, the service remains operationally up.
Transient issues caused by MAC route delays

Some situations may cause potential transient issues to occur. These are shown in Transient issues caused by ‟slow” MAC learning and described below.

Figure 32. Transient issues caused by ‟slow” MAC learning

Transient packet duplication caused by delay in PE3 to learn MAC1:

This scenario is illustrated by the diagram on the left in Transient issues caused by ‟slow” MAC learning. In an all-active multihoming scenario, if a specified MAC address is not yet learned in a remote PE, but is known in the two PEs of the ES, for example, PE1 and PE2, the latter PEs may send duplicated packets to the CE.

In an all-active multihoming scenario, if a specified MAC address (for example, MAC1), is not learned yet in a remote PE (for example, PE3), but it is known in the two PEs of the ES (for example, PE1 and PE2), the latter PEs may send duplicated packets to the CE.

This issue is solved by the use of ingress-replication-bum-label in PE1 and PE2. If configured, PE1/PE2 knows that the received packet is an unknown unicast packet, therefore, the NDF (PE1) does not send the packets to the CE and there is not duplication.

Note: Even without the ingress-replication-bum-label, this is only a transient situation that would be solved as soon as MAC1 is learned in PE3.

Transient blackhole caused by delay in PE1 to learn MAC1:

This case is illustrated by the diagram on the right in Transient issues caused by ‟slow” MAC learning. In an all-active multihoming scenario, MAC1 is known in PE3 and aliasing is applied to MAC1. However, MAC1 is not known yet in PE1, the NDF for the ES. If PE3 hashing picks up PE1 as the destination of the aliased MAC1, the packets are blackholed. This case is solved on the NDF by not blocking unknown unicast traffic that arrives with a unicast label. If PE1 and PE2 are configured using ingress-replication-bum-label, PE3 sends unknown unicast with a BUM label and known unicast with a unicast label. In the latter case, PE1 considers it is safe to forward the frame to the CE, even if it is unknown unicast. It is important to note that this is a transient issue and as soon as PE1 learns MAC1 the frames are forwarded as known unicast.

EVPN single-active multihoming

The 7750 SR, 7450 ESS, and 7950 XRS SR OS supports single-active multihoming on access LAG SAPs, regular SAPs, and spoke SDPs for a specified VPLS service.

The following SR OS procedures support EVPN single-active multihoming for a specified Ethernet-Segment:

  • DF (Designated Forwarder) election

    As in all-active multihoming, DF election in single-active multihoming determines the forwarding for BUM traffic from the EVPN network to the Ethernet-Segment CE. Also, in single-active multihoming, DF election also determines the forwarding of any traffic (unicast/BUM) and in any direction (to/from the CE).

  • backup PE

    In single-active multihoming, the remote PEs do not perform aliasing to the PEs in the Ethernet-Segment. The remote PEs identify the DF based on the MAC routes and send the unicast flows for the Ethernet-Segment to the PE in the DF and program a backup PE as an alternative next-hop for the remote ESI in case of failure.

    This RFC 7432 procedure is known as 'Backup PE' and is shown in Backup PE for PE3.

    Figure 33. Backup PE
Single-active multihoming service model

The following shows an example of PE1 configuration that provides single-active multihoming to CE2, as shown in Backup PE.

*A:PE1>config>service>system>bgp-evpn# info 
----------------------------------------------
  route-distinguisher 10.1.1.1:0
  ethernet-segment "ESI2" create
    esi 01:12:12:12:12:12:12:12:12:12
    multi-homing single-active
    service-carving
    sdp 1 
    no shutdown

*A:PE1>config>redundancy>evpn-multi-homing# info 
----------------------------------------------
    boot-timer 120
    es-activation-timer 10

*A:PE1>config>service>vpls# info 
----------------------------------------------
  description "evpn-mpls-service with single-active multihoming"
  bgp
  bgp-evpn
    evi 10
    mpls bgp 1
      no shutdown
      auto-bind-tunnel resolution any
  spoke-sdp 1:1 create 
  exit

The PE2 example configuration for this scenario is as follows:

*A:PE1>config>service>system>bgp-evpn# info 
----------------------------------------------
  route-distinguisher 10.1.1.1:0
  ethernet-segment "ESI2" create
    esi 01:12:12:12:12:12:12:12:12:12
    multi-homing single-active
    service-carving
    sdp 2 
    no shutdown

*A:PE1>config>redundancy>evpn-multi-homing# info 
----------------------------------------------
    boot-timer 120
    es-activation-timer 10

*A:PE1>config>service>vpls# info 
----------------------------------------------
  description "evpn-mpls-service with single-active multihoming"
  bgp
  bgp-evpn
    evi 10
    mpls bgp 1
      no shutdown
      auto-bind-tunnel resolution any
  spoke-sdp 2:1 create 
  exit

In single-active multihoming, the non-DF PEs for a specified ESI block unicast and BUM traffic in both directions (upstream and downstream) on the object associated with the ESI. Other than that, single-active multihoming is similar to all-active multihoming with the following differences:

  • The ethernet-segment is configured for single-active: service>system>bgp-evpn>eth-seg>multi-homing single-active.

  • The advertisement of the ESI-label in an AD per-ESI is optional for single-active Ethernet-Segments. The user can control the no advertisement of the ESI label by using the service system bgp-evpn eth-seg multi-homing single-active no-esi-label command. By default, the ESI label is used for single-active ESs too.

  • For single-active multihoming, the Ethernet-Segment can be associated with a port and sdp, as well as a lag-id, as shown in Backup PE, where:

    • port is used for single-active SAP redundancy without the need for lag.

    • sdp is used for single-active spoke SDP redundancy.

    • lag is used for single-active LAG redundancy

      Note: In this case, key, system-id, and system-priority must be different on the PEs that are part of the Ethernet-Segment).
  • For single-active multihoming, when the PE is non-DF for the service, the SAPs/spoke SDPs on the Ethernet-Segment are down and show StandByForMHProtocol as the reason.

  • From a service perspective, single-active multihoming can provide redundancy to CEs (MHD, Multihomed Devices) or networks (MHN, Multihomed Networks) with the following setup:

    • LAG with or without LACP

      In this case, the multihomed ports on the CE are part of the different LAGs (a LAG per multihomed PE is used in the CE). The non-DF PE for each service can signal that the SAP is operationally down if eth-cfm fault-propagation-enable {use-if-tlv | suspend-ccm} is configured.

    • regular Ethernet 802.1q/ad ports

      In this case, the multihomed ports on the CE/network are not part of any LAG. Eth-cfm can also be used for non-DF indication to the multihomed device/network.

    • active-standby PWs

      In this case, the multihomed CE/network is connected to the PEs through an MPLS network and an active/standby spoke SDP per service. The non-DF PE for each service makes use of the LDP PW status bits to signal that the spoke SDP is operationally down on the PE side.

ES and DF election procedures

In all-active multihoming, the non-DF keeps the SAP up, although it removes it from the default flooding list. In the single-active multihoming implementation the non-DF brings the SAP or SDP binding operationally down. See ES discovery and DF election procedures.

The following show commands display the status of the single-active ESI-7413 in the non-DF. The associated spoke SDP is operationally down and it signals PW Status standby to the multihomed CE:

*A:PE1# show service system bgp-evpn ethernet-segment name "ESI-7413"       

===============================================================================
Service Ethernet Segment
===============================================================================
Name                    : ESI-7413
Admin State             : Up                 Oper State         : Up
ESI                     : 01:74:13:00:74:13:00:00:74:13
Multi-homing            : singleActive       Oper Multi-homing  : singleActive
Source BMAC LSB         : <none>             
Sdp Id                  : 4                  
ES Activation Timer     : 0 secs             
Exp/Imp Route-Target    : target:74:13:00:74:13:00

Svc Carving             : auto               
ES SHG Label            : 262141             
===============================================================================
*A:PE1# show service system bgp-evpn ethernet-segment name "ESI-7413" evi 1 
===============================================================================
EVI DF and Candidate List
===============================================================================
EVI           SvcId         Actv Timer Rem      DF  DF Last Change
-------------------------------------------------------------------------------
1             1             0                   no  06/11/2015 20:05:32
===============================================================================
===============================================================================
DF Candidates                           Time Added
-------------------------------------------------------------------------------
192.0.2.70                              06/11/2015 20:05:20
192.0.2.73                              06/11/2015 20:05:32
-------------------------------------------------------------------------------
Number of entries: 2
===============================================================================
*A:PE1# show service id 1 base 
===============================================================================
Service Basic Information
===============================================================================
Service Id        : 1                   Vpn Id            : 0
Service Type      : VPLS                
Name              : (Not Specified)
Description       : (Not Specified)

<snip>
-------------------------------------------------------------------------------
Service Access & Destination Points
-------------------------------------------------------------------------------
Identifier                               Type         AdmMTU  OprMTU  Adm  Opr
-------------------------------------------------------------------------------
sap:1/1/1:1                              q-tag        9000    9000    Up   Up
sdp:4:13 S(192.0.2.74)                   Spok         0       8978    Up   Down
===============================================================================
* indicates that the corresponding row element may have been truncated.


*A:PE1# show service id 1 all | match Pw 
Local Pw Bits      : pwFwdingStandby
Peer Pw Bits       : None

*A:PE1# show service id 1 all | match Flag 
Flags              : StandbyForMHProtocol
Flags              : None

Backup PE function

A remote PE (PE3 in Backup PE) imports the AD routes per ESI, where the single-active flag is set. PE3 interprets that the Ethernet-Segment is single-active if at least one PE sends an AD route per-ESI with the single-active flag set. MACs for a specified service and ESI are learned from a single PE, that is, the DF for that <ESI, EVI>.

The remote PE installs a single EVPN-MPLS destination (TEP, label) for a received MAC address and a backup next-hop to the PE for which the AD routes per-ESI and per-EVI are received. For instance, in the following command, 00:ca:ca:ba:ca:06 is associated with the remote ethernet-segment eES 01:74:13:00:74:13:00:00:74:13. That eES is resolved to PE (192.0.2.73), which is the DF on the ES.

*A:PE3# show service id 1 fdb detail 
===============================================================================
Forwarding Database, Service 1
===============================================================================
ServId    MAC               Source-Identifier        Type     Last Change
                                                     Age      
-------------------------------------------------------------------------------
1         00:ca:ca:ba:ca:02 sap:1/1/1:2              L/0      06/12/15 00:33:39
1         00:ca:ca:ba:ca:06 eES:                     Evpn     06/12/15 00:33:39
                            01:74:13:00:74:13:00:00:74:13
1         00:ca:fe:ca:fe:69 eMpls:                   EvpnS    06/11/15 21:53:47
                            192.0.2.69:262118
1         00:ca:fe:ca:fe:70 eMpls:                   EvpnS    06/11/15 19:59:57
                            192.0.2.70:262140
1         00:ca:fe:ca:fe:72 eMpls:                   EvpnS    06/11/15 19:59:57
                            192.0.2.72:262141
-------------------------------------------------------------------------------
No. of MAC Entries: 5
-------------------------------------------------------------------------------
Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================

*A:PE3# show service id 1 evpn-mpls 
===============================================================================
BGP EVPN-MPLS Dest
===============================================================================
TEP Address     Egr Label     Num. MACs   Mcast           Last Change
                 Transport                                
-------------------------------------------------------------------------------
192.0.2.69      262118        1           Yes             06/11/2015 19:59:03
                ldp                                        
192.0.2.70      262139        0           Yes             06/11/2015 19:59:03
                ldp                                        
192.0.2.70      262140        1           No              06/11/2015 19:59:03
                ldp                                        
192.0.2.72      262140        0           Yes             06/11/2015 19:59:03
                ldp                                        
192.0.2.72      262141        1           No              06/11/2015 19:59:03
                ldp                                        
192.0.2.73      262139        0           Yes             06/11/2015 19:59:03
                ldp                                        
192.0.2.254     262142        0           Yes             06/11/2015 19:59:03
                bgp                                        
-------------------------------------------------------------------------------
Number of entries : 7
-------------------------------------------------------------------------------
===============================================================================
===============================================================================
BGP EVPN-MPLS Ethernet Segment Dest
===============================================================================
Eth SegId                     TEP Address     Egr Label   Last Change
                                               Transport  
-------------------------------------------------------------------------------
01:74:13:00:74:13:00:00:74:13 192.0.2.73      262140      06/11/2015 19:59:03
                                              ldp          
-------------------------------------------------------------------------------
Number of entries : 1
-------------------------------------------------------------------------------
===============================================================================

If PE3 sees only two single-active PEs in the same ESI, the second PE is the backup PE. Upon receiving an AD per-ES/per-EVI route withdrawal for the ESI from the primary PE, the PE3 starts sending the unicast traffic to the backup PE immediately.

If PE3 receives AD routes for the same ESI and EVI from more than two PEs, the PE does not install any backup route in the datapath. Upon receiving an AD per-ES/per-EVI route withdrawal for the ESI, it flushes the MACs associated with the ESI.

Network failures and convergence for single-active multihoming

Single-active multihoming ES failure shows the remote PE (PE3) behavior when there is an Ethernet-Segment failure.

Figure 34. Single-active multihoming ES failure

The PE3 behavior for unicast traffic is as follows:

  1. PE3 forwards MAC DA = CE2 to PE2 when the MAC advertisement route came from PE2 and the set of Ethernet AD per-ES routes and Ethernet AD per-EVI routes from PE1 and PE2 are active at PE3.

  2. If there was a failure between CE2 and PE2, PE2 would withdraw its set of Ethernet AD and ES routes, then PE3 would immediately forward the traffic destined for CE2 to PE1 only (the backup PE). PE3 does not need to wait for the withdrawal of the individual MAC.

  3. If after step 2, PE2 withdraws its MAC advertisement route, PE3 treats traffic to MAC DA = CE2 as unknown unicast, unless the MAC has been previously advertised by PE1.

Also, a DF election on PE1 is triggered. In general, a DF election is triggered by the same events as for all-active multihoming. In this case, the DF forwards traffic to CE2 when the esi-activation-timer expiration occurs (the timer kicks in when there is a transition from non-DF to DF).

EVPN ESI type 1 support

According to RFC 7432, specific Ethernet Segment Identifier (ESI) types support auto-derivation and the 10-byte ESI value does not need to be configured. SR OS supports the manual configuration of 10-byte ESI for the Ethernet segment, or alternatively, the auto-derivation of EVPN type 1 ESIs.

The auto-esi {none|type-1} command is supported in the Ethernet segment configuration. The default mode is none and it forces the user to configure a manual ESI. When type-1 is configured, a manual ESI cannot be configured in the ES and the ESI is auto-derived in accordance with the RFC 7432 ESI type 1 definition. An ESI type 1 encodes 0x01 in the ESI type octet (T=0x01) and indicates that IEEE 802.1AX LACP is used between the PEs and CEs.

The ESI is auto-derived from the CE's LACP PDUs by concatenating the following parameters:

  • CE LACP system MAC address (6 octets)

    The CE LACP system MAC address is encoded in the high-order 6 octets of the ESI value field.

  • CE LACP port key (2 octets)

    The CE LACP port key is encoded in the 2 octets next to the system MAC address.

The remaining octet is set to 0x00.

The following usage considerations apply to auto-ESI type 1:

  • ESI type 1 is only supported on non-virtual Ethernet segments associated with LAGs when LACP is enabled.

  • Single-active or all-active modes are supported. When used with a single-active node, the CE must be attached to the PEs by a single LAG, which allows the multihomed PEs to auto-derive the same ESI.

  • Changing the auto-esi command requires an ES shutdown.

  • When the ES is enabled but the ESI has not yet been auto-derived, no multihoming routes are advertised for the ES. ES and AD routes are advertised only after ESI type 1 is auto-derived and the ES is enabled.

  • When the ES LAG is operationally down as a result of the ports or LACP going down, the previously auto-derived ESI is retained. Consequently, convergence is not impacted when the LAG comes back up; if the CE's LACP information is changed, the ES goes down and a new auto-derived type 1 ESI is generated.

P2MP mLDP tunnels for BUM traffic in EVPN-MPLS services

P2MP mLDP tunnels for BUM traffic in EVPN-MPLS services are supported and enabled through the use of the provider-tunnel context. If EVPN-MPLS takes ownership over the provider-tunnel, bgp-ad is still supported in the service but it does not generate BGP updates, including the PMSI Tunnel Attribute. The following CLI example shows an EVPN-MPLS service that uses P2MP mLDP LSPs for BUM traffic.

*A:PE-1>config>service>vpls(vpls or b-vpls)# info 
----------------------------------------------
  description "evpn-mpls-service with p2mp mLDP"
  bgp-evpn
    evi 10
    no ingress-repl-inc-mcast-advertisement 
    mpls bgp 1
      no shutdown
      auto-bind-tunnel resolution any
  exit
  provider-tunnel
    inclusive
      owner bgp-evpn-mpls
      root-and-leaf
      mldp
      no shutdown
      exit
    exit
  sap 1/1/1:1 create 
  exit
  spoke-sdp 1:1 create
 exit

When provider-tunnel inclusive is used in EVPN-MPLS services, the following commands can be used in the same way as for BGP-AD or BGP-VPLS services:

  • data-delay-interval

  • root-and-leaf

  • mldp

  • shutdown

The following commands are used by provider-tunnel in BGP-EVPN MPLS services:

  • [no] ingress-repl-inc-mcast-advertisement

    This command allows you to control the advertisement of IMET-IR and IMET-P2MP-IR routes for the service. See BGP-EVPN control plane for MPLS tunnels for a description of the IMET routes. The following considerations apply:

    • If configured as no ingress-repl-inc-mcast-advertisement, the system does not send the IMET-IR or IMET-P2MP-IR routes, regardless of the service being enabled for BGP-EVPN MLPLS or BGP-EVPN VXLAN.

    • If configured as ingress-repl-inc-mcast-advertisement and the PE is root-and-leaf, the system sends an IMET-P2MP-IR route.

    • If configured as ingress-repl-inc-mcast-advertisement and the PE is no root-and-leaf, the system sends an IMET-IR route.

    • Default value is ingress-repl-inc-mcast-advertisement.

  • [no] owner {bgp-ad | bgp-vpls | bgp-evpn-mpls}

    The owner of the provider tunnel must be configured. The default value is no owner. The following considerations apply:

    • Only one of the protocols supports a provider tunnel in the service and it must be explicitly configured.

    • bgp-vpls and bgp-evpn are mutually exclusive.

    • While bgp-ad and bgp-evpn can coexist in the same service, only bgp-evpn can be the provider-tunnel owner in such cases.

EVPN services with p2mp mLDP—control plane shows the use of P2MP mLDP tunnels in an EVI with a root node and a few leaf-only nodes.

Figure 35. EVPN services with p2mp mLDP—control plane

Consider the use case of a root-and-leaf PE4 where the other nodes are configured as leaf-only nodes (no root-and-leaf). This scenario is handled as follows:

  1. If ingress-repl-inc-mcast-advertisement is configured, then as soon as the bgp-evpn mpls option is enabled, the PE4 sends an IMET-P2MP route (tunnel type mLDP), or optionally, an IMET-P2MP-IR route (tunnel type composite). IMET-P2MP-IR routes allow leaf-only nodes to create EVPN-MPLS multicast destinations and send BUM traffic to the root.

  2. If ingress-repl-inc-mcast-advertisement is configured, PE1/2/3 do not send IMET-P2MP routes; only IMET-IR routes are sent.

    • The root-and-leaf node imports the IMET-IR routes from the leaf nodes but it only sends BUM traffic to the P2MP tunnel as long as it is active.

    • If the P2MP tunnel goes operationally down, the root-and-leaf node starts sending BUM traffic to the evpn-mpls multicast destinations

  3. When PE1/2/3 receive and import the IMET-P2MP or IMET-P2MP-IR from PE4, they join the mLDP P2MP tree signaled by PE4. They issue an LDP label-mapping message including the corresponding P2MP FEC.

As described in IETF Draft draft-ietf-bess-evpn-etree, mLDP and Ingress Replication (IR) can work in the same network for the same service; that is, EVI1 can have some nodes using mLDP (for example, PE1) and others using IR (for example, PE2). For scaling, this is significantly important in services that consist of a pair of root nodes sending BUM in P2MP tunnels and hundreds of leaf-nodes that only need to send BUM traffic to the roots. By using IMET-P2MP-IR routes from the roots, the operator makes sure the leaf-only nodes can send BUM traffic to the root nodes without the need to set up P2MP tunnels from the leaf nodes.

When both static and dynamic P2MP mLDP tunnels are used on the same router, Nokia recommends that the static tunnels use a tunnel ID lower than 8193. If a tunnel ID is statically configured with a value equal to or greater than 8193, BGP-EVPN may attempt to use the same tunnel ID for services with enabled provider-tunnel, and fail to set up an mLDP tunnel.

Inter-AS option C or seamless-MPLS models for non-segmented mLDP trees are supported with EVPN for BUM traffic. The leaf PE that joins an mLDP EVPN root PE supports Recursive and Basic Opaque FEC elements (types 7 and 1, respectively). Therefore, packet forwarding is handled as follows:

  • The ABR or ASBR may leak the root IP address into the leaf PE IGP, which allows the leaf PE to issue a Basic opaque FEC to join the root.

  • The ABR or ASBR may distribute the root IP using BGP label-ipv4, which results in the leaf PE issuing a Recursive opaque FEC to join the root.

For more information about mLDP opaque FECs, see the 7450 ESS, 7750 SR, 7950 XRS, and VSR Layer 3 Services Guide: IES and VPRN and the 7450 ESS, 7750 SR, 7950 XRS, and VSR MPLS Guide.

All-active multihoming and single-active with an ESI label multihoming are supported in EVPN-MPLS services together with P2MP mLDP tunnels. Both use an upstream-allocated ESI label, as described in RFC 7432 section 8.3.1.2, which is popped at the leaf PEs, resulting in the requirement that, in addition to the root PE, all EVPN-MPLS P2MP leaf PEs must support this capability (including the PEs not connected to the multihoming ES).

PBB-EVPN

This section contains information about PBB-EVPN.

BGP-EVPN control plane for PBB-EVPN

PBB-EVPN uses a reduced subset of the routes and procedures described in RFC 7432. The supported routes are:

  • ES routes

  • MAC/IP routes

  • Inclusive Multicast Ethernet Tag routes

EVPN route type 3 - inclusive multicast Ethernet tag route

This route is used to advertise the ISIDs that belong to I-VPLS services as well as the default multicast tree. PBB-Epipe ISIDs are not advertised in Inclusive Multicast routes. The following fields are used:

  • Route Distinguisher is taken from the RD of the B-VPLS service within the BGP context. The RD can be configured or derived from the value of the bgp-evpn evi.

  • Ethernet Tag ID encodes the ISID for a specified I-VPLS.

  • IP address length is always 32.

  • Originating router's IP address carries an IPv4 or IPv6 address.

  • PMSI attribute:

    • Tunnel type = Ingress replication (6).

    • Flags = Leaf not required.

    • MPLS label carries the MPLS label allocated for the service in the high-order 20 bits of the label field.

      Note: This label is the same label used in the B-MAC routes for the same B-VPLS service unless bgp-evpn mpls ingress-replication-bum-label is configured in the B-VPLS service.
    • Tunnel endpoint = equal to the originating IP address.

      Note: The mLDP P2MP tunnel type is supported on PBB-EPVN services, but it can be used in the default multicast tree only.
EVPN route type 2 - MAC/IP advertisement route (or B-MAC routes)

The 7750 SR, 7450 ESS, or 7950 XRS generates this route type for advertising B-MAC addresses for the following:

  • Learned MACs on B-SAPs or B-SDP bindings (if mac-advertisement is enabled)

  • Conditional static MACs (if mac-advertisement is enabled)

  • B-VPLS shared B-MACs (source-bmacs) and dedicated B-MACs (es-bmacs).

The route type 2 generated by the router uses the following fields and values:

  • Route Distinguisher is taken from the RD of the VPLS service within the BGP context. The RD can be configured or derived from the bgp-evpn evi value.

  • Ethernet Segment Identifier (ESI):

    • ESI = 0 for the advertisement of source-bmac, es-bmacs, sap-bmacs, or sdp-bmacs if no multihoming or single-active multihoming is used.

    • ESI=MAX-ESI (0xFF..FF) in the advertisement of es-bmacs used for all-active multihoming.

    • ESI different from zero or MAX-ESI for learned B-MACs on B-SAPs/SDP bindings if EVPN multihoming is used on B-VPLS SAPs and SDP bindings.

  • Ethernet Tag ID is 0.

    Note: A different Ethernet Tag value may be used only when send-bvpls-evpn-flush is enabled.
  • MAC address length is always 48.

  • B-MAC address (learned, configured, or system-generated).

  • IP address length zero and IP address omitted.

  • MPLS Label 1 carries the MPLS label allocated by the system to the B-VPLS service. The label value is encoded in the high-order 20 bits of the field and is the same label used in the routes type 3 for the same service unless BGP-EVPN MPLS ingress-replication-bum-label is configured in the service.

  • The MAC Mobility extended community:

    • The MAC mobility extended community is used in PBB-EVPN for C-MAC flush purposes if per ISID load balancing (single-active multihoming) is used and a source-bmac is used for traffic coming from the ESI.

      If there is a failure in one of the ES links, C-MAC flush through the withdrawal of the B-MAC cannot be done (other ESIs are still working); therefore, the MAC mobility extended community is used to signal C-MAC flush to the remote PEs.

    • When a dedicated es-bmac per ESI is used, the MAC flush can be based on the withdrawal of the B-MAC from the failing node.

    • es-bmacs are advertised as static (sticky bit set).

    • Source-bmacs are advertised as static MACs (sticky bit set). In the case of an update, if advertised to indicate that C-MAC flush is needed, the MAC mobility extended community is added to the B-MAC route including a higher sequence number (than the one previously advertised) in addition to the sticky bit.

EVPN route type 4 - ES route

This route type is used for DF election as described in section BGP-EVPN control plane for MPLS tunnels.

Note: The EVPN route type 1—Ethernet Auto Discovery route is not used in PBB-EVPN.

PBB-EVPN for I-VPLS and PBB Epipe services

The 7750 SR, 7450 ESS, and 7950 XRS SR OS implementation of PBB-EVPN reuses the existing PBB-VPLS model, where N I-VPLS (or Epipe) services can be linked to a B-VPLS service. BGP-EVPN is enabled in the B-VPLS and the B-VPLS becomes an EVI (EVPN Instance). PBB-EVPN for I-VPLS and PBB Epipe services shows the PBB-EVPN model in the SR OS.

Figure 36. PBB-EVPN for I-VPLS and PBB Epipe services

Each PE in the B-VPLS domain advertises its source-bmac as either configured in vpls>pbb>source-bmac or auto-derived from the chassis MAC. The remote PEs install the advertised B-MACs in the B-VPLS FDB. If a specified PE is configured with an ethernet-segment associated with an I-VPLS or PBB Epipe, it may also advertise an es-bmac for the Ethernet-Segment.

In the example shown in PBB-EVPN for I-VPLS and PBB Epipe services, when a frame with MAC DA = AA gets to PE1, a MAC lookup is performed on the I-VPLS FDB and B-MAC-34 is found. A B-MAC lookup on the B-VPLS FDB yields the next-hop (or next-hops if the destination is in an all-active Ethernet-Segment) to which the frame is sent. As in PBB-VPLS, the frame is encapsulated with the corresponding PBB header. A label specified by EVPN for the B-VPLS and the MPLS transport label are also added.

If the lookup on the I-VPLS FDB fails, the system sends the frame encapsulated into a PBB packet with B-MAC DA = Group B-MAC for the ISID. That packet is distributed to all the PEs where the ISID is defined and contains the EVPN label distributed by the Inclusive Multicast routes for that ISID, as well as the transport label.

For PBB-Epipes, all the traffic is sent in a unicast PBB packet to the B-MAC configured in the pbb-tunnel.

The following CLI output shows an example of the configuration of an I-VPLS, PBB-Epipe, and their corresponding B-VPLS.

*A:PE-1>config# 

service vpls 1 name "b-vpls1" b-vpls create
  description "pbb-evpn-service"
  service-mtu 2000
  pbb   
    source-bmac 00:00:00:00:00:03 
  bgp
  bgp-evpn
    evi 1
    mpls bgp 1
      no shutdown
      auto-bind-tunnel resolution any
  sap 1/1/1:1 create 
  exit
  spoke-sdp 1:1 create


*A:PE-1>config# 

service vpls 101 name "vpls101" i-vpls create
  pbb
    backbone-vpls 1
  sap 1/2/1:101 create
  spoke-sdp 1:102 create

*A:PE-1>config# 

service epipe 102 name "epipe102" create
  pbb
    tunnel 1 backbone-dest-mac 00:00:00:00:00:01 isid 102
  sap 1/2/1:102 create

Configure the bgp-evpn context as described in section EVPN for MPLS tunnels in VPLS services (EVPN-MPLS).

Some EVPN configuration options are not relevant to PBB-EVPN and are not supported when BGP-EVPN is configured in a B-VPLS; these are as follows:

  • bgp-evpn> [no] ip-route-advertisement

  • bgp-evpn> [no] unknown-mac-route

  • bgp-evpn> vxlan [no] shutdown

  • bgp-evpn>mpls>force-vlan-vc-forwarding

When bgp-evpn>mpls no shutdown is added to a specified B-VPLS instance, the following considerations apply:

  • BGP-AD is supported along with EVPN in the same B-VPLS instance.

  • The following B-VPLS and BGP-EVPN commands are fully supported:

    • vpls>backbone-vpls

    • vpls>backbone-vpls>send-flush-on-bvpls-failure

    • vpls>backbone-vpls>source-bmac

    • vpls>backbone-vpls>use-sap-bmac

    • vpls>backbone-vpls>use-es-bmac (For more information, see PBB-EVPN multihoming in I-VPLS and PBB Epipe services)

    • vpls>isid-policies

    • vpls>static-mac

    • vpls>SAP or SDP-binding>static-isid

    • bgp-evpn>mac-advertisement - this command affects the 'learned' B-MACs on SAPs or SDP bindings and not on the system B-MAC or SAP/es-bmacs being advertised.

    • bgp-evpn>mac-duplication and settings.

    • bgp-evpn>mpls>auto-bind-tunnel and options.

    • bgp-evpn>mpls>ecmp

    • bgp-evpn>mpls>control-word

    • bgp-evpn>evi

    • bgp-evpn>mpls>ingress-replication-bum-label

Flood containment for I-VPLS services

In general, PBB technologies in the 7750 SR, 7450 ESS, or 7950 XRS SR OS support a way to contain the flooding for a specified I-VPLS ISID, so that BUM traffic for that ISID only reaches the PEs where the ISID is locally defined. Each PE creates an MFIB per I-VPLS ISID on the B-VPLS instance. That MFIB supports SAP or SDP bindings endpoints that can be populated by:

  • MMRP in regular PBB-VPLS

  • IS-IS in SPBM

In PBB-EVPN, B-VPLS EVPN endpoints can be added to the MFIBs using EVPN Inclusive Multicast Ethernet Tag routes.

The example in PBB-EVPN and I-VPLS flooding containment shows how the MFIBs are populated in PBB-EVPN.

Figure 37. PBB-EVPN and I-VPLS flooding containment

When the B-VPLS 10 is enabled, PE1 advertises as follows:

  • A B-MAC route containing PE1's system B-MAC (00:01 as configured in pbb>source-bmac) along with an MPLS label.

  • An Inclusive Multicast Ethernet Tag route (IMET route) with Ethernet-tag = 0 that allows the remote B-VPLS 10 instances to add an entry for PE1 in the default multicast list.

Note: The MPLS label that is advertised for the MAC routes and the inclusive multicast routes for a specified B-VPLS can be the same label or a different label. As in regular EVPN-MPLS, this depends on the [no] ingress-replication-bum-label command.

When I-VPLS 2001 (ISID 2001) is enabled as per the CLI in the preceding section, PE1 advertises as follows:

An additional inclusive multicast route with Ethernet-tag = 2001. This allows the remote PEs to create an MFIB for the corresponding ISID 2001 and add the corresponding EVPN binding entry to the MFIB.

This default behavior can be modified by the configured isid-policy. For instance, for ISIDs 1-2000, configure as follows:

isid-policy
    entry 10 create
      no advertise-local 
      range 1 to 2000
      use-def-mcast

This configuration has the following effect for the ISID range:

  • no advertise-local instructs the system to not advertise the local active ISIDs contained in the 1 to 2001 range.

  • use-def-mcast instructs the system to use the default flooding list as opposed to the MFIB.

The ISID flooding behavior on B-VPLS SAPs and SDP bindings is as follows:

  • B-VPLS SAPs and SDP bindings are only added to the TLS-multicast list and not to the MFIB list (unless static-isids are configured, which is only possible for SAPs/SDP bindings and not BGP-AD spoke SDPs).

    As a result, if the system needs to flood ISID BUM traffic and the ISID is also defined in remote PEs connected through SAPs or spoke SDPs without static-isids, then an isid-policy must be configured for the ISID so that the ISID uses the default multicast list.

  • When an isid-policy is configured and a range of ISIDs use the default multicast list, the remote PBB-EVPN PEs are added to the default multicast list as long as they advertise an IMET route with an ISID included in the policy's ISID range. PEs advertising IMET routes with Ethernet-tag = 0 are also added to the default multicast list (7750 SR, 7450 ESS, or 7950 XRS SR OS behavior).

  • The B-VPLS 10 also allows the ISID flooding to legacy PBB networks via B-SAPs or B-SDPs. The legacy PBB network B-MACs are dynamically learned on those SAPs/binds or statically configured through the use of conditional static-macs. The use of static-isids is required so that non-local ISIDs are advertised.

sap 1/1/1:1 create 
exit
spoke-sdp 1:1 create
  static-mac
      mac 00:fe:ca:fe:ca:fe create sap 1/1/1:1 monitor fwd-status
  static-isid
      range 1 isid 3000 to 5000 create
Note: The configuration of PBB-Epipes does not trigger any IMET advertisement.
PBB-EVPN and PBB-VPLS integration

The 7750 SR, 7450 ESS, and 7950 XRS SR OS EVPN implementation supports RFC 8560 so that PBB-EVPN and PBB-VPLS can be integrated into the same network and within the same B-VPLS service.

All the concepts described in section EVPN and VPLS integration are also supported in B-VPLS services so that B-VPLS SAP or SDP bindings can be integrated with PBB-EVPN destination bindings. The features described in that section also facilitate a smooth migration from B-VPLS SDP bindings to PBB-EVPN destination bindings.

PBB-EVPN multihoming in I-VPLS and PBB Epipe services

The 7750 SR, 7450 ESS, and 7950 XRS SR OS PBB-EVPN implementation supports all-active and single-active multihoming for I-VPLS and PBB Epipe services.

PBB-EVPN multihoming reuses the ethernet-segment concept described in section EVPN multihoming in VPLS services. However, unlike EVPN-MPLS, PBB-EVPN does not use AD routes; it uses B-MACs for split-horizon checks and aliasing.

System B-MAC assignment in PBB-EVPN

RFC 7623 describes two types of B-MAC assignments that a PE can implement:

  • shared B-MAC addresses that can be used for single-homed CEs and a number of multihomed CEs connected to Ethernet-Segments

  • dedicated B-MAC addresses per Ethernet-Segment

In this document and in 7750 SR, 7450 ESS, and 7950 XRS terminology:

  • A shared-bmac (in IETF) is a source-bmac as configured in service>(b)vpls>pbb>source-bmac

  • A dedicated-bmac per ES (in IETF) is an es-bmac as configured in service>pbb>use-es-bmac

B-MAC selection and use depends on the multihoming model; for single-active mode, the type of B-MAC impacts the flooding in the network as follows:

  • All-active multihoming requires es-bmacs.

  • Single-active multihoming can use es-bmacs or source-bmacs.

    • The use of source-bmacs minimizes the number of B-MACs being advertised but has a larger impact on C-MAC flush upon ES failures.

    • The use of es-bmacs optimizes the C-MAC flush upon ES failures at the expense of advertising more B-MACs.

PBB-EVPN all-active multihoming service model

PBB-EVPN all-active multihoming shows the use of all-active multihoming in the 7750 SR, 7450 ESS, and 7950 XRS SR OS PBB-EVPN implementation.

Figure 38. PBB-EVPN all-active multihoming

For example, the following shows the ESI-1 and all-active configuration in PE3 and PE4. As in EVPN-MPLS, all-active multihoming is only possible if a LAG is used at the CE. All-active multihoming uses es-bmacs, that is, each ESI is assigned a dedicated B-MAC. All the PEs part of the ES source traffic using the same es-bmac.

In PBB-EVPN all-active multihoming and the following configuration, the es-bmac used by PE3 and PE4 is B-MAC-34 (for example, 00:00:00:00:00:34). The es-bmac for a specified ethernet-segment is configured by the source-bmac-lsb along with the (b-)vpls>pbb>use-es-bmac command.

Configuration in PE3:

*A:PE3>config>lag(1)# info 
----------------------------------------------
  mode access
  encap-type dot1q
  port 1/1/1
  lacp active administrative-key 32768
  no shutdown

*A:PE3>config>service>system>bgp-evpn# info 
----------------------------------------------
  route-distinguisher 10.3.3.3:0
  ethernet-segment ESI-1 create
    esi 00:34:34:34:34:34:34:34:34:34
    multi-homing all-active
    service-carving auto
    lag 1
    source-bmac-lsb 00:34 es-bmac-table-size 8
    no shutdown 

*A:PE3>config>service>vpls 1(b-vpls)# info 
----------------------------------------------
  bgp    
  exit    
  bgp-evpn
    evi 1
    mpls bgp 1
      no shutdown
      ecmp 2
      auto-bind-tunnel resolution any
  exit    
  pbb   
    source-bmac 00:00:00:00:00:03
    use-es-bmac

*A:PE3>config>service>vpls (i-vpls)# info 
----------------------------------------------
  pbb
    backbone-vpls 1
  sap lag-1:101 create

*A:PE1>config>service>epipe (pbb)# info 
----------------------------------------------
  pbb
    tunnel 1 backbone-dest-mac 00:00:00:00:00:01 isid 102
  sap lag-1:102 create

Configuration in PE4:

*A:PE4>config>lag(1)# info 
----------------------------------------------
  mode access
  encap-type dot1q
  port 1/1/1
  lacp active administrative-key 32768
  no shutdown

*A:PE4>config>service>system>bgp-evpn# info 
----------------------------------------------
  route-distinguisher 10.4.4.4:0
  ethernet-segment ESI-1 create
    esi 00:34:34:34:34:34:34:34:34:34
    multi-homing all-active
    service-carving auto
    lag 1
    source-bmac-lsb 00:34 es-bmac-table-size 8
    no shutdown 


*A:PE4>config>service>vpls 1(b-vpls)# info 
----------------------------------------------
  bgp    
  exit    
  bgp-evpn
    evi 1
    mpls bgp 1
      no shutdown
      ecmp 2
      auto-bind-tunnel resolution any
  exit    
  pbb   
    source-bmac 00:00:00:00:00:04
    use-es-bmac

*A:PE4>config>service>vpls (i-vpls)# info 
----------------------------------------------
  pbb
    backbone-vpls 1
  sap lag-1:101 create

*A:PE4>config>service>epipe (pbb)# info 
----------------------------------------------
  pbb
    tunnel 1 backbone-dest-mac 00:00:00:00:00:01 isid 102
  sap lag-1:102 create

The above configuration enables the all-active multihoming procedures for PBB-EVPN.

Note: The ethernet-segment ESI-1 can also be used for regular VPLS services.

The following considerations apply when the ESI is used for PBB-EVPN:

  • ESI association

    Only LAG is supported for all-active multihoming. The following commands are used for the LAG to ESI association:

    • config>service>system>bgp-evpn>ethernet-segment# lag <id>

    • config>service>system>bgp-evpn>ethernet-segment# source-bmaclsb <MAC-lsb> [es-bmac-table-size <size>]

    • Where:

      • The same ESI may be used for EVPN and PBB-EVPN services.

      • For PBB-EVPN services, the source-bmac-lsb attribute is mandatory and ignored for EVPN-MPLS services.

      • The source-bmac-lsb attribute must be set to a specific 2-byte value. The value must match on all the PEs part of the same ESI, for example, PE3 and PE4 for ESI-1. This means that the configured pbb>source-bmac on the two PEs for B-VPLS 1 must have the same 4 most significant bytes.

      • The es-bmac-table-size parameter modifies the default value (8) for the maximum number of virtual B-MACs that can be associated with the ethernet-segment (for example, es-bmacs). When the source-bmac-lsb is configured, the associated es-bmac-table-size is reserved out of the total FDB space.

      • When multi-homing all-active is configured within the ethernet-segment, only a LAG can be associated with it. The association of a port or an sdp is restricted by the CLI.

  • service-carving

    If service-carving is configured in the ESI, the DF election algorithm is a modulo function of the ISID and the number of PEs part of the ESI, as opposed to a modulo function of evi and number of PEs (used for EVPN-MPLS).

  • service-carving mode manual

    A service-carving mode manual option is added so that the user can control what PE is DF for a specified ISID. The PE is DF for the configured ISIDs and non-DF for the non-configured ISIDs.

  • DF election

    An all-active Designated Forwarder (DF) election is also carried out for PBB-EVPN. In this case, the DF election defines which of the PEs of the ESI for a specified I-VPLS is the one able to send the downstream BUM traffic to the CE. Only one DF per ESI is allowed in the I-VPLS service, and the non-DF only blocks BUM traffic and in the downstream direction.

  • split-horizon function

    In PBB-EVPN, the split-horizon function to avoid echoed packets on the CE is based on an ingress lookup of the ES B-MAC (as opposed to the ESI label in EVPN-MPLS). In PBB-EVPN all-active multihoming PE3 sends packets using B-MAC SA = BMAC-34. PE4 does not send those packets back to the CE because BMAC-34 is identified as the es-bmac for ESI-1.

  • aliasing

    In PBB-EVPN, aliasing is based on the ES B-MAC sent by all the PEs part of the same ESI. See the following section for more information. In PBB-EVPN all-active multihoming PE1 performs load balancing between PE3 and PE4 when sending unicast flows to BMAC-34 (es-bmac for ESI-1).

In the configuration above, a PBB-Epipe is configured in PE3 and PE4, both pointing at the same remote pbb tunnel backbone-dest-mac. On the remote PE, for example PE1, the configuration of the PBB-Epipe points at the es-bmac:

*A:PE1>config>service>epipe (pbb)# info 
----------------------------------------------
  pbb
    tunnel 1 backbone-dest-mac 00:00:00:00:00:34 isid 102
  sap 1/1/1:102 create

When PBB-Epipes are used in combination with all-active multihoming, Nokia recommends using bgp-evpn mpls ingress-replication-bum-label in the PEs where the ethernet-segment is created, that is in PE3 and PE4. This guarantees that in case of flooding in the B-VPLS service for the PBB Epipe, only the DF forwards the traffic to the CE.

Note: The PBB-Epipe traffic always uses B-MAC DA = unicast; therefore, the DF cannot check whether the inner frame is unknown unicast or not based on the group B-MAC. Therefore, the use of an EVPN BUM label is highly recommended.

Aliasing for PBB-Epipes with all-active multihoming only works if shared-queuing or ingress policing is enabled on the ingress PE Epipe. In any other case, the IOM sends the traffic to a single destination (no ECMP is used in spite of the bgp-evpn mpls ecmp setting).

All-active multihomed es-bmacs are treated by the remote PEs as eES:MAX-ESI BMACs. The following example shows the FDB in B-VPLS 1 in PE1 as shown in PBB-EVPN all-active multihoming:

*A:PE1# show service id 1 fdb detail 

===============================================================================
Forwarding Database, Service 1
===============================================================================
ServId    MAC               Source-Identifier        Type     Last Change
                                                     Age      
-------------------------------------------------------------------------------
1         00:00:00:00:00:03 eMpls:                   EvpnS    06/12/15 15:35:39
                            192.0.2.3:262138
1         00:00:00:00:00:04 eMpls:                   EvpnS    06/12/15 15:42:52
                            192.0.2.4:262130
1         00:00:00:00:00:34 eES:                     EvpnS    06/12/15 15:35:57
                            MAX-ESI
-------------------------------------------------------------------------------
No. of MAC Entries: 3
-------------------------------------------------------------------------------
Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================

The show service id evpn-mpls on PE1 shows that the remote es-bmac (that is, 00:00:00:00:00:34) has two associated next-hops (for example, PE3 and PE4):

*A:PE1# show service id 1 evpn-mpls 

===============================================================================
BGP EVPN-MPLS Dest
===============================================================================
TEP Address     Egr Label     Num. MACs   Mcast           Last Change
                 Transport                                
-------------------------------------------------------------------------------
192.0.2.3       262138        1           Yes             06/12/2015 15:34:48
                ldp                                        
192.0.2.4       262130        1           Yes             06/12/2015 15:34:48
                ldp                                        
-------------------------------------------------------------------------------
Number of entries : 2
-------------------------------------------------------------------------------
===============================================================================

===============================================================================
BGP EVPN-MPLS Ethernet Segment Dest
===============================================================================
Eth SegId                     TEP Address     Egr Label   Last Change
                                               Transport  
-------------------------------------------------------------------------------
No Matching Entries
===============================================================================

===============================================================================
BGP EVPN-MPLS ES BMAC Dest
===============================================================================
VBMacAddr                   TEP Address     Egr Label     Last Change
                                             Transport    
-------------------------------------------------------------------------------
00:00:00:00:00:34           192.0.2.3       262138        06/12/2015 15:34:48
                                            ldp            
00:00:00:00:00:34           192.0.2.4       262130        06/12/2015 15:34:48
                                            ldp            
-------------------------------------------------------------------------------
Number of entries : 2
-------------------------------------------------------------------------------
=============================================================================== 
Network failures and convergence for all-active multihoming

ES failures are resolved by the PEs withdrawing the es-bmac. The remote PEs withdraw the route and update their list of next-hops for a specified es-bmac.

No mac-flush of the I-VPLS FDB tables is required as long as the es-bmac is still in the FDB.

When the route corresponding to the last next-hop for a specified es-bmac is withdrawn, the es-bmac is flushed from the B-VPLS FDB and all the C-MACs associated with it are flushed too.

The following events trigger a withdrawal of the es-bmac and the corresponding next-hop update in the remote PEs:

  • B-VPLS transition to operationally down status.

  • Change of pbb>source-bmac.

  • Change of es-bmac (or removal of pbb use-es-bmac).

  • Ethernet-segment transition to operationally down status.

Note: Individual SAPs going operationally down in an ES do not generate any BGP withdrawal or indication so that the remote nodes can flush their C-MACs. This is solved in EVPN-MPLS by the use of AD routes per EVI; however, there is nothing similar in PBB-EVPN for indicating a partial failure in an ESI.
PBB-EVPN single-active multihoming service model

In single-active multihoming, the non-DF PEs for a specified ESI block unicast and BUM traffic in both directions (upstream and downstream) on the object associated with the ESI. Other than that, single-active multihoming follows the same service model defined in the PBB-EVPN all-active multihoming service model section with the following differences:

  • The ethernet-segment is configured for single-active: service>system>bgp-evpn>eth-seg>multi-homing single-active.

  • For single-active multihoming, the ethernet-segment can be associated with a port and sdp, as well as a lag.

  • From a service perspective, single-active multihoming can provide redundancy to the following services and access types:

    • I-VPLS LAG and regular SAPs

    • I-VPLS active/standby spoke SDPs

    • EVPN single-active multihoming is supported for PBB-Epipes only in two-node scenarios with local switching.

  • While all-active multihoming only uses es-bmac assignment to the ES, single-active multihoming can use source-bmac or es-bmac assignment. The system allows the following user choices per B-VPLS and ES:

    • A dedicated es-bmac per ES can be used. In that case, the pbb use-es-bmac command is configured in the B-VPLS and the same procedures described in PBB-EVPN all-active multihoming service model follow with one difference. While in all-active multihoming all the PEs part of the ESI source the PBB packets with the same source es-bmac, single-active multihoming requires the use of a different es-bmac per PE.

    • A non-dedicated source-bmac can be used. In this case, the user does not configure pbb>use-es-bmac and the regular source-bmac is used for the traffic. A different source-bmac has to be advertised per PE.

    • The use of source-bmacs or es-bmacs for single-active multihomed ESIs has a different impact on C-MAC flushing, as shown in Source-bmac versus es-bmac C-MAC flushing .

    Figure 39. Source-bmac versus es-bmac C-MAC flushing
    • If es-bmacs are used as shown in the representation on the right in Source-bmac versus es-bmac C-MAC flushing , a less-impacting C-MAC flush is achieved, therefore, minimizing the flooding after ESI failures. In case of ESI failure, PE1 withdraws the es-bmac 00:12 and the remote PE3 only flushes the C-MACs associated with that es-bmac (only the C-MACs behind the CE are flushed).

    • If source-bmacs are used, as shown on the left side of Source-bmac versus es-bmac C-MAC flushing , in case of ES failure, a BGP update with higher sequence number is issued by PE1 and the remote PE3 flushes all the C-MACs associated with the source-bmac. Therefore, all the C-MACs behind the PE's B-VPLS are flushed, as opposed to only the C-MACs behind the ESI's CE.

  • As in EVPN-MPLS, the non-DF status can be notified to the access CE or network:

    • LAG with or without LACP

      In this case, the multihomed ports on the CE are not part of the same LAG. The non-DF PE for each service may signal that the LAG SAP is operationally down by using eth-cfm fault-propagation-enable {use-if-tlv|suspend-ccm}.

    • regular Ethernet 802.1q/ad ports

      In this case, the multihomed ports on the CE/network are not part of any LAG. The non-DF PE for each service signals that the SAP is operationally down by using eth-cfm fault-propagation-enable {use-if-tlv|suspend-ccm}.

    • active-standby PWs

      In this case, the multihomed CE/network is connected to the PEs through an MPLS network and an active/standby spoke SDP per service. The non-DF PE for each service makes use of the LDP PW status bits to signal that the spoke SDP is standby at the PE side. Nokia recommends that the CE suppresses the signaling of PW status standby.

Network failures and convergence for single-active multihoming

ESI failures are resolved depending on the B-MAC address assignment chosen by the user:

  • If the B-MAC address assignment is based on the use of es-bmacs, DF and non-DFs do send the es-bmac/ESI=0 for a specified ESI. Each PE has a different es-bmac for the same ESI (as opposed to the same es-bmac on all the PEs for all-active).

    In case of an ESI failure, the PE withdraws its es-bmac route triggering a mac-flush of all the C-MACs associated with it in the remote PEs.

  • If the B-MAC address assignment is based on the use of source-bmac, DF and non-DFs advertise their respective source-bmacs. In case of an ES failure:

    • The PE re-advertises its source-bmac with a higher sequence number (the new DF does not re-advertise its source-bmac).

    • The far-end PEs interpret a source-bmac advertisement with a different sequence number as a flush-all-from-me message from the PE detecting the failure. They flush all the C-MACs associated with that B-MAC in all the ISID services.

The following events trigger a C-MAC flush notification. A 'C-MAC flush notification' means the withdrawal of a specified B-MAC or the update of B-MAC with a higher sequence number (SQN). Both BGP messages make the remote PEs flush all the C-MACs associated with the indicated B-MAC:

  • B-VPLS transition to operationally down status. This triggers the withdrawal of the associated B-MACs, regardless of the use-es-bmac setting.

  • Change of pbb>source-bmac. This triggers the withdrawal and re-advertisement of the source-bmac, causing the corresponding C-MAC flush in the remote PEs.

  • Change of es-bmac (removal of pbb use-es-bmac). This triggers the withdrawal of the es-bmac and re-advertisement of the new es-bmac.

  • Ethernet-Segment (ES) transition to operationally down or admin-down status. This triggers an es-bmac withdrawal (if use-es-bmac is used) or an update of the source-bmac with a higher SQN (if no use-es-bmac is used).

  • Service Carving Range change for the ES. This triggers an es-bmac update with higher SQN (if use-es-bmac is used) or an update of the source-bmac with a higher SQN (if no use-es-bmac is used).

  • Change in the number of candidate PEs for the ES. This triggers an es-bmac update with higher SQN (if use-es-bmac is used) or an update of the source-bmac with a higher SQN (if no use-es-bmac is used).

  • In an ESI, individual SAPs/SDP bindings or individual I-VPLS going operationally down do not generate any BGP withdrawal or indication so that the remote nodes can flush their C-MACs. This is solved in EVPN-MPLS by the use of AD routes per EVI; however, there is nothing similar in PBB-EVPN for indicating a partial failure in an ESI.

PBB-Epipes and EVPN multihoming

EVPN multihoming is supported with PBB-EVPN Epipes, but only in a limited number of scenarios. In general, the following applies to PBB-EVPN Epipes:

  • PBB-EVPN Epipes do not support spoke SDPs that are associated with EVPN ESs.

  • PBB-EVPN Epipes support all-active EVPN multihoming as long as no local-switching is required in the Epipe instance where the ES is defined.

  • PBB-EVPN Epipes support single-active EVPN multihoming only in a two-node case scenario.

PBB-EVPN MH in a three-node scenario shows the EVPN MH support in a three-node scenario.

Figure 40. PBB-EVPN MH in a three-node scenario

EVPN MH support in a three-node scenario has the following characteristics:

  • All-active EVPN multihoming is fully supported (diagram on the left in PBB-EVPN MH in a three-node scenario). CE1 may also be multihomed to other PEs, as long as those PEs are not PE2 or PE3. In this case, PE1 Epipe's pbb-tunnel would be configured with the remote ES B-MAC.

  • Single-active EVPN multihoming is not supported in a three (or more)-node scenario (diagram on the right in PBB-EVPN MH in a three-node scenario). Because PE1's Epipe pbb-tunnel can only point at a single remote B-MAC and single-active multihoming requires the use of separate B-MACs on PE2 and PE3, the scenario is not possible and not supported regardless of the ES association to port/LAG/sdps.

  • Regardless of the EVPN multihoming type, the CLI prevents the user from adding a spoke SDP to an Epipe, if the corresponding SDP is part of an ES.

PBB-EVPN MH in a two-node scenario shows the EVPN MH support in a two-node scenario.

Figure 41. PBB-EVPN MH in a two-node scenario

EVPN MH support in a two-node scenario has the following characteristics, as shown in PBB-EVPN MH in a two-node scenario:

  • All-active multihoming is not supported for redundancy in this scenario because PE1's pbb-tunnel cannot point at a locally defined ES B-MAC. This is represented in the left-most scenario in PBB-EVPN MH in a two-node scenario.

  • Single-active multihoming is supported for redundancy in a two-node three or four SAP scenario, as displayed by the two right-most scenarios in PBB-EVPN MH in a two-node scenario).

    In these two cases, the Epipe pbb-tunnel is configured with the source B-MAC of the remote PE node.

    When two SAPs are active in the same Epipe, local-switching is used to exchange frames between the CEs.

PBB-EVPN and use of P2MP mLDP tunnels for default multicast list

P2MP mLDP tunnels can also be used in PBB-EVPN services. The use of provider-tunnel inclusive MLDP is only supported in the B-VPLS default multicast list; that is, no per-ISID IMET-P2MP routes are supported. IMET-P2MP routes in a B-VPLS are always advertised with Ethernet tag zero. All-active EVPN multihoming is supported in PBB-EVPN services together with P2MP mLDP tunnels; however, single-active multihoming is not supported. This capability is only required on the P2MP root PEs within PBB-EVPN services using all-active multihoming.

B-VPLS supports the use of MFIBs for ISIDs using ingress replication. The following considerations apply when provider-tunnel is enabled in a B-VPLS service:

  • Local I-VPLS or static-ISIDs configured on the B-VPLS generate IMET-IR routes and MFIBs are created per ISID by default.

  • The default IMET-P2MP or IMET-P2MP-IR route sent with Ethernet-tag = 0 is issued depending on the ingress-repl-inc-mcast-advertisement command.

  • The following considerations apply if an isid-policy is configured in the B-VPLS.

    • A range of ISIDs configured with use-def-mcast make use of the P2MP tree, assuming the node is configured as root-and-leaf.

    • A range of ISIDs configured with advertise-local make the system advertise IMET-IR routes for the local ISIDs included in the range.

The following example CLI output shows a range of ISIDs (1000-2000) that use the P2MP tree and the system does not advertise the IMET-IR routes for those ISIDs. Other local ISIDs advertise the IMET-IR and use the MFIB to forward BUM packets to the EVPN-MPLS destinations created by the IMET-IR routes.

*A:PE-1>config>service>vpls(b-vpls)# info 
----------------------------------------------
  service-mtu 2000
  bgp-evpn
    evi 10
    mpls bgp 1
      no shutdown
      auto-bind-tunnel resolution any
  isid-policy
    entry 10 create
      use-def-mcast
      no advertise-local
      range 1000 to 2000
      exit
    exit  
  provider-tunnel
    inclusive
      owner bgp-evpn-mpls
      root-and-leaf
      mldp
      no shutdown
      exit
    exit
  sap 1/1/1:1 create 
  exit
  spoke-sdp 1:1 create
 exit
PBB-EVPN ISID-based C-MAC flush

SR OS supports ISID-based C-MAC flush procedures for PBB-EVPN I-VPLS services where no single-active ESs are used. SR OS also supports C-MAC flush procedure where other redundancy mechanisms, such as BGP-MH, need these procedures to avoid blackholes caused by a SAP or spoke SDP failure.

The C-MAC flush procedures are enabled on the I-VPLS service using the config>service>vpls>pbb>send-bvpls-evpn-flush CLI command. The feature can be disabled on a per-SAP or per-spoke SDP basis by using the disable-send-bvpls-evpn-flush command in the config>service>vpls>sap or config>service>vpls>spoke-sdp context.

With the feature enabled on an I-VPLS service and a SAP or spoke SDP, if there is a SAP or spoke SDP failure, the router sends a C-MAC flush notification for the corresponding B-MAC and ISID. The router receiving the notification flushes all the C-MACs associated with the indicated B-MAC and ISID when the config>service>vpls>bgp-evpn>accept-ivpls-evpn-flush command is enabled for the B-VPLS service.

The C-MAC flush notification consists of an EVPN B-MAC route that is encoded as follows: the ISID to be flushed is encoded in the Ethernet Tag field and the sequence number is incremented with respect to the previously advertised route.

If send-bvpls-evpn-flush is configured on an I-VPLS with SAPs or spoke SDPs, one of the following rules must be observed:

  • The disable-send-bvpls-evpn-flush option is configured on the SAPs or spoke SDPs.

  • The SAPs or spoke SDPs are not on an ES.

  • The SAPs or spoke SDPs are on an ES or vES with no src-bmac-lsb enabled.

  • The no use-es-bmac is enabled on the B-VPLS.

ISID-based C-MAC flush can be enabled in I-VPLS services with ES or vES. If enabled, the expected interaction between the RFC 7623-based C-MAC flush and the ISID-based C-MAC flush is as follows.

  • If send-bvpls-evpn-flush is enabled in an I-VPLS service, the ISID-based C-MAC flush overrides (replaces) the RFC 7623-based C-MAC flushing.

  • For each ES, vES, or B-VPLS, the system checks for at least one I-VPLS service that does not have send-bvpls-evpn-flush enabled.

    • If ISID-based C-MAC flush is enabled for all I-VPLS services, RFC 7623-based C-MAC flushing is not triggered; only ISID-based C-MAC flush notifications are generated.

    • If at least one I-VPLS service is found with no ISID-based C-MAC flush enabled, then RFC 7623-based C-MAC flushing notifications are triggered based on ES events.

      ISID-based C-MAC flush notifications are also generated for I-VPLS services that have send-bvpls-evpn-flush enabled.

Per-ISID C-MAC flush following a SAP failure shows an example where the ISID-based C-MAC flush prevents blackhole situations for a CE that is using BGP-MH as the redundancy mechanism in the I-VPLS with an ISID of 3000.

Figure 42. Per-ISID C-MAC flush following a SAP failure

When send-bvpls-evpn-flush is enabled, the I-VPLS service is ready to send per-ISID C-MAC flush messages in the form of B-MAC/ISID routes. The first B-MAC/ISID route for an I-VPLS service is sent with sequence number zero; subsequent updates for the same route increment the sequence number. A B-MAC/ISID route for the I-VPLS is advertised or withdrawn during the following cases:

  • I-VPLS send-bvpls-evpn-flush configuration and deconfiguration

  • I-VPLS association and disassociation from the B-VPLS service

  • I-VPLS operational status change (up/down)

  • B-VPLS operational status change (up/down)

  • B-VPLS bgp-evpn mpls status change (no shutdown/shutdown)

  • B-VPLS operational source B-MAC change

If no disable-send-bvpls-evpn-flush is configured for a SAP or spoke SDP, upon a failure on that SAP or spoke SDP, the system sends a per-ISID C-MAC flush message; that is, a B-MAC/ISID route update with an incremented sequence number.

If the user explicitly configures disable-send-bvpls-evpn-flush for a SAP or spoke SDP, the system does not send per-ISID C-MAC flush messages for failures on that SAP or spoke SDP.

The B-VPLS on the receiving node must be configured with bgp-evpn>accept-ivpls-evpn-flush to accept and process C-MAC flush non-zero Ethernet-tag MAC routes. If the accept-ivpls-evpn-flush command is enabled (the command is disabled by default), the node accepts non-zero Ethernet-tag MAC routes (B-MAC/ISID routes) and processes them. When a new B-MAC/ISID update (with an incremented sequence number) for an existing route is received, the router flushes all the C-MACs associated with that B-MAC and ISID. The B-MAC/ISID route withdrawals also cause a C-MAC flush.

Note: Only B-MAC routes with the Ethernet Tag field set to zero are considered for B-MAC installation in the FDB.

The following CLI example shows the commands that enable the C-MAC flush feature on PE1 and PE3.

*A:PE-1>config>service>vpls(i-vpls)# info 
----------------------------------------------
  pbb
    backbone-vpls 10
    send-bvpls-evpn-flush
    exit
  exit
  bgp
    route-distinguisher 65000:1
    vsi-export ‟vsi_export”
    vsi-import ‟vsi_import”
  exit
  site ‟CE-1” create
    site-id 1
    sap lag-1:1
    site-activation-timer 3
    no shutdown
  exit
  sap lag-1:1 create
    no disable-send-bvpls-evpn-flush
    no shutdown
    exit
<snip>
*A:PE-3>config>service>vpls(b-vpls 10)# info 
----------------------------------------------
<snip>
  bgp-evpn
    accept-ivpls-evpn-flush

In the preceding example, with send-bvpls-evpn-flush enabled on the I-VPLS service of PE1, a B-MAC/ISID route (for pbb source-bmac address B-MAC 00:..:01 and ISID 3000) is advertised. If the SAP goes operationally down, PE1 sends an update of the source B-MAC address (00:..:01) for ISID 3000 with a higher sequence number.

With accept-ivpls-evpn-flush enabled on PE3’s B-VPLS service, PE3 flushes all C-MACs associated with B-MAC 00:01 and ISID 3000. The C-MACs associated with other B-MACs or ISIDs are retained in PE3’s FDB.

PBB-EVPN ISID-based route targets

Routers with PBB-EVPN services use the following route types to advertise the ISID of a specific service:

  • Inclusive Multicast Ethernet Tag routes (IMET-ISID routes) are used to auto-discover ISIDs in the PBB-EVPN network. The routes encode the service ISID in the Ethernet Tag field.

  • BMAC-ISID routes are only used when ISID-based C-MAC flush is configured. The routes encode the ISID in the Ethernet Tag field.

Although the preceding routes are only relevant for routers where the advertised ISID is configured, they are sent with the B-VPLS route-target by default. As a result, the routes are unnecessarily disseminated to all the routers in the B-VPLS network.

SR OS supports the use of per-ISID or group of ISID route-targets, which limits the dissemination of IMET-ISID or BMAC-ISID routes for a specific ISID to the PEs where the ISID is configured.

The config>service>(b-)vpls>isid-route-target>isid-range from [to to] [auto-rt | route-target rt] command allows the user to determine whether the IMET-ISID and BMAC-ISID routes are sent with the B-VPLS route-target (default option, no command), or a route-target specific to the ISID or range of ISIDs.

The following configuration example shows how to configure ISID ranges as auto-rt or with a specific route-target.

*A:PE-3>config>service>(b-)vpls>bgp-evpn# 
isid-route-target 
[no] isid-range <from> [to <to>] {auto-rt|route-target <rt>}
/* For example:
*A:PE-3>config>service>(b-)vpls>bgp-evpn# 
isid-route-target
 isid-range 1000 to 1999 auto-rt
 isid-range 2000 route-target target:65000:2000

The auto-rt option auto-derives a route-target per ISID in the following format:

<2-byte-as-number>:<4-byte-value>

Where: 4-byte-value = 0x30+ISID, as described in RFC 7623. PBB-EVPN auto-rt ISID-based route target format shows the format of the auto-rt option.

Figure 43. PBB-EVPN auto-rt ISID-based route target format

Where:

  • If it is 2 bytes, then the AS number is obtained from the config>router>autonomous-system command. If the AS number exceeds the 2 byte limit, then the low order 16-bit value is used.

  • A = 0 for auto-derivation

  • Type = 3, which corresponds to an ISID-type route-target

  • ISID is the 24-bit ISID

  • The type and sub-type are 0x00 and 0x02.

If isid-route-target is enabled, the export and import directions for IMET-ISID and BMAC-ISID route processing are modified as follows:

  • Exported IMET-ISID and BMAC-ISID routes

    • For local I-VPLS ISIDs and static ISIDs, IMET-ISID routes are sent individually with an ISID-based route-target (and without a B-VPLS route-target) unless the ISID is contained in an ISID policy for which no advertise-local is configured.

    • If both isid-route-target and send-bvpls-evpn-flush options are enabled for an I-VPLS, the BMAC-ISID route is also sent with the ISID-based route-target and no B-VPLS route-target.

    • The isid-route-target command affects the IMET-ISID and BMAC-ISID routes only. The BMAC-0, IMET-0 (B-MAC and IMET routes with Ethernet Tag == 0), and ES routes are not impacted by the command.

  • Imported IMET-ISID and BMAC-ISID routes

    • Upon enabling isid-route-target for a specific I-VPLS, the BGP starts importing IMET-ISID routes with ISID-based route-targets, and (assuming the bgp-evpn accept-ivpls-evpn-flush option is enabled) BMAC-ISID routes with ISID-based route-targets.

    • The new ISID-based RTs are added for import operations when the I-VPLS is associated with the B-VPLS service (and not based on the I-VPLS operational status), or when the static-isid is added.

    • The system does not maintain a mapping of the route-targets and ISIDs for the imported routes. For example, if I-VPLS 1 and 2 are configured with the isid-route-target option and IMET-ISID=2 route is received with a route-target corresponding to ISID=1, then BGP imports the route and the router processes it.

    • The router does not check the format of the received auto-derived route-targets. The route is imported as long as the route-target is on the list of RTs for the B-VPLS.

  • If the isid-route-target option is configured for one or more I-VPLS services, the vsi-import and vsi-export policies are blocked in the B-VPLS. BGP peer import and export policies are still allowed. Matching on the export ISID-based route-target is supported.

EVPN-VPWS for MPLS tunnels

This section contains information about EVPN-VPWS for MPLS tunnels.

BGP-EVPN control plane for EVPN-VPWS

EVPN-VPWS for MPLS tunnels uses the RFC 8214 BGP extensions described in EVPN-VPWS for VXLAN tunnels, with the following differences for the Ethernet AD per-EVI routes:

  • The MPLS field encodes an MPLS label as opposed to a VXLAN VNI.

  • The C flag is set if the control word is configured in the service.

  • The F flag is set if the hash label is configured in the service.

EVPN for MPLS tunnels in Epipe services (EVPN-VPWS)

The use and configuration of EVPN-VPWS services is described in EVPN-VPWS for VXLAN tunnels with the following differences when the EVPN-VPWS services use MPLS tunnels instead of VXLAN.

When MPLS tunnels are used, the bgp-evpn>mpls context must be configured in the Epipe. As an example, if Epipe 2 is an EVPN-VPWS service that uses MPLS tunnels between PE2 and PE4, this would be its configuration:

PE2>config>service>epipe(2)#
-----------------------
bgp
exit
bgp-evpn
  evi 2
  local-attachment-circuit "AC-1" 
    eth-tag 200
exit
 remote-attachment-circuit "AC-2" 
    eth-tag 200
exit
  mpls bgp 1
    ecmp 2
    no shutdown
exit
sap 1/1/1:1 create
PE4>config>service>epipe(2)#
-----------------------
bgp
exit
bgp-evpn
  evi 2
  local-attachment-circuit "AC-2" 
    eth-tag 200
exit
  remote-attachment-circuit "AC-1" 
    eth-tag 100
exit
  mpls bgp 1
    ecmp 2
    no shutdown
exit
spoke-sdp 1:1

Where the following BGP-EVPN commands, specific to MPLS tunnels, are supported in the same way as in VPLS services:

  • mpls auto-bind-tunnel

  • mpls control-word

  • mpls hash-label

  • mpls entropy-label

  • mpls force-vlan-vc-forwarding

  • mpls shutdown

EVPN-VPWS Epipes with MPLS tunnels can also be configured with the following characteristics:

  • Access attachment circuits can be SAPs or spoke SDPs. Manually configured and BGP-VPWS spoke SDPs are supported. The VC switching configuration is not supported on BGP-EVPN-enabled pipes.

  • EVPN-VPWS Epipes using null SAPs can be configured with sap>ethernet>llf. When enabled, upon removing the EVPN destination, the port is brought oper-down with flag LinkLossFwd, however the AD per EVI route for the SAP is still advertised (the SAP is kept oper-up). When the EVPN destination is created, the port is brought oper-up and the flag cleared.

  • EVPN-VPWS Epipes for MPLS tunnels support endpoints. The parameter endpoint endpoint name is configurable along with bgp-evpn>local-attachment-circuit and bgp-evpn>remote-attachment-circuit. The following conditions apply to endpoints on EVPN-VPWS Epipes with MPLS tunnels:

    • Up to two explicit endpoints are allowed per Epipe service with BGP-EVPN configured.

    • A limited endpoint configuration is allowed in Epipes with BGP-EVPN. Specifically, neither active-hold-delay nor revert-time are configurable.

    • When bgp-evpn>remote-attachment-circuit is added to an explicit endpoint with a spoke SDP, the spoke-sdp>precedence command is not allowed. The spoke SDP always has a precedence of four, which is always higher than the EVPN precedence. Therefore, the EVPN-MPLS destination is used for transmission if it is created, and the spoke SDP is only used when the EVPN-MPLS destination is removed.

  • EVPN-VPWS Epipes for MPLS tunnels support control word, hash label and entropy labels.
    • When the control word is configured, the PE sets the C bit in its AD per-EVI advertisement and sends the control word in the datapath. In this case, the PE expects the control word to be received. If there is a mismatch between the received control word and the configured control word, the system does not set up the EVPN destination and the service does not come up.
    • When the hash-label command is configured, the PE sets the F bit in its AD per-EVI routes and sends the hash label in the datapath. The PE expects the hash-label to also be received. In case of a mismatch between the received F flag and the locally configured hash-label, the router does not create the EVPN destination and the service does not come up. For the service, the use of the hash label and entropy labels are mutually exclusive.
  • EVPN-VPWS Epipes support force-qinq-vc-forwarding [c-tag-c-tag | s-tag-c-tag] command under bgp-evpn mpls and the qinq-vlan-translation s-tag.c-tag command on ingress QinQ SAPs.

    When QinQ VLAN translation is configured at the ingress QinQ or dot1q SAP, the service-delimiting outer and inner VLAN values can be translated to the configured values. The force-qinq-vc-forwarding s-tag-c-tag command must be configured to preserve the translated QinQ tags in the payload when sending EVPN packets. This translation and preservation behavior is aligned with the ‟normalization” concept described in draft-ietf-bess-evpn-vpws-fxc. The VLAN tag processing described in Epipe service pseudowire VLAN tag processing applies to EVPN destinations in EVPN-VPWS services too.

The following features, described in EVPN-VPWS for VXLAN tunnels, are also supported for MPLS tunnels:

  • Advertisement of the Layer-2 MTU and consistency checking of the MTU of the AD per-EVI routes.

  • Use of A/S PW and MC-LAG at access.

  • EVPN multihoming, including:

    • Single-active and all-active

    • Regular or virtual ESs

    • All existing DF election modes

EVPN-VPWS services with local-switching support

Epipes with BGP-EVPN MPLS support the following configurations:

  • up to two endpoints

  • up to two SAPs associated with a different configured endpoint each

  • two pairs of local/remote attachment circuit Ethernet tags, also associated with different configured endpoints

  • EVPN destinations that can be used as Inter-Chassis Backup (ICB) links

The support of endpoints and up to two SAPs with local-switching allows two and three-node topologies for EVPN-VPWS. EVPN-VPWS endpoints example 1, EVPN-VPWS endpoints example 2, and EVPN-VPWS endpoints example 3 show examples of these topologies.

Example 1

The following figure shows an example of EVPN-VPWS endpoints.

Figure 44. EVPN-VPWS endpoints example 1

In EVPN-VPWS endpoints example 1, PE1 is configured with the following Epipe services:

endpoint X create
exit
endpoint Y create
exit
bgp-evpn
  evi 350 
  local-attachment-circuit "CE-1" endpoint "Y" create
    eth-tag 1
  exit
  remote-attachment-circuit "ICB-1" endpoint "Y" create
    eth-tag 2
  exit
  local-attachment-circuit "CE-2" endpoint "X" create
    eth-tag 2
  exit
  remote-attachment-circuit "ICB-2" endpoint "X" create
    eth-tag 1
  exit  
  mpls bgp 1 
    auto-bind-tunnel
      resolution any
      exit
    no shutdown
    exit
  exit
sap lag-1:1 endpoint X create
exit
sap 1/1/1:1 endpoint Y create
exit

In EVPN-VPWS endpoints example 1, PE2 is configured with the following Epipe services:

bgp-evpn
  evi 350 
  local-attachment-circuit "CE-1" create
    eth-tag 1
  exit
  remote-attachment-circuit "ICB-1" create
    eth-tag 2
  exit
// implicit endpoint "Y"
  mpls bgp 1 
    auto-bind-tunnel
  resolution any
  exit
no shutdown
exit
  exit
sap lag-1:1 create
exit
// implicit endpoint "X"

In this example, if we assume multihoming on CE1, the following applies:

  • PE1 advertises two AD per-EVI routes, for tags 1 and 2, respectively. PE2 advertises only the route for tag 1.

    • AD per-EVI routes for tag 1 are advertised based on CE1 SAPs' states

    • AD per-EVI route for tag 2 is advertised based on CE2 SAP state

  • PE1 creates endpoint X with sap lag-1:1 and ES-destination to tag 1 in PE2

  • PE2 creates the usual destination to tag 2 in PE1

  • In case of all-active MH:

    • traffic from CE1 to PE1 is forwarded to CE2 directly

    • traffic from CE1 to PE2 is forwarded to PE1 with the label that identifies CE2's SAP

    • traffic from CE2 is forwarded to CE1 directly because CE1's SAP is the endpoint Tx; in case of failure on CE1's SAP, PE1 changes the Tx object to the ES-destination to PE2

  • In case of single-active MH, traffic flows in the same way, except that a non-DF SAP is operationally down and therefore cannot be an endpoint Tx object.

Example 2

The following figure shows an example of EVPN-VPWS endpoints.

Figure 45. EVPN-VPWS endpoints example 2

In EVPN-VPWS endpoints example 2, PE1 is configured with the following Epipe services.

endpoint X create
exit
endpoint Y create
exit
bgp-evpn
  evi 350 
  local-attachment-circuit "CE-1" endpoint "Y" create
    eth-tag 1
  exit
  remote-attachment-circuit "ICB-1" endpoint "Y" create
    eth-tag 2
  exit
  local-attachment-circuit "CE-2" endpoint "X" create
    eth-tag 2
  exit
  remote-attachment-circuit "ICB-2" endpoint "X" create
    eth-tag 1
  exit  
  mpls bgp 1 
    auto-bind-tunnel
      resolution any
      exit
    no shutdown
    exit
  exit
sap lag-1:1 endpoint X create
exit
sap lag-2:1 endpoint Y create
exit

In EVPN-VPWS endpoints example 2, PE2 is configured with the following Epipe services.

endpoint X create
exit
endpoint Y create
exit
bgp-evpn
  evi 350 
  local-attachment-circuit "CE-1" endpoint "Y" create
    eth-tag 1
  exit
  remote-attachment-circuit "ICB-1" endpoint "Y" create
    eth-tag 2
  exit
  local-attachment-circuit "CE-2" endpoint "X" create
    eth-tag 2
  exit
  remote-attachment-circuit "ICB-2" endpoint "X" create
    eth-tag 1
  exit  
  mpls bgp 1 
    auto-bind-tunnel
      resolution any
      exit
    no shutdown
    exit
  exit
sap lag-1:1 endpoint X create
exit
sap lag-2:1 endpoint Y create
exit

This example is similar to the EVPN-VPWS endpoints example 1 example, except that the two PEs are multihomed to both CEs. In EVPN-VPWS endpoints example 1, if CE2 goes down, then, no traffic exists between PEs because the two PEs lose all the objects in the endpoint connected to CE2. Traffic that arrives on EVPN is only forwarded to a SAP on a different endpoint.

Example 3

The following figure shows an example of EVPN-VPWS endpoints.

Figure 46. EVPN-VPWS endpoints example 3

In EVPN-VPWS endpoints example 3, PE1 is configured with the following Epipe services.

bgp-evpn
  evi 350 
  local-attachment-circuit "CE-1" 
    eth-tag 1
  exit
  remote-attachment-circuit "ICB-1" 
    eth-tag 2
  exit  
// implicit endpoint "Y"
  mpls bgp 1 
    auto-bind-tunnel
      resolution any
      exit
    no shutdown
    exit
  exit
sap lag-1:1 create
// implicit endpoint "X" 
exit

In EVPN-VPWS endpoints example 3, PE2 is configured with the following Epipe services.

endpoint X create
exit
endpoint Y create
exit
bgp-evpn
  evi 350 
  local-attachment-circuit "CE-1" endpoint "Y"   
    eth-tag 1
  exit
  remote-attachment-circuit "ICB-1" endpoint "Y"
    eth-tag 2
  exit
  local-attachment-circuit "CE-2" endpoint "X"
    eth-tag 2
  exit
  remote-attachment-circuit "ICB-2" endpoint "X"
    eth-tag 1
  exit  
  mpls bgp 1 
    auto-bind-tunnel
      resolution any
      exit
    no shutdown
    exit
  exit
sap lag-1:1 endpoint X create
exit
sap lag-2:1 endpoint Y create
exit

In EVPN-VPWS endpoints example 3, PE3 is configured with the following Epipe services.

bgp-evpn
  evi 350 
  local-attachment-circuit "CE-2" 
    eth-tag 2
  exit
  remote-attachment-circuit "ICB-2"
    eth-tag 1
  exit  
// implicit endpoint "X"
  mpls bgp 1 
    auto-bind-tunnel
      resolution any
      exit
    no shutdown
    exit
  exit
sap lag-1:1 create
// implicit endpoint "Y"
exit

This example is similar to the EVPN-VPWS endpoints example 2 example, except that a third node is added. Nodes PE1 and PE3 have implicit endpoints. Only node PE2 requires the configuration of endpoints.

EVPN for MPLS tunnels in routed VPLS services

EVPN-MPLS and IP-prefix advertisement (enabled by the ip-route-advertisement command) are fully supported in routed VPLS services and provide the same feature-set as EVPN-VXLAN. The following capabilities are supported in a service where bgp-evpn mpls is enabled:

  • R-VPLS with VRRP support on the VPRN or IES interfaces

  • R-VPLS support including ip-route-advertisement with regular interfaces

    This includes the advertisement and process of ip-prefix routes defined in IETF Draft draft-ietf-bess-evpn-prefix-advertisement with the appropriate encoding for EVPN-MPLS.

  • R-VPLS support including ip-route-advertisement with evpn-tunnel interfaces

  • R-VPLS with IPv6 support on the VPRN or IES IP interface

IES interfaces do not support either ip-route-advertisement or evpn-tunnel.

EVPN-MPLS multihoming and passive VRRP

SAP and spoke SDP based ESs are supported on R-VPLS services where bgp-evpn mpls is enabled.

EVPN-MPLS multihoming in R-VPLS services shows an example of EVPN-MPLS multihoming in R-VPLS services, with the following assumptions:

  • There are two subnets for a specific customer (for example, EVI1 and EVI2 in EVPN-MPLS multihoming in R-VPLS services), and a VPRN is instantiated in all the PEs for efficient inter-subnet forwarding.

  • A ‟backhaul” R-VPLS with evpn-tunnel mode enabled is used in the core to interconnect all the VPRNs. EVPN IP-prefix routes are used to exchange the prefixes corresponding to the two subnets.

  • An all-active ES is configured for EVI1 on PE1 and PE2.

  • A single-active ES is configured for EVI2 on PE3 and PE4.

Figure 47. EVPN-MPLS multihoming in R-VPLS services

In the example in EVPN-MPLS multihoming in R-VPLS services, the hosts connected to CE1 and CE4 could use regular VRRP for default gateway redundancy; however, this may not be the most efficient way to provide upstream routing.

For example, if PE1 and PE2 are using regular VRRP, the upstream traffic from CE1 may be hashed to the backup IRB VRRP interface, instead of being hashed to the active interface. The same thing may occur for single-active multihoming and regular VRRP for PE3 and PE4. The traffic from CE4 is sent to PE3, while PE4 may be the active VRRP router. In that case, PE3 has to send the traffic to PE4, instead of route it directly.

In both cases, unnecessary bandwidth between the PEs is used to get to the active IRB interface. In addition, VRRP scaling is limited if aggressive keepalive timers are used.

Because of these issues, passive VRRP is recommended as the best method when EVPN-MPLS multihoming is used in combination with R-VPLS redundant interfaces.

Passive VRRP is a VRRP setting in which the transmission and reception of keepalive messages is completely suppressed, and therefore the VPRN interface always behaves as the active router. Passive VRRP is enabled by adding the passive keyword to the VRRP instance at creation, as shown in the following examples:

  • config service vprn 1 interface int-1 vrrp 1 passive

  • config service vprn 1 interface int-1 ipv6 vrrp 1 passive

For example, if PE1, PE2, and PE5 in EVPN-MPLS multihoming in R-VPLS services use passive VRRP, even if each individual R-VPLS interface has a different MAC/IP address, because they share the same VRRP instance 1 and the same backup IP, the three PEs own the same virtual MAC and virtual IP address (for example, 00-00-5E-00-00-01 and 10.0.0.254). The virtual MAC is auto-derived from 00-00-5E-00-00-VRID per RFC 3768. The following is the expected behavior when passive VRRP is used in this example:

  • All R-VPLS IRB interfaces for EVI1 have their own physical MAC/IP address; they also own the same default gateway virtual MAC and IP address.

  • All EVI1 hosts have a unique configured default gateway; for example, 10.0.0.254.

  • When CE1 or CE2 send upstream traffic to a remote subnet, the packets are routed by the closest PE because the virtual MAC is always local to the PE.

    For example, the packets from CE1 hashed to PE1 are routed at PE1. The packets from CE1 hashed to PE2 are routed directly at PE2.

  • Downstream packets (for example, packets from CE3 to CE1), are routed directly by the PE to CE1, regardless of the PE to which PE5 routed the packets.

    For example, the packets from CE3 sent to PE1 are routed at PE1. The packets from CE3 sent to PE2 are routed at PE2.

  • In case of ES failure in one of the PEs, the traffic is forwarded by the available PE.

    For example, if the packets routed by PE5 arrive at PE1 and the link to CE1 is down, then PE1 sends the packets to PE2. PE2 forwards the packets to CE1 even if the MAC source address of the packets matches PE2's virtual MAC address. Virtual MACs bypass the R-VPLS interface MAC protection.

The following list summarizes the advantages of using passive VRRP mode versus regular VRRP for EVPN-MPLS multihoming in R-VPLS services:

  • Passive VRRP does not require multiple VRRP instances to achieve default gateway load-balancing. Only one instance per R-VPLS, therefore only one default gateway, is needed for all the hosts.

  • The convergence time for link/node failures is not impacted by the VRRP convergence, as all the nodes in the VRRP instance are acting as active routers.

  • Passive VRRP scales better than VRRP, as it does not use keepalive or BFD messages to detect failures and allow the backup to take over.

In EVPN all-active multihoming scenarios with R-VPLS services where the advertisement of the ARP/ND entries is enabled, use the following command to avoid issues with MAC mobility caused by the MAC/IP Advertisement route for the ARP/ND entry being sent with ESI=0:

  • MD-CLI
    configure service vpls bgp-evpn routes mac-ip arp-nd-only-with-fdb-advertisement true
  • classic CLI
    configure service vpls bgp-evpn arp-nd-only-with-fdb-advertisement

When this command is enabled, the local ARP/ND entries of VPRN interfaces using this VPLS are advertised in this BGP-EVPN service only when the corresponding local MAC is programmed in the FDB.

In an EVPN multihoming scenario, this command prevents the router from advertising a MAC/IP Advertisement route with the MAC and IP binding but without the correct ESI value (which is taken only when the MAC is properly programmed in the FDB against the ESI).

In addition, if an Ethernet segment SAP receives a frame, the MAC address can be re-programmed as type learned, even if the MAC was previously programmed as type EVPN.

Virtual Ethernet Segment for EVPN multihoming

SR OS supports virtual Ethernet Segments (vES) for EVPN multihoming in accordance with draft-ietf-bess-evpn-virtual-eth-segment.

Regular ESs can only be associated with ports, LAGs, and SDPs, which satisfies the redundancy requirements for CEs that are directly connected to the ES PEs by a port, LAG, or SDP. However, this implementation does not work when an aggregation network exists between the CEs and the ES PEs, which requires different ESs to be defined for the port or LAG of the SDP.

All-active multihoming on vES shows an example of how CE1 and CE2 use all-active multihoming to the EVPN-MPLS network despite the third-party Ethernet aggregation network to which they are connected.

Figure 48. All-active multihoming on vES

The ES association can be made in a more granular way by creating a vES. A vES can be associated with the following:

  • Q-tag ranges on dot1q ports or LAGs

  • S-tag ranges on QinQ ports or LAGs

  • C-tag ranges per S-tag on QinQ ports or LAGs

  • VC-ID ranges on SDPs

The following example displays the vES configuration options.

MD-CLI

[ex:/configure service system bgp evpn]
A:admin@node-2# info
    ethernet-segment "vES-1" {
        type virtual
        association {
            lag "lag-1" {
                virtual-ranges {
                    dot1q {
                        q-tag 100 {
                            end 200
                        }
                    }
                }
            }
        }
    }
    ethernet-segment "vES-2" {
        type virtual
        association {
            port 1/1/1 {
                virtual-ranges {
                    qinq {
                        s-tag-c-tag 1 c-tag-start 100 {
                            c-tag-end 200
                        }
                        s-tag 2 {
                            end 10
                        }
                    }
                }
            }
        }
    }
    ethernet-segment "vES-3" {
        type virtual
        association {
            sdp 1 {
                virtual-ranges {
                    vc-id 1000 {
                        end 2000
                    }
                }
            }
        }
    }

classic CLI

A:node-2>config>service>system>bgp-evpn# info
----------------------------------------------
                ethernet-segment "vES-1" virtual create
                    service-carving
                        mode auto
                    exit
                    lag 1
                    dot1q
                        q-tag-range 100 to 200
                    exit
                    shutdown
                exit
                ethernet-segment "vES-2" virtual create
                    service-carving
                        mode auto
                    exit
                    port 1/1/1
                    qinq
                        s-tag-range 2 to 10
                        s-tag 1 c-tag-range 100 to 200
                    exit
                    shutdown
                exit
                ethernet-segment "vES-3" virtual create
                    service-carving
                        mode auto
                    exit
                    sdp 1
                    vc-id-range 1000 to 2000
                    shutdown
                exit
----------------------------------------------

Where:

  • The virtual keyword creates an ES as defined in draft-ietf-bess-evpn-virtual-eth-segment. The configuration of the dot1q or QinQ nodes is allowed only when the ES is created as virtual.

  • On the vES, the user must first create a port, LAG, or SDP before configuring a VLAN or VC-ID association. When added, the port/LAG type and encap-type is checked as follows:

    • If the encap-type is dot1q, only the dot1q context configuration is allowed; the qinq context cannot be configured.

    • If the encap-type is qinq, only the qinq context configuration is allowed; the dot1q context cannot be configured.

    • A dot1q, qinq, or VC-ID range is required for the vES to become operationally active.

  • The dot1q Q-tag range determines which VIDs are associated with the vES on a specific dot1q port or LAG. The group of SAPs that match the configured port/LAG and VIDs is part of the vES.

  • The QinQ S-tag range determines which outer VIDs are associated with the vES on the QinQ port or LAG.

  • The QinQ S-tag C-tag range determines which inner C-tags per S-tag are associated with the vES on the QinQ port or LAG.

  • The VC-ID range determines which VC IDs are associated with the vES on the configured SDP.

Although Q-tag values 0, * and 1 to 4094 are allowed, the following considerations must be taken in to account when configuring a dot1q or qinq vES:

  • Up to 8 dot1q or QinQ ranges may be configured in the same vES.

  • When configuring a QinQ vES, a Q-tag included in a S-tag range cannot be included in the S-tag Q-tag of the s-tag qtag1 c-tag-range qtag2 [to qtag2] command. For example, the following combination is not supported in the same vES.

    s-tag-range 350 to 500
    s-tag 500 c-tag-range 100 to 200

    The following example shows a supported combination:

    • MD-CLI
      [ex:/configure service system bgp evpn]
      A:admin@node-2# info
          ethernet-segment "qinq" {
      ...
          ethernet-segment "vES-4" {
              type virtual
              association {
                  port 1/1/1 {
                      virtual-ranges {
                          qinq {
                              s-tag-c-tag 500 c-tag-start 100 {
                                  c-tag-end 200
                              }
                              s-tag-c-tag 600 c-tag-start 100 {
                                  c-tag-end 200
                              }
                              s-tag-c-tag 600 c-tag-start 150 {
                                  c-tag-end 200
                              }
                              s-tag 100 {
                                  end 200
                              }
                              s-tag 300 {
                                  end 400
                              }
                          }
                      }
                  }
              }
          }
    • classic CLI
      A:node-2>config>service>system>bgp-evpn>eth-seg>qinq# info
      ----------------------------------------------
                      s-tag-range 100 to 200
                      s-tag-range 300 to 400
                      s-tag 500 c-tag-range 100 to 200
                      s-tag 600 c-tag-range 100 to 200
                      s-tag 600 c-tag-range 150 to 200
    Note: For more information about the contexts for this command, see:
    • 7450 ESS, 7750 SR, 7950 XRS, and VSR MD-CLI Command Reference Guide
    • 7450 ESS, 7750 SR, 7950 XRS, and VSR Classic CLI Command Reference Guide
  • vES associations that contain Q-tags <0, *, null> are special and treated as follows:

    • When a special Q-tag value is configured in the from value of the range, the to value must be the same.

    • Q-tag values <0, *> are only supported for the Q-tag range and C-tag range; they are not supported in the S-tag range.

    • The Q-tag ‟null” value is only supported in the C-tag range if the s-tag is configured as ‟*”.

Examples of supported Q-tag values lists examples of the supported Q-tag values between 1 to 4094.

Table 3. Examples of supported Q-tag values
vES configuration for port 1/1/1 SAP association

dot1q Q-tag range 100

1/1/1:100

dot1q Q-tag range 100 to 102

1/1/1:100, 1/1/1:101, 1/1/1:102

QinQ S-tag 100 C-tag-range 200

1/1/1:100.200

QinQ S-tag-range 100

All the SAPs 1/1/1:100.x

where x is a value between 1 to 4094, 0, *

QinQ S-tag range 100 to 102

All SAPs 1/1/1:100.x, 1/1/1:101.x, 1/1/1:102.x

where x is a value between 1 to 4094, 0, *

Examples of supported combinations lists all the supported combinations that include Q-tag values <0, *, null>. Any other combination of these special values is not supported.

Table 4. Examples of supported combinations
vES configuration for port 1/1/1 SAP association

dot1q Q-tag range 0

1/1/1:0

dot1q Q-tag range *

1/1/1:*

QinQ S-tag 0 C-tag range *

1/1/1:0.*

QinQ S-tag * C-tag range *

1/1/1:*.*

QinQ S-tag * C-tag range null

1/1/1:*.null

QinQ S-tag x C-tag range 0

1/1/1:x.0

where x is a value between 1 to 4094

QinQ S-tag x C-tag range *

1/1/1:x.*

where x is a value between 1 to 4094

On vESs, the single-active and all-active modes are supported for EVPN-MPLS VPLS, Epipe, and PBB-EVPN services. Single-active multihoming is supported on port and SDP-based vESs, and all-active multihoming is only supported on LAG-based vESs.

The following considerations apply if the vES is used with PBB-EVPN services:

  • B-MAC allocation procedures are the same as the regular ES procedures.

    Note: Two all-active vESs must use different ES B-MACs, even if they are defined in the same LAG.
  • The vES implements C-MAC flush procedures described in RFC 7623. Optionally, the ISID-based C‑MAC flush can be used for cases where the single-active vES does not use ES B-MAC allocation.

Preference-based and non-revertive DF election

In addition to the ES service-carving modes auto and off, the manual mode also supports the preference-based algorithm with the non-revertive option, as described in draft-rabadan-bess-evpn-pref-df.

When ES is configured to use the preference-based algorithm, the ES route is advertised with the Designated Forwarder (DF) election extended community (sub-type 0x06). DF election extended community shows the DF election extended community.

Figure 49. DF election extended community

In the extended community, a DF type 2 preference algorithm is advertised with a 2-byte preference value (32767 by default) if the preference-based manual mode is configured. The Don't Preempt Me (DP) bit is set if the non-revertive option is enabled.

The following CLI excerpt shows the relevant commands to enable the preference-based DF election on a specific ES (regular or virtual):

config>service>system>bgp-evpn>ethernet-segment#
...
service-carving mode {manual|auto|off}
  service-carving manual
    [no] preference [create] [non-revertive]
      value <value>
    exit
  [no] evi <evi> [to <evi>] 
  [no] isid <isid> [to <isid>]
# value 0..65535; default 32767
...

Where:

  • The preference value can be changed on an active ES without shutting down the ES, and therefore, a new DF can be forced for maintenance or other reasons.

  • The service-carving mode must be changed to manual mode to create the preference context.

  • The preference command is supported on regular or virtual ES, regardless of the multihoming mode (single-active or all-active) or the service type (VPLS, I-VPLS, or Epipe).

  • By default, the highest-preference PE in the ES becomes the DF for an EVI or ISID, using the DP bit as the tiebreaker first (DP=1 wins over DP=0) and the lowest PE-IP as the last-resort tiebreaker. All the explicitly configured EVI or ISID ranges select the lowest preference PE as the DF (whereas the non-configured EVI or ISID values select the highest preference PE).

    This selection is displayed as Cfg Range Type: lowest-pref in the following show command example.

    *A:PE-2# show service system bgp-evpn ethernet-segment name "vES-23"      
    ===============================================================================
    Service Ethernet Segment
    ===============================================================================
    Name                    : vES-23
    Eth Seg Type            : Virtual            
    Admin State             : Enabled            Oper State         : Up
    ESI                     : 01:23:23:23:23:23:23:23:23:23
    Multi-homing            : allActive          Oper Multi-homing  : allActive
    ES SHG Label            : 262141             
    Source BMAC LSB         : 00-23              
    ES BMac Tbl Size        : 8                  ES BMac Entries    : 0
    Lag Id                  : 1                  
    ES Activation Timer     : 3 secs (default)   
    Svc Carving             : manual             Oper Svc Carving   : manual
    Cfg Range Type          : lowest-pref        
    -------------------------------------------------------------------------------
    DF Pref Election Information
    -------------------------------------------------------------------------------
    Preference     Preference     Last Admin Change        Oper Pref      Do No
    Mode           Value                                   Value          Preempt
    -------------------------------------------------------------------------------
    non-revertive  100            12/21/2016 15:16:38      100            Enabled
    -------------------------------------------------------------------------------
    EVI Ranges: <none>
    ISID Ranges: <none>
    ===============================================================================
    
  • The EVI and ISID ranges configured on the service-carving context are not required to be consistent with any ranges configured for vESs.

  • If the non-revertive option is configured, when the former DF comes back up after a failure and checks existing ES routes, it advertises an operational preference and DP bit, which does not cause a DF switchover for the ES EVI/ISID values.

  • The non-revertive option prevents an ES DF switchover in the following events:

    • ES port recovery after a failure

    • node recovery after a reboot or power-up event

  • The non-revertive option does not prevent an ES DF switchover when the ES is administratively enabled or when any other event attempts to recover the ES when the ES routes from the ES peers are not present yet. An example of this is when the user executes the clear card command on all of the line cards in the router. When the ES is brought up, the BGP session is still recovering and therefore, there are no remote ES routes from the ES peers. Use the following command to prevent this situation for reboots or node power-up events (but not for any other events).

    configure redundancy bgp-evpn-multi-homing boot-timer

The following configuration example shows the use of the preference-based algorithm and non-revertive option in an ES defined in PE1 and PE2.

*A:PE-1>config>service>system>bgp-evpn# info 
----------------------------------------------
 ethernet-segment "ES1" create
   esi 01:00:00:00:00:12:00:00:00:01
   service-carving manual
       preference non-revertive create
         value 10000
       exit
       evi 2001 to 4000 
   exit
   multi-homing single-active
   port 1/1/1
   no shutdown
   
/* example of vpls 1 - similar config exists for evis 2-4000 */
*A:PE-1>config>service>vpls# info 
----------------------------------------------
 vxlan vni 1 create
 exit
 bgp-evpn
   evi 1
   mpls bgp 1
     ecmp 2
     auto-bind-tunnel
       resolution any
     exit
 sap 1/1/1:1 create
 no shutdown
----------------------------------------------
*A:PE-2>config>service>system>bgp-evpn# info 
----------------------------------------------
 ethernet-segment "ES1" create
   esi 01:00:00:00:00:12:00:00:00:01
   service-carving manual
       preference non-revertive create
         value 5000
       exit
       evi 2001 to 4000 
   exit
   multi-homing single-active
   port 1/1/1
   no shutdown
   
*A:PE-2>config>service>vpls# info 
----------------------------------------------
 vxlan vni 1 create
 exit
 bgp-evpn
   evi 1
   mpls bgp 1
     ecmp 2
     auto-bind-tunnel
       resolution any
     exit
 sap 1/1/1:1 create
 no shutdown
----------------------------------------------

Based on the configuration in the preceding example, the PE behavior is as follows:

  1. Assuming the ES is no shutdown on both PE1 and PE2, the PEs exchange ES routes, including the [Pref, DP-bit] in the DF election extended community.

  2. For EVIs 1 to 2000, PE2 is immediately promoted to NDF, whereas PE1 becomes the DF, and (following the es-activation-timer) brings up its SAP in EVIs 1 to 2000.

    For EVIs 2001 to 4000, the result is the opposite and PE2 becomes the DF.

  3. If port 1/1/1 on PE1 goes down, PE1 withdraws its ES route and PE2 becomes the DF for EVIs 1 to 2000.

  4. When port 1/1/1 on PE1 comes back up, PE1 compares its ES1 preference with the preferences in the remote PEs in ES1. PE1 advertises the ES route with an ‟in-use operational” Pref = 5000 and DP=0. Because PE2's Pref is the same as PE1's operational value, but PE2's DP=1, PE2 continues to be the DF for EVIs 1 to 4000.

    Note: The DP bit is the tiebreaker in case of equal Pref and regardless of the choice of highest or lowest preference algorithm.

  5. PE1's ‟in-use” Pref and DP continue to be [5000,0] until one of the following conditions is true:

    • PE2 withdraws its ES route, in which case PE1 re-advertises its admin Pref and DP [10000,DP=1]

    • The user changes PE1's Pref configuration

EVPN-MPLS routed VPLS multicast routing support

IPv4 multicast routing is supported in an EVPN-MPLS VPRN routed VPLS service through its IP interface, when the source of the multicast stream is on one side of its IP interface and the receivers are on either side of the IP interface. For example, the source for multicast stream G1 could be on the IP side sending to receivers on both other regular IP interfaces and the VPLS of the routed VPLS service, while the source for group G2 could be on the VPLS side sending to receivers on both the VPLS and IP side of the routed VPLS service. See IPv4 and IPv6 multicast routing support for more details.

IGMP snooping in EVPN-MPLS and PBB EVPN services

IGMP snooping is supported in EVPN-MPLS VPLS and PBB-EVPN I-VPLS (where BGP EVPN is running in the associated B-VPLS service) services. It is also supported in EVPN-MPLS VPRN and IES R-VPLS services. It is required in scenarios where the operator does not want to flood all of the IP multicast traffic to the access nodes or CEs, and only wants to deliver IP multicast traffic for which IGMP reports have been received.

The following points apply when IGMP snooping is configured in EVPN-MPLS VPLS or PBB-EVPN I-VPLS services:

  • IGMP snooping is enabled using the configure service vpls igmp-snooping no shutdown command.

  • Queries and reports received on SAP or SDP bindings are snooped and properly handled; they are sent to SAP or SDP bindings as expected.

  • Queries and reports on EVPN-MPLS or PBB-EVPN B-VPLS destinations are handled as follows.

    • If received from SAP or SDP bindings, the queries and reports are sent to all EVPN-MPLS and PBB-EVPN B-VPLS destinations, regardless of whether the service is using an ingress replication or mLDP provider tunnel.

    • If received on an EVPN-MPLS or PBB-EVPN B-VPLS destination, the queries and reports are processed and propagated to access SAP or SDP bindings, regardless of whether the service is using an ingress replication or mLDP provider tunnel.

    • EVPN-MPLS and PBB-EVPN B-VPLS destinations are is treated as a single IGMP snooping interface and is always added as an mrouter.

    • The debug trace output displays one copy of messages being sent to all EVPN-MPLS and PBB-EVPN B-VPLS destinations (the trace does not show a copy for each destination) and displays messages received from all EVPN-MPLS and PBB-EVPN B-VPLS destinations as coming from a single EVPN-MPLS interface.

Note: When IGMP snooping is enabled with P2MP LSPs, at least one EVPN-MPLS multicast destination must be established to enable the processing of IGMP messages by the system. The use of P2MP LSPs is not supported when sending IPv4 multicast into an EVPN-MPLS R-VPLS service from its IP interface.

In the following show command output, the EVPN-MPLS destinations are shown as part of the MFIB (when igmp-snooping is in a no shutdown state), and the EVPN-MPLS logical interface is shown as an mrouter.

*A:PE-2# show service id 2000 mfib 
===============================================================================
Multicast FIB, Service 2000
===============================================================================
Source Address  Group Address         SAP or SDP Id                   Svc Id   Fwd
                                                                            Blk
-------------------------------------------------------------------------------
*               *                     eMpls:192.0.2.3:262132       Local    Fwd
                                      eMpls:192.0.2.4:262136       Local    Fwd
                                      eMpls:192.0.2.5:262131       Local    Fwd
-------------------------------------------------------------------------------
Number of entries: 1
===============================================================================
*A:PE-2# show service id 2000 igmp-snooping base 
===============================================================================
IGMP Snooping Base info for service 2000
===============================================================================
Admin State : Up
Querier     : 10.0.0.3 on evpn-mpls
-------------------------------------------------------------------------------
SAP or SDP                   Oper MRtr Pim  Send Max   Max  Max   MVR       Num
Id                        Stat Port Port Qrys Grps  Srcs Grp   From-VPLS Grps
                                                         Srcs            
-------------------------------------------------------------------------------
sap:1/1/1:2000            Up   No   No   No   None  None None  Local     0
evpn-mpls                 Up   Yes  N/A  N/A  N/A   N/A  N/A   N/A       N/A
===============================================================================


*A:PE-4# show service id 2000 igmp-snooping mrouters 
 
===============================================================================
IGMP Snooping Multicast Routers for service 2000
===============================================================================
MRouter          SAP or SDP Id                 Up Time        Expires   Version
-------------------------------------------------------------------------------
10.0.0.3         evpn-mpls                  0d 00:38:49    175s      3
-------------------------------------------------------------------------------
Number of mrouters: 1
===============================================================================

The equivalent output for PBB-EVPN services is similar to the output above for EVPN-MPLS services, with the exception that the EVPN destinations are named "b-EVPN-MPLS".

Data-driven IGMP snooping synchronization with EVPN multihoming

When single-active multihoming is used, the IGMP snooping state is learned on the active multihoming object. If a failover occurs, the system with the newly active multihoming object must wait for IGMP messages to be received to instantiate the IGMP snooping state after the ES activation timer expires; this could result in an increased outage.

The outage can be reduced by using MCS synchronization, which is supported for IGMP snooping in both EVPN-MPLS and PBB-EVPN services (see Multichassis synchronization for Layer 2 snooping states). However, MCS only supports synchronization between two PEs, whereas EVPN multihoming is supported between a maximum of four PEs. Also, IGMP snooping state can be synchronized only on a SAP.

An increased outage would also occur when using all-active EVPN multihoming. The IGMP snooping state on an ES LAG SAP or virtual ES to the attached CE must be synchronized between all the ES PEs, as the LAG link used by the DF PE may not be the same as that used by the attached CE. MCS synchronization is not applicable to all-active multihoming as MCS only supports active/standby synchronization.

To eliminate any additional outage on a multihoming failover, IGMP snooping messages can be synchronized between the PEs on an ES using data-driven IGMP snooping state synchronization, which is supported in EVPN-MPLS services, PBB-EVPN services, EVPN-MPLS VPRN and IES R-VPLS services. The IGMP messages received on an ES SAP or spoke SDP are sent to the peer ES PEs with an ESI label (for EVPN-MPLS) or ES B-MAC (for PBB-EVPN) and these are used to synchronize the IGMP snooping state on the ES SAP or spoke SDP on the receiving PE.

Data-driven IGMP snooping state synchronization is supported for both all-active multihoming and single-active with an ESI label multihoming in EVPN-MPLS, EVPN-MPLS VPRN and IES R-VPLS services, and for all-active multihoming in PBB-EVPN services. All PEs participating in a multihomed ES must be running an SR OS version supporting this capability. PBB-EVPN with IGMP snooping using single-active multihoming is not supported.

Data-driven IGMP snooping state synchronization is also supported with P2MP mLDP LSPs in both EVPN-MPLS and PBB-EVPN services. When P2MP mLDP LSPs are used in EVPN-MPLS services, all PEs (including the PEs not connected to a multihomed ES) in the EVPN-MPLS service must be running an SR OS version supporting this capability with IGMP snooping enabled and all network interfaces must be configured on FP3 or higher-based line cards.

Data-driven IGMP snooping synchronization with EVPN multihoming shows the processing of an IGMP message for EVPN-MPLS. In PBB-EVPN services, the ES B-MAC is used instead of the ESI label to synchronize the state.

Figure 50. Data-driven IGMP snooping synchronization with EVPN multihoming

Data-driven synchronization is enabled by default when IGMP snooping is enabled within an EVPN-MPLS service using all-active multihoming or single-active with an ESI label multihoming, or in a PBB-EVPN service using all-active multihoming. If IGMP snooping MCS synchronization is enabled on an EVPN-MPLS or PBB-EVPN (I-VPLS) multihoming SAP then MCS synchronization takes precedence over the data-driven synchronization and the MCS information is used. Mixing data-driven and MCS IGMP synchronization within the same ES is not supported.

When using EVPN-MPLS, the ES should be configured as non-revertive to avoid an outage when a PE takes over the DF role. The Ethernet A-D per ESI route update is withdrawn when the ES is down which prevents state synchronization to the PE with the ES down, as it does not advertise an ESI label. The lack of state synchronization means that if the ES comes up and that PE becomes DF after the ES activation timer expires, it may not have any IGMP snooping state until the next IGMP messages are received, potentially resulting in an additional outage. Configuring the ES as non-revertive can avoid this potential outage. Configuring the ES to be non-revertive would also avoid an outage when PBB-EVPN is used, but there is no outage related to the lack of the ESI label as it is not used in PBB-EVPN.

The following steps can be used when enabling IGMP snooping in EVPN-MPLS and PBB-EVPN services:

  1. Upgrade SR OS on all ES PEs to a version supporting data-driven IGMP snooping synchronization with EVPN multihoming.

  2. Enable IGMP snooping in the required services on all ES PEs. Traffic loss occurs until all ES PEs have IGMP snooping enabled and the first set of join/query messages are processed by the ES PEs.

    Note: There is no action required on the non-ES PEs.

If P2MP mLDP LSPs are also configured, the following steps can be used when enabling IGMP snooping in EVPN-MPLS and PBB-EVPN services:

  1. Upgrade SR OS on all PEs (both ES and non-ES) to a version supporting data-driven IGMP snooping synchronization with EVPN multihoming.

  2. Enable IGMP snooping in EVPN-MPLS and PBB-EVPN services.
    • Perform the following steps for EVPN-MPLS:

      • Enable IGMP snooping on all non-ES PEs. Traffic loss occurs until the first set of join/query messages are processed by the non-ES PEs.

      • Then enable IGMP snooping on all ES PEs. Traffic loss occurs until all PEs have IGMP snooping enabled and the first set of join/query messages are processed by the ES PEs.

    • Perform the following steps for PBB-EVPN:

      • Enable IGMP snooping on all ES PEs. Traffic loss occurs until all PEs have IGMP snooping enabled and the first set of join/query messages are processed by the ES PEs.

      • There is no action required on the non-ES PEs.

To aid with troubleshooting, the debug packet output displays the IGMP packets used for the snooping state synchronization. An example of a join sent on ES esi-1 from one ES PE and the same join received on another ES PE follows.

6 2017/06/16 18:00:07.819 PDT MINOR: DEBUG #2001 Base IGMP
"IGMP: TX packet on svc 1
  from chaddr 5e:00:00:16:d8:2e
  send towards ES:esi-1
  Port  : evpn-mpls
  SrcIp : 0.0.0.0
  DstIp : 239.0.0.22
  Type  : V3 REPORT
    Num Group Records: 1
        Group Record Type: MODE_IS_EXCL (2), AuxDataLen 0, Num Sources 0
          Group Addr: 239.0.0.1


4 2017/06/16 18:00:07.820 PDT MINOR: DEBUG #2001 Base IGMP
"IGMP: RX packet on svc 1
  from chaddr d8:2e:ff:00:01:41
  received via evpn-mpls on ES:esi-1
  Port  : sap lag-1:1
  SrcIp : 0.0.0.0
  DstIp : 239.0.0.22
  Type  : V3 REPORT
    Num Group Records: 1
        Group Record Type: MODE_IS_EXCL (2), AuxDataLen 0, Num Sources 0
          Group Addr: 239.0.0.1

PIM snooping for IPv4 in EVPN-MPLS and PBB-EVPN services

PIM snooping for VPLS allows a VPLS PE router to build multicast states by snooping PIM protocol packets that are sent over the VPLS. The VPLS PE then forwards multicast traffic based on the multicast states. When all receivers in a VPLS are IP multicast routers running PIM, multicast forwarding in the VPLS is efficient when PIM snooping for VPLS is enabled.

PIM snooping for IPv4 is supported in EVPN-MPLS (for VPLS and R-VPLS) and PBB-EVPN I-VPLS (where BGP EVPN is running in the associated B-VPLS service) services. It is enabled using the following command (as IPv4 multicast is enabled by default):

configure service vpls <service-id> pim-snooping 

PIM snooping on SAPs and spoke SDPs operates in the same way as in a plain VPLS service. However, EVPN-MPLS/PBB-EVPN B-VPLS destinations are treated as a single PIM interface, specifically:

  • Hellos and join/prune messages from SAPs or SDPs are always sent to all EVPN-MPLS or PBB-EVPN B-VPLS destinations.

  • As soon as a hello message is received from one PIM neighbor on an EVPN-MPLS or PBB-EVPN I-VPLS destination, then the single interface representing all EVPN-MPLS or PBB-EVPN I-VPLS destinations has that PIM neighbor.

  • The EVPN-MPLS or PBB-EVPN B-VPLS destination split horizon logic ensures that IP multicast traffic and PIM messages received on an EVPN-MPLS or PBB-EVPN B-VPLS destination are not forwarded back to other EVPN-MPLS or PBB-EVPN B-VPLS destinations.

  • The debug trace output displays one copy of messages being sent to all EVPN-MPLS or PBB-EVPN B-VPLS destinations (the trace does not show a copy for each destination) and displays messages received from all EVPN-MPLS or PBB-EVPN B-VPLS destinations as coming from a single EVPN-MPLS interface.

PIM snooping for IPv4 is supported in EVPN-MPLS services using P2MP LSPs and PBB-EVPN I-VPLS services with P2MP LSPs in the associated B-VPLS service. When PIM snooping is enabled with P2MP LSPs, at least one EVPN-MPLS multicast destination is required to be established to enable the processing of PIM messages by the system.

Multichassis synchronization (MCS) of PIM snooping for IPv4 state is supported for both SAPs and spoke SDPs which can be used with single-active multihoming. Care should be taken when using *.null to define the range for a QinQ virtual ES if the associated SAPs are also being synchronized by MCS, as there is no equivalent MCS sync-tag support to the *.null range.

PBB-EVPN services operate in a similar way to regular PBB services, specifically:

  • The multicast flooding between the I-VPLS and the B-VPLS works in a similar way as for PIM snooping for IPv4 with an I-VPLS using a regular B-VPLS. The first PIM join message received over the local B-VPLS from a B-VPLS SAP or SDP or EVPN destination adds all of the B-VPLS SAP or SDP or EVPN components into the related multicast forwarding table associated with that I-VPLS context. The multicast packets are forwarded throughout the B-VPLS on the per ISID single tree.

  • When a PIM router is connected to a remote I-VPLS instance over the B-VPLS infrastructure, its location is identified by the B-VPLS SAP, SDP or by the set of all EVPN destinations on which its PIM hellos are received. The location is also identified by the source B-MAC address used in the PBB header for the PIM hello message (this is the B-MAC associated with the B-VPLS instance on the remote PBB PE).

In EVPN-MPLS services, the individual EVPN-MPLS destinations appear in the MFIB but the information for each EVPN-MPLS destination entry is always identical, as shown below:

*A:PE# show service id 1 mfib
===============================================================================
Multicast FIB, Service 1
===============================================================================
Source Address  Group Address         Port Id                      Svc Id   Fwd
                                                                            Blk
-------------------------------------------------------------------------------
*               239.252.0.1           sap:1/1/9:1                  Local    Fwd
                                      eMpls:1.1.1.2:262141         Local    Fwd
                                      eMpls:1.1.1.3:262141         Local    Fwd
-------------------------------------------------------------------------------
Number of entries: 1
===============================================================================
*A:PE#

Similarly for the PIM neighbors:

*A:PE# show service id 1 pim-snooping neighbor
===============================================================================
PIM Snooping Neighbors ipv4
===============================================================================
Port Id                 Nbr DR Prty     Up Time       Expiry Time     Hold Time
  Nbr Address
-------------------------------------------------------------------------------
SAP:1/1/9:1             1               0d 00:08:17   0d 00:01:29     105
  10.0.0.1
EVPN-MPLS               1               0d 00:27:26   0d 00:01:19     105
  10.0.0.2
EVPN-MPLS               1               0d 00:27:26   0d 00:01:19     105
  10.0.0.3
-------------------------------------------------------------------------------
Neighbors : 3
===============================================================================
*A:PE#

A single EVPN-MPLS interface is shown in the outgoing interface, as can be seen in the following output:

*A:PE# show service id 1 pim-snooping group detail
===============================================================================
PIM Snooping Source Group ipv4
===============================================================================
Group Address      : 239.252.0.1
Source Address     : *
Up Time            : 0d 00:07:07
Up JP State        : Joined             Up JP Expiry       : 0d 00:00:37
Up JP Rpt          : Not Joined StarG   Up JP Rpt Override : 0d 00:00:00
RPF Neighbor       : 10.0.0.1
Incoming Intf      : SAP:1/1/9:1
Outgoing Intf List : EVPN-MPLS, SAP:1/1/9:1
Forwarded Packets  : 0                  Forwarded Octets   : 0
-------------------------------------------------------------------------------
Groups : 1
===============================================================================
*A:PE#

An example of the debug trace output for a join received on an EVPN-MPLS destination is shown below:

A:PE1# debug service id 1 pim-snooping packet jp
A:PE1#
32 2016/12/20 14:21:22.68 CET MINOR: DEBUG #2001 Base PIM[vpls 1 ]
"PIM[vpls 1 ]: Join/Prune
[000 02:16:02.460] PIM-RX ifId 1071394 ifName EVPN-MPLS 10.0.0.3 -> 224.0.0.13 
Length: 34
PIM Version: 2 Msg Type: Join/Prune Checksum: 0xd3eb
Upstream Nbr IP : 10.0.0.1 Resvd: 0x0, Num Groups 1, HoldTime 210
        Group: 239.252.0.1/32 Num Joined Srcs: 1, Num Pruned Srcs: 0
        Joined Srcs:
                10.0.0.1/32 Flag SWR  <*,G>

The equivalent output for PBB-EVPN services is similar to that above for EVPN-MPLS services, with the exception that the EVPN destinations are named ‟b-EVPN-MPLS”.

Data-driven PIM snooping for IPv4 synchronization with EVPN multihoming

When single-active multihoming is used, PIM snooping for IPv4 state is learned on the active multihoming object. If a failover occurs, the system with the newly active multihoming object must wait for IPv4 PIM messages to be received to instantiate the PIM snooping for IPv4 state after the ES activation timer expires, which could result in an increased outage.

This outage can be reduced by using MCS synchronization, which is supported for PIM snooping for IPv4 in both EVPN-MPLS and PBB-EVPN services (see Multichassis synchronization for Layer 2 snooping states). However, MCS only supports synchronization between two PEs, whereas EVPN multihoming is supported between a maximum of four PEs.

An increased outage would also occur when using all-active EVPN multihoming. The PIM snooping for IPv4 state on an all-active ES LAG SAP or virtual ES to the attached CE must be synchronized between all the ES PEs, as the LAG link used by the DF PE may not be the same as that used by the attached CE. MCS synchronization is not applicable to all-active multihoming as MCS only supports active/standby synchronization.

To eliminate any additional outage on a multihoming failover, snooped IPv4 PIM messages should be synchronized between the PEs on an ES using data-driven PIM snooping for IPv4 state synchronization, which is supported in both EVPN-MPLS and PBB-EVPN services. The IPv4 PIM messages received on an ES SAP or spoke SDP are sent to the peer ES PEs with an ESI label (for EVPN-MPLS) or ES B-MAC (for PBB-EVPN) and are used to synchronize the PIM snooping for IPv4 state on the ES SAP or spoke SDP on the receiving PE.

Data-driven PIM snooping state synchronization is supported for all-active multihoming and single-active with an ESI label multihoming in EVPN-MPLS services. All PEs participating in a multihomed ES must be running an SR OS version supporting this capability with PIM snooping for IPv4 enabled. It is also supported with P2MP mLDP LSPs in the EVPN-MPLS services, in which case all PEs (including the PEs not connected to a multihomed ES) must have PIM snooping for IPv4 enabled and all network interfaces must be configured on FP3 or higher-based line cards.

In addition, data-driven PIM snooping state synchronization is supported for all-active multihoming in PBB-EVPN services and with P2MP mLDP LSPs in PBB-EVPN services. All PEs participating in a multihomed ES, and all PEs using PIM proxy mode (including the PEs not connected to a multihomed ES) in the PBB-EVPN service must be running an SR OS version supporting this capability and must have PIM snooping for IPv4 enabled. PBB-EVPN with PIM snooping for IPv4 using single-active multihoming is not supported.

Data-driven PIM snooping for IPv4 synchronization with EVPN multihoming shows the processing of an IPv4 PIM message for EVPN-MPLS. In PBB-EVPN services, the ES B-MAC is used instead of the ESI label to synchronize the state.

Figure 51. Data-driven PIM snooping for IPv4 synchronization with EVPN multihoming

Data-driven synchronization is enabled by default when PIM snooping for IPv4 is enabled within an EVPN-MPLS service using all-active multihoming and single-active with an ESI label multihoming, or in a PBB-EVPN service using all-active multihoming. If PIM snooping for IPv4 MCS synchronization is enabled on an EVPN-MPLS or PBB-EVPN (I-VPLS) multihoming SAP or spoke SDP, then MCS synchronization takes preference over the data-driven synchronization and the MCS information is used. Mixing data-driven and MCS PIM synchronization within the same ES is not supported.

When using EVPN-MPLS, the ES should be configured as non-revertive to avoid an outage when a PE takes over the DF role. The Ethernet A-D per ESI route update is withdrawn when the ES is down, which prevents state synchronization to the PE with the ES down as it does not advertise an ESI label. The lack of state synchronization means that if the ES comes up and that PE becomes DF after the ES activation timer expires, it may not have any PIM snooping for IPv4 state until the next PIM messages are received, potentially resulting in an additional outage. Configuring the ES as non-revertive can avoid this potential outage. Configuring the ES to be non-revertive would also avoid an outage when PBB-EVPN is used, but there is no outage related to the lack of the ESI label as it is not used in PBB-EVPN.

The following steps can be used when enabling PIM snooping for IPv4 (using PIM snooping and PIM proxy modes) in EVPN-MPLS and PBB-EVPN services:

  • PIM snooping mode

    1. Upgrade SR OS on all ES PEs to a version supporting data-driven PIM snooping for IPv4 synchronization with EVPN multihoming.

    2. Enable PIM snooping for IPv4 on all ES PEs. Traffic loss occurs until all PEs have PIM snooping for IPv4 enabled and the first set of join/hello messages are processed by the ES PEs.

      Note: There is no action required on the non-ES PEs.
  • PIM proxy mode

    • EVPN-MPLS

      1. Upgrade SR OS on all ES PEs to a version supporting data-driven PIM snooping for IPv4 synchronization with EVPN multihoming.

      2. Enable PIM snooping for IPv4 on all ES PEs. Traffic loss occurs until all PEs have PIM snooping for IPv4 enabled and the first set of join/hello messages are processed by the ES PEs.

        Note: There is no action required on the non-ES PEs.
    • PBB-EVPN

      1. Upgrade SR OS on all PEs (both ES and non-ES) to a version supporting data-driven PIM snooping for IPv4 synchronization with EVPN multihoming.

      2. Enable PIM snooping for IPv4 on all non-ES PEs. Traffic loss occurs until all PEs have PIM snooping for IPv4 enabled and the first set of join/hello messages are processed by each non-ES PE.

      3. Enable PIM snooping for IPv4 on all ES PEs. Traffic loss occurs until all PEs have PIM snooping for IPv4 enabled and the first set of join/hello messages are processed by the ES PEs.

If P2MP mLDP LSPs are also configured, the following steps can be used when enabling PIM snooping or IPv4 (using PIM snooping and PIM proxy modes) in EVPN-MPLS and PBB-EVPN services.

  • PIM snooping mode

    1. Upgrade SR OS on all PEs (both ES and non-ES) to a version supporting data-driven PIM snooping for IPv4 synchronization with EVPN multihoming.

    2. Then enable PIM snooping for IPv4 on all ES PEs. Traffic loss occurs until all PEs have PIM snooping enabled and the first set of join/hello messages are processed by the ES PEs.

      Note: There is no action required on the non-ES PEs.
  • PIM proxy mode

    1. Upgrade SR OS on all PEs (both ES and non-ES) to a version supporting data-driven PIM snooping for IPv4 synchronization with EVPN multihoming.

    2. Enable PIM snooping for IPv4 on all non-ES PEs. Traffic loss occurs until all PEs have PIM snooping for IPv4 enabled and the first set of join/hello messages are processed by each non-ES PE.

    3. Enable PIM snooping for IPv4 on all ES PEs. Traffic loss occurs until all PEs have PIM snooping enabled and the first set of join/hello messages are processed by the ES PEs.

In the above steps, when PIM snooping for IPv4 is enabled, the traffic loss can be reduced or eliminated by configuring a larger hold-time (up to 300 seconds), during which multicast traffic is flooded.

To aid with troubleshooting, the debug packet output displays the PIM packets used for the snooping state synchronization. An example of a join sent on ES esi-1 from one ES PE and the same join received on another ES PE follows:

6 2017/06/16 17:36:37.144 PDT MINOR: DEBUG #2001 Base PIM[vpls 1 ]
"PIM[vpls 1 ]: pimVplsFwdJPToEvpn
Forwarding to remote peer on bgp-evpn ethernet-segment esi-1"
7 2017/06/16 17:36:37.144 PDT MINOR: DEBUG #2001 Base PIM[vpls 1 ]
"PIM[vpls 1 ]: Join/Prune
[000 00:19:37.040] PIM-TX ifId 1071394 ifName EVPN-MPLS-ES:esi-1 10.0.0.10 -> 22
10.0.0.13 Length: 34
PIM Version: 2 Msg Type: Join/Prune Checksum: 0xd2de
Upstream Nbr IP : 10.0.0.1 Resvd: 0x0, Num Groups 1, HoldTime 210
        Group: 239.0.0.10/32 Num Joined Srcs: 1, Num Pruned Srcs: 0
        Joined Srcs:
                10.0.0.1/32 Flag SWR  <*,G>


4 2017/06/16 17:36:37.144 PDT MINOR: DEBUG #2001 Base PIM[vpls 1 ]
"PIM[vpls 1 ]: pimProcessPdu
Received from remote peer on bgp-evpn ethernet-segment esi-1, will be applied on
 lag-1:1
"
5 2017/06/16 17:36:37.144 PDT MINOR: DEBUG #2001 Base PIM[vpls 1 ]
"PIM[vpls 1 ]: Join/Prune
[000 00:19:30.740] PIM-RX ifId 1071394 ifName EVPN-MPLS-ES:esi-1 10.0.0.10 -> 22
10.0.0.13 Length: 34
PIM Version: 2 Msg Type: Join/Prune Checksum: 0xd2de
Upstream Nbr IP : 10.0.0.1 Resvd: 0x0, Num Groups 1, HoldTime 210
        Group: 239.0.0.10/32 Num Joined Srcs: 1, Num Pruned Srcs: 0
        Joined Srcs:
                10.0.0.1/32 Flag SWR  <*,G>

EVPN E-Tree

This section contains information about EVPN E-Tree.

BGP EVPN control plane for EVPN E-Tree

BGP EVPN control plane is extended and aligned with IETF RFC 8317 to support EVPN E-Tree services. EVPN E-Tree BGP routes shows the main EVPN extensions for the EVPN E-Tree information model.

Figure 52. EVPN E-Tree BGP routes

The following BGP extensions are implemented for EVPN E-Tree services:

  • An EVPN E-Tree extended community (EC) sub-type 0x5 is defined. The following information is included:

    • The lower bit of the Flags field contains the L bit (where L=1 indicates leaf AC).

    • The leaf label contains a 20-bit MPLS label in the high-order 20 bits of the label field. This leaf label is automatically allocated by the system or statically assigned by the evpn-etree-leaf-label <value> command.

  • The new E-Tree EC is sent with the following routes:

    • AD per-ES per PE route for BUM egress filtering:

      Each EVPN E-Tree capable PE advertises an AD per-ES route with the E-Tree EC, and the following information:

      • Service RD and route-target; if ad-per-es-route-target evi-rt-set is configured, then non-zero ESI AD per-ES routes (used for multihoming) are sent per the evi-rt-set configuration, but E-Tree zero-ESI routes (used for E-Tree) are sent based on the default evi-rt configuration

      • ESI = 0

      • Eth Tag = MAX-ET

      • MPLS label = zero

    • AD per-EVI route for root or leaf configuration consistency check as follows:

      • The E-Tree EC is sent with the AD per-EVI routes for a specific ES. In this case, no validation is performed by the implementation, and the leaf indication is only used for troubleshooting on the remote PEs.

      • The MPLS label value is zero.

      • All attachment circuits (ACs) in each EVI for a specific ES must be configured as either a root or leaf AC, but not a combination. In case of a configuration error, for example, where the AC in PE1 is configured as root and in PE2 as leaf AC, the remote PE3 receives the AD per-EVI routes with inconsistent leaf indication. However, the unicast filtering remains unaffected and is still handled by the FDB lookup information.

    • MAC/IP routes for known unicast ingress filtering as follows:

      • An egress PE sends all MAC/IP routes learned over a leaf AC SAP or spoke SDP with this E-Tree EC indicating that the MAC/IP belongs to a leaf AC.

      • The MPLS label value in the EC is 0.

      • Upon receiving a route with E-Tree EC, the ingress PE imports the route and installs the MAC in the FDB with a leaf flag (if there is a leaf indication in the route). Any frame coming from a leaf AC for which the MAC destination address (DA) matches a leaf AC MAC is discarded at the ingress.

      • If two PEs send the same MAC with the same ESI but inconsistent root or leaf AC indication, the MAC is installed in the FDB as root.

EVPN for MPLS tunnels in E-Tree services

EVPN E-Tree services are modeled as VPLS services configured as E-Trees with the bgp-evpn mpls context enabled.

The following example shows a CLI configuration of a VPLS E-Tree service with EVPN E-Tree service enabled.

*A:PE1>config>service>system>bgp-evpn#
  evpn-etree-leaf-label
*A:PE1>config>service# vpls 1 customer 1 etree create 
*A:PE1>config>service>vpls(etree)# info 
----------------------------------------------
  description "ETREE-enabled evpn-mpls-service"
  bgp-evpn
    evi 10
    mpls bgp 1
      no shutdown
      ecmp 2
      auto-bind-tunnel resolution any
      ingress-replication-bum-label
  sap lag-1:1 leaf-ac create 
  exit
  sap 2/1/1:1 leaf-ac create 
  exit
  sap 2/2/2:1 create 
  exit
  spoke-sdp 3:1 leaf-ac create 
  exit

The following configuration guidelines apply to the EVPN E-Tree service:

  • Before configuring an EVPN E-Tree service, the user must first run the evpn-etree-leaf-label <value> command. This is relevant for EVPN E-Tree services only. The command allocates an E-Tree leaf label on the system and, when a specific value is configured, the leaf label must match on all other PE nodes attached to the same EVPN service.

    Optionally, the evpn-etree-leaf-label <value> command may be configured with a static label value (within the static label range configured in the system using the config>router>mpls>mpls-label>static-label-range context). The static label is used when global leaf labels are needed in the network. For example, the case where at least one 7250 IXR Gen 1 router is attached to the EVPN E-Tree service.

  • The configure service vpls create etree command is compatible with the bgp-evpn mpls context.

  • As in VPLS E-Tree services, an AC that is not configured as a leaf AC is treated as root AC.

  • MAC addresses learned over a leaf AC SAP or SDP binding are advertised as leaf MAC addresses.

  • Any PE with one or more bgp-evpn enabled VPLS E-Tree service advertises an AD per-ES per-PE route with the leaf indication and leaf label used for BUM egress filtering.

  • Any leaf AC SAP or SDP binding defined in an ES triggers the advertisement of an AD per-EVI route with the leaf indication.

  • EVPN E-Tree services use the following CLI commands:

    • sap sap-id leaf-ac create command using the configure service vpls context

    • mesh-sdp sdp-id:vc-id create leaf-ac command using the configure service vpls context

    • spoke-sdp sdp-id:vc-id create leaf-ac command using the configure service vpls context

  • The root-leaf-tag command is blocked in VPLS E-Tree services where bgp-evpn mpls context is enabled.

EVPN E-Tree operation

EVPN E-Tree supports all operations related to flows among local root AC and leaf AC objects in accordance with IETF RFC 8317. This section describes the extensions required to forward traffic from (or to) root AC and leaf AC objects to (or from) BGP EVPN destinations.

EVPN E-Tree known unicast ingress filtering

Known unicast traffic forwarding is based on ingress PE filtering. EVPN E-Tree known unicast ingress filtering shows an example of EVPN-E-Tree forwarding behavior for known unicast.

Figure 53. EVPN E-Tree known unicast ingress filtering

MAC addresses learned on leaf-ac objects are advertised in EVPN with their corresponding leaf indication.

In EVPN E-Tree known unicast ingress filtering , PE1 advertises MAC1 using the E-Tree EC and leaf indication, and PE2 installs MAC1 with a leaf flag in the FDB.

Assuming MAC DA is present in the local FDB (MAC1 in the FDB of PE2) when PE2 receives a frame, it is handled as follows:

  • If the unicast frame enters a root-ac, the frame follows regular data plane procedures; that is, it is sent to the owner of the MAC DA (local SAP or SDP binding or remote BGP EVPN PE) without any filtering.

  • If the unicast frame enters a leaf-ac, it is handled as follows:

    1. A MAC DA lookup is performed on the FDB.

    2. If there is a hit and the MAC was learned as an EVPN leaf (or from a leaf-ac), then the frame is dropped at ingress.

    3. The source MAC (MAC2) is learned and marked as a leaf-learned MAC. It is advertised by the EVPN with the corresponding leaf indication.

  • A MAC received with a root and leaf indication from different PEs in the same ES is installed as root.

The ingress filtering for E-Tree leaf-to-leaf traffic requires the implementation of an extra leaf EVPN MPLS destination per remote PE (containing leaf objects) per E-Tree service. The ingress filtering for E-Tree leaf-to-leaf traffic is as follows:

  • A separate EVPN MPLS bind is created for unicast leaf traffic in the service. The internal EVPN MPLS destination is created for each remote PE that contains a leaf and advertises at least one leaf MAC.

  • The creation of the internal EVPN MPLS destination is triggered when a MAC route with L=1 in the E-Tree EC is received. Any EVPN E-Tree service can potentially use one extra EVPN MPLS destination for leaf unicast traffic per remote PE.

    The extra destination in the EVPN E-Tree service is for unicast only and it is not part of the flooding list. It is resource-accounted and displayed in the tools dump service evpn usage command, as shown in the following example output.

    A:PE-4# tools dump service evpn usage 
    vxlan-evpn-mpls usage statistics at 01/23/2017 00:53:14:
    MPLS-TEP                                        :             3
    VXLAN-TEP                                       :             0
    Total-TEP                                       :      3/ 16383
    Mpls Dests (TEP, Egress Label + ES + ES-BMAC)   :            10
    Mpls Etree Leaf Dests                           :             1
    Vxlan Dests (TEP, Egress VNI)                   :             0
    Total-Dest                                      :     10/196607
    Sdp Bind +  Evpn Dests                          :     13/245759
    ES L2/L3 PBR                                    :      0/ 32767
    Evpn Etree Remote BUM Leaf Labels               :             3
    
  • MACs received with L=1 point to the EVPN MPLS destination, whereas root MACs point to the ‟root” destination.

EVPN E-Tree BUM egress filtering

BUM traffic forwarding is based on egress PE filtering. EVPN E-Tree BUM egress filtering shows an example of EVPN E-Tree forwarding behavior for BUM traffic.

Figure 54. EVPN E-Tree BUM egress filtering

In EVPN E-Tree BUM egress filtering , BUM frames are handled as follows when they ingress PE or PE2:

  • If the BUM frame enters a root-ac, the frame follows regular EVPN data plane procedures.

  • If the BUM frame enters a leaf-ac, the frame handling is as follows:

    1. The frame is marked as leaf and forwarded or replicated to the egress IOM.

    2. At the egress IOM, the frame is flooded in the default multicast list subject to the following:

      • Leaf entries are skipped when BUM traffic is forwarded. This prevents leaf-to-leaf BUM traffic forwarding.

      • Traffic to remote BGP EVPN PEs is encapsulated with the EVPN label stack. If a leaf ESI label present for the far-end PE (L1 in EVPN E-Tree BUM egress filtering ), the leaf ESI label is added at the bottom of the stack; the remaining stack follows (including EVI label). If there is no leaf ESI label for the far-end egress PE, no additional label is added to the stack. This means that the egress PE does not have any E-Tree enabled service, but it can still work with the VPLS E-Tree service available in PE2.

The BUM-encapsulated packet is received on the network ingress interface at the egress PE or PE1. The packet is processed as follows.

  1. A normal ILM lookup is performed for each label (including the EVI label) in the stack.

  2. Further label lookups are performed when the EVI label ILM lookup is complete. If the lookup yields a leaf label, all the leaf-acs are skipped when flooding to the default-multicast list at the egress PE.

EVPN E-Tree egress filtering based on MAC source address

The egress PE checks the MAC Source Address (SA) for traffic received without the leaf MPLS label. This check covers corner cases where the ingress PE sends traffic originating from a leaf-ac but without a leaf indication.

In EVPN E-Tree BUM egress filtering , PE2 receives a frame with MAC DA = MAC3 and MAC SA = MAC2. Because MAC3 is a root MAC, MAC lookup at PE2 allows the system to unicast the packet to PE1 without the leaf label. If MAC3 was no longer in PE1's FDB, PE1 would flood the frame to all the root and leaf-acs, despite the frame having originated from a leaf-ac.

To minimize and prevent leaf traffic from leaking to other leaf-acs (as described in the preceding case), the egress PE always performs a MAC SA check for all types of traffic. The datapath performs MAC SA-based egress filtering as follows:

  1. An Ethernet frame may be treated as originating from a leaf-ac because of several reasons, which requires the system to set a flag to indicate leaf traffic. The flag is set if one of the following conditions is true:

    • The frames arrive on a leaf SAP.
    • EVPN traffic arrives with a leaf label.
    • A MAC SA is flagged as a leaf SA.
  2. After the flag is set, the action taken depends on the type of traffic:

    • unicast traffic

      An FDB lookup is performed, and if the MAC DA FDB entry is marked as a leaf type, the frame is dropped to prevent leaf-to-leaf forwarding.

    • BUM traffic

      The flag is considered at the egress IOM and leaf-to-leaf forwarding is suppressed.

EVPN E-Tree and EVPN multihoming

EVPN E-Tree procedures support all-active and single-active EVPN multihoming. Ingress filtering can handle MACs learned on ES leaf-ac SAP or SDP bindings. If a MAC associated with an ES leaf-ac is advertised with a different E-Tree indication or if the AD per-EVI routes have inconsistent leaf indications, then the remote PEs performing the aliasing treat the MAC as root.

EVPN E-Tree BUM egress filtering and multihoming shows the expected behavior for multihoming and egress BUM filtering.

Figure 55. EVPN E-Tree BUM egress filtering and multihoming

Multihoming and egress BUM filtering in EVPN E-Tree BUM egress filtering and multihoming is handled as follows:

  • BUM frames received on an ES leaf-ac are flooded to the EVPN based on EVPN E-Tree procedures. The leaf ESI label is sent when flooding to other PEs in the same ES, and additional labels are not added to the stack.

    When flooding in the default multicast list, the egress PE skips all the leaf-acs (including the ES leaf-acs) on the assumption that all ACs in a specific ES for a specified EVI have a consistent E-Tree configuration, and they send an AD per-EVI route with a consistent E-Tree indication.

  • BUM frames received on an ES root-ac are flooded to the EVPN based on regular EVPN procedures. The regular ES label is sent for split-horizon when packets are sent to the DF or NDF PEs in the same ES. When flooding in the default multicast list, the egress PE skips the ES SAPs based on the ES label lookup.

If the PE receives an ES MAC from a peer that shares the ES and decides to install it against the local ES SAP that is oper-up, it checks the E-Tree configuration (root or leaf) of the local ES SAP against the received MAC route. The MAC route is processed as follows:

  • If the E-Tree configuration does not match, then the MAC is not installed against any destination until the misconfiguration is resolved.

  • If the SAP is oper-down, the MAC is installed against the EVPN destination to the peer.

PBB-EVPN E-Tree services

SR OS supports PBB-EVPN E-Tree services in accordance with IETF RFC 8317. PBB-EVPN E-Tree services are modeled as PBB-EVPN services where some I-VPLS services are configured as etree and some of their SAP or spoke SDPs are configured as leaf-acs.

The procedures for the PBB-EVPN E-Tree are similar to those for the EVPN E-Tree, except that the egress leaf-to-leaf filtering for BUM traffic is based on the B-MAC source address. Also, the leaf label and the EVPN AD routes are not used.

The PBB-EVPN E-Tree operation is as follows:

  • When one or more I-VPLS E-Tree services are linked to a B-VPLS, the leaf backbone source MAC address (leaf-source-bmac parameter) is used for leaf-originated traffic in addition to the source B-VPLS MAC address (source-bmac parameter) that is used for sourcing root traffic.

  • The leaf backbone source MAC address for PBB must be configured using the command config>service>pbb>leaf-source-bmac ieee-address before the configuration of any I-VPLS E-Tree service.

  • The leaf-source-bmac address is advertised in a B-MAC route with a leaf indication.

  • Known unicast filtering occurs at the ingress PE. When a frame enters an I-VPLS leaf-ac, a MAC lookup is performed. If the C-MAC DA is associated with a leaf B-MAC, the frame is dropped.

  • Leaf-to-leaf BUM traffic filtering occurs at the egress PE. When flooding BUM traffic with the B-MAC SA matching a leaf B-MAC, the egress PE skips the I-VPLS leaf-acs.

The following CLI example shows an I-VPLS E-Tree service that uses PBB-EVPN E-Tree. The leaf-source-bmac address must be configured before the configuration of the I-VPLS E-Tree. As is the case in regular E-Tree services, SAP and spoke SDPs that are not explicitly configured as leaf-acs are considered root-ac objects.

A:PE-2>config>service# info
----------------------------------------------
        pbb
            leaf-source-bmac 00:00:00:00:00:22
        exit
        vpls 1000 customer 1 name "vpls1000" b-vpls create
            service-mtu 2000
            bgp
            exit
            bgp-evpn
                evi 1000
                exit
                mpls bgp 1
                    ingress-replication-bum-label
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            stp                      
                shutdown
            exit
            no shutdown
        exit
        vpls 1001 customer 1 i-vpls etree create
            pbb
                backbone-vpls 1000
                exit
            exit
            stp
                shutdown
            exit
            sap 1/1/1:1001 leaf-ac create
                no shutdown
            exit
            sap 1/1/1:1002 create
                no shutdown
            exit
            no shutdown
        exit

The following considerations apply to PBB-EVPN E-Trees and multihoming:

  • All-active multihoming is not supported on leaf-ac I-VPLS SAPs.

  • Single-active multihoming is supported on leaf-ac I-VPLS SAPs and spoke SDPs.

  • ISID- and RFC 7623-based C-MAC flush are supported in addition to PBB-EVPN E-Tree services and single-active multihoming.

MPLS entropy label and hash label

The router supports the MPLS entropy label (RFC 6790) for EVPN VPLS and Epipe services, and the Flow Aware Transport label, known as the hash label, (RFC 6391) on spoke SDPs bound to a VPLS EVPN service, as well as on EVPN unicast destinations (in Epipe and VPLS services) if enabled by the hash-label command. This label allows LSR nodes in a network to load-balance labeled packets in a much more granular fashion than allowed by simply hashing on the standard label stack. The entropy label can be enabled on BGP-EVPN services (VPLS and Epipe).

To configure insertion of the entropy label on a BGP-EVPN VPLS or Epipe, use the entropy-label command in the bgp-evpn>mpls context. Use the entropy-label command under the spoke-sdp context to configure insertion of the entropy label on spoke SDPs bound to a BGP-EVPN VPLS. Note that the entropy label is only inserted if the far end of the MPLS tunnel is also entropy-label-capable. For more information, see the 7450 ESS, 7750 SR, 7950 XRS, and VSR MPLS Guide.

The hash label is configured using the hash-label command in the spoke-sdp context. Either the hash label or the entropy label can be configured on one object, but not both.

Inter-AS Option B and Next-Hop-Self Route-Reflector for EVPN-MPLS

Inter-AS Option B and Next-Hop-Self Route-Reflector (VPN-NH-RR) functions are supported for the BGP-EVPN family in the same way both functions are supported for IP-VPN families.

A typical use case for EVPN Inter-AS Option B or EVPN VPN-NH-RR is Data Center Interconnect (DCI) networks, where cloud and service providers are looking for efficient ways to extend their Layer 2 and Layer 3 tenant services beyond the data center and provide a tighter DC-WAN integration. While the instantiation of EVPN services in the DGW to provide this DCI connectivity is a common model, some operators use Inter-AS Option B or VPN-NH-RR connectivity to allow the DGW to function as an ASBR or ABR respectively, and the services are only instantiated on the edge devices.

EVPN inter-AS Option B or VPN-NH-RR model shows a DCI example where the EVPN services in two DCs are interconnected without the need for instantiating services on the DC GWs.

Figure 56. EVPN inter-AS Option B or VPN-NH-RR model

The ASBRs or ABRs connect the DC to the WAN at the control plane and data plane levels where the following considerations apply:

  • From a control plane perspective, the ASBRs or ABRs perform the following tasks:

    1. accept EVPN-MPLS routes from a BGP peer

      EVPN-VXLAN routes are not supported.

    2. extract the MPLS label from the EVPN NLRI or attribute and program a label swap operation on the IOM

    3. re-advertise the EVPN-MPLS route to the BGP peer in the other Autonomous Systems (ASs) or IGP domains

      The re-advertised route has a Next-Hop-Self and a new label encoded for those routes that came with a label.

  • From a data plan perspective, the ASBRs and ABRs terminate the ingress transport tunnel, perform an EVPN label swap operation, and send the packets on to an interface (if E-BGP is used) or a new tunnel (if IBGP is used).

  • The ASBR or ABR resolves the EVPN routes based on the existing bgp next-hop-resolution command for family vpn, where vpn refers to EVPN, VPN-IPv4, and VPN-IPv6 families.

*A:ABR-1# configure router bgp next-hop-resolution labeled-routes transport-tunnel 
family vpn resolution-filter 
  - resolution-filter
 [no] bgp             - Use BGP tunnelling for next hop resolution
 [no] ldp             - Use LDP tunnelling for next hop resolution
 [no] rsvp            - Use RSVP tunnelling for next hop resolution
 [no] sr-isis         - Use sr-isis tunnelling for next hop resolution
 [no] sr-ospf         - Use sr-ospf for next hop resolution
 [no] sr-te           - Use sr-te for next hop resolution
 [no] udp             - Use udp for next hop resolution

For more information about the next-hop resolution of BGP-labeled routes, see the 7450 ESS, 7750 SR, 7950 XRS, and VSR Unicast Routing Protocols Guide

Inter-AS Option B for EVPN services on ABSRs and VPN-NH-RR on ABRs re-use the existing commands enable-inter-as-vpn and enable-rr-vpn-forwarding respectively. The two commands enable the ASBR or ABR function for both EVPN and IP-VPN routes. These two features can be used with the following EVPN services:

  • EVPN-MPLS Epipe services (EVPN-VPWS)

  • EVPN-MPLS VPLS services

  • EVPN-MPLS R-VPLS services

  • PBB-EVPN and PBB-EVPN E-Tree services

  • EVPN-MPLS E-Tree services

  • PE and ABR functions (EVPN services and enable-rr-vpn-forwarding), which are both supported on the same router

  • PE and ASBR functions (EVPN services and enable-inter-as-vpn), which are both supported on the same router

The following sub-sections clarify some aspects of EVPN when used in an Inter-AS Option B or VPN-NH-RR network.

Inter-AS Option B and VPN-NH-RR procedures on EVPN routes

When enable-rr-vpn-forwarding or enable-inter-as-vpn is configured, only EVPN-MPLS routes are processed for label swap and the next hop is changed. EVPN-VXLAN routes are re-advertised without a change in the next hop.

The following shows how the router processes and re-advertises the different EVPN route types. For more information about the route fields, see the BGP-EVPN control plane for MPLS tunnels Guide.

  • Auto-discovery (AD) routes (type 1)

    For AD per EVI routes, the MPLS label is extracted from the route NLRI. The route is re-advertised with Next-Hop-Self (NHS) and a new label. No modifications are made for the remaining attributes.

    For AD per ES routes, the MPLS label in the NLRI is zero. The route is re-advertised with NHS and the MPLS label remains zero. No modifications are made for the remaining attributes.

  • MAC/IP routes (type 2)

    The MPLS label (Label-1) is extracted from the NLRI. The route is re-advertised with NHS and a new Label-1. No modifications are made for the remaining attributes.

  • Inclusive Multicast Ethernet Tag (IMET) routes (type 3)

    Because there is no MPLS label present in the NLRI, the MPLS label is extracted from the PMSI Tunnel Attribute (PTA) if needed, and the route is then re-advertised with NHS, with the following considerations:

    • For IMET routes with tunnel-type Ingress Replication, the router extracts the IR label from the PTA. The router programs the label swap and re-advertises the route with a new label in the PTA.

    • For tunnel-type P2MP mLDP, the router re-advertises the route with NHS. No label is extracted; therefore, no swap operation occurs.

    • For tunnel-type Composite, the IR label is extracted from the PTA, the swap operation is programmed and the route re-advertised with NHS. A new label is encoded in the PTA’s IR label with no other changes in the remaining fields.

    • For tunnel-type AR, the routes are always considered VXLAN routes and are re-advertised with the next-hop unchanged.

  • Ethernet-Segment (ES) routes (type 4)

    Because ES routes do not contain an MPLS label, the route is re-advertised with NHS and no modifications to the remaining attributes. Although an ASBR or ABR re-advertises ES routes, EVPN multihoming for ES PEs located in different ASs or IGMP domains is not supported.

  • IP-Prefix routes (type 5)

    The MPLS label is extracted from the NLRI and the route is re-advertised with NHS and a new label. No modifications are made to the remaining attributes.

BUM traffic in inter-AS Option B and VPN-NH-RR networks

Inter-AS Option B and VPN-NH-RR support the use of non-segmented trees for forwarding BUM traffic in EVPN.

For ingress replication and non-segmented trees, the ASBR or ABR performs an EVPN BUM label swap without any aggregation or further replication. This concept is shown in VPN-NH-RR and ingress replication for BUM traffic.

Figure 57. VPN-NH-RR and ingress replication for BUM traffic

In VPN-NH-RR and ingress replication for BUM traffic, when PE2, PE3, and PE4 advertise their IMET routes, the ABRs re-advertise the routes with NHS and a different label. However, IMET routes are not aggregated; therefore, PE1 sets up three different EVPN multicast destinations and sends three copies of every BUM packet, even if they are sent to the same ABR. This example is also applicable to ASBRs and Inter-AS Option B.

P2MP mLDP may also be used with VPN-NH-RR, but not with Inter-AS Option B. The ABRs, however, do not aggregate or change the mLDP root IP addresses in the IMET routes. The root IP addresses must be leaked across IGP domains. For example, if PE2 advertises an IMET route with mLDP or composite tunnel type, PE1 is able to join the mLDP tree if the root IP is leaked into PE1’s IGP domain.

EVPN multihoming in inter-AS Option B and VPN-NH-RR networks

In general, EVPN multihoming is supported in Inter-AS Option B or VPN-NH-RR networks with the following limitations:

  • An ES PE can only process a remote ES route correctly if the received next hop and origination IP address match. EVPN multihoming is not supported when the ES PEs are in different ASs or IGP domains, or if there is an NH-RR peering the ES PEs and overriding the ES route next hops.

  • EVPN multihoming ESs are not supported on EVPN PEs that are also ABRs or ASBRs.

  • Mass-withdraw based on the AD per-ES routes is not supported for a PE that is in a different AS or IGP domain that the ES PEs. EVPN multihoming with inter-AS Option B or VPN-NH-RR shows an EVPN multihoming scenario where the ES PEs, PE2 and PE3, and the remote PE, PE1, are in different ASs or IGP domains.

Figure 58. EVPN multihoming with inter-AS Option B or VPN-NH-RR

In EVPN multihoming with inter-AS Option B or VPN-NH-RR, PE1’s aliasing and backup functions to the remote ES-1 are supported. However, PE1 cannot identify the originating PE for the received AD per-ES routes because they are both arriving with the same next hop (ASBR/ABR4) and RDs may not help to correlate each AD per-ES route to a specified PE. Therefore, if there is a failure on PE2’s ES link, PE1 cannot remove PE2 from the destinations list for ES-1 based on the AD per-ES route. PE1 must wait for the AD per-EVI route withdrawals to remove PE2 from the list. In summary, when the ES PEs and the remote PE are in different ASs or IGP domains, per-service withdrawal based on AD per-EVI routes is supported, but mass-withdrawal based on AD per-ES routes is not supported.

EVPN E-Tree in inter-AS Option B and VPN-NH-RR networks

Unicast procedures known to EVPN-MPLS E-Tree are supported in Inter-AS Option B or VPN-NH-RR scenarios, however, the BUM filtering procedures are affected.

As described in EVPN E-Tree, leaf-to-leaf BUM filtering is based on the Leaf Label identification at the egress PE. In a non-Inter-AS or non-VPN-NH-RR scenario, EVPN E-tree AD per-ES (ESI-0) routes carrying the Leaf Label are distinguished by the advertised next hop. In Inter-AS or VPN-NH-RR scenarios, all the AD per-ES routes are received with the ABR or ASBR next hop. Therefore, AD per-ES routes originating from different PEs would all have the same next hop, and the ingress PE would not be able to determine which leaf label to use for a specific EVPN multicast destination.

A simplified EVPN E-Tree solution is supported, where an E-Tree Leaf Label is not installed in the IOM if the PE receives more than one E-Tree AD per-ES route, with different RDs, for the same next hop. In this case, leaf BUM traffic is transmitted without a Leaf Label and the leaf-to-leaf traffic filtering depends on the egress source MAC filtering on the egress PE. See EVPN E-Tree egress filtering based on MAC source address.

PBB-EVPN E-tree services are not affected by Inter-AS or VPN-NH-RR scenarios, as AD per-ES routes are not used.

ECMP for EVPN-MPLS destinations

ECMP is supported for EVPN route next hops that are resolved to EVPN-MPLS destinations as follows:

  • ECMP for Layer 2 unicast traffic on Epipe and VPLS services for EVPN-MPLS destinations

    This is enabled by the configure service epipe bgp-evpn mpls auto-bind-tunnel ecmp number and configure service vpls bgp-evpn mpls auto-bind-tunnel ecmp commands and allows the resolution of an EVPN-MPLS next hop to a group of ECMP tunnels of type RSVP-TE, SR-TE or BGP.

  • ECMP for Layer 3 unicast traffic on R-VPLS services with EVPN-MPLS destinations

    This is enabled by the configure service vpls bgp-evpn mpls auto-bind-tunnel ecmp and configure service vpls allow-ip-int-bind evpn-mpls-ecmp commands.

    The VPRN unicast traffic (IPv4 and IPv6) is sprayed among ‟m” paths, with ‟m” being the lowest value of (16,n), where ‟n” is the number of ECMP paths configured in the configure service vpls bgp-evpn mpls auto-bind-tunnel ecmp command.

    CPM originated traffic is not sprayed and picks up the first tunnel in the set.

    This feature is limited to FP3 and above systems.

  • ECMP for Layer 3 multicast traffic on R-VPLS services with EVPN-MPLS destinations

    This is enabled by the configure service vpls allow-ip-int-bind ip-multicast-ecmp and configure service vpls bgp-evpn mpls auto-bind-tunnel ecmp commands. The VPRN multicast traffic (IPv4 and IPv6) are sprayed among up to ‟m” paths, with ‟m” being the lowest value of (16,n), and ‟n” is the number of ECMP paths configured in the configure service vpls bgp-evpn mpls auto-bind-tunnel ecmp command.

In all of these cases, the configure service epipe bgp-evpn mpls auto-bind-tunnel ecmp number and configure service vpls bgp-evpn mpls auto-bind-tunnel ecmp number commands determine the number of Traffic Engineering (TE) tunnels that an EVPN next hop can resolved to. TE tunnels refer to RSVP-TE or SR-TE types. For shortest path tunnels, such as, ldp, sr-isis, sr-ospf, udp, and so on, the number of tunnels in the ECMP group are determined by the configure router ecmp command.

Weighted ECMP for Layer 2 unicast traffic on Epipe and VPLS services for EVPN-MPLS destinations is supported. Packets are sprayed across the LSPs according to the outcome of the hash algorithm and the configured load balancing weight of each LSP when both:

  • the Epipe or VPLS service directly uses an ECMP set of RSVP or SR-TE LSPs with the configure router mpls lsp load-balancing-weight command configured

  • the configure service epipe bgp-evpn mpls auto-bind-tunnel weighted-ecmp or configure service vpls bgp-evpn mpls auto-bind-tunnel weighted-ecmp commands are configured

If the service uses a BGP tunnel which uses an ECMP set of RSVP or SR-TE LSPs with a load-balancing-weight configured, the router performs weighted ECMP regardless of the setting of weighted-ecmp under the auto-bind-tunnel context.

IPv6 tunnel resolution for EVPN MPLS services

EVPN MPLS services can be deployed in a pure IPv6 network infrastructure, where IPv6 addresses are used as next-hops of the advertised EVPN routes, and EVPN routes received with IPv6 next-hops are resolved to tunnels in the IPv6 tunnel-table.

To change the system-ipv4 address that is advertised as the next-hop for a local EVPN MPLS service by default, configure the config>service>vpls>bgp-evpn>mpls>route-next-hop {system-ipv4 | system-ipv6 | ip-address} command or the config>service>epipe>bgp-evpn>mpls>route-next-hop {system-ipv4 | system-ipv6 | ip-address} command.

The configured IP address is used as a next-hop for the MAC/IP, IMET, and AD per-EVI routes advertised for the service. Note that this configured next-hop can be overridden by a policy with the next-hop-self command.

In the case of Inter-AS model B or next-hop-self route-reflector scenarios, at the ASBR/ABR:

  • A route received with an IPv4 next-hop can be re-advertised to a neighbor with an IPv6 next-hop. The neighbor must be configured with the advertise-ipv6-next-hops evpn command.

  • A route received with an IPv6 next-hop can be re-advertised to a neighbor with an IPv4 next-hop. The no advertise-ipv6-next-hops evpn command must be configured on that neighbor.

EVPN multihoming support for MPLS tunnels resolved to non-system IPv4/IPv6 addresses

EVPN MPLS multihoming is supported on PEs that use non-system IPv4 or IPv6 addresses for tunnel resolution. Similar to multihoming in EVPN VXLAN networks, (see Non-system IPv4 and IPv6 VXLAN termination for EVPN VXLAN multihoming), additional configuration steps are required.

  • The configure service system bgp-evpn eth-seg es-orig-ip ip-address command must be configured with the non-system IPv4 or IPv6 address used for the EVPN-MPLS service. This command modifies the originating IP field in the ES routes advertised for the Ethernet Segment, and makes the system use this IP address when adding the local PE as DF candidate.

  • The configure service system bgp-evpn eth-seg route-next-hop ip-address command must also be configured with the non-system IP address. This command changes the next-hop of the ES and AD per-ES routes to the configured address.

  • All the EVPN MPLS services that make use of the Ethernet Segment must be configured with the configure service vpls|epipe bgp-evpn mpls route-next-hop ip-address command.

When multihoming is used in the service, the same IP address should be configured in all three of the commands detailed above, so the DF Election candidate list is built correctly.

EVPN for SRv6 tunnels

EVPN-VPWS, EVPN on VPLS services, and EVPN on VPRN services (EVPN-IFL) are supported with SRv6 tunnels. See the 7750 SR and 7950 XRS Segment Routing and PCE User Guide for more information about EVPN for SRv6 tunnels.

General EVPN topics

This section provides information about general topics related to EVPN.

ARP/ND snooping and proxy support

VPLS services support proxy-ARP (Address Resolution Protocol) and proxy-ND (Neighbor Discovery) functions that can be enabled or disabled independently per service. When enabled (proxy-ARP/proxy-ND no shutdown), the system populates the corresponding proxy-ARP/proxy-ND table with IP--MAC entries learned from the following sources:

  • EVPN-received IP-MAC entries

  • User-configured static IP-MAC entries

  • Snooped dynamic IP-MAC entries (learned from ARP/GARP/NA messages received on local SAPs/SDP bindings)

In addition, any ingress ARP or ND frame on a SAP or SDP binding is intercepted and processed. ARP requests and Neighbor Solicitations are answered by the system if the requested IP address is present in the proxy table.

Proxy-ARP example usage in an EVPN network shows an example of how proxy-ARP is used in an EVPN network. Proxy-ND would work in a similar way. The MAC address notation in the diagram is shortened for readability.

Figure 59. Proxy-ARP example usage in an EVPN network

PE1 is configured as follows:

*A:PE1>config>service>vpls# info 
----------------------------------------------
    vxlan instance 1 vni 600 create
        exit
        bgp
            route-distinguisher 192.0.2.71:600
            route-target export target:64500:600 import target:64500:600
    exit
    bgp-evpn
        vxlan bgp 1 vxlan-instance 1
            no shutdown
        exit
    exit
    proxy-arp
        age-time 600
        send-refresh 200
        dup-detect window 3 num-moves 3 hold-down max anti-spoof-
mac 00:ca:ca:ca:ca:ca
        dynamic-arp-populate
          no shutdown
            exit
            sap 1/1/1:600 create
            exit
    no shutdown               
----------------------------------------------

Proxy-ARP example usage in an EVPN network shows the following steps, assuming proxy-ARP is no shutdown on PE1 and PE2, and the tables are empty:

  1. ISP-A sends ARP-request for (10.10.)10.3.

  2. PE1 learns the MAC 00:01 in the FDB as usual and advertises it in EVPN without any IP. Optionally, the MAC can be configured as a CStatic mac, in which case it is advertised as protected. If the MAC is learned on a SAP or SDP binding where auto-learn-mac-protect is enabled, the MAC is also advertised as protected.

  3. The ARP-request is sent to the CPM where:

    • An ARP entry (IP 10.1'MAC 00:01) is populated into the proxy-ARP table.

    • EVPN advertises MAC 00:01 and IP 10.1 in EVPN with the same SEQ number and Protected bit as the previous route-type 2 for MAC 00:01.

    • A GARP is also issued to other SAPs/SDP bindings (assuming they are not in the same split horizon group as the source). If garp-flood-evpn is enabled, the GARP message is also sent to the EVPN network.

    • The original ARP-request can still be flooded to the EVPN or not based on the unknown-arp-request-flood-evpn command.

  4. Assuming PE1 was configured with unknown-arp-request-flood-evpn, the ARP-request is flooded to PE2 and delivered to ISP-B. ISP-B replies with its MAC in the ARP-reply. The ARP-reply is finally delivered to ISP-A.

  5. PE2 learns MAC 00:01 in the FDB and the entry 10.1'00:01 in the proxy-ARP table, based on the EVPN advertisements.

  6. When ISP-B replies with its MAC in the ARP-reply:

    • MAC 00:03 is learned in FDB at PE2 and advertised in EVPN.

    • MAC 00:03 and IP 10.3 are learned in the proxy-ARP table and advertised in EVPN with the same SEQ number as the previous MAC route.

    • ARP-reply is unicasted to MAC 00:01.

  7. EVPN advertisements are used to populate PE1's FDB (MAC 00:03) and proxy-ARP (IP 10.3—>MAC 00:03) tables as mentioned in 5.

From this point onward, the PEs reply to any ARP-request for 00:01 or 00:03, without the need for flooding the message in the EVPN network. By replying to known ARP-requests / Neighbor Solicitations, the PEs help to significantly reduce the flooding in the network.

Use the following commands to customize proxy-ARP/proxy-ND behavior:

  • dynamic-arp-populate and dynamic-nd-populate

    Enables the addition of dynamic entries to the proxy-ARP or proxy-ND table (disabled by default). When executed, the system populates proxy-ARP/proxy-ND entries from snooped GARP/ARP/NA messages on SAPs/SDP bindings in addition to the entries coming from EVPN (if EVPN is enabled). These entries are shown as dynamic.

  • static <IPv4-address> <mac-address> and static <IPv4-address> <mac-address> and static <ipv6-address> <mac-address> {host | router}

    Configures static entries to be added to the table.

    Note: A static IP-MAC entry requires the addition of the MAC address to the FDB as either learned or CStatic (conditional static mac) to become active (Status —> active).
  • age-time <60 to 86400> (seconds)

    Specifies the aging timer per proxy-ARP/proxy-ND entry. When the aging expires, the entry is flushed. The age is reset when a new ARP/GARP/NA for the same IP MAC is received.

  • send-refresh <120 to 86400> (seconds)

    If enabled, the system sends ARP-request/Neighbor Solicitation messages at the configured time, so that the owner of the IP can reply and therefore refresh its IP MAC (proxy-ARP entry) and MAC (FDB entry).

  • table-size [1 to 16384]

    Enables the user to limit the number of entries learned on a specified service. By default, the table-size limit is 250.

    The unknown ARP-requests, NS, or the unsolicited GARPs and NA messages can be configured to be flooded or not in an EVPN network with the following commands:

    • proxy-arp [no] unknown-arp-request-flood-evpn

    • proxy-arp [no] garp-flood-evpn

    • proxy-nd [no] unknown-ns-flood-evpn

    • proxy-nd [no] host-unsolicited-na-flood-evpn

    • proxy-nd [no] router-unsolicited-na-flood-evpn

  • dup-detect [anti-spoof-mac <mac-address>] window <minutes> num-moves <count> hold-down <minutes | max>

    Enables a mechanism that detects duplicate IPs and ARP/ND spoofing attacks. The working of the dup-detect command can be summarized as follows:

    • Attempts (relevant to dynamic and EVPN entry types) to add the same IP (different MAC) are monitored for <window> minutes and when <count> is reached within that window, the proxy-ARP/proxy-ND entry for the IP is suspected and marked as duplicate. An alarm is also triggered.

    • The condition is cleared when hold-down time expires (max does not expire) or a clear command is issued.

    • If the anti-spoof-mac is configured, the proxy-ARP/proxy-ND offending entry's MAC is replaced by this <mac-address> and advertised in an unsolicited GARP/NA for local SAP or SDP bindings and in EVPN to remote PEs.

    • This mechanism assumes that the same anti-spoof-mac is configured in all the PEs for the same service and that traffic with destination anti-spoof-mac received on SAPs/SDP bindings are dropped. An ingress MAC filter has to be configured to drop traffic to the anti-spoof-mac.

Proxy-arp entry combinations shows the combinations that produce a Status = Active proxy-arp entry in the table. The system replies to proxy-ARP requests for active entries. Any other combination results in a Status = inActv entry. If the service is not active, the proxy-arp entries are not active either, regardless of the FDB entries

Note: A static entry is active in the FDB even when the service is down.
Table 5. Proxy-arp entry combinations
Proxy-arp entry type FDB entry type (for the same MAC)

Dynamic

learned

Static

learned

Dynamic

CStatic/Static

Static

CStatic/Static

EVPN

EVPN, learned/CStatic/Static with matching ESI

Duplicate

When proxy-ARP/proxy-ND is enabled on services with all-active multihomed Ethernet Segments, a proxy-arp entry type evpn may be associated with learned/CStatic/Static FDB entries (because for example, the CE can send traffic for the same MAC to all the multihomed PEs in the ES). If this is the case, the entry is active if the ESI of the EVPN route and the FDB entry match, or inactive otherwise, as per Proxy-arp entry combinations.

Proxy-ARP/ND periodic refresh, unsolicited refresh and confirm-messages

When proxy-ARP/proxy-ND is enabled, the system starts populating the proxy table and responding to ARP-requests/NS messages. To keep the active IP-MAC entries alive and ensure that all the host/routers in the service update their ARP/ND caches, the system may generate the following three types of ARP/ND messages for a specified IP-MAC entry:

  • periodic refresh messages (ARP-requests or NS for a specified IP)

    These messages are activated by the send-refresh command and their objective is to keep the existing FDB and Proxy-ARP/ND entries alive to minimize EVPN withdrawals and re-advertisements.

  • unsolicited refresh messages (unsolicited GARP or NA messages)

    These messages are sent by the system when a new entry is learned or updated. Their objective is to update the attached host/router caches.

  • confirm messages (unicast ARP-requests or unicast NS messages)

    These messages are sent by the system when a new MAC is learned for an existing IP. The objective of the confirm messages is to verify that a specified IP has really moved to a different part of the network and is associated with the new MAC. If the IP has not moved, it forces the owners of the duplicate IP to reply and cause dup-detect to kick in.

Advertisement of Proxy-ARP/ND flags in EVPN

When a dynamic or static Proxy-ARP/ND entry is learned (or configured), the following property flags are created with it:
  • The Router flag (R) is used in IPv6 Neighbor Advertisement messages to indicate if the proxy-ND entry belongs to an IPv6 router or an IPv6 host.
  • The Override flag (O) is used in IPv6 Neighbor Advertisement messages to indicate whether the resolved entry should override a potential ND entry that the solicitor may already have for the same IPv6 address.
  • The Immutable flag (I) indicates that the proxy-ARP or proxy-ND entry cannot change its binding to a different MAC addresses. This Flag is always set for static proxy-ARP/ND entries or configured dynamic IP addresses that are associated with a mac-list.

RFC9047 describes how to convey the flags (R, O and I) in the EVPN ARP/ND extended community that is advertised with the EVPN MAC/IP Advertisement routes. This enables the ingress and egress EVPN PEs to install the proxy-ARP/ND entries with the same property flags. The following figure shows the format of the EVPN ARP/ND extended community.

Figure 60. Format of EVPN ARP/ND extended community

By default, the router does not advertise the ARP/ND extended community. Use the following command to configure the router to advertise all the proxy ARP/ND MAC/IP Advertisement routes with the extended community:

configure service vpls bgp-evpn arp-nd-extended-community

Proxy-ARP/ND and flag processing

Proxy-ND and the Router Flag

RFC 4861 describes the use of the (R) or Router flag in NA messages as follows:

  • A node capable of routing IPv6 packets must reply to NS messages with NA messages where the R flag is set (R=1).

  • Hosts must reply with NA messages where R=0.

The R flag in NA messages impacts how the hosts select their default gateways when sending packets off-link. The proxy-ND function on the router does one of the following, depending on whether it can provide the appropriate R flag information:

  • provides the appropriate R flag information in the proxy-ND NA replies, if possible

  • floods the received NA messages, if it cannot provide the appropriate R flag when replying

The use of the R flag (only present in NA messages and not in NS messages) makes the procedure for learning proxy-ND entries and replying to NS messages different from the procedures for proxy-ARP in IPv4. The NA messages snooping determines the router or host flag to add to each entry, and that determines the flag to use when responding to an NS message.

The procedure to add the R flag to a specified entry is as follows:

  • Dynamic entries are learned based on received NA messages. The R flag is also learned and added to the proxy-ND entry so that the appropriate R flag is used in response to NS requests for a specified IP.

  • Static entries are configured as host or router using the following command.

    • MD-CLI
      configure service vpls proxy-nd static-neighbor ip-address type
    • classic CLI
      configure service vpls proxy-nd static
  • EVPN entries are learned from BGP and the following command determines the R flag added to them;
    • MD-CLI
      configure service vpls proxy-nd evpn advertise-neighbor-type 
    • classic CLI
      configure service vpls proxy-nd evpn-nd-advertise
    in case the following command is not configured (if configured, the signaled flag value determines the flag of the entry).
    • MD-CLI
      configure service vpls bgp-evpn routes mac-ip arp-nd-extended-community 
    • classic CLI
      configure service vpls bgp-evpn arp-nd-extended-community-advertisement
  • In addition, the EVPN ND advertisement indicates what static and dynamic IP → MAC entries the system advertises in EVPN.

    • If you specify the router option for EVPN ND advertisement, the system should flood the received unsolicited NA messages for hosts. This is controlled by the following command:
      • MD-CLI
        configure service vpls proxy-nd evpn flood unknown-neighbor-advertise-host
      • classic CLI
        configure service vpls proxy-nd host-unsolicited-na-flood-evpn 
    • The opposite is also true so that the host option for EVPN ND advertisement is configured with the following command:
      • MD-CLI
        configure service vpls proxy-nd evpn flood unknown-neighbor-advertise-router 
      • classic CLI
        configure service vpls proxy-nd router-unsolicited-na-flood-evpn
    • The router-host option for EVPN ND advertisement allows the router to advertise both types of entries in EVPN at the same time. That is, static and dynamic entries with the router or host flag are advertised in EVPN with the corresponding flag in the ARP/ND extended community. This option can be enabled only if the ARP/ND extended community is configured.

EVPN proxy-ND MAC/IP Advertisement routes received without the EVPN ARP/ND extended communities create an entry with type Router (which is the default value). Entries created as duplicate are advertised in EVPN with an R flag value that depends on the configuration of the EVPN ND advertisement command. If the host option is configured for the EVPN ND advertisement, the duplicate entry is treated as a host. If the router or router-host option is configured for the EVPN ND advertisement, the duplicate entry behaves as a router.

Proxy-ARP/ND and the Immutable Flag

The I bit or Immutable flag in the ARP/ND extended community is advertised and used as follows:

  • Any static proxy-ARP/ND entry is advertised with I=1 if you enable ARP/ND extended community advertisement.
  • Any configured dynamic IP address (associated with a mac-list) proxy-ARP/ND entry is advertised with I=1 if you enable ARP/ND extended community
  • Duplicate entries are advertised with I=1 as well (in addition to O=1 and R=0 or 1 based on the configuration).
  • The setting of the I bit is independent of the static bit associated with the FDB entry, and it is only used with proxy-ARP/ND advertisements.

The I bit in the ARP/ND extended community is processed on reception as follows:

  • A PE receiving an EVPN MAC/IP Advertisement route containing an IP-MAC and the I flag set, installs the IP-MAC entry in the ARP/ND or proxy-ARP/ND table as an immutable binding.
  • This immutable binding entry overrides an existing non-immutable binding for the same IP-MAC. In general, the ARP/ND extended community command changes the selection of ARP/ND entries when multiple routes with the same IP address exist. This preferred order of ARP/ND entries selection is as follows:
    1. Local immutable ARP/ND entries (static and dynamic)
    2. EVPN immutable ARP/ND entries
    3. Remaining ARP/ND entries
  • The absence of the EVPN ARP/ND Extended Community in a MAC/IP Advertisement route indicates that the IP→MAC entry is not an immutable binding.
  • Receiving multiple EVPN MAC/IP Advertisement routes with the I flag set to 1 for the same IP but a different MAC address is considered a misconfiguration or a transient error condition. If this happens in the network, a PE receiving multiple routes (with the I flag set to 1 for the same IP and a different MAC address) selects one of them based on the previously described selection rules.
Proxy-ND and the Override Flag
The O bit or Override flag in the ARP/ND extended community is advertised and used as follows:
  • The O flag is learned for dynamic entries (being 0 or 1) and added to the proxy-ND table. If the ARP/ND extended community is configured, the O flag associated with the entry is advertised along with the EVPN MAC/IP Advertisement route. Static and duplicate entries are always advertised with O=1.
  • Upon receiving an EVPN MAC/IP Advertisement route, the received O flag is stored in the entry created in the proxy-ND table, and used when replying to local NS messages for the IP address.

Proxy-ARP/ND mac-List for dynamic entries

SR OS supports the association of configured MAC lists with a configured dynamic proxy-ARP or proxy-ND IP address. The actual proxy-ARP or proxy-ND entry is not created until an ARP or Neighbor Advertisement message is received for the IP and one of the MACs in the associated MAC-list. This is in accordance with IETF RFC 9161, which states that a proxy-ARP or proxy-ND IP entry can be associated with one MAC among a list of allowed MACs.

The following example shows the use of MAC lists for dynamic entries.

A:PE-2>config>service#
  proxy-arp-nd
    mac-list ISP-1 create 
      mac 00:de:ad:be:ef:01 
      mac 00:de:ad:be:ef:02 
      mac 00:de:ad:be:ef:03
 
A:PE-2>config>service>vpls>proxy-arp#
  dynamic 1.1.1.1 create
    mac-list ISP-1
    resolve 30
 
A:PE-2>config>service>vpls>proxy-nd#
  dynamic 2001:db8:1000::1 create
    mac-list ISP-1 
    resolve 30

where:

  • A dynamic IP (dynamic ip create) is configured and associated with a MAC list (mac-list name).

  • The MAC list is created in the config>service context and can be reused by multiple configured dynamic IPs as follows:

    • in different services

    • in the same service, for proxy-ARP and proxy-ND entries

  • If the MAC list is empty, the proxy-ARP or proxy-ND entry is not created for the configured IP.

  • The same MAC list can be applied to multiple configured dynamic entries even within the same service.

  • The new proxy-ARP and proxy-ND entries behave as dynamic entries and are displayed as type dyn in the show commands.

    The following output example displays the entry corresponding to the configured dynamic IP.

show service id 1 proxy-arp detail
-------------------------------------------------------------------------------
Proxy Arp
-------------------------------------------------------------------------------
Admin State       : enabled             
Dyn Populate      : enabled             
Age Time          : 900 secs            Send Refresh      : 300 secs
Table Size        : 250                 Total             : 1
Static Count      : 0                   EVPN Count        : 0
Dynamic Count     : 1                   Duplicate Count   : 0
Dup Detect
-------------------------------------------------------------------------------
Detect Window     : 3 mins              Num Moves         : 5
Hold down         : 9 mins              
Anti Spoof MAC    : None
EVPN
-------------------------------------------------------------------------------
Garp Flood        : enabled             Req Flood         : enabled
Static Black Hole : disabled            
-------------------------------------------------------------------------------
===============================================================================
VPLS Proxy Arp Entries
===============================================================================
IP Address          Mac Address       Type Status    Last Update
-------------------------------------------------------------------------------
1.1.1.1            00:de:ad:be:ef:01  dyn active    02/23/2016 09:05:49
-------------------------------------------------------------------------------
Number of entries : 1
===============================================================================
show service proxy-arp-nd mac-list "ISP-1" associations
===============================================================================
MAC List Associations
===============================================================================
Service Id                    IP Addr
-------------------------------------------------------------------------------
1                             1.1.1.1
1                             2001:db8:1000::1
-------------------------------------------------------------------------------
Number of Entries: 2
===============================================================================

Although no new proxy-ARP or proxy-ND entries are created when a dynamic IP is configured, the router triggers the following resolve procedure:

  1. The router sends a resolve message with a configurable frequency of 1 to 60 minutes; the default value is five minutes.

    Note: The resolve message is an ARP-request or NS message flooded to all the non-EVPN endpoints in the service.
  2. The router sends resolve messages at the configured frequency until a dynamic entry for the IP is created.

    Note: The dynamic entry is created only if an ARP, GARP, or NA message is received for the configured IP, and the associated MAC belongs to the configured MAC list of the IP. If the MAC list is empty, the proxy-ARP or proxy-ND entry is not created for the configured IP.

After a dynamic entry (with a MAC address included in the list) is successfully created, its behavior (for send-refresh, age-time, and other activities) is the same as a configured dynamic entry with the following exceptions.

  • Regular dynamic entries may override configured dynamic entries, but static or EVPN entries cannot override configured dynamic entries.

  • If the corresponding MAC is flushed from the FDB after the entry is successfully created, the entry becomes inactive in the proxy-ARP or proxy-ND table and the resolve process is restarted.

  • If the MAC list is changed, all the IPs that point to the list delete the proxy entries and the resolve process is restarted.

  • If there is an existing configured dynamic entry and the router receives a GARP, ARP, or NA for the IP with a MAC that is not contained in the MAC list, the message is discarded and the proxy-ARP or proxy-ND entry is deleted. The resolve process is restarted.

  • If there is an existing configured dynamic entry and the router receives a GARP, ARP, or NA for the IP with a MAC contained in the MAC list, the existing entry is overridden by the IP and new MAC, assuming the confirm procedure passes.

  • The dup-detect and confirm procedures work for the configured dynamic entries when the MAC changes are between MACs in the MAC list. Changes to an off-list MAC cause the entry to be deleted and the resolve process is restarted.

Configured dynamic entries are advertised as immutable if you enable advertisement of ARP/ND extended community. The following considerations about IP duplication and immutable configured dynamic entries apply:
  • The CPM drops received dynamic ARP/ND messages without learning them, if they match a dynamic (immutable) entry.
  • If there is a local configured dynamic address (irrespective of whether there is an entry for it or not), a received EVPN immutable entry for the same IP address is not installed. Therefore the IP duplication mechanisms do not apply to immutable entries.

BGP-EVPN MAC-mobility

EVPN defines a mechanism to allow the smooth mobility of MAC addresses from an NVE to another NVE. The 7750 SR, 7450 ESS, and 7950 XRS support this procedure as well as the MAC-mobility extended community in MAC advertisement routes as follows:

  • The router honors and generates the SEQ (Sequence) number in the MAC mobility extended community for MAC moves.

  • When a MAC is EVPN-learned and it is attempted to be learned locally, a BGP update is sent with SEQ number changed to ‟previous SEQ”+1 (exception: MAC duplication num-moves value is reached).

  • SEQ number = zero or no MAC mobility ext-community are interpreted as sequence zero.

  • In case of mobility, the following MAC selection procedure is followed:

    • If a PE has two or more active remote EVPN routes for the same MAC (VNI can be the same or different), the highest SEQ number is selected. The tie-breaker is the lowest IP (BGP NH IP).

    • If a PE has two or more active EVPN routes and it is the originator of one of them, the highest SEQ number is selected. The tie-breaker is the lowest IP (BGP NH IP of the remote route is compared to the local system address).

Note: When EVPN multihoming is used in EVPN-MPLS, the ESI is compared to determine whether a MAC received from two different PEs has to be processed within the context of MAC mobility or multihoming. Two MAC routes that are associated with the same remote or local ESI but different PEs are considered reachable through all those PEs. Mobility procedures are not triggered as long as the MAC route still belongs to the same ESI.

BGP-EVPN MAC-duplication

EVPN defines a mechanism to protect the EVPN service from control plane churn as a result of loops or accidental duplicated MAC addresses. The 7750 SR, 7450 ESS, and 7950 XRS support an enhanced version of this procedure as described in this section.

A situation may arise where the same MAC address is learned by different PEs in the same VPLS because of two (or more hosts) being misconfigured with the same (duplicate) MAC address. In such situation, the traffic originating from these hosts would trigger continuous MAC moves among the PEs attached to these hosts. It is important to recognize such situation and avoid incrementing the sequence number (in the MAC Mobility attribute) to infinity.

To remedy such situation, a router that detects a MAC mobility event by way of local learning starts a window <in-minutes> timer (default value of window = 3) and if it detects num-moves <num> before the timer expires (default value of num-moves = 5), it concludes that a duplicate MAC situation has occurred. The window and number of moves are configured with the following commands.
configure service vpls bgp-evpn mac-duplication detect window
configure service vpls bgp-evpn mac-duplication detect num-moves
The router then alerts the user with a trap message when a duplicate MAC situation occurs.
10 2014/01/14 01:00:22.91 UTC MINOR: SVCMGR #2331 Base 
"VPLS Service 1 has MAC(s) detected as duplicates by EVPN mac-
duplication detection." 
Use the following command in the BGP EVPN Table section to display the offending MAC address:
show service id svc-id bgp-evpn
===============================================================================
BGP EVPN Table
===============================================================================
EVI : 1000 
Creation Origin : manual 

Adv L2 Attributes : Disabled 
Ignore Mtu Mismatch: Disabled 

MAC/IP Routes
MAC Advertisement : Enabled                Unknown MAC Route : Disabled
CFM MAC Advertise : Disabled 
ARP/ND Ext Comm Adv: Disabled 

Multicast Routes
Sel Mcast Advert : Disabled 
Ing Rep Inc McastAd: Enabled 

IP Prefix Routes
IP Route Advert : Disabled 

MAC Duplication Detection
Num. Moves : 5                             Window : 3
Retry : 9                                  Number of Dup MACs : 1
Black Hole : Enabled 
Local Learned Trusted MAC
MAC time : 1                               MAC move factor : 3


-------------------------------------------------------------------------------
Detected Duplicate MAC Addresses           Time Detected
-------------------------------------------------------------------------------
00:de:fe:ca:da:04                          05/18/2023 09:55:22
-------------------------------------------------------------------------------
===============================================================================


-------------------------------------------------------------------------------
Local Learned Trusted MAC
-------------------------------------------------------------------------------
MAC Address Time Detected
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------

After detecting the duplicate, the router stops sending and processing any BGP MAC advertisement routes for that MAC address until one of the following occurs:

  • The MAC is flushed because of a local event (SAP or SDP binding associated with the MAC fails) or the reception of a remote update with better SEQ number (because of a MAC flush at the remote router).

  • The retry <in-minutes> timer expires, which flushes the MAC and restart the process. The retry timer is configured using the following command.
    configure service vpls bgp-evpn mac-duplication retry
Note: The other routers in the VPLS instance forward the traffic for the duplicate MAC address to the router advertising the best route for the MAC.

The values of num-moves and window are configurable to allow for the required flexibility in different environments. In scenarios where BGP configure router bgp rapid-update evpn is configured, the user may want to configure a shorter window timer than in scenarios where BGP updates are sent every (default) min-route-advertisement interval.

MAC duplication is always enabled in EVPN VPLS services. The preceding example shows the output for BGP EVPN MAC duplication detection configuration per VPLS service under the following context.
configure service vpls bgp-evpn mac-duplication

The following example shows a MAC duplication detection configuration.

MD-CLI

[ex:/configure service vpls "bd-1000-mac-dup-mpls" bgp-evpn mac-duplication]
A:admin@node-2# info detail
retry 9
detect {
    num-moves 5
    window 3
}

classic CLI

A:node-2>config>service>vpls>bgp-evpn>mac-duplication# info detail 
----------------------------------------------
detect num-moves 5 window 3 trusted-mac-move-factor 3
retry 9

Conditional static MAC and protection

RFC 7432 defines the use of the sticky bit in the MAC mobility extended community to signal static MAC addresses. These addresses must be protected in case there is an attempt to dynamically learn them in a different place in the EVPN-VXLAN VPLS service.

In the 7750 SR, 7450 ESS, and 7950 XRS, any conditional static MAC defined in an EVPN-VXLAN VPLS service is advertised by BGP-EVPN as a static address, that is, with the sticky bit set. The following example shows the configuration of a conditional static MAC.

A:node2config>service>vpls# info 
----------------------------------------------
            description "vxlan-service"
...     
            sap 1/1/1:1000 create
            exit
            static-mac                
                mac 00:ca:ca:ca:ca:00 create sap 1/1/1:1000 monitor fwd-status
            exit
            no shutdown

A:node-2# show router bgp routes evpn mac hunt mac-address 00:ca:ca:ca:ca:00 
...
===============================================================================
BGP EVPN Mac Routes
===============================================================================
Network        : 0.0.0.0/0
Nexthop        : 192.0.2.63
From           : 192.0.2.63
Res. Nexthop   : 192.168.19.1
Local Pref.    : 100                    Interface Name : NotAvailable
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : 0
AIGP Metric    : None                   
Connector      : None
Community      : target:65000:1000     mac-mobility:Seq: 0/Static
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.63
Flags          : Used  Valid  Best  IGP  
Route Source   : Internal               
AS-Path        : No As-Path
EVPN type      : MAC                    
ESI            : 0:0:0:0:0:0:0:0:0:0    Tag            : 1063
IP Address     : ::                     RD             : 65063:1000
Mac Address    : 00:ca:ca:ca:ca:00      Mac Mobility   : Seq:0
Neighbor-AS    : N/A
Source Class   : 0                      Dest Class     : 0
-------------------------------------------------------------------------------
Routes : 1                            
===============================================================================

Local static MACs or remote MACs with sticky bit are considered as "protected". A packet entering a SAP / SDP binding is discarded if its source MAC address matches one of these 'protected' MACs.

Auto-learn MAC protect and restricting protected source MACs

Auto-learn MAC protect, together with the ability to restrict where the protected source MACs are allowed to enter the service, can be enabled within an EVPN-MPLS and EVPN-VXLAN VPLS and routed VPLS services, but not in PBB-EVPN services. The protection, using the auto-learn-mac-protect command (described in Auto-learn MAC protect), and the restrictions, using the restrict-protected-src [discard-frame] command, operate in the same way as in a non-EVPN VPLS service.

  • When auto-learn-mac-protect is enabled on an object, source MAC addresses learned on that object are marked as protected within the FDB.

  • When restrict-protected-src is enabled on an object and a protected source MAC is received on that object, the object is automatically shutdown (requiring the user to shutdown then no shutdown the object to make it operational again).

  • When restrict-protected-src discard-frame is enabled on an object and a frame with a protected source MAC is received on that object, that frame is discarded.

In addition, the following behavioral differences are specific to EVPN services:

  • An implicit restrict-protected-src discard-frame command is enabled by default on SAPs, mesh-SDPs and spoke SDPs. As this is the default, it is not possible to configure this command in an EVPN service. This default state can be seen in the show output for these objects, for example on a SAP:

    *A:PE# show service id 1 sap 1/1/9:1 detail
    ===============================================================================
    Service Access Points(SAP)
    ===============================================================================
    Service Id         : 1
    SAP                : 1/1/9:1                  Encap             : q-tag
    ...
    RestMacProtSrc Act : none (oper: Discard-frame)
    
  • A restrict-protected-src discard-frame can be optionally enabled on EVPN-MPLS/VXLAN destinations within EVPN services. When enabled, frames that have a protected source MAC address are discarded if received on any EVPN-MPLS/VXLAN destination in this service, unless the MAC address is learned and protected on an EVPN-MPLS/VXLAN destination in this service. This is enabled as follows:

            
            configure
               service
                  vpls <service id>
                       bgp-evpn
                           mpls bgp <instance>
                               [no] restrict-protected-src discard-frame
                       vxlan instance <instance> vni <vni-id>
                           [no] restrict-protected-src discard-frame
    
  • Auto-learned protected MACs are advertised to remote PEs in an EVPN MAC/IP advertisement route with the sticky bit set.

  • The source MAC protection action relating to the restrict-protected-src [discard-frame] commands also applies to MAC addresses learned by receiving an EVPN MAC/IP advertisement route with the sticky bit set from remote PEs. This causes remotely configured conditional static MACs and auto-learned protected MACs to be protected locally.

  • In all-active multihoming scenarios, if auto-learn-mac-protect is configured on all-active SAPs and restrict-protected-src discard-frame is enabled on EVPN-MPLS/VXLAN destinations, traffic from the CE that enters one multihoming PE and needs to be switched through the other multihoming PE is discarded on the second multihoming PE. Each multihoming PE protects the CE's MAC on its local all-active SAP, which results in any frames with the CE's MAC address as the source MAC being discarded as they are received on the EVPN-MPLS/VXLAN destination from the other multihoming PE.

Conditional static MACs, EVPN static MACs and locally protected MACs are marked as protected within the FDB, as shown in the example output.

*A:PE# show service fdb-mac
===============================================================================
Service Forwarding Database
===============================================================================
ServId    MAC               Source-Identifier        Type     Last Change
                                                     Age
-------------------------------------------------------------------------------
1         00:00:00:00:00:01 sap:1/1/9:1              LP/30    01/05/16 11:58:22
1         00:00:00:00:00:02 vxlan-1:                 EvpnS:P  01/05/16 11:58:23
                            10.1.1.2:1
1         00:00:00:00:01:01 sap:1/1/9:1              CStatic: 01/04/16 20:05:02
                                                     P
1         00:00:00:00:01:02 vxlan-1:                 EvpnS:P  01/04/16 20:18:02
                            10.1.1.2:1
-------------------------------------------------------------------------------
No. of Entries: 4
-------------------------------------------------------------------------------
Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================

In this output:

  • the first MAC is locally protected using the auto-learn-mac-protect command

  • the second MAC has been protected using the auto-learn-mac-protect command on a remote PE

  • the third MAC is a locally configured conditional static MAC

  • the fourth MAC is a remotely configured conditional static MAC

The command auto-learn-mac-protect can be optionally extended with an exclude-list by using the following command:

auto-learn-mac-protect [exclude-list name]

This list refers to a mac-list <name> created under the config>service context and contains a list of MACs and associated masks.

When auto-learn-mac-protect [exclude-list name] is configured on a service object, dynamically learned MACs are excluded from being learned as protected if they match a MAC entry in the MAC list. Dynamically learned MAC SAs are protected only if they are learned on an object with ALMP configured and one of the following conditions is true:

  • there is no exclude list associated with the same object

  • there is an exclude-list but the MAC does not match any entry

The MAC lists can be used in multiple objects of the same or different service. When empty, ALMP does not exclude any learned MAC from protection on the object. This extension allows the mobility of specific MACs in objects where MACs are learned as protected.

Blackhole MAC and its application to proxy-ARP/proxy-ND duplicate detection

A blackhole MAC is a local FDB record. It is similar to a conditional static MAC; it is associated with a black-hole (similar to a VPRN blackhole static-route in VPRNs) instead of a SAP or SDP binding. A blackhole MAC can be added by using the following command:

config>service>vpls# static-mac mac
mac <ieee-address> [create] black-hole

The static blackhole MAC can have security applications (for example, replacement of MAC filters) for specific MACs. When used in combination with restrict-protected-src, the static blackhole MAC provides a simple and scalable way to filter MAC DA or SA in the data plane, regardless of how the frame arrived at the system (using SAP or SDP bindings or EVPN endpoints).

For example, when a specified static-mac mac 00:00:ca:fe:ca:fe create black-hole is added to a service, the following behavior occurs:

  1. The configured MAC is created as a static MAC with a black-hole source identifier.

    *A:PE1# show service id 1 fdb detail                  
    ===============================================================================
    Forwarding Database, Service 1
    ===============================================================================
    ServId    MAC               Source-Identifier        Type     Last Change
                                                         Age      
    -------------------------------------------------------------------------------
    1         00:ca:ca:ba:ca:01 eES:                     Evpn     06/29/15 23:21:34
                                01:00:00:00:00:71:00:00:00:01
    1         00:ca:ca:ba:ca:06 eES:                     Evpn     06/29/15 23:21:34
                                01:74:13:00:74:13:00:00:74:13
    1         00:ca:00:00:00:00 sap:1/1/1:2              CStatic:P 06/29/15 23:20:58
    1         00:ca:fe:ca:fe:00 black-hole               CStatic:P  06/29/15 23:20:00
    1         00:ca:fe:ca:fe:69 eMpls:                   EvpnS:P    06/29/15 20:40:13
                                192.0.2.69:262133
    -------------------------------------------------------------------------------
    No. of MAC Entries: 5
    -------------------------------------------------------------------------------
    Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
    ===============================================================================
    
  2. After it has been successfully added to the FDB, the blackhole MAC is treated like any other protected MAC, as follows:

    • The blackhole MAC is added as protected (CStatic:P) and advertised in EVPN as static.

    • SAP or SDP bindings or EVPN endpoints, where the restrict-protected-src discard-frame is enabled, discard frames where MAC SA is equal to blackhole MAC.

    • SAP or SDP bindings, where restrict-protected-src (no discard-frame) is enabled, go operationally down if a frame with MAC SA is equal to blackhole MAC is received.

  3. After the blackhole MAC has been successfully added to the FDB, any frame arriving at any SAP or SDP binding or EVPN endpoint with MAC DA equal to blackhole MAC is discarded.

Blackhole MACs can also be used in services with proxy-ARP/proxy-ND enabled to filter traffic with destination to anti-spoof-macs. The anti-spoof-mac provides a way to attract traffic to a specified IP when a duplicate condition is detected for that IP address (see section ARP/ND snooping and proxy support for more information); however, the system still needs to drop the traffic addressed to the anti-spoof-mac by using either a MAC filter or a blackhole MAC.

The user does not need to configure MAC filters when configuring a static-black-hole MAC address for the anti-spoof-mac function. To use a blackhole MAC entry for the anti-spoof-mac function in a proxy-ARP/proxy-ND service, the user needs to configure:

  • the static-black-hole option for the anti-spoof-mac

    *A:PE1# config>service>vpls>proxy-arp# 
    dup-detect window 3 num-moves 5 hold-down max anti-spoof-
    mac 00:66:66:66:66:00 static-black-hole
    
  • a static blackhole MAC using the same MAC address used for the anti-spoof-mac

    *A:PE1# config>service>vpls# 
    static-mac mac 00:66:66:66:66:00 create black-hole
    

When this configuration is complete, the behavior of the anti-spoof-mac function changes as follows:

  • In the EVPN, the MAC is advertised as static. Locally, the MAC is shown in the FDB as ‟CStatic” and associated with a black-hole.

  • The combination of the anti-spoof-mac and the static-black-hole ensures that any frame that arrives at the system with MAC DA = anti-spoof-mac is discarded, regardless of the ingress endpoint type (SAP or SDP binding or EVPN) and without the need for a filter.

  • If, instead of discarding traffic, the user wants to redirect it using MAC DA as the anti-spoof-mac, then redirect filters should be configured on SAPs or SDP bindings (instead of the static-black-hole option).

When the static-black-hole option is not configured with the anti-spoof-mac, the behavior of the anti-spoof-mac function, as described in ARP/ND snooping and proxy support, remains unchanged. In particular:

  • the anti-spoof-mac is not programmed in the FDB

  • any attempt to add a static MAC (or any other MAC) with the anti-spoof-mac value is rejected by the system

  • a MAC filter is needed to discard traffic with MAC DA = anti-spoof-mac.

Blackhole MAC for EVPN loop detection

SR OS can combine a blackhole MAC address concept and the EVPN MAC duplication procedures to provide loop protection in EVPN networks. The feature is compliant with the MAC mobility and multihoming functionality in RFC 7432, and the Loop Protection section in draft-ietf-bess-rfc7432bis. Use the following command to enable the feature:

  • MD-CLI
    configure service vpls bgp-evpn mac-duplication blackhole enable
  • classic CLI
    configure service vpls bgp-evpn mac-duplication black-hole-dup-mac

If enabled, there are no apparent changes in the MAC duplication; however, if a duplicated MAC is detected (for example, M1), then the router performs the following:

  1. adds M1 to the duplicate MAC list

  2. programs M1 in the FDB as a Protected MAC associated with a blackhole endpoint (where type is set to EvpnD:P and Source-Identifier is black-hole)

While the MAC type value remains EvpnD:P, the following additional operational details apply.

  • Incoming frames with MAC DA = M1 are discarded by the ingress IOM, regardless of the ingress endpoint type (SAP, SDP, or EVPN), based on an FDB MAC lookup.

  • Incoming frames with MAC SA = M1 are discarded by the ingress IOM or cause the router to bring down the SAP or SDP binding, depending on the restrict-protected-src setting on the SAP, SDP, or EVPN endpoint.

The following example shows an EVPN-MPLS service where blackhole is enabled and MAC duplication programs the duplicate MAC as a blackhole.

19 2016/12/20 19:45:59.69 UTC MINOR: SVCMGR #2331 Base 
"VPLS Service 1000 has MAC(s) detected as duplicates by EVPN mac-duplication 
detection."

MD-CLI

[ex:/configure service vpls "bd-1000"]
A:admin@node-2# info
  admin-state enable
  service-id 1000
  customer "1"
  bgp 1 {
  }
  bgp-evpn {
    evi 1000
    mac-duplication {
      blackhole true
      detect {
        num-moves 5
        window 3
      }
    }
    mpls 1 {
      admin-state enable
      ingress-replication-bum-label true
      auto-bind-tunnel {
        resolution any
      }
    }
  }
  sap 1/1/1:1000 {
  }
  spoke-sdp 56:1000 {
  }

classic CLI

A:node-2# configure service vpls 1000 
A:node-2>config>service>vpls# info 
----------------------------------------------
            bgp
            exit
            bgp-evpn
                evi 1000
                mac-duplication
                    detect num-moves 5 window 3
                    retry 6
                    black-hole-dup-mac
                exit
                mpls bgp 1
                    ingress-replication-bum-label
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            sap 1/1/1:1000 create
                no shutdown
            exit
            spoke-sdp 56:1000 create
                no shutdown
            exit
            no shutdown
----------------------------------------------

The following command displays BGP EVPN table values.

show service id 1000 bgp-evpn 
===============================================================================
BGP EVPN Table
===============================================================================
EVI : 1000 
Creation Origin : manual 

Adv L2 Attributes : Disabled 
Ignore Mtu Mismatch: Disabled 

MAC/IP Routes
MAC Advertisement : Enabled Unknown MAC Route : Disabled
CFM MAC Advertise : Disabled 
ARP/ND Ext Comm Adv: Disabled 

Multicast Routes
Sel Mcast Advert : Disabled 
Ing Rep Inc McastAd: Enabled 

IP Prefix Routes
IP Route Advert : Disabled 

MAC Duplication Detection
Num. Moves : 5 Window : 3
Retry : 9 Number of Dup MACs : 1
Black Hole : Enabled 
Local Learned Trusted MAC
MAC time : 1 MAC move factor : 3


-------------------------------------------------------------------------------
Detected Duplicate MAC Addresses Time Detected
-------------------------------------------------------------------------------
00:de:fe:ca:da:04 05/18/2023 09:55:22
-------------------------------------------------------------------------------
===============================================================================


-------------------------------------------------------------------------------
Local Learned Trusted MAC
-------------------------------------------------------------------------------
MAC Address Time Detected
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------


===============================================================================
BGP EVPN MPLS Information
===============================================================================
Admin Status : Enabled                   Bgp Instance : 1
Force Vlan Fwding : Disabled 
Force Qinq Fwding : none 
Route NextHop Type : system-ipv4 
Control Word : Disabled 
Max Ecmp Routes : 1 
Entropy Label : Disabled 
Default Route Tag : none 
Split Horizon Group: (Not Specified)
Ingress Rep BUM Lbl: Enabled 
Ingress Ucast Lbl : 524262               Ingress Mcast Lbl : 524261
RestProtSrcMacAct : none 
Evpn Mpls Encap : Enabled                Evpn MplsoUdp : Disabled
Oper Group : 
MH Mode : network 
Evi 3-byte Auto-RT : Disabled 
Dyn Egr Lbl Limit : Disabled 
Hash Label : Disabled 
-------------------------------------------------------------------------------
===============================================================================


===============================================================================
BGP EVPN MPLS Auto Bind Tunnel Information
===============================================================================
Allow-Flex-Algo-Fallback : false 
Resolution : any                         Strict Tnl Tag : false
Max Ecmp Routes : 1 
Bgp Instance : 1 
Filter Tunnel Types : (Not Specified)
Weighted Ecmp : false 
-------------------------------------------------------------------------------
===============================================================================  

The following command displays Forwarding Database details.

show service id 1000 fdb detail
===============================================================================
Forwarding Database, Service 1000
===============================================================================
ServId MAC              Source-Identifier Type     Last Change
        Transport:Tnl-Id                  Age 
-------------------------------------------------------------------------------
1000  00:de:fe:da:da:04 black-hole        EvpnD:P  05/18/23 10:04:49
-------------------------------------------------------------------------------
No. of MAC Entries: 1
-------------------------------------------------------------------------------
Legend:L=Learned O=Oam P=Protected-MAC C=Conditional S=Static Lf=Leaf T=Trusted
===============================================================================

If the retry time expires, the MAC is flushed from the FDB and the process starts again. The following command clears the duplicate blackhole MAC address.

clear service id evpn mac-dup-detect
Note: The clear service id 1000 fdb command clears learned MAC addresses; blackhole MAC addresses are not cleared.

Support for the blackhole enable and black-hole-dup-mac commands and the preceding associated loop detection procedures is as follows:

  • not supported on B-VPLS, I-VPLS, or M-VPLS services

  • fully supported on EVPN-VXLAN VPLS/R-VPLS services, EVPN-MPLS VPLS/R-VPLS services (including EVPN E-Tree) and EVPN-SRv6 VPLS services

  • fully supported with EVPN MAC mobility and EVPN multihoming

Deterministic EVPN loop detection with trusted MACs

The EVPN loop detection procedure, described in the preceding section, is compliant with draft-ietf-bess-rfc7432bis and is an efficient way of detecting and blocking loops in EVPN networks. Contrary to other intrusive methods that inject Ethernet beacons into the customer network and detect loops depending on whether the beacon messages get back to the PEs, the EVPN loop detection mechanism is non intrusive since relies entirely on the learning of the same MAC on different nodes. However, the mechanism lacks of determinism as seen in EVPN non-intrusive loop detection mechanism.

Figure 61. EVPN non-intrusive loop detection mechanism

Suppose PE1, PE2 and PE3 are attached to the same EVPN VPLS service, and there is an accidental backdoor link between the Base Stations connected to PE2 and PE3. When the Controller with MAC M1 issues a broadcast frame, PE1 forwards it to PE2 and PE3, and the frame is looped back via the backdoor link. The mac-duplication procedure kicks in and M1 is detected as duplicate and turned into a blackhole MAC in the FDB, effectively solving the loop. However, the user does not know beforehand if M1 is blackholed in PE1, PE2, PE3 or multiple of them at the same time. If M1 is blackholed in PE1, this represents an issue for the hosts connected to other PEs (not shown) attached to the same service. Therefore, in the example, we want to influence the mac-duplication procedure so that M1 gets blackholed in PE2, PE3, or both, but not in PE1. In order to make the procedure more deterministic, the Trusted MAC concept is used.

A trusted MAC on a specific PE and VPLS service is a MAC that is a dynamically learned and stays in the FDB as type learned without being flushed or a change in its type, for a configurable number of minutes given by the following command.
configure service vpls bgp-evpn mac-duplication trusted-mac-time
If the MAC moves from a SAP to another SAP in the same service and PE, the MAC does not reset its trusted MAC timer.
Trusted MACs are affected by the mac-duplication procedures in a different way from the non-trusted MACs. Trusted MACs require a different number of moves (during the mac-duplication window) in order to be declared as duplicate, specified by num-moves <number> * trusted-mac-move-factor <number> in the following context.
configure service vpls bgp-evpn mac-duplication detect
While non-trusted MACs are detected as duplicate after num-moves, trusted MACs will need more moves to be declared as duplicate.

The following example shows the configuration of three PEs as shown in EVPN non-intrusive loop detection mechanism.

MD-CLI
// Applicable to PE1, PE2 and PE3

[ex:/configure service vpls "bd-1000" bgp-evpn mac-duplication]
A:admin@node-2# info
  blackhole true
  trusted-mac-time 5 // value 1..15, default: 5
  detect {
    num-moves 5
    window 3
    trusted-mac-move-factor 3 // value 1..10, default: 1
  }
classic CLI
// Applicable to PE1, PE2 and PE3

A:node-2>config>service>vpls>bgp-evpn>mac-duplication# info 
----------------------------------------------
  detect num-moves 5 window 3 trusted-mac-move-factor 3 // value 1..10, default: 1
  black-hole-dup-mac
  trusted-mac-time 5 // value 1..15, default: 5

Based on the preceding configuration, recall the example described at the beginning of this section and assume M1 is a trusted MAC in PE1 (it has been dynamically learned for 5 minutes), then M1 requires 15 moves to be declared as duplicate (therefore a blackhole MAC) in PE1, whereas M1 only need 5 moves to be declared as duplicate in PE2 and PE3. This procedure guarantees that M1 does not get blackholed in the PE of its location (PE1).

The trusted MACs are displayed in the following FDB show command with a "T" in the Type field:

show service id 1000 fdb detail
===============================================================================
Forwarding Database, Service 1000
===============================================================================
ServId   MAC               Source-Identifier   Type     Last Change
          Transport:Tnl-Id                     Age 
-------------------------------------------------------------------------------
1000     00:de:fe:da:da:04 sap:1/1/1:1000      LT/0     05/18/23 10:54:54
-------------------------------------------------------------------------------
No. of MAC Entries: 1
-------------------------------------------------------------------------------
Legend:L=Learned O=Oam P=Protected-MAC C=Conditional S=Static Lf=Leaf T=Trusted
===============================================================================

CFM interaction with EVPN services

Ethernet Connectivity and Fault Management (ETH-CFM) allows the user to validate and measure Ethernet Layer 2 services using standard IEEE 802.1ag and ITU-T Y.1731 protocols. Each tool performs a unique function and adheres to that tool's specific PDU and frame format and the associate rules governing the transmission, interception, and process of the PDU. Detailed information describing the ETH-CFM architecture, the tools, and various functions is located in the various OAM and Diagnostics guides and is not repeated here.

EVPN provides powerful solution architectures. ETH-CFM is supported in the various Layer 2 EVPN architectures. Because the destination Layer 2 MAC address, unicast or multicast, is ETH-CFM tool dependent (for example, ETH-CC is sent as an L2 multicast and ETH-DM is sent as an L2 unicast), the ETH-CFM function is allowed to multicast and broadcast to the virtual EVPN connections. The Maintenance Endpoint (MEP) and Maintenance Intermediate Point (MIP) do not populate the local Layer 2 MAC Address forwarding database (FDB) with the MAC related to the MEP and MIP. This means that the 48-bit IEEE MAC address is not exchanged with peers and all ETH-CFM frames are broadcast across all virtual connections. To prevent the flooding of unicast packets and allow the remote forwarding databases to learn the remote MEP and MIP Layer 2 MAC addresses, the command cfm-mac-advertisement must be configured under the config>service>vpls>bgp-evpn context. This allows the MEP and MIP Layer 2 IEEE MAC addresses to be exchanged with peers. This command tracks configuration changes and send the required updates via the EVPN notification process related to a change.

Up MEP, Down MEP, and MIP creation is supported on the SAP, spoke, and mesh connections within the EVPN service. There is no support for the creation of ETH-CFM Management Points (MPs) on the virtual connection. VirtualMEP (vMEP) is supported with a VPLS context and the applicable EVPN Layer 2 VPLS solution architectures. The vMEP follows the same rules as the general MPs. When a vMEP is configured within the supported EVPN service, the ETH-CFM extraction routines are installed on the SAP, Binding, and EVPN connections within an EVPN VPLS Service. The vMEP extraction within the EVPN-PBB context requires the vmep-extensions parameter to install the extraction on the EVPN connections.

When MPs are used in combination with EVPN multihoming, the following must be considered:

  • Behavior of operationally down MEPs on SAPs/SDP bindings with EVPN multihoming:

    • all-active multihoming

      No ETH-CFM is expected to be used in this case, because the two (or more) SAPs/SDP bindings on the PEs are oper-up and active; however, the CE has a single LAG and responds as though it is connected to a single system. In addition to that, cfm-mac-advertisement can lead to traffic loops in all-active multihoming.

    • single-active multihoming

      Operationally down MEPs defined on single-active Ethernet-Segment SAPs/SDP bindings do not send any CCMs when the PE is non-DF for the ES and fault-propagation is configured. For single-active multihoming, the behavior is equivalent to MEPs defined on BGP-MH SAPs/binds.

  • Behavior for operationally up MEPs on ES SAPs/SDP bindings with EVPN multihoming:

    • all-active multihoming

      Operationally up MEPs defined on non-DF ES SAPs can send CFM packets. However, they cannot receive CCMs (the SAP is removed from the default multicast list) or unicast CFM packets (because the MEP MAC is not installed locally in the FDB; unicast CFM packets are treated as unknown, and not sent to the non-DF SAP MEP).

    • single-active multihoming

      Operationally up MEPs should be able to send or receive CFM packets normally.

    • operationally up MEPs defined on LAG SAPs

      Operationally up MEPs defined on LAG SAPs require the command process_cpm_traffic_on_sap_down so that they can process CFM when the LAG is down and act as regular Ethernet ports.

Because of the above considerations, the use of ETH-CFM in EVPN multihomed SAPs/SDP bindings is only recommended on operationally down MEPs and single-active multihoming. ETH-CFM is used in this case to notify the CE of the DF or non-DF status.

Multi-instance EVPN: Two instances of different encapsulation in the same VPLS/R-VPLS/Epipe service

SR OS supports a maximum of two BGP instances in the same VPLS or R-VPLS, where the two instances can be:

  • one EVPN-VXLAN instance and one EVPN-MPLS instance in the same VPLS or R-VPLS service
  • two EVPN-VXLAN instances in the same VPLS or R-VPLS service
  • two EVPN-MPLS instances in the same VPLS or R-VPLS service
  • one EVPN-MPLS instance and one EVPN-SRv6 instance in the same VPLS service
  • one EVPN-VXLAN instance and one EVPN-SRv6 instance in the same VPLS service

In all the preceding cases, the procedures are compliant with RFC 9014.

SR OS also supports up to two BGP instances in the same Epipe. These two instances can be an EVPN-MPLS instance and an EVPN-SRv6 instance in the same Epipe service.

The procedures to support two BGP instances in the same Epipe adhere to draft-sr-bess-evpn-vpws-gateway.

EVPN-VXLAN to EVPN-MPLS interworking

This section describes the configuration aspects of a VPLS/R-VPLS with EVPN-VXLAN and EVPN-MPLS.

In a service where EVPN-VXLAN and EVPN-MPLS are configured together, the configure service vpls bgp-evpn vxlan bgp 1 and configure service vpls bgp-evpn mpls bgp 2 commands allow the user to associate EVPN-MPLS to a different instance from that associated with EVPN-VXLAN, and have both encapsulations simultaneously enabled in the same service. At the control plane level, EVPN MAC/IP advertisement routes received in one instance are consumed and readvertised in the other instance as long as the route is the best route for a specific MAC. Inclusive multicast routes are independently generated for each BGP instance. In the data plane, the EVPN-MPLS and EVPN-VXLAN destinations are instantiated in different implicit Split Horizon Groups (SHGs) so that traffic can be forwarded between them.

The following example shows a VPLS service with two BGP instances and both VXLAN and MPLS encapsulations configured for the same BGP-EVPN service.

*A:PE-1>config>service>vpls# info 
----------------------------------------------
  description "evpn-mpls and evpn-vxlan in the same service"
  vxlan instance 1 vni 7000 create
  exit
  bgp 
    route-distinguisher 10:2
    route-target target:64500:1
  exit
  bgp 2 
    route-distinguisher 10:1
    route-target target:64500:1
  exit
  bgp-evpn
    evi 7000
    incl-mcast-orig-ip 10.12.12.12
    vxlan bgp 1 vxlan-instance 1
      no shutdown 
    mpls bgp 2
      control-word
      auto-bind-tunnel 
        resolution any
      exit
      force-vlan-vc-forwarding
      no shutdown
    exit
  exit
  no shutdown

The following list describes the preceding example:

  • bgp 1 or bgp is the default BGP instance

  • bgp 2 is the additional instance required when both bgp-evpn vxlan and bgp-evpn mpls are enabled in the service

  • The commands supported in instance 1 are also available in instance 2 with the following considerations:

    • pw-template-binding

      The pw-template-binding can only exist in instance 1; it is not supported in instance 2.

    • route-distinguisher

      The operating route-distinguisher in both BGP instances must be different.

    • route-target

      The route target in both instances can be the same or different.

    • vsi-import and vsi-export

      Import and export policies can also be defined for either BGP instance.

  • MPLS and VXLAN can use either BGP instance, and the instance is associated when bgp-evpn mpls or bgp-evpn vxlan is created. The bgp-evpn vxlan command must include not only the association to a BGP instance, but also to a vxlan-instance (because the VPLS services support two VXLAN instances).

    Note: The bgp-evpn vxlan no shutdown command is only allowed if bgp-evpn mpls shutdown is configured, or if the BGP instance associated with the MPLS has a different route distinguisher than the VXLAN instance.

The following features are not supported when two BGP instances are enabled on the same VPLS/R-VPLS service:

  • SDP bindings

  • M-VPLS, I-VPLS, B-VPLS, or E-Tree VPLS

  • Proxy-ARP and proxy-ND

  • BGP Multihoming

  • IGMP, MLD, and PIM snooping

  • BGP-VPLS or BGP-AD (SDP bindings are not created)

The service>vpls>bgp-evpn>ip-route-advertisement command is not supported on R-VPLS services with two BGP instances.

EVPN-SRv6 to EVPN-MPLS or EVPN-VXLAN interworking

EVPN-SRv6 and EVPN-MPLS or EVPN-VXLAN can be simultaneously configured in the same VPLS service (but not R-VPLS), in different instances. In addition, EVPN-SRv6 and EVPN-MPLS can be simultaneously configured in the same Epipe service, so that border routers can stitch SRv6 and MPLS domains for point-to-point services.

VPLS services

EVPN-SRv6 and EVPN-VXLAN instances in the same VPLS service follow the same configuration rules as described in EVPN-VXLAN to EVPN-MPLS interworking, and the same processing of MAC/IP Advertisement routes and Inclusive Multicast Ethernet Tag routes is applied.

The following example shows a VPLS service with two BGP instances, with both VXLAN and SRv6 encapsulations configured under BGP-EVPN.

MD-CLI
A:node-2>config>service>vpls "evpn-srv6-vxlan-1"> info
    admin-state enable
    description "evpn-srv6 and evpn-vxlan in the same service"
    vxlan {
        instance 1 {
            vni 12340
        }
    }
    segment-routing-v6 1 {
        locator "loc-1" {
            function {
                end-dt2u {
                }
                end-dt2m {
                }
            }
        }
    }
    bgp 1 {
        route-distinguisher "12340:1"
        route-target {
            export "target:64500:12340"
            import "target:64500:12340"
        }
    }
    bgp 2 {
        route-distinguisher "12340:2"
        route-target {
            export "target:64500:12341"
            import "target:64500:12341"
        }
    }
    bgp-evpn {
        evi 12340
        incl-mcast-orig-ip 10.12.12.12
        segment-routing-v6 2 {
            admin-state enable
            ecmp 4
            force-vc-forwarding vlan
            srv6 {
                default-locator "loc-1"
            }
        }
        vxlan 1 {
            admin-state enable
            vxlan-instance 1
        }
    }
classic CLI
A:node-2>config>service>vpls# info
----------------------------------------------
            description "evpn-srv6 and evpn-vxlan in the same service"
            vxlan instance 1 vni 12340 create
            exit
            segment-routing-v6 1 create
                locator "loc-1"
                    function
                        end-dt2u
                        end-dt2m
                    exit
                exit
            exit
            bgp
                route-distinguisher 12340:1
                route-target export target:64500:12340 import target:64500:12340
            exit
            bgp 2
                route-distinguisher 12340:2
                route-target export target:64500:12341 import target:64500:12341
            exit
            bgp-evpn
                incl-mcast-orig-ip 10.12.12.12
                evi 12340
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
                segment-routing-v6 bgp 2 srv6-instance 1 default-locator "loc-1" create
                    ecmp 4
                    force-vlan-vc-forwarding
                    route-next-hop 2001:db8::76
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
----------------------------------------------

When an EVPN-SRv6 instance and an EVPN-MPLS instance are both configured in the same VPLS service, each instance can be configured in a different or the same split horizon group. The former option allows the interconnection of domains of different encapsulation, and the rules of configuration and route processing described in EVPN-VXLAN to EVPN-MPLS interworking apply. The latter option is used for domains where MPLS and SRv6 PEs are attached to the same service, typically for migration purposes.

When the EVPN-SRv6 and the EVPN-MPLS instances are configured in the same split horizon group:

  • MAC/IP Advertisement routes are not redistributed between the two instances
  • Two BUM EVPN destinations to the same far-end PE (identified by the originator IP of the Inclusive Multicast Ethernet Tag routes) cannot be created. An EVPN-MPLS BUM destination is removed if there is another BUM destination to the same far end with an SRv6 encapsulation. This is to prevent BUM traffic duplication between multi-instance nodes
  • SAPs are supported, but SDP binds are not supported

The following example shows a VPLS service with two BGP instances, with both MPLS and SRv6 encapsulations configured under BGP-EVPN, with the same split horizon group.

MD-CLI
configure service vpls "evpn-srv6-mpls-1" >info
admin-state enable
    description "evpn-srv6 and evpn-mpls in the same service"
    segment-routing-v6 1 {
        locator "loc-1" {
            function {
                end-dt2u {
                }
                end-dt2m {
                }
            }
        }
    }
    bgp 1 {
        route-distinguisher "12341:1"
        route-target {
            export "target:64500:12342"
            import "target:64500:12342"
        }
    }
    bgp 2 {
        route-distinguisher "12341:2"
        route-target {
            export "target:64500:12343"
            import "target:64500:12343"
        }
    }
    bgp-evpn {
        evi 12340
        incl-mcast-orig-ip 10.12.12.12
        segment-routing-v6 2 {
            admin-state enable
            ecmp 4
            force-vc-forwarding vlan
            srv6 {
                default-locator "loc-1"
            }
            route-next-hop {
                ip-address 2001:db8::76
            }
        }
        mpls 1 {
            admin-state enable
            force-vc-forwarding vlan
            split-horizon-group "SHG-1"
            ingress-replication-bum-label true
            ecmp 4
            mh-mode access
            auto-bind-tunnel {
                resolution any
            }
        }
    }
    split-horizon-group "SHG-1" {
    }
classic CLI
A:node-2>config>service>vpls# info
----------------------------------------------
            description "evpn-srv6 and evpn-mpls in the same service"
            split-horizon-group "SHG-1" create
            exit
            segment-routing-v6 1 create
                locator "loc-1"
                    function
                        end-dt2u
                        end-dt2m
                    exit
                exit
            exit
            bgp
                route-distinguisher 12341:1
                route-target export target:64500:12342 import target:64500:12342
            exit
            bgp 2
                route-distinguisher 12341:2
                route-target export target:64500:12343 import target:64500:12343
            exit
            bgp-evpn
                incl-mcast-orig-ip 10.12.12.12
                evi 12341
                mpls bgp 1
                    mh-mode access
                    force-vlan-vc-forwarding
                    split-horizon-group "SHG-1"
                    ingress-replication-bum-label
                    ecmp 4
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
                segment-routing-v6 bgp 2 srv6-instance 1 default-locator "loc-1" create
                    ecmp 4
                    force-vlan-vc-forwarding
                    route-next-hop 2001:db8::76
                    split-horizon-group "SHG-1"
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
----------------------------------------------
Epipe services

The following example shows an Epipe service with two BGP instances, with both MPLS and SRv6 encapsulations configured under BGP-EVPN. This is the gateway configuration when it is stitching MPLS and SRv6 domains for E-Line services:

MD-CLI
[ex:/configure service epipe "multi-inst-evpn-vpws-100"]
A:admin@node-2# info
    admin-state enable
    service-id 100
    customer "1"
    segment-routing-v6 1 {
        locator "LOC-2-16bits" {
            function {
                end-dx2 {
                }
            }
        }
    }
    bgp 1 {
        route-distinguisher "23.23.23.1:100"
    }
    bgp 2 {
        route-distinguisher "23.23.23.2:100"
    }
    endpoint "mpls" {
    }
    endpoint "srv6" {
    }
    bgp-evpn {
        evi 100
        local-attachment-circuit "mpls" {
            endpoint "mpls"
            eth-tag 1
        }
        local-attachment-circuit "srv6" {
            endpoint "srv6"
            eth-tag 1
            bgp 2
        }
        remote-attachment-circuit "mpls" {
            endpoint "mpls"
            eth-tag 1
        }
        remote-attachment-circuit "srv6" {
            endpoint "srv6"
            eth-tag 1
            bgp 2
        }
        mpls 1 {
            admin-state enable
            ecmp 2
            domain-id "64500:1"
            auto-bind-tunnel {
                resolution any
            }
        }
        segment-routing-v6 2 {
            admin-state enable
            source-address 2001:db8::2
            mh-mode access
            domain-id "64500:2"
            srv6 {
                instance 1
                default-locator "LOC-2-16bits"
            }
        }
    }
classic CLI
A:node-2# configure service epipe 100 
A:node-2>config>service>epipe# info 
----------------------------------------------
            endpoint "mpls" create
            exit
            endpoint "srv6" create
            exit
            segment-routing-v6 1 create
                locator "LOC-2-16bits"
                    function
                        end-dx2
                    exit
                exit
            exit
            bgp 1
                route-distinguisher 23.23.23.1:100
            exit
            bgp 2
                route-distinguisher 23.23.23.2:100
            exit
            bgp-evpn
                local-attachment-circuit mpls bgp 1 endpoint mpls create
                    eth-tag 1
                exit
                local-attachment-circuit srv6 bgp 2 endpoint srv6 create
                    eth-tag 1
                exit
                remote-attachment-circuit mpls bgp 1 endpoint mpls create
                    eth-tag 1
                exit
                remote-attachment-circuit srv6 bgp 2 endpoint srv6 create
                    eth-tag 1
                exit
                evi 100
                mpls bgp 1
                    domain-id 64500:1
                    ecmp 2
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
                segment-routing-v6 bgp 2 srv6-instance 1 default-locator "LOC-2-16bits" create
                    domain-id 64500:2
                    mh-mode access
                    source-address 2001:db8::2
                    no shutdown
                exit
            exit
            no shutdown
----------------------------------------------
Where:
  • The epipe bgp command supports up to two instances, where the default value is 1, and the accepted values are now in the range 1..2.
  • The bgp-instance of 1 or 2 can be matched under the following contexts:
    configure service epipe bgp-evpn mpls
    configure service epipe bgp-evpn segment-routing-v6
    MPLS and SRv6 can be configured in Epipes with one or two instances, and they can indistinctly use instance “1” or “2”. The preceding example shows an Epipe service with MPLS configured in bgp-instance 1 and segment-routing-v6 configured in bgp-instance 2.
  • The bgp-instance 2 requires the support of the explicit route distinquisher (RD) configuration because the EVI-based autoderivation of the RD only applies to bgp-instance 1. The route target EVI-based autoderivation applies to both instances.
  • The following command can also be associated with a BGP instance.
    configure service epipe bgp-evpn local-attachment-circuit  
    By default, all local attachment circuits in the service are associated to bgp-instance 1. When the local attachment circuits are associated to different BGP instances, no local SAPs or Spoke-SDPs are supported in the service (this is blocked by the CLI).
  • The BGP instances are configured with a D-PATH domain-id. The D-PATH attribute is described in section BGP D-PATH attribute for Layer 3 loop protection and can also be used in multi-instance Epipe services. D-PATH is used in the EVPN-VPWS AD per-EVI routes for best-path selection and loop avoidance in case of redundant gateways. In the preceding example, configuring segment-routing-v6 bgp 2 domain-id 64500:2 means that the received EVPN AD per-EVI route in the SRv6 instance is redistributed to the MPLS instance with a D-PATH attribute where domain 64500:2 is prepended.
  • When configuring an SRv6 and an MPLS instance in an Epipe service, one of the two instances must be configured as mh-mode access, with the other one configured as mh-mode network (default value for SRv6 and MPLS instances). The command is added under the MPLS and SRv6 instances (not in VXLAN instances).

As in the case of any Epipe service, two explicit or implicit endpoints exist, where traffic always flows from one endpoint to the other endpoint. The preceding example uses the configuration of two explicit endpoints, however one implicit endpoint plus one explicit endpoint can also be configured, and the behavior would be identical. In other words, the preceding configuration is also valid if the endpoint mpls is not configured. In this case, the local-attachment-circuit and remote-attachment-circuit associated with bgp 1 would be part of an implicit end-point.

BGP-EVPN routes in services configured with two BGP instances

The following sections describe BGP-EVPN routes in EVPN and VPLS services configured with two BGP instances.

VPLS services

From a BGP perspective, the two BGP instances configured in the service are independent of each other. The redistribution of routes between the BGP instances is resolved at the EVPN application layer.

By default, if EVPN-VXLAN and EVPN-MPLS are both enabled in the same service, BGP sends the generated EVPN routes twice: with the RFC 9012 BGP encapsulation extended community set to VXLAN and a second time with the encapsulation type set to MPLS.

Usually, a DCGW peers a pair of Route Reflectors (RRs) in the DC and a pair of RRs in the WAN. For this reason, the user needs to add router policies so that EVPN-MPLS routes are only sent to the WAN RRs and EVPN-VXLAN routes are only sent to the DC RRs. The following examples show how to configure router policies.

MD-CLI
[ex:/configure router "Base" bgp]
A:admin@node-2# info
    vpn-apply-export true
    vpn-apply-import true
    group "WAN" {
        type internal
        family {
            evpn true
        }
        export {
            policy ["allow only mpls"]
        }
    }
    group "DC" {
        type internal
        family {
            evpn true
        }
        export {
            policy ["allow only vxlan"]
        }
    }
    neighbor "192.0.2.2" {
        group "WAN"
    }
    neighbor "192.0.2.75" {
        group "DC"
    }

*[ex:/configure policy-options]
A:admin@PE-76# info
    community "mpls" {
        member "bgp-tunnel-encap:MPLS" { }
    }
    community "vxlan" {
        member "bgp-tunnel-encap:VXLAN" { }
    }
    policy-statement "allow only mpls" {
        entry 10 {
            from {
                family [evpn]
                community {
                    name "vxlan"
                }
            }
            action {
                action-type reject
            }
        }
    }
    policy-statement "allow only vxlan" {
        entry 10 {
            from {
                family [evpn]
                community {
                    name "mpls"
                }
            }
            action {
                action-type reject
            }
        }
    }
classic CLI
config>router>bgp# 
 vpn-apply-import
 vpn-apply-export
 group "WAN"                
  family evpn                
  type internal                
  export "allow only mpls"                 
  neighbor 192.0.2.6           
 group "DC"               
  family evpn                
  type internal                
  export "allow only vxlan"                 
  neighbor 192.0.2.2
A:node-2>config>router>policy-options# info 
----------------------------------------------
         community "vxlan" members "bgp-tunnel-encap:VXLAN"
         community "mpls" members "bgp-tunnel-encap:MPLS"
            policy-statement "allow only mpls"
                entry 10
                  from
                    family evpn
                    community vxlan
                  action drop
                  exit
                exit
            exit
            policy-statement "allow only vxlan"
                entry 10
             from
              family evpn
              community mpls
                    action drop
                    exit
                exit
            exit

In a BGP instance, the EVPN routes are imported based on the route-targets and regular BGP selection procedures, regardless of their encapsulation.

The BGP-EVPN routes are generated and redistributed between BGP instances based on the following rules:

  • Auto-discovery (AD) routes (type 1) are not generated by services with two BGP EVPN instances, unless a local Ethernet segment is present on the service. However, AD routes received from the EVPN-MPLS peers are processed for aliasing and backup functions as usual.

  • MAC/IP routes (type 2) received in one of the two BGP instances are imported and the MACs added to the FDB according to the existing selection rules. If the MAC is installed in the FDB, it is readvertised in the other BGP instance with the new BGP attributes corresponding to the BGP instance (route target, route distinguisher, and so on). The following considerations apply to these routes:

    • The mac-advertisement command governs the advertisement of any MACs (even those learned from BGP).

    • A MAC route is redistributed only if it is the best route based on the EVPN selection rules.

    • If a MAC route is the best route and has to be redistributed, the MAC/IP information, along with the MAC mobility extended community, is propagated in the redistribution.

    • The router redistributes any MAC route update for which any attribute has changed. For example, a change in the SEQ or sticky bit in one instance is updated in the other instance for a route that is selected as the best MAC route.

  • EVPN inclusive multicast routes are generated independently for each BGP instance with the corresponding BGP encapsulation extended community (VXLAN or MPLS). Also, the following considerations apply to these routes:

    • Ingress Replication (IR) and Assisted Replication (AR) routes are supported in the EVPN-VXLAN instance. If AR is configured, the AR IP address must be a loopback address different from the system-ip and the configured originating-ip address.

    • IR, P2MP mLDP, and composite inclusive multicast routes are supported in the EVPN-MPLS instance.

    • The modification of the incl-mcast-orig-ip command is supported, subject to the following considerations:

      • The configured IP in the incl-mcast-orig-ip command is encoded in the originating-ip field of the inclusive multicast Routes for IR, P2MP, and composite tunnels.

      • The originating-ip field of the AR routes is still derived from the service>system>vxlan>assisted-replication-ip configured value.

    • EVPN handles the inclusive multicast routes in a service based on the following rules:

      • For IR routes, the EVPN destination is set up based on the NLRI next hop.

      • For P2MP mLDP routes, the PMSI Tunnel Attribute tunnel-id is used to join the mLDP tree.

      • For composite P2MP-IR routes, the PMSI Tunnel Attribute tunnel-id is used to join the tree and create the P2MP bind. The NLRI next-hop is used to build the IR destination.

      • For AR routes, the NLRI next-hop is used to build the destination.

      • The following applies if a router receives two inclusive multicast routes in the same instance:
        • If the routes have the same originating-ip but different route distinguishers and next-hops, the router processes both routes. In the case of IR routes, it sets up two destinations: one to each next-hop.

        • If the routes have the same originating-ip, different route distinguishers, but same next hops, the router sets up only one binding for IR routes.

        • The router ignores inclusive multicast routes received with its own originating-ip, regardless of the route distinguisher.

  • IP-Prefix routes (type 5) are not generated or imported by a service with two BGP instances.

The rules in this section can be extrapolated to VPLS services where SRv6 and MPLS or VXLAN are configured in different instances of the same VPLS with different split horizon groups.

Epipe services
Single-instance EVPN-VPWS services do not generate AD per-EVI routes if they do not have a local SAP/spoke SDP configured and in the oper-up state. Multi-instance EVPN-VPWS do not allow local SAP/Spoke-SDPs and hence they do not generate AD per-EVI routes for the configured local-attachment-circuit eth-tag. These services "redistribute" AD per-EVI routes received in one instance into the other. The redistribution rules follow draft-sr-bess-evpn-vpws-gateway as follows:
  • Upon receiving an AD per-EVI route in bgp-instance 1, if the route is selected to be installed and the route does not contain a local domain-id in its D-PATH attribute (local means the domain-id is configured in the Epipe), an AD per-EVI route is triggered in bgp-instance 2, using the eth-tag, RD, RT and properties of bgp-instance 2.
  • The EVPN Layer 2 attributes extended community is regenerated for the redistributed route. The value of the P and B flags is set to 0 when redistributing routes.
  • The encapsulation-specific attributes of the redistributed route are regenerated based on the encapsulation of the BGP instance in which the route is advertised.
  • The redistributed route carries the Communities, Extended Communities, and Large Communities of the source route when the following command is configured:
MD-CLI
configure service system bgp evpn ad-per-evi-routes attribute-propagation true
Classic-CLI
configure service system bgp-evpn ad-per-evi-routes attribute-propagation

The exception are RTs (which are re-originated), EVPN Extended Communities, and BGP Encapsulation Extended Communities [RFC9012]. EVPN Extended Communities and BGP Encapsulation Extended Communities are not propagated across domains.

  • The redistributed AD per-EVI route updates the D-PATH attribute of the received route or adds the D-PATH attribute if the received route did not contain a D-PATH.
  • The ESI of the redistributed AD per-EVI route is always zero.
  • AD per-ES and ES routes are never redistributed.
Route selection of AD per-EVI routes

The redistribution of attributes, as well as the BGP best-path selection for AD per-EVI routes is controlled by the following commands:

MD-CLI
[ex:/configure service system bgp evpn ad-per-evi-routes]
A:admin@PE-2# tree detail 
+-- attribute-propagation <boolean>
+-- bgp-path-selection <boolean>
+-- d-path-ignore <boolean>
Classic-CLI

*A:PE-2>config>service>system>bgp-evpn>ad-per-evi-routes# tree detail 
ad-per-evi-routes
|
+---attribute-propagation
| no attribute-propagation
|
+---bgp-path-selection
| no bgp-path-selection
|
+---d-path-ignore
| no d-path-ignore

Where both (bgp-path-selection and attribute-propagation) are disabled by default, and the router enforces that bgp-path-selection can only be enabled if attribute-propagation is enabled before.

If bgp-path-selection false (default) is configured, in case of multiple AD per-EVI routes for the same Ethernet tag are received in the same Epipe BGP instance, the lowest IP route is selected. Those routes may have zero ESI, or different non-zero ESI.

When multiple non-zero ESI AD per-EVI routes are received and the ESI matches on all of them, the bgp-path-selection command impacts the following procedures:

  • The command influences the selection of AD per-EVI routes to create the ES destination. If disabled, the lowest IP address routes are selected, up to the number of configured ECMP paths. If enabled, the routes are selected based on BGP best-path selection.
  • The command influences the selection of the best AD per-EVI route of the ES for the purpose of attribute propagation. If enabled, the attributes of the best-path route are propagated.
The best-path selection tie-breaking rules are included below for reference:
  1. High Local Pref wins
  2. Shortest D-PATH wins (if d-path-ignore false)
  3. Lowest left-most D-PATH domain-id wins (if d-path-ignore false)
  4. Shortest AS_PATH wins
  5. Lowest Origin wins
  6. Lowest MED wins
  7. EBGP wins
  8. Lowest tunnel-table cost to the next-hop
  9. Lowest next-hop type wins (resolution in TTM wins vs RTM)
  10. Lowest next-hop type wins
  11. Lowest router ID wins (applicable to ibgp peers only)
  12. Shortest cluster_list length wins (applicable to ibgp peers only)
  13. Lowest IP address
  14. Next-hop check (IPv4 NH wins, then lowest NH wins)
  15. RD check (lowest RD wins)
  16. Path-Id (add path)

Anycast redundant solution for dual BGP-instance services

The following sections describe the anycast redundant solution for dual BGP instances in VPLS and Epipe services.

VPLS services

The following figure shows the anycast mechanism used to support gateway redundancy for dual BGP-instance services. The example shows two redundant DC gateways (DCGWs) where the VPLS services contain two BGP instances: one each for EVPN-VXLAN and EVPN-MPLS.

Figure 62. Multihomed anycast solution

The example shown in Multihomed anycast solution depends on the ability of the two DCGWs to send the same inclusive multicast route to the remote PE or NVEs, such that:

  • The remote PE or NVEs create a single BUM destination to one of the DCGWs (because the BGP selects only the best route to the DCGWs).

  • The DCGWs do not create a destination between each other.

This solution avoids loops for BUM traffic, and known unicast traffic can use either DCGW router, depending on the BGP selection. The following CLI example output shows the configuration of each DCGW.

MD-CLI
/* bgp configuration on DCGW1 and DCGW2 */

[ex:/configure router "Base" bgp]
A:admin@DCGW# info
    vpn-apply-export true
    vpn-apply-import true
    group "DC" {
        type internal
        family {
            evpn true
        }
    }
    group "WAN" {
        type internal
        family {
            evpn true
        }
    }
    neighbor "192.0.2.2" {
        group "DC"
    }
    neighbor "192.0.2.6" {
        group "WAN"
    }

/* vpls service configuration in DCGW1 */

[ex:/configure service vpls "1"]
A:admin@DCGW1# info
    admin-state enable
    customer "1"
    vxlan {
        instance 1 {
            vni 1
        }
    }
    bgp 1 {
        route-distinguisher "64501:12"
        route-target {
            export "target:64500:1"
            import "target:64500:1"
        }
    }
    bgp 2 {
        route-distinguisher "64502:12"
        route-target {
            export "target:64500:1"
            import "target:64500:1"
        }
    }
    bgp-evpn {
        evi 1
        incl-mcast-orig-ip 10.12.12.12
        vxlan 1 {
            admin-state enable
            vxlan-instance 1
        }
        mpls 2 {
            admin-state enable
            auto-bind-tunnel {
                resolution any
            }
        }
    }

/* vpls service configuration in DCGW2 */

[ex:/configure service vpls "1"]
A:admin@DCGW2# info
    admin-state enable
    customer "1"
    vxlan {
        instance 1 {
            vni 1
        }
    }
    bgp 1 {
        route-distinguisher "64501:12"
        route-target {
            export "target:64500:1"
            import "target:64500:1"
        }
    }
    bgp 2 {
        route-distinguisher "64502:12"
        route-target {
            export "target:64500:1"
            import "target:64500:1"
        }
    }
    bgp-evpn {
        evi 1
        incl-mcast-orig-ip 10.12.12.12
        vxlan 1 {
            admin-state enable
            vxlan-instance 1
        }
        mpls 2 {
            admin-state enable
            auto-bind-tunnel {
                resolution any
            }
        }
    }
classic CLI
/* bgp configuration on DCGW1 and DCGW2 */
config>router>bgp# 
 group ”WAN"                
  family evpn                
  type internal                
  neighbor 192.0.2.6          
 group ”DC"                
  family evpn                
  type internal                
  neighbor 192.0.2.2
/* vpls service configuration */
DCGW-1# config>service>vpls(1)#
-----------------------
bgp 
  route-distinguisher 64501:12
  route-target target:64500:1
exit
bgp 2
  route-distinguisher 64502:12
  route-target target:64500:1
exit
vxlan instance 1 vni 1 create
exit
bgp-evpn 
  evi 1
  incl-mcast-orig-ip 10.12.12.12
  vxlan bgp 1 vxlan-instance 1
   no shutdown 
  mpls bgp 2
   no shutdown
   auto-bind-tunnel 
     resolution any
   exit
<snip>
DCGW-2# config>service>vpls(1)#
-----------------------
bgp 
  route-distinguisher 64501:12
  route-target target:64500:1
exit
bgp 2
  route-distinguisher 64502:12
  route-target target:64500:1
exit
vxlan instance 1 vni 1 create
exit
bgp-evpn 
  evi 1
  incl-mcast-orig-ip 10.12.12.12
  vxlan bgp 1 vxlan-instance 1
   no shutdown 
  mpls bgp 2
   no shutdown
   auto-bind-tunnel 
     resolution any
<snip>

Based on the preceding configuration example, the behavior of the DCGWs in this scenario is as follows:

  • DCGW-1 and DCGW-2 send inclusive multicast routes to the DC RR and WAN RR with the same route key. For example:

    • DCGW-1 and DCGW-2 both send an IR route to DC RR with RD=64501:12, orig-ip=10.12.12.12, and a different next hop and tunnel ID

    • DCGW-1 and DCGW-2 both send an IR route to WAN RR with RD=64502:12, orig-ip=10.12.12.12, and different next hop and tunnel ID

  • DCGW-1 and DCGW-2 both receive MAC/IP routes from DC and WAN that are redistributed to the other BGP instances, assuming that the route is selected as best route and the MAC is installed in the FDB.

    As described in section BGP-EVPN routes in services configured with two BGP instances, router peer policies are required so that only VXLAN or MPLS routes are sent or received for a specific peer.

  • Configuration of the same incl-mcast-orig-ip address in both DCGWs enables the anycast solution for BUM traffic for all the following reasons:

    • The configured originating-ip is not required to be a reachable IP address and this forces the remote PE or NVEs to select only one of the two DCGWs.

    • The BGP next hops are allowed to be the system-ip or even a loopback address. In both cases, the BGP next hops are not required to be reachable in their respective networks.

In the example shown in Multihomed anycast solution, PE-1 picks up DCGW-1's inclusive multicast route (because of its lower BGP next hop) and creates a BUM destination to 192.0.2.4. When sending BUM traffic for VPLS-1, it only sends the traffic to DCGW-1. In the same way, the DCGWs do not set up BUM destinations between each other as they use the same originating-ip in their inclusive multicast routes.

The remote PE or NVEs perform a similar BGP selection for MAC/IP routes, as a specific MAC is sent by the two DCGWs with the same route key. A PE or NVE sends known unicast traffic for a specific MAC to only one DCGW.

Epipe services

The Anycast redundancy solution can also be used for gateways that stitch SRv6 to MPLS domains for EVPN-VPWS services. The principle is similar to the one explained in VPLS services. The following figure shows an example.

Figure 63. Multihomed Anycast solution for Epipe services

The configuration on the two gateways (BR-2 and BR-3 in the preceding example) must generate AD per-EVI routes with the same route key (including the same RD) from both gateways so that the ingress PEs select one of the two gateways based on BGP best-path selection.

Note: The D-PATH configuration (domain-id) is also needed, and mh-mode is configured on both gateways

The following is an example of the (identical) configuration in BR-2 and BR-3.

MD-CLI
[ex:/configure service epipe "1"]
A:admin@BR-2/BR-3# info
    admin-state enable
    service-id 1
    customer "1"
    segment-routing-v6 1 {
        locator "LOC-1" {
            function {
                end-dx2 {
                }
            }
        }
    }
    bgp 1 {
	route-distinguisher 2323:1
    }
    bgp 2 {
    	route-distinguisher 2323:2
    }  
    endpoint "MPLS" {
    }
    endpoint "SRv6" {
    }
    bgp-evpn {
        evi 1
        local-attachment-circuit "gw-mpls" { // implicitly associated to bgp 1
        	eth-tag 1
		endpoint “MPLS”
        }
        remote-attachment-circuit "ac-1-mpls" {
            	eth-tag 1
		endpoint “MPLS”
        }
        local-attachment-circuit "gw-srv6" {  // associated to bgp 2
        	eth-tag 1
		endpoint “SRv6”
		bgp 2
        }
        remote-attachment-circuit "ac-2-srv6" {
            	eth-tag 1
		endpoint “SRv6”
               bgp 2
        }
        mpls 1 {
            admin-state enable
            ecmp 2
            domain 64500:1
            mh-mode access
            auto-bind-tunnel {
                resolution any
            }
            route-next-hop {
                ip-address 2001:db8::2
            }
        }
     	segment-routing-v6 2 {
            admin-state enable
            source-address 2001:db8::2
            ecmp 2
            domain 64500:2
            mh-mode network // default
            srv6 {
                instance 1
                default-locator "LOC-1"
            }
            route-next-hop {
                system-ipv6
            }
        }
    }
classic CLI
*A:BR-2/BR-3# configure service epipe 1
*A:BR-2/BR-3>config>service>epipe# info 
----------------------------------------------
            endpoint "MPLS" create
            exit
            endpoint "SRv6" create
            exit
            segment-routing-v6 1 create
                locator "LOC-1"
                    function
                        end-dx2
                    exit
                exit
            exit
            bgp 1
                route-distinguisher 2323:1
            exit
            bgp 2
                route-distinguisher 2323:2
            exit
            bgp-evpn
                local-attachment-circuit "gw-mpls" bgp 1 endpoint "MPLS" create
                    eth-tag 1
                exit
                local-attachment-circuit "gw-srv6" bgp 2 endpoint "SRv6" create
                    eth-tag 1
                exit
                remote-attachment-circuit "ac-1-mpls" bgp 1 endpoint "MPLS" create
                    eth-tag 1
                exit
                remote-attachment-circuit "ac-2-srv6" bgp 2 endpoint "SRv6" create
                    eth-tag 1
                exit
                evi 1
                mpls bgp 1
                    domain-id 64500:1
                    ecmp 2
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
                segment-routing-v6 bgp 2 srv6-instance 1 default-locator "LOC-1" create
                    domain-id 64500:2
                    mh-mode access
                    source-address 2001:db8::3
                    no shutdown
                exit
            exit
            no shutdown
----------------------------------------------

In this example:

  • The Anycast gateways attached to the same two domains redistribute the EVPN AD per-EVI routes between domains, where ESI is always reset to zero.
  • The redundant gateways may set the same Ethernet Tag ID in the redistributed A-D per-EVI route (the example shows the same eth-tag values, but the gateways could also use other values).
  • The Anycast gateways process the received D-PATH attribute and update the D-PATH (with the source domain-id) when redistributing the AD per-EVI route to the next domain. The D-PATH attribute avoids control plane loops.

The following considerations related to the use of D-PATH in this configuration apply:

  • Based on the domain configuration, when an AD per-EVI route is imported in domain X and redistributed into domain Y, the domain ID of X is prepended to the D-PATH in the redistributed AD per-EVI route.
  • When two AD per-EVI routes for the same and Ethernet tag (same route key) are received on the same services from different next-hops, D-PATH is considered in the BGP best-path selection, unless d-path-ignore true is configured.
  • When two AD per-EVI routes for the same service are received with different route distinquishers and same Ethernet tag, from different next hops, D-PATH is considered in the BGP best-path selection, unless d-path-ignore true is configured, and assuming bgp-path-selection true is configured.
  • If d-path-ignore false is configured, the router compares the D-PATH attribute received in VPWS AD per-EVI routes with the same key (same or different RDs) as follows:
    • The routes with the shortest D-PATH are preferred, hence routes not tied to the shortest D-PATH are removed. Routes without D-PATH are considered zero-length D-PATH.
    • The routes with the numerically lowest left-most domain ID are preferred, hence routes not tied to the numerically lowest left-most domain ID are removed from consideration.

Using P2MP mLDP in redundant anycast DCGWs

Anycast multihoming and mLDP shows an example of a common BGP EVPN service configured in redundant anycast DCGWs and mLDP used in the MPLS instance.

Note: Packet duplication may occur if the service configuration is not performed carefully.
Figure 64. Anycast multihoming and mLDP

When mLDP is used with multiple anycast multihoming DCGWs, the same originating IP address must be used by all the DCGWs. Failure to do so may result in packet duplication.

In the example shown in Anycast multihoming and mLDP, each pair of DCGWs (DCGW1/DCGW2 and DCGW3/DCGW4) is configured with a different originating IP address, which causes the following behavior:

  1. DCGW3 and DCGW4 receive the inclusive multicast routes with the same route key from DCGW1 and DCGW2.

  2. Both DCGWs (DCGW3 and DCGW4) select only one route, which is generally the same, for example, DCGW1's inclusive multicast route.

  3. As a result, DCGW3 and DCGW4 join the mLDP tree with root in DCGW1, creating packet duplication when DCGW1 sends BUM traffic.

  4. Remote PE nodes with a single BGP-EVPN instance join the mLDP tree without any problem.

To avoid the packet duplication shown in Anycast multihoming and mLDP, Nokia recommends to configure the same originating IP address in all four DCGWs (DCGW1/DCGW2 and DCGW3/DCGW4). However, the route distinguishers can be different per pair.

The following behavior occurs if the same originating IP address is configured on the DCGW pairs shown in Anycast multihoming and mLDP.

Note: This configuration allows the use of mLDP as long as BUM traffic is not required between the two DCs. Ingress replication must be used if BUM traffic between the DCs is required.
  • DCGW3 and DCGW4 do not join any mLDP tree sourced from DCGW1 or DCGW2, which prevents any packet duplication. This is because a router ignore inclusive multicast routes received with its own originating-ip, regardless of the route-distinguisher.

  • PE1 joins the mLDP trees from the two DCs.

I-ES solution for dual BGP instance services

SR OS supports Interconnect ESs (I-ES) for VXLAN as per RFC9014. An I-ES is a virtual ES that allows DCGWs with two BGP instances to handle VXLAN access networks as any other type of ES. I-ESs support the RFC 7432 multihoming functions, including single-active and all-active, ESI-based split-horizon filtering, DF election, and aliasing and backup on remote EVPN-MPLS PEs.

In addition to the EVPN multihoming features, the main advantages of the I-ES redundant solution compared to the redundant solution described in Anycast redundant solution for dual BGP-instance services are as follows:

  • The use of I-ES for redundancy in dual BGP-instance services allows local SAPs on the DCGWs.

  • P2MP mLDP can be used to transport BUM traffic between DCs that use I-ES without any risk of packet duplication. As described in Using P2MP mLDP in redundant anycast DCGWs, packet duplication may occur in the anycast DCGW solution when mLDP is used in the WAN.

Where EVPN-MPLS networks are interconnected to EVPN-VXLAN networks, the I-ES concept applies only to the access VXLAN network; the EVPN-MPLS network does not modify its existing behavior.

The Interconnect ES concept shows the use of I-ES for Layer 2 EVPN DCI between VXLAN and MPLS networks.

Figure 65. The Interconnect ES concept

The following example shows how I-ES-1 would be provisioned on DCGW1 and the association between I-ES to a specified VPLS service. A similar configuration would occur on DCGW2 in the I-ES.

New I-ES configuration:

DCGW1#config>service>system>bgp-evpn# 
ethernet-segment I-ES-1 virtual create
 esi 01:00:00:00:12:12:12:12:12:00
 service-carving
  mode auto
 multi-homing all-active
 network-interconnect-vxlan 1
 service-id
   service-range 1 to 1000
 no shutdown

Service configuration:

DCGW1#config>service>vpls(1)# 
 vxlan instance 1 vni 1 instance 1 create
 exit
bgp
 route-distinguisher 1:1 
bgp 2
 route-distinguisher 2:2               
bgp-evpn        
  evi 1
  vxlan bgp 1 vxlan-instance 1
   no shutdown
  exit
  mpls bgp 2
   auto-bind-tunnel resolution any                     
   no shutdown

...

DCGW1#config>service>vpls(2)#  
 vxlan instance 1 vni 2 create
 exit
bgp
 route-distinguisher 3:3 
bgp 2
 route-distinguisher 4:4               
 bgp-evpn        
  evi 2
  vxlan bgp 1 vxlan-instance 1
   no shutdown
  exit
  mpls bgp 2
   auto-bind-tunnel resolution any   
   no shutdown
 sap 1/1/1:1 create
 exit 

The above configuration associates I-ES-1 to the VXLAN instance in services VPLS1 and VPLS 2. The I-ES is modeled as a virtual ES, with the following considerations:

  • The commands network-interconnect-vxlan and service-id service-range svc-id [to svc-id] are required within the ES.

    • The network-interconnect-vxlan parameter identifies the VXLAN instance associated with the virtual ES. The value of the parameter must be set to 1. This command is rejected in a non-virtual ES.

    • The service-range parameter associates the specific service range to the ES. The ES must be configured as network-interconnect-vxlan before any service range can be added.

    • The ES parameters port, lag, sdp, vc-id-range, dot1q, and qinq cannot be configured in the ES when a network-interconnect-vxlan instance is configured. The source-bmac-lsb option is blocked, as the I-ES cannot be associated with an I-VPLS or PBB-Epipe service. The remaining ES configuration options are supported.

    • All services with two BGP instances associate the VXLAN destinations and ingress VXLAN instances to the ES.

  • Multiple services can be associated with the same ES, with the following considerations:

    • In a DC with two DCGWs (as in The Interconnect ES concept), only two I-ESs are needed to load-balance, where one half of the dual BGP-instance services would be associated with one I-ES (for example, I-ES-1, in the above configuration) and one half to another I-ES.

    • Up to eight service ranges per VXLAN instance can be configured. Ranges may overlap within the same ES, but not between different ESs.

    • The service range can be configured before the service.

  • After the I-ES is configured using network-interconnect-vxlan, the ES operational state depends exclusively on the ES administrative state. Because the I-ES is not associated with a physical port or SDP, when testing the non-revertive service carving manual mode, an ES shutdown and no shutdown event results in the node sending its own administrative preference and DP bit and taking control if the preference and DP bit are higher than the current DF. This is because the peer ES routes are not present at the EVPN application layer when the ES is configured for no shutdown; therefore, the PE sends its own administrative preference and DP values. For I-ESs, the non-revertive mode works only for node failures.

  • A VXLAN instance may be placed in MhStandby under any of the following situations:

    • if the PE is single-active NDF for that I-ES

    • if the VXLAN service is added to the I-ES and either the ES or BGP-EVPN MPLS is shut down in all the services included in the ES

      The following example shows the change of the MhStandby flag from false to true when BGP-EVPN MPLS is shut down for all the services in the I-ES.

      A:PE-4# show service id 500 vxlan instance 1 oper-flags 
      ===============================================================================
      VPLS VXLAN oper flags
      ===============================================================================
      MhStandby                              : false
      ===============================================================================
      A:PE-4# configure service vpls 500 bgp-evpn vxlan shutdown 
      *A:PE-4# show service id 500 vxlan instance 1 oper-flags    
      ===============================================================================
      VPLS VXLAN oper flags
      ===============================================================================
      MhStandby                              : true
      ===============================================================================
      
BGP-EVPN routes on dual BGP-instance services with I-ES

The configuration of an I-ES on DCGWs with two BGP instances has the following impact on the advertisement and processing of BGP-EVPN routes.

  • For EVPN MAC/IP routes, the following considerations apply:

    If bgp-evpn>vxlan>no auto-disc-route-advertisement and mh-mode access are configured on the access instance:

    • MAC/IP routes received in the EVPN-MPLS BGP instance are readvertised in the EVPN-VXLAN BGP instance with the ESI set to zero.

    • EVPN-VXLAN PEs and NVEs in the DC receive the same MAC from two or more different MAC/IP routes from the DCGWs, which perform regular EVPN MAC/IP route selection.

    • MAC/IP routes received in the EVPN-VXLAN BGP instance are readvertised in the EVPN-MPLS BGP instance with the configured non-zero I-ESI value, assuming the VXLAN instance is not in an MhStandby operational state; otherwise the MAC/IP routes are dropped.

    • EVPN-MPLS PEs in the WAN receive the same MAC from two or more DCGWs set with the same ESI. In this case, regular aliasing and backup functions occur as usual.

  • If bgp-evpn>vxlan>auto-disc-route-advertisement and mh-mode access are configured, the following differences apply to the above:

    • MAC/IP routes received in the EVPN-MPLS BGP instance are readvertised in the EVPN-VXLAN BGP instance with the ESI set to the I-ESI.

    • In this case, EVPN-VXLAN PEs and NVEs in the DC receive the same MAC from two or more different MAC/IP routes from the DCGWs, with the same ESI, therefore they can perform aliasing.

  • ES routes are exchanged for the I-ES. The routes should be sent only to the MPLS network and not to the VXLAN network. This can be achieved by using router policies.

  • AD per-ES and AD per-EVI routes are also advertised for the I-ES, and are sent only to the MPLS network and not to the VXLAN if bgp-evpn>vxlan>no auto-disc-route-advertisement is configured. For ES routes, router polices can be used to prevent these routes from being sent to VXLAN peers. If bgp-evpn>vxlan>auto-disc-route-advertisement is configured, AD routes must be sent to the VXLAN peers so that they can apply backup or aliasing functions.

In general, when I-ESs are used for redundancy, the use of router policies is needed to avoid control plane loops with MAC/IP routes. Consider the following to avoid control plane loops:

  • loops created by remote MACs

    Remote EVPN-MPLS MAC/IP routes are readvertised into EVPN-VXLAN routes with an SOO (Site Of Origin) EC added by a BGP peer or VSI export policy identifying the DCGW pair. The other DCGW in the pair drops EVPN-VXLAN MAC/IP routes tagged with the pair SOO. Router policies are needed to add SOO and drop routes received with self SOO.

    When remote EVPN-VXLAN MAC/IP routes are readvertised into EVPN-MPLS, the DCGWs automatically drop EVPN-MPLS MAC/IP routes received with their own non-zero I-ESI.

  • loops created by local SAP MACs

    Local SAP MACs are learned and MAC/IP routes are advertised into both BGP instances. The MAC/IP routes advertised in the EVPN-VXLAN instance are dropped by the peer based on the SOO router policies as described above for loops created by remote MACs. The DCGW local MACs are always learned over the EVPN-MPLS destinations between the DCGWs.

The following describes the considerations for BGP peer policies on DCGW1 to avoid control plane loops. Similar policies would be configured on DCGW2.

  • Avoid sending service VXLAN routes to MPLS peers and service MPLS routes to VXLAN peers.

  • Avoid sending AD and ES routes to VXLAN peers. If bgp-evpn>vxlan>auto-disc-route-advertisement is configured AD routes must be sent to the VXLAN peers.

  • Add SOO to VXLAN routes sent to the ES peer.

  • Drop VXLAN routes received from the ES peer.

The following shows the CLI configuration:

A:DCGW1# configure router bgp 
A:DCGW1>config>router>bgp# info 
----------------------------------------------
            family vpn-ipv4 evpn
            vpn-apply-import
            vpn-apply-export
            rapid-withdrawal
            rapid-update vpn-ipv4 evpn
            group "wan"
                type internal
                export "allow only mpls" 
                neighbor 192.0.2.4
                exit
                neighbor 192.0.2.5
                exit
            exit
            group "internal"
                type internal
                neighbor 192.0.2.1
                    export "allow only vxlan" 
                exit
                neighbor 192.0.2.3
                    import "drop SOO-DCGW-23" 
                    export "add SOO to vxlan routes" 
                exit                  
            exit
            no shutdown
----------------------------------------------
A:DCGW1>config>router>bgp# /configure router policy-options    
A:DCGW1>config>router>policy-options# info 
----------------------------------------------
            community "mpls" members "bgp-tunnel-encap:MPLS"
            community "vxlan" members "bgp-tunnel-encap:VXLAN"
            community "SOO-DCGW-23" members "origin:64500:23"

// This policy prevents the router from sending service VXLAN routes to MPLS peers. //
  policy-statement "allow only mpls"
                entry 10
                    from
                        community "vxlan"
                        family evpn
                    exit
                    action drop
                    exit
                exit
            exit

This policy ensures the router only exports routes that include the VXLAN encapsulation.

   policy-statement "allow only vxlan"
                entry 10
                    from
                        community "vxlan"
                        family evpn
                    exit
                    action accept
                    exit
                exit                  
                default-action drop
                exit
            exit

This import policy avoids importing routes with a self SOO.

  policy-statement "drop SOO-DCGW-23"
                entry 10
                    from
                        community "SOO-DCGW-23"
                        family evpn
                    exit
                    action drop
                    exit
                exit
            exit

This import policy adds SOO only to VXLAN routes. This allows the peer to drop routes based on the SOO, without affecting the MPLS routes.

  policy-statement "add SOO to vxlan routes"
                entry 10
                    from
                        community "vxlan"
                        family evpn
                    exit
                    action accept
                        community add "SOO-DCGW-23"
                    exit
                exit                  
                default-action accept
                exit
            exit
----------------------------------------------
Single-active multihoming on I-ES

When an I-ES is configured as single-active and configured as no shutdown with at least one associated service, the DCGWs send ES and AD routes as for any ES. It also runs DF election as normal, based on the ES routes, with the candidate list being pruned by the AD routes.

I-ES — single-active shows the expected behavior for a single-active I-ES.

Figure 66. I-ES — single-active

As shown in I-ES — single-active, the Non-Designated Forwarder (NDF) for a specified service carries out the following tasks:

  • From a datapath perspective, the VXLAN instance on the NDF goes into an MhStandby operational state and blocks ingress and egress traffic on the VXLAN destinations associated with the I-ES.

  • The MAC/IP routes and the FDB process

    • MAC/IP routes associated with the VXLAN instance and readvertised to EVPN-MPLS peers are withdrawn.

    • MAC/IP routes corresponding to local SAP MACs or EVPN-MPLS binding MACs are withdrawn if they were advertised to the EVPN-VXLAN instance.

    • Received MAC/IP routes associated with the VXLAN instance are not installed in the FDB. MAC/IP routes show as ‟used” in BGP; however, only the MAC/IP route received from MPLS (from the ES peer) is programmed.

  • The Inclusive Multicast Ethernet Tag (IMET) routes process

    • IMET-AR-R routes (IMET-AR with replicator role) must be withdrawn if the VXLAN instance goes into an MhStandby operational state. Only the DF advertises the IMET-AR-R routes.

    • IMET-IR advertisements in the case of the NDF (or MhStandby) are controlled by the command config>service>vpls>bgp-evpn>vxlan [no] send-imet-ir-on-ndf.

      By default, the command is enabled and the router advertises IMET-IR routes, even if the PE is NDF (MhStandby). This attracts BUM traffic, but also speeds up convergence in the case of a DF switchover. The command is supported for single-active and all-active.

      If the command is disabled, the router withdraws the IMET-IR routes when the PE is NDF and do not attract BUM traffic.

The I-ES DF PE for the service continues advertising IMET and MAC/IP routes for the associated VXLAN instance as usual, as well as forwarding on the DF VXLAN bindings. When the DF DCGW receives BUM traffic, it sends the traffic with the egress ESI label if needed.

All-active multihoming on I-ES

The same considerations for ES and AD routes, and DF election apply for all-active multihoming as for single-active multihoming; the difference is in the behavior on the NDF DCGW. The NDF for a specified service performs the following tasks:

  • From a datapath perspective, the NDF blocks ingress and egress paths for broadcast and multicast traffic on the VXLAN instance bindings associated with the I-ES, while unknown and known unicast traffic is still allowed. The unknown unicast traffic is transmitted on the NDF if there is no risk of duplication. For example, unknown unicast packets are transmitted on the NDF if they do not have an ESI label, do not have an EVPN BUM label, and they pass a MAC SA suppression. In the example in All-active multihoming and unknown unicast on the NDF, the NDF transmits unknown unicast traffic. Regardless of whether DCGW1 is a DF or NDF, it accepts the unknown unicast packets and floods to local SAPs and EVPN destinations. When sending to DGW2, the router sends the ESI label identifying the I-ES. Because of the ESI-label suppression, DCGW2 does not send unknown traffic back to the DC.

    Figure 67. All-active multihoming and unknown unicast on the NDF
  • The MAC/IP routes and the FDB process

    • MAC/IP routes associated with the VXLAN instance are advertised normally.

    • MACs are installed as normal in the FDB for received MAC/IP routes associated with the VXLAN instance.

  • The IMET routes process

    • As with single-active multihoming, IMET-AR-R routes must be withdrawn on the NDF (MhStandby state). Only the DF advertises the IMET-AR-R routes.

    • The IMET-IR advertisements in the case of the NDF (or MhStandby) are controlled by the command config>service>vpls>bgp-evpn>vxlan [no] send-imet-ir-on-ndf, as in single-active multihoming.

The behavior on the non-DF for BUM traffic can also be controlled by the command config>service>vpls>vxlan>rx-discard-on-ndf {bm | bum | none}, where the default option is bm. However, the user can change this option to discard all BUM traffic, or forward all BUM traffic (none).

The I-ES DF PE for the service continues advertising IMET and MAC/IP routes for the associated VXLAN instance as usual. When the DF DCGW receives BUM traffic, it sends the traffic with the egress ESI label if needed.

Multi-instance EVPN: Two instances of the same encapsulation in the same VPLS/R-VPLS service

As described in Multi-instance EVPN: Two instances of different encapsulation in the same VPLS/R-VPLS/Epipe service, two BGP instances are supported in VPLS services, where one instance can be associated with the EVPN-VXLAN and the other instance with the EVPN-MPLS. In addition, both BGP instances in a VPLS/R-VPLS service can also be associated with EVPN-VXLAN, or both instances can be associated with EVPN-MPLS.

For example, a VPLS service can be configured with two VXLAN instances that use VNI 500 and 501 respectively, and those instances can be associated with different BGP instances:

*A:PE-2# configure service vpls 500
*A:PE-2>config>service>vpls# info
----------------------------------------------
vxlan instance 1 vni 500 create
exit
vxlan instance 2 vni 501 create
exit
bgp
 route-distinguisher 192.0.2.2:500
 vsi-export "vsi-500-export"
 vsi-import "vsi-500-import"
exit
bgp 2
 route-distinguisher 192.0.2.2:501
 vsi-export "vsi-501-export"
 vsi-import "vsi-501-import"
exit
bgp-evpn
 incl-mcast-orig-ip 23.23.23.23
 evi 500
 vxlan bgp 1 vxlan-instance 1
 no shutdown
 exit
 vxlan bgp 2 vxlan-instance 2
 no shutdown
 exit
exit
stp
shutdown
exit
no shutdown
----------------------------------------------

From a data plane perspective, each VXLAN instance is instantiated in a different implicit SHG, so that traffic can be forwarded between them.

In addition, multi-instance EVPN-VXLAN services support:

  • assisted-replication for IPv4 VTEPs in both VXLAN instances, where a single assisted-replication IPv4 address can be used for both instances

  • non-system IP and IPv6 termination, where a single vxlan-src-vtep ip-address can be configured for each service, and therefore used for the two instances

For example, a VPLS service can be configured with two EVPN-MPLS instances that are associated with two BGP instances as follows.
*A:PE-2# configure service vpls 700
*A:PE-2>config>service>vpls# info
----------------------------------------------
description "two bgp-evpn mpls instances"
bgp
  route-distinguisher auto-rd
  vsi-export "vsi-700-export"
  vsi-import "vsi-700-import"
exit
bgp 2
  route-distinguisher auto-rd
  vsi-export "vsi-701-export"
  vsi-import "vsi-701-import"
exit
bgp-evpn
  evi 700
  mpls bgp 1
    mh-mode access
    ingress-replication-bum-label
    auto-bind-tunnel
      resolution any
    exit
    no shutdown
  exit
  mpls bgp 2
    mh-mode network
    ingress-replication-bum-label
    auto-bind-tunnel
      resolution any
    exit
    no shutdown
  exit
exit
stp
  shutdown
exit
no shutdown
----------------------------------------------

Multi-instance EVPN-MPLS VPLS/R-VPLS services have the same limitations as any multi-instance service, as described in Multi-Instance EVPN: EVPN-VXLAN and EVPN-MPLS in the same VPLS/R-VPLS service. In addition, services with two EVPN-MPLS instances do not support SAPs.

The mh-mode {network|access} command in the vpls>bgp-evpn>mpls context determines which instance is considered access and which instance is considered network.

  • The default form of the bgp-evpn>mpls command is mh-mode network and only one instance can be configured. The other instance must be configured as mh-mode access.
  • The use of provider-tunnel is supported if there is one instance configured as network, and the P2MP tunnel is implicitly associated with the network instance.

Multi-instance EVPN-MPLS VPLS/R-VPLS services support:

  • all of the auto-bind-tunnel resolution options in each of the two instances. This includes resolution of IPv4 next-hops to TTMv4 entries and resolution of IPv6 next-hops to TTMv6 entries.

  • different address families in different instances. For instance, mpls bgp 1 may resolve routes to TTMv4 and mpls bgp 2 to TTMv6, or the reverse. In a VPLS service with two EVPN-VXLAN instances, it is not possible to have an instance with routes resolved to IPv4 VXLAN tunnels and the other instance with routes resolved to IPv6 VXLAN tunnels.

  • an explicit split-horizon-group in each instance; however, the same split-horizon-group cannot be configured on the two instances of the same VPLS service

  • a restrict-protected-src discard-frame per instance. If a MAC is protected in one instance and a frame arrives at the other instance with the protected MAC as source MAC, the frame is discarded if restrict-protected-src discard-frame is configured.

At the control plane level for two EVPN-VXLAN or two EVPN-MPLS instances, the processing of MAC/IP routes and inclusive multicast routes is described in BGP-EVPN routes in services configured with two BGP instances with the differences between the two scenarios described in BGP-EVPN routes in multi-instance EVPN services with the same encapsulation.

BGP-EVPN routes in multi-instance EVPN services with the same encapsulation

If two BGP instances with the same encapsulation (VXLAN or MPLS) are configured in the same VPLS/R-VPLS service, different import route targets in each BGP instance are mandatory (although this is not enforced).

BGP-EVPN routes in services configured with two BGP instances describes the use of policies to avoid sending WAN routes (routes meant to be redistributed from DC to WAN) to the DC again and DC routes (routes meant to be redistributed from WAN to DC) to the WAN again. Those policies are based on export policy statements that match on the RFC 9012 BGP encapsulation extended community (MPLS and VXLAN respectively).

When the two BGP instances are of the same encapsulation (VXLAN or MPLS), the policies matching on different BGP encapsulation extended community are not feasible because both instances advertise routes with the same encapsulation value. Because the export route targets in the two BGP instances must be different, the policies, to avoid sending WAN routes back to the WAN and DC routes back to the DC, can be based on export policies that prevent routes with a DC route target from being sent to the WAN peers (and opposite for routes with a WAN route target).

In scaled scenarios, matching based on route targets, does not scale well. An alternative and preferred solution is to configure a default-route-tag that identifies all the EVPN instances connected to the DC (or one domain), and a different default-route-tag in all the EVPN instances connected to the WAN (or the other domain). Anycast redundant solution for multi-instance EVPN services with the same encapsulation shows an example that demonstrates the use of default-route-tags.

Other than the specifications described in this section, the processing of MAC/IP routes and inclusive multicast Ethernet tag routes in multi-instance EVPN services of the same encapsulation follow the rules described in BGP-EVPN routes in services configured with two BGP instances.

Anycast redundant solution for multi-instance EVPN services with the same encapsulation

The solution described in Anycast redundant solution for dual BGP-instance services is also supported in multi-instance EVPN VPLS/R-VPLS services with the same encapsulation.

The following CLI example output shows the configuration of DCGW-1 and DCGW-2 in Multihomed anycast solution where VPLS 500 is a multi-instance EVPN-VXLAN service and BGP instance 2 is associated with VXLAN instead of MPLS.

Different default-route-tags are used in BGP instance 1 and instance 2, so that in the export route policies, DC routes are not advertised to the WAN, and WAN routes are not advertised to the DC, respectively.

*A:DCGW-1(and DCGW-2)>config>service>vpls(500)# info
----------------------------------------------
vxlan instance 1 vni 500 create
exit
vxlan instance 2 vni 501 create
exit
bgp
 route-distinguisher 192.0.2.2:500
 route-target target:64500:500
exit
bgp 2
 route-distinguisher 192.0.2.2:501
 route-target target:64500:501
exit
bgp-evpn
 incl-mcast-orig-ip 23.23.23.23
 evi 500
 vxlan bgp 1 vxlan-instance 1
 default-route-tag 500
 no shutdown
 exit
 vxlan bgp 2 vxlan-instance 2
   default-route-tag 501
   no shutdown
 exit
exit
stp
shutdown
exit
no shutdown
----------------------------------------------
config>router>bgp#
 vpn-apply-import
 vpn-apply-export
 group "WAN"
  family evpn
  type internal
  export "allow only mpls"
  neighbor 192.0.2.6
 group "DC"
  family evpn
  type internal
export "allow only vxlan"
  neighbor 192.0.2.2

config>router>policy-options# info
----------------------------------------------
            policy-statement "allow only mpls"
                entry 10
                  from
                    family evpn 
                    tag 500 
                  action drop
                  exit
                exit
            exit
            policy-statement "allow only vxlan"
                entry 10
                  from
                    family evpn 
                    tag 501 
                  action drop
       exit
                  exit
            exit

The same Anycast redundant solution can be applied to VPLS/R-VPLS with two instances of EVPN-MPLS encapsulation. The configuration would be identical, other than replacing the VXLAN aspects with the EVPN-MPLS-specific parameters.

For a full description of this solution, see the Anycast redundant solution for dual BGP-instance services

I-ES solution for dual BGP EVPN instance services with the same encapsulation

The I-ES of network-interconnect VXLAN Ethernet segment is described in I-ES solution for dual BGP instance services. I-ES’s are also supported on VPLS and R-VPLS services with two EVPN-VXLAN instances.

I-ES in dual EVPN-VXLAN services shows the use of an I-ES in a dual EVPN-VXLAN instance service.

Figure 68. I-ES in dual EVPN-VXLAN services

Similar to (single-instance) EVPN-VXLAN all-active multihoming, the BUM forwarding procedures follow the ‟Local Bias” behavior.

At the ingress PE, the forwarding rules for EVPN-VXLAN services are as follows:

  • The no send-imet-ir-on-ndf or rx-discard-on-ndf bum command must be enabled so that the NDF does not forward any BUM traffic.

  • BUM frames received on any SAP or I-ES VXLAN binding are flooded to:

    • local non-ES and single-active DF ES SAPs

    • local all-active ES SAPs (DF and NDF)

    • EVPN-VXLAN destinations

      BUM received on an I-ES VXLAN binding follows SHG rules, for example, it can only be forwarded to EVPN-VXLAN destinations that belong to the other VXLAN instance (instance 2), which is a different SHG.

  • As an example, in I-ES in dual EVPN-VXLAN services:

    • GW1 and GW2 are configured with no send-imet-ir-on-ndf.

    • TOR1 generates BUM traffic that only reaches GW1 (DF).

    • GW1 forwards to CE1 and EVPN-VXLAN destinations.

The forwarding rules at the egress PE are as follows:

  • The source VTEP is looked up for BUM frames received on EVPN-VXLAN.

  • If the source VTEP matches one of the PEs with which the local PE shares an ES _AND_ a VXLAN service:

    • Then the local PE does not forward to the shared local ESs (this includes port, lag, or network-interconnect-vxlan ESs). It forwards though to non-shared ES SAPs unless they are in NDF state.

    • Else, the local PE forwards normally to local ESs unless they are in NDF state.

  • Because there is no multicast label or multicast B-MAC in VXLAN, the only way the egress PE can identify BUM traffic is by looking at the customer MAC DA. Therefore, BM or unknown MAC DAs identify BUM traffic.

  • As an example, in I-ES in dual EVPN-VXLAN services:

    • GW2 receives BUM on EVPN-VXLAN. GW2 identifies the source VTEP as a PE with which the I-ES-1 is shared, therefore it does not forward the BUM frames to the local I-ES. It forwards to the non-shared ES and local SAPs though (CE2).

    • GW3 receives BUM on EVPN-VXLAN, however the source VTEP does not match any PE with which GW3 shares an ES. Hence GW3 forwards to all local ESs that are DF, in other words, CE3.

The following configuration example shows how I-ES-1 would be provisioned on DCGW1 and the association between I-ES to a specified VPLS service. A similar configuration would occur on DCGW2 in the I-ES.

I-ES configuration:

*A:GW1>config>service>system>bgp-evpn>eth-seg# info 
----------------------------------------------
                esi 00:23:23:23:23:23:23:00:00:01
                service-carving
                    mode manual
                    manual
                        preference non-revertive create
                            value 150
                        exit
                        evi 101 to 200
                    exit
                exit
                multi-homing all-active
                network-interconnect-vxlan 1
                service-id
                    service-range 1
                    service-range 1000 to 1002
                    service-range 2000
                exit
                no shutdown

Service configuration:

*A:GW1>config>service>vpls# info 
----------------------------------------------
            vxlan instance 1 vni 1000 create
                rx-discard-on-ndf bum
            exit
            vxlan instance 2 vni 1002 create
            exit
            bgp
                route-target export target:64500:1000 import target:64500:1000
            exit
            bgp 2
                route-distinguisher auto-rd
                route-target export target:64500:1002 import target:64500:1002
            exit
            bgp-evpn
                evi 1000
                vxlan bgp 1 vxlan-instance 1
                    ecmp 2
                    default-route-tag 100
                    auto-disc-route-advertisement
                    no shutdown
                exit
                vxlan bgp 2 vxlan-instance 2
                    ecmp 2
                    default-route-tag 200
                    auto-disc-route-advertisement
                    mh-mode network
                    no shutdown
                exit
            exit
            no shutdown 

Multi-instance EVPN VPLS/R-VPLS services with two EVPN-MPLS instances do not support I-ESs.

For information about how the EVPN routes are processed and advertised in an I-ES, see the I-ES solution for dual BGP instance services.

Configuring static VXLAN and EVPN in the same VPLS/R-VPLS service

In some DCGW use cases, static VXLAN must be used to connect DC switches that do not support EVPN to the WAN so that a tenant subnet can be extended to the WAN. For those cases, the DC Gateway is configured with VPLS services that include a static VXLAN instance and a BGP-EVPN instance in the same service. The following combinations are supported in the same VPLS/R-VPLS service:

  • two static VXLAN instances

  • one static VXLAN instance and one EVPN-VXLAN instance

  • one static VXLAN instance and one EVPN-MPLS instance

When a static VXLAN instance coexists with EVPN-MPLS in the same VPLS/R-VPLS service, the VXLAN instance can be associated with a network-interconnect-vxlan ES if VXLAN uses instance 1. Both single-active and all-active multihoming modes are supported as follows:

  • In single-active mode, the following behavior is for a VXLAN binding associated with the ES on the NDF:

    • TX (transmission to VXLAN)

      No MACs are learned against the binding, and the binding is removed from the default multicast list.

    • RX (reception from VXLAN)

      The RX state is down for the binding.

  • In all-active mode, the following behavior is for the NDF:

    • on TX

      The binding is kept in the default multicast list, but only forwards the unknown-unicast traffic.

    • on RX

      The behavior is determined by the command rx-discard-on-ndf {bm | bum | none} where:

      • The option bm is the default option, discards broadcast and multicast traffic, and allows unicast (known and unknown).

      • The option bum discards any BUM frame on the NDF reception.

      • The option none does not discard any BUM frame on the NDF reception.

The use of the rx-discard-on-ndf options is shown in the following cases.

Use case 1: Static VXLAN with anycast VTEPs and all-active ES

This use case, which is illustrated in I-ES multihoming – static VXLAN with anycast VTEPs, works only for all-active I-ESs.

Figure 69. I-ES multihoming – static VXLAN with anycast VTEPs

In this use case, the DCGWs use anycast VTEPs, that is, PE1 has a single egress VTEP configured to the DCGWs, for example, 12.12.12.12. Normally, PE1 finds ECMP paths to send the traffic to both DCGWs. However, because a specified BUM flow can be sent to either the DF or the NDF (but not to both at the same time), the DCGWs must be configured with the following option so that BUM is not discarded on the NDF:

rx-discard-on-ndf none

Similar to any LAG-like scenario at the access, the access CE load balances the traffic to the multihomed PEs, but a specific flow is only sent to one of these PEs. With the option none, the BUM traffic on RX is accepted, and there are no duplicate packets or black-holed packets.

Use case 2: Static VXLAN with non-anycast VTEPs

This use case, which is shown in the following figure, works with single or all-active multihoming.

Figure 70. I-ES multihoming - static VXLAN with non-anycast VTEPs

In this case, the DCGWs use different VTEPs, for example 1.1.1.1 and 2.2.2.2 respectively. PE1 has two separate egress VTEPs to the DCGWs. Therefore, PE1 sends BUM flows to both DCGWs at the same time. Concerning all-active multihoming, if the default option for rx-discard-on-ndf is configured, PE2 and PE3 receive duplicate unknown unicast packets from PE1 (because the default option accepts unknown unicast on the RX of the NDF). So, the DCGWs must be configured with rx-discard-on-ndf bum .

Any use case in which the access PE sends BUM flows to all multihomed PEs, including the NDF, is similar to I-ES multihoming - static VXLAN with non-anycast VTEPs. BUM traffic must be blocked on the NDF’s RX to avoid duplicate unicast packets.

For single-active multihoming, the rx-discard-on-ndf is irrelevant because BUM and known unicast are always discarded on the NDF.

Also, when non-anycast VTEPs are used on DCGWs, the following can be stated:

  • MAC addresses learned on one DCGW and advertised in EVPN, are not learned on the redundant DCGW through EVPN, based on the presence of a local ES in the route. I-ES multihoming - static VXLAN with non-anycast VTEPs, shows a scenario in which the MAC of VM can be advertised by DCGW1, but not learned by DCGW2.

  • As a result of the above behavior and because PE2 known unicast to M1 can be aliased to DGW2, when traffic to M1 gets to DCGW2, it is flooded because M1 is unknown. DCGW2 floods to all the static bindings, as well as local SAPs.

  • ESI-label filtering, and no VXLAN binding between DCGWs, avoid loops for BUM traffic sent from the DF.

When a static VXLAN instance coexists with EVPN-VXLAN in the same VPLS or R-VPLS service, no VXLAN instance should be associated with an all-active network-interconnect-vxlan ES. This is because when multihoming is used with an EVPN-VXLAN core, the non-DF PE always discards unknown unicast traffic to the static VXLAN instance (this is not the case with EVPN-MPLS if the unknown traffic has a BUM label) and traffic blackholes may occur. This is discussed in the following example:

  • Consider the example in I-ES multihoming - static VXLAN with non-anycast VTEPs I-ES multihoming – static VXLAN with non-anycast VTEPs, only replacing EVPN-MPLS by EVPN-VXLAN in the WAN network.

  • Consider the PE2 has learned VM’s MAC via ES-1 EVPN destination. Because of the regular aliasing procedures, PE2 may send unicast traffic with destination VM to DCGW1, which is the non-DF for I-ES 1.

  • Because EVPN-VXLAN is used in the WAN instead of EVPN-MPLS, when the traffic gets to DCGW1, it is dropped if the VM’s MAC is not learned on DCGW1, creating a blackhole for the flow. If the I-ES had used EVPN-MPLS in the WAN, DCGW1 would have flooded to the static VXLAN binds and no blackhole would have occurred.

Because of the behavior illustrated above, when a static VXLAN instance coexists with an EVPN-VXLAN instance in the same VPLS/R-VPLS service, redundancy based on all-active I-ES is not recommended and single-active or an anycast solution without I-ES should be used instead. Anycast solutions are discussed in Anycast redundant solution for multi-instance EVPN services with the same encapsulation, only with a static VXLAN instance in instance 1 instead of EVPN-VXLAN in this case.

EVPN IP-prefix route interoperability

SR OS supports the three IP-VRF-to-IP-VRF models defined in draft-ietf-bess-evpn-prefix-advertisement for EVPN-VXLAN and EVPN-MPLS R-VPLS services. Those three models are known as:

  • interface-less IP-VRF-to-IP-VRF

  • interface-ful IP-VRF-to-IP-VRF with SBD IRB (Supplementary Bridge Domain Integrated Routing Bridging)

  • interface-ful IP-VRF-to-IP-VRF with unnumbered SBD IRB

SR OS supports all three models for IPv4 and IPv6 prefixes. The three models have pros and cons, and different vendors have chosen different models depending on the use cases that they intend to address. When a third-party vendor is connected to an SR OS router, it is important to know which of the three models the third-party vendor implements. The following sections describe the models and the required configuration in each of them.

Interface-ful IP-VRF-to-IP-VRF with SBD IRB model

The SBD is equivalent to an R-VPLS that connects all the PEs that are attached to the same tenant VPRN. Interface-ful refers to the fact that there is a full IRB interface between the VPRN and the SBD (an interface object with MAC and IP addresses, over which interface parameters can be configured).

Interface-ful IP-VRF-to-IP-VRF with SBD IRB model illustrates this model.

Figure 71. Interface-ful IP-VRF-to-IP-VRF with SBD IRB model

Interface-ful IP-VRF-to-IP-VRF with SBD IRB model shows a 7750 SR and a third-party router using interface-ful IP-VRF-to-IP-VRF with SBD IRB model. The two routers are attached to a VPRN for the same tenant, and those VPRNs are connected by R-VPLS-2, or SBD. Both routers exchange IP prefix routes with a non-zero gateway IP (this is the IP address of the SBD IRB). The SBD IRB MAC and IP are advertised in a MAC/IP route. On reception, the IP prefix route creates a route-table entry in the VPRN, where the gateway IP must be recursively resolved to the information provided by the MAC/IP route and installed in the ARP and FDB tables.

This model is described in detail in EVPN for VXLAN in IRB backhaul R-VPLS services and IP prefixes. As an example, and based on Interface-ful IP-VRF-to-IP-VRF with SBD IRB model above, the following CLI output shows the configuration of a 7750 SR SBD and VPRN, using on this interface-ful with SBD IRB mode:

7750SR#config>service#

vpls 2 customer 1 name "sbd" create
 allow-ip-int-bind
 exit
 bgp
 exit
 bgp-evpn
 evi 2
 ip-route-advertisement
 mpls bgp 1
 auto-bind-tunnel resolution any
 no shutdown

vprn 1 customer 1 name "vprn1" create
 route-distinguisher auto-rd
 interface "sbd" create
 address 192.168.0.1/16
 ipv6
   30::3/64
 exit
 vpls "sbd"

The model is, also, supported for IPv6 prefixes. There are no configuration differences except the ability to configure an IPv6 address and interface.

Interface-ful IP-VRF-to-IP-VRF with unnumbered SBD IRB model

Interface-ful refers to the fact that there is a full IRB interface between the VPRN and the SBD. However, the SBD IRB is unnumbered in this model, which means no IP address is configured on it. In SR OS, an unnumbered SBD IRB is equivalent to an R-VPLS linked to a VPRN interface through an EVPN tunnel. See EVPN for VXLAN in EVPN tunnel R-VPLS services for more information.

Interface-ful IP-VRF-to-IP-VRF with unnumbered SBD IRB model illustrates this model.

Figure 72. Interface-ful IP-VRF-to-IP-VRF with unnumbered SBD IRB model

Interface-ful IP-VRF-to-IP-VRF with unnumbered SBD IRB model shows a 7750 SR and a third-party router running interface-ful IP-VRF-to-IP-VRF with unnumbered SBD IRB model. The IP prefix routes are now expected to have a zero gateway IP and the MAC in the router's MAC extended community used for the recursive resolution to a MAC/IP route.

The corresponding configuration of the 7750 SR VPRN and SBD in the example could be:

7750SR#config>service#

vpls 2 customer 1 name "sbd" create
 allow-ip-int-bind
 exit
 bgp
 exit
 bgp-evpn
evi 2
ip-route-advertisement
mpls bgp 1
auto-bind-tunnel resolution any
no shutdown

vprn 1 customer 1 create
 route-distinguisher auto-rd
 interface "sbd" create
  ipv6
  exit
 vpls "sbd" 
  evpn-tunnel ipv6-gateway-address mac

Note that the evpn-tunnel command controls the use of the Router's MAC extended community and the zero gateway IP in the IPv4-prefix route. For IPv6, the ipv6-gateway-address mac option makes the router advertise the IPv6-prefix routes with a Router's MAC extended community and zero gateway IP.

Interoperable interface-less IP-VRF-to-IP-VRF model (Ethernet encapsulation)

This model is interface-less because no Supplementary Broadcast Domain (SBD) is required to connect the VPRNs of the tenant, and no recursive resolution is required upon receiving an IP prefix route. In other words, the next-hop of the IP prefix route is directly resolved to an EVPN tunnel, without the need for any other route. The standard specification draft-ietf-bess-evpn-ip-prefix supports two variants of this model that are not interoperable with each other:

  • EVPN IFL for Ethernet NVO (Network Virtualization Overlay) tunnels

    Ethernet NVO indicates that the EVPN packets contain an inner Ethernet header. This is the case for tunnels such as VXLAN.

    In the Ethernet NVO option, the ingress PE uses the received router's MAC extended community address (received along with the route type 5) as the inner destination MAC address for the EVPN packets sent to the prefix

  • EVPN IFL for IP NVO tunnels

    IP NVO indicates that the EVPN packets contain an inner IP packet, but without Ethernet header. This is similar to the IPVPN packets exchanged between PEs.

Interface-less IP-VRF-to-IP-VRF model illustrates the Interface-less IP-VRF-to-IP-VRF model.

Figure 73. Interface-less IP-VRF-to-IP-VRF model

SR OS supports the interoperable Interface-less IP-VRF-to-IP-VRF Model for Ethernet NVO tunnels. In Interface-less IP-VRF-to-IP-VRF model this interoperable model is shown on the left side PE router. The following is the model implementation:

  • There is no datapath difference between this model and the existing R-VPLS EVPN tunnel model or the model described in Interface-ful IP-VRF-to-IP-VRF with unnumbered SBD IRB model.

  • This model is enabled by configuring config>service>vprn>if>vpls>evpn-tunnel (with ipv6-gateway-address mac for IPv6), and bgp-evpn>ip-route-advertisement. In addition, because the SBD IRB MAC/IP route is no longer needed, the bgp-evpn no mac-advertisement command prevents the advertisement of the MAC/IP route.

  • The IP prefix routes are processed as follows:

    • On transmission, there is no change in the IP prefix route processing compared to the configuration of the Interface-ful IP-VRF-to-IP-VRF with Unnumbered SBD IRB Model.

      • IPv4/IPv6 prefix routes are advertised based on the information in the route-table for IPv4 and IPv6, with GW-IP=0 and the corresponding MAC extended community.

      • If bgp-evpn no mac-advertisement is configured, no MAC/IP route is sent for the R-VPLS.

    • The received IPv4/IPv6 prefix routes are processed as follows:

      1. Upon receiving an IPv4/IPv6 prefix route with a MAC extended community for the router, an internal MAC/IP route is generated with the encoded MAC and the RD, Ethernet tag, ESI, Label/VNI and next hop derived from the IP prefix route itself.

      2. If no competing received MAC/IP routes exist for the same MAC, this IP prefix-derived MAC/IP route is selected and the MAC is installed in the R-VPLS FDB with type ‟Evpn”.

      3. After the MAC is installed in FDB, there are no differences between this interoperable interface-less model and the interface-ful with unnumbered SBD IRB model. Therefore, SR OS is compatible with the received IP prefix routes for both models.

The following is an example of a typical configuration of a PE's SBD and VPRN that work in interface-less model for IPv4 and IPv6:

7750SR#config>service#

vpls 2 customer 1 name "sbd" create
       allow-ip-int-bind
       exit
       bgp
       exit
       bgp-evpn
            evi 2
            no mac-advertisement
            ip-route-advertisement
            mpls bgp 1
                 auto-bind-tunnel resolution any
                 no shutdown
vprn 1 customer 1 create
       route-distinguisher auto-rd
       interface "sbd" create
            ipv6
            exit
            vpls "sbd"
                 evpn-tunnel ipv6-gateway-address mac

Interface-less IP-VRF-to-IP-VRF model (IP encapsulation) for MPLS tunnels

In addition to the Interface-ful and interoperable Interface-less models described in the previous sections, SR OS also supports Interface-less Model (EVPN IFL) with IP encapsulation for MPLS tunnels. In the standard specification - draft-ietf-bess-evpn-ip-prefix - this refers to the EVPN IFL model for IP NVO tunnels.

Compared to the Ethernet NVO option, the ingress PE no longer pushes an inner Ethernet header, but the IP packet is directly encapsulated with an EVPN service label and the transport labels.

Interface-less IP-VRF-to-IP-VRF model for IP encapsulation in MPLS tunnels illustrates the Interface-less Model (EVPN IFL) with IP encapsulation for MPLS tunnels.

Figure 74. Interface-less IP-VRF-to-IP-VRF model for IP encapsulation in MPLS tunnels

EVPN IFL uses EVPN IP Prefix routes to exchange prefixes between PEs without the need for an R-VPLS service, termed Supplementary Broadcast Domain (SBD) in the standards, and any destination MAC lookup. The datapath used in EVPN IFL is the same as that is used for IP-VPN services in the VPRN.

In the example of Interface-less IP-VRF-to-IP-VRF model for IP encapsulation in MPLS tunnels:

  1. PE2 advertises IP Prefix 20.0/24 (shorthand for 20.0.0.0/24) in an EVPN IP Prefix route that does not contain a Router's MAC extended community anymore. As usual, and depicted in step 1, arriving frames with IP destination of 20.0.0.1 on PE1's R-VPLS-1 are processed for a route lookup on VPRN-1.

  2. However, in step 2 and as opposed to the previous models, the lookup yields a route-table entry that does not point at an SBD R-VPLS, but rather to an MPLS tunnel terminated on PE2. PE1 then pushes the EVPN service label that was received on the IP Prefix route at the top of the IP packet, and the packet is sent on the wire without any inner Ethernet header.

  3. In step 3, the MPLS tunnel is terminated on PE2 and the EVPN label identifies the VPRN-1 service for a route lookup.

  4. Step 4 corresponds to the regular R-VPLS forwarding that happens in the other EVPN L3 models.

A new vprn>bgp-evpn>mpls context has been added to configure a VPRN service for EVPN IFL. This context is like the one existing in VPLS and Epipe services and enables the use of EVPN IFL in the VPRN service. When configured, no R-VPLS with evpn-tunnel should be added to the VPRN, that is, no SBD is configured. As an example, in Interface-less IP-VRF-to-IP-VRF model for IP encapsulation in MPLS tunnels PE1 and PE2 VPRN-1 service are configured as follows:

[ex:configure service vprn "vprn-1"]
A:admin@PE1# info
    admin-state enable
    ecmp 2
    bgp-evpn {
        mpls 1 {
            admin-state enable
            route-distinguisher "192.0.2.1:12"
            vrf-target {
                community "target:64500:2"
            }
            auto-bind-tunnel {
                resolution any
            }
        }
    }
    interface "irb-1" {
        ipv4 {
            primary {
                address 10.0.0.254
                prefix-length 24
            }
        }
        vpls "r-vpls-1" {
        }
    }
[ex:configure service vprn "vprn-1"]
A:admin@PE2# info
    admin-state enable
    ecmp 2
    bgp-evpn {
        mpls 1 {
            admin-state enable
            route-distinguisher "192.0.2.2:21"
            vrf-target {
                community "target:64500:2"
            }
            auto-bind-tunnel {
                resolution any
            }
        }
    }
    interface "irb-2" {
        ipv4 {
            primary {
                address 20.0.0.254
                prefix-length 24
            }
        }
        vpls "r-vpls-1" {
        }
    }

ARP-ND host routes for extended Layer 2 Data Centers

SR OS supports the creation of host routes for IP addresses that are present in the ARP or neighbor tables of a routing context. These host routes are referred to as ARP-ND routes and can be advertised using EVPN or IP-VPN families. A typical use case where ARP-ND routes are needed is the extension of Layer 2 Data Centers (DCs). Extended Layer-2 Data Centers illustrates this use case.

Figure 75. Extended Layer-2 Data Centers

Subnet 10.0.0.0/16 in Extended Layer-2 Data Centers is extended throughout two DCs. The DC gateways are connected to the users of subnet 20.0.0.0/24 on PE1 using IP-VPN (or EVPN). If the virtual machine VM 10.0.0.1 is connected to DC1, when PE1 needs to send traffic to host 10.0.0.1, it performs a Longest Prefix Match (LPM) lookup on the VPRN’s route table. If the only IP prefix advertised by the four DC GWs was 10.0.0.0/16, PE1 could send the packets to the DC where the VM is not present.

To provide efficient downstream routing to the DC where the VM is located, DGW1 and DGW2 must generate host routes for the VMs to which they connect. When the VM moves to the other DC, DGW3 and DGW4 must be able to learn the VM’s host route and advertise it to PE1. DGW1 and DGW2 must withdraw the route for 10.0.0.1, because the VM is no longer in the local DC.

In this case, the SR OS is able to learn the VM’s host route from the generated ARP or ND messages when the VM boots or when the VM moves.

A route owner type called ‟ARP-ND” is supported in the base or VPRN route table. The ARP-ND host routes have a preference of 1 in the route table and are automatically created out of the ARP or ND neighbor entries in the router instance.

The following commands enable ARP-ND host routes to be created in the applicable route tables:

  • configure service vprn/ies interface arp-host-route populate {evpn | dynamic | static}

  • configure service vprn/ies interface ipv6 nd-host-route populate {evpn | dynamic | static}

When the command is enabled, the EVPN, dynamic and static ARP entries of the routing context create ARP-ND host routes in the route table. Similarly, ARP-ND host routes are created in the IPv6 route table out of static, dynamic, and EVPN neighbor entries if the command is enabled.

The arp and nd-host-route populate commands are used with the following features:

  • adding ARP-ND hosts

    A route tag can be added to ARP-ND hosts using the route-tag command. This tag can be matched on BGP VRF export and peer export policies.

  • keeping entries active

    The ARP-ND host routes are kept in the route table as long as the corresponding ARP or neighbor entry is active. Even if there is no traffic destined for them, the arp-proactive-refresh and nd-proactive-refresh commands configure the node to keep the entries active by sending an ARP refresh message 30 seconds before the arp-timeout or starting NUD when the stale time expires.

  • speeding up learning

    To speed up the learning of the ARP-ND host routes, the arp-learn-unsolicited and nd-learn-unsolicited commands can be configured. When arp-learn-unsolicited is enabled, received unsolicited ARP messages (typically GARPs) create an ARP entry, and consequently, an ARP-ND route if arp-populate-host-route is enabled. Similarly, unsolicited Neighbor Advertisement messages create a stale neighbor. If nd-populate-host-route is enabled, a confirmation message (NUD) is sent for all the neighbor entries created as stale, and if confirmed, the corresponding ARP-ND routes are added to the route table.

Note: The ARP-ND host routes are created in the route table but not in the routing context FIB. This helps preserve the FIB scale in the router.

In Extended Layer-2 Data Centers, enabling arp-host-route-populate on the DCGWs allows them to learn or advertise the ARP-ND host route 10.0.0.1/32 when the VM is locally connected and to remove or withdraw the host routes when the VM is no longer present in the local DC.

ARP-ND host routes installed in the route table can be exported to VPN IPv4, VPN IPv6, or EVPN routes. No other BGP families or routing protocols are supported.

EVPN host mobility procedures within the same R-VPLS service

EVPN host mobility is supported in SR OS as in Section 4 of draft-ietf-bess-evpn-inter-subnet-forwarding. When a host moves from a source PE to a target PE, it can behave in one of the following ways:

  • The host initiates an ARP request or GARP upon moving to the target PE.

  • The host sends a data packet without first initiating an ARP request of GARP.

  • The host is silent.

The SR OS supports the above scenarios as follows.

EVPN host mobility configuration

Host mobility within the same R-VPLS – initial phase shows an example of a host connected to a source PE, PE1, that moved to the target, PE2. The figure shows the expected configuration on the VPRN interface, where R-VPLS 1 is attached (for both PE1 and PE2). PE1 and PE2 are configured with an ‟anycast gateway”, that is, a VRRP passive instance with the same backup MAC and IP in both PEs.

Figure 76. Host mobility within the same R-VPLS – initial phase

In this initial phase:

  1. PE1 learns Host-1 IP to MAC (10.1-M1) in the ARP table and generates a host route (RT5) for 10.1/32, because Host-1 is locally connected to PE1. In particular:

    • arp-learn-unsolicited triggers the learning of 10.1-M1 upon receiving a GARP from Host-1 or any other ARP

    • arp-proactive-refresh triggers the refresh of host-1’s ARP entry 30 seconds before the entry ages out

    • local-proxy-arp makes sure PE1 replies to any received ARP request on behalf of other hosts in the R-VPLS

    • arp-host-route populate dynamic ensures that only the dynamically learned ARP entries create a host route, for example, 10.1

    • no flood-garp-and-unknown-req suppresses ARP flooding (from CPM) within the R-VPLS1 context and reduces significantly the unnecessary ARP flooding because the ARP entries are synchronized through EVPN

    • advertise dynamic triggers the advertisement of MAC/IP routes for the dynamic ARP entries, including the IP and MAC addresses, for example, 10.1-M1; a MAC/IP route for M1-only that has been previously advertised as M1 is learned on the FDB as local or dynamic

  2. PE2 learns Host-1 10.1-M1 in the ARP and FDB tables as EVPN type. PE2 must not learn 10.1-M1 as dynamic, so that PE2 is prevented from advertising an RT5 for 10.1/32. If PE2 advertises 10.1/32, then PE3 could select PE2 as the next-hop to reach Host-1, creating an unwanted hair-pinning forwarding behavior. PE2 is expected to have the same configuration as PE1, including the following commands, as well as those described for PE1:

    • no learn-dynamic prevents PE2 from learning ARP entries from ARP traffic received on an EVPN tunnel.

    • populate dynamic, as in PE1, makes sure PE2 only creates route-table ARP-ND host routes for dynamic entries. Hence, 10.1-M1 does not create a host route as long as it is learned via EVPN only.

The configuration described in this section and the cases in the following sections are for IPv4 hosts, however, the functionality is also supported for IPv6 hosts. The IPv6 configuration requires equivalent commands, that use the prefix "nd-" instead of "arp-". The only exception is the flood-garp-and-unknown-req command, which does not have an equivalent command for ND.

Host initiates an ARP/GARP upon moving to the target PE

An example is illustrated in Host mobility within the same R-VPLS – move with GARP. This is the expected behavior based on the configuration described in EVPN host mobility configuration.

  1. Host-1 moves from PE1 to PE2 and issues a GARP with 10.1-M1.

  2. Upon receiving the GARP, PE2 updates its FDB and ARP table.

  3. The route-table entry for 10.1/32 changes from EVPN to type arp-nd (based on populate dynamic), therefore, PE2 advertises a RT5 with 10.1/32. Also, M1 is now learned in FDB and ARP as local, therefore, MAC/IP routes with a higher sequence number are advertised (one MAC/IP route with M1 only and another one with 10.1-M1).

  4. Upon receiving the routes, PE1:

    1. Updates its FDB and withdraws its RT2(M1) based on the higher SEQ number.

    2. Updates its ARP entry 10.1-M1 from dynamic to type evpn.

    3. Removes its arp-nd host from the route-table and withdraws its RT5 for 10.1/32 (based on populate dynamic).

  5. The move of 10.1-M1 from dynamic to evpn triggers an ARP request from PE1 asking for 10.1. The no flood-garp-and-unknown-req command prevents PE1 from flooding the ARP request to PE2.

Figure 77. Host mobility within the same R-VPLS – move with GARP

After step 5, no one replies to PE1’s ARP request and the procedure is over. If a host replied to the ARP for 10.1, the process starts again.

Host sends a data packet upon a move to target PE

In this case, the host does not send a GARP/ARP packet when moving to the target PE. Only regular data packets are sent. The steps are illustrated in Host mobility within the same R-VPLS – move with data packet.

  1. Host-1 moves from PE1 to PE2 and issues a (non-ARP) frame with MAC SA=M1.

  2. When receiving the frame, PE2 updates its FDB and starts the mobility procedures for M1 (because it was previously learned from EVPN). At the same time, PE2 also creates a short-lived dynamic ARP entry for the host, and triggers an ARP request for it.

  3. PE2 advertises a RT2 with M1 only, and a higher sequence number.

  4. PE1 receives the RT2, updates its FDB and withdraws its RT2s for M1 (this includes the RT2 with M1-only and the RT2 with 10.1-M1).

  5. PE1 issues an ARP request for 10.1, triggered by the update on M1.

    In this case, the PEs are configured with flood-garp-and-unknown-req and therefore, the generated ARP request is flooded to local SAP and SDP-binds and EVPN destinations. When the ARP request gets to PE2, it is flooded to PE2’s SAP and SDP-binds and received by Host-1.

  6. Host-1 sends an ARP reply that is snooped by PE2 and triggers a similar process described in Host initiates an ARP/GARP upon moving to the target PE (this is illustrated in the following).

    Because passive VRRP is used in this scenario, the ARP reply uses the anycast backup MAC that is consumed by PE2.

  7. Upon receiving the ARP reply, PE2 updates its ARP table to dynamic.

  8. Because the route-table entry for 10.1/32 now changes from EVPN to type arp-nd (based on populate dynamic), PE2 advertises a RT5 with 10.1/32. Also, M1 is now learned in ARP as local, therefore a RT2 for 10.1-M1 is sent (the sequence number follows the RT2 with M1 only).

  9. Upon receiving the route, PE1:

    1. Updates the ARP entry 10.1-M1, from type local to type evpn.

    2. Removes its arp-nd host from the route-table and withdraws its RT5 for 10.1/32 (based on populate dynamic).

Figure 78. Host mobility within the same R-VPLS – move with data packet
Silent host upon a move to the target PE

This case assumes the host moves but it stays silent after the move. The steps are illustrated in Host mobility within the same R-VPLS – silent host.

  1. Host-1 moves from PE1 to PE2 but remains silent.

  2. Eventually M1 ages out in PE1’s FDB and the RT2s for M1 are withdrawn. This update on M1 triggers PE1 to issue an ARP request for 10.1.

    The flood-garp-and-unknown-req is configured. The ARP request makes it to PE2 and Host-1.

  3. Host-1 sends an ARP reply that is consumed by PE2. FDB and ARP tables are updated.

  4. The FDB and ARP updates trigger RT2s with M1-only and with 10.1-M1. Because an arp-nd dynamic host route is also created in the route-table, an RT5 with 10.1/32 is triggered.

  5. Upon receiving the routes, PE1 updates FDB and ARP tables. The update on the ARP table from dynamic to evpn removes the host route from the route-table and withdraws the RT5 route.

Figure 79. Host mobility within the same R-VPLS – silent host

BGP and EVPN route selection for EVPN routes

When two or more EVPN routes are received at a PE, BGP route selection typically takes place when the route key or the routes are equal. When the route key is different, but the PE has to make a selection (for instance, the same MAC is advertised in two routes with different RDs), BGP hands over the routes to EVPN and the EVPN application performs the selection.

EVPN and BGP selection criteria are described below:

  • EVPN route selection for MAC routes

    When two or more routes are received with the same mac-length/mac but different route key, BGP hands the routes over to EVPN. EVPN selects the route based on the following tiebreaking order:

    1. Conditional static MACs (local protected MACs)

    2. Auto-learned protected MACs (locally learned MACs on SAPs or mesh or spoke SDPs because of the configuration of auto-learn-mac-protect)

    3. EVPN ES PBR MACs (see ES PBR MAC routes below)

    4. EVPN static MACs (remote protected MACs)

    5. Data plane learned MACs (regular learning on SAPs or SDP bindings) and EVPN MACs with higher SEQ numbers. Learned MACs and EVPN MACs are considered equal if they have the same SEQ number.

    6. EVPN MACs with higher SEQ number

    7. EVPN E-tree root MACs

    8. EVPN non-RT-5 MACs (this tie-breaking rule is only observed if the selection algorithm is comparing received MAC routes and internal MAC routes derived from the MACs in IP-Prefix routes, for example, RT-5 MACs)

    9. Lowest IP (next-hop IP of the EVPN NLRI)

    10. Lowest Ethernet tag (that is zero for MPLS and may be different from zero for VXLAN)

    11. Lowest RD

    12. Lowest BGP instance (this tie-breaking rule is only considered if the above rules fail to select a unique MAC and the service has two BGP instances of the same encapsulation)

  • ES PBR MAC routes

    When a PBR filter with a forward action to an ESI and SF-IP (Service Function IP) exists, a MAC route is created by the system. This MAC route is compared to other MAC routes received from BGP.

    • When ARP resolves (it can be static, EVPN, or dynamic) for a SF-IP and the system has an AD EVI route for the ESI, a ‟MAC route” is created by ES PBR with the <MAC Address = ARPed MAC Address, VTEP = AD EVI VTEP, VNI = AD EVI VNI, RD = ES PBR RD (special RD), Static = 1> and installed in EVPN.

    • This MAC route does not add anything (back) to ARP; however, it goes through the MAC route selection in EVPN and triggers the FDB addition if it is the best route.

    • In terms of priority, this route's priority is lower than local static but higher than remote EVPN static (number 2 in the tiebreaking order above).

    • If there are two competing ES PBR MAC routes, then the selection goes through the rest of checks (Lowest IP > Lowest RD).

  • EVPN route selection for IP-prefix and IPv6-prefix routes

    See Route selection across EVPN-IFL and other owners in the VPRN service.

  • EVPN route selection for EVPN AD per-EVI routes

    See Route selection of AD per-EVI routes.

  • BGP route selection

    The BGP route selection for MAC routes with the same route-key follows the following priority order:

    1. EVPN static MACs (remote protected MACs).

    2. EVPN MACs with higher sequence number.

    3. Regular BGP selection (local-pref, aigp metric, shortest as-path, lowest IP).

    Regular BGP selection is followed for the rest of the EVPN routes.

Note: In case BGP has to run an actual selection and a specified (otherwise valid) EVPN route 'loses' to another EVPN route, the non-selected route is displayed by the show router BGP routes evpn x detail command with a tie-breaker reason.
Note: Protected MACs do not overwrite EVPN static MACs; in other words, if a MAC is in the FDB and protected because being received with the sticky/static bit set in a BGP EVPN update and a frame is received with the source MAC on an object configured with auto-learn-mac-protect, that frame is dropped because of the implicit restrict-protected-src discard-frame. The reverse is not true; when a MAC is learned and protected using auto-learn-mac-protect, its information is not overwritten with the contents of a BGP update containing the same MAC address.

LSP tagging for BGP next-hops or prefixes and BGP-LU

It is possible to constrain the tunnels used by the system for resolution of BGP next-hops or prefixes and BGP labeled unicast routes using LSP administrative tags. For more information, see the 7450 ESS, 7750 SR, 7950 XRS, and VSR MPLS Guide, "LSP Tagging and Auto-Bind Using Tag Information".

Oper-groups interaction with EVPN services

Operational groups, also referred to as oper-groups, are supported in EVPN services. In addition to supporting SAP and SDP-binds, oper-groups can also be configured under the following objects:

  • EVPN-VXLAN instances (except on Epipe services)

  • EVPN-MPLS instances

  • Ethernet segments

These oper-groups can be monitored in LAGs or service objects. Oper-groups are particularly useful for the following applications:

  • Link Loss Forwarding (LLF) for EVPN VPWS services

  • core isolation blackhole avoidance

  • LAG standby signaling to CE on non-DF EVPN PEs (single-active)

LAG-based LLF for EVPN-VPWS services

SR OS uses Eth-CFM fault-propagation to support CE-to-CE fault propagation in EVPN-VPWS services. That is, upon detecting a CE failure, an EVPN-VPWS PE withdraws the corresponding Auto-Discovery per-EVI route, which then triggers a down MEP on the remote PE that signals the fault to the connected CE. In cases where the CE connected to EVPN-VPWS services does not support Eth-CFM, the fault can be propagated to the remote CE by using LAG standby-signaling, which can be LACP-based or simply power-off.

Link loss forwarding for EVPN-VPWS shows an example of link loss forwarding for EVPN-VPWS.

Figure 80. Link loss forwarding for EVPN-VPWS

In this example, PE1 is configured as follows:

A:PE1>config>lag(1)# info 
----------------------------------------------
mode access
encap-type null 
port 1/1/1
port 1/1/2
standby-signaling power-off
monitor-oper-group "llf-1"
no shutdown
----------------------------------------------
*A:PE1>config>service>epipe# info
----------------------------------------------
bgp
exit
bgp-evpn
    evi 1
    local-attachment-circuit ac-1  
        eth-tag 1
        exit
    remote-attachment-circuit ac-2 
        eth-tag 2
        exit
    mpls bgp 1
        oper-group "llf-1"
        auto-bind-tunnel
            resolution any
        exit
        no shutdown
    exit
sap lag-1 create
no shutdown
exit
no shutdown

The following applies to the PE1 configuration:

  • The EVPN Epipe service is configured on PE1 with a null LAG SAP and the oper-group ‟llf-1” under bgp-evpn>mpls. This is the only member of oper-group ‟llf-1”.

    Note: Do not configure the oper-group under config>service>epipe, because circular dependencies are created when the access SAPs go down because of the LAG monitor-oper-group command.
  • The operational group monitors the status of the BGP-EVPN instance in the Epipe service. The status of the BGP-EVPN instance is determined by the existence of an EVPN destination at the Epipe.

  • The LAG, in access mode and encap-type null, is configured with the command monitor-oper-group ‟llf-1”.

    Note: The configure>lag>monitor-oper-group name command is only supported in access mode. Any encap-type can be used.

As shown in Link loss forwarding for EVPN-VPWS, upon failure on CE2, the following events occur:

  1. PE2 withdraws the EVPN route.

  2. The EVPN destination is removed in PE1 and oper-group ‟llf-1” also goes down.

  3. Because lag-1 is monitoring ‟llf-1”, the oper-group that is becoming inactive triggers standby signaling on the LAG; that is, power-off or LACP out-of-sync signaling to the CE1.

    When the SAP or port is down because of the LAG monitoring of the oper-group, PE1 does not trigger an AD per-EVI route withdrawal, even if the SAP is brought operationally down.

  4. After CE2 recovers and PE2 re-advertises the AD per-EVI route, PE1 creates the EVPN destination and oper-group ‟llf-1” comes up. As a result, the monitoring LAG stops signaling standby and the LAG is brought up.

Core isolation blackhole avoidance

Core isolation blackhole avoidance shows how blackholes can be avoided when a PE becomes isolated from the core.

Figure 81. Core isolation blackhole avoidance

In this example, consider that PE2 and PE1 are single-active multihomed to CE1. If PE2 loses all its core links, PE2 must somehow notify CE1 so that PE2 does not continue attracting traffic and so that PE1 can take over. This notification is achieved by using oper-groups under the BGP-EVPN instance in the service. The following is an example output of the PE2 configuration.

*[ex:configure service vpls ”evi1"]
A:admin@PE-2# info
    admin-state enable 
    bgp-evpn { 
        evi 1 
        mpls 1 { 
            admin-state enable 
            oper-group ‟evpn-mesh”
            auto-bind-tunnel { 
                resolution any 
            } 
        } 
    } 
    sap lag-1:351 { 
        monitor-oper-group ‟evpn-mesh”
            } 
*[ex:configure service oper-group ”evpn-mesh"]
A:admin@PE-2# info detail 
    hold-time { 
        up 4 
    } 

With the PE2 configuration and Core isolation blackhole avoidance example, the following steps occur:

  1. PE2 loses all its core links, therefore, it removes its EVPN-MPLS destinations. This causes oper-group ‟evpn-mesh” to go down.

  2. Because PE2 is the DF in the Ethernet Segment (ES) ES-1 and sap lag-1:351 is monitoring the oper-group, the SAP becomes operationally down. If ETH-CFM fault propagation is enabled on a down MEP configured on the SAP, CE1 is notified of the failure.

  3. PE1 takes over as the DF based on the withdrawal of the ES (and AD) routes from PE2, and CE1 begins sending traffic immediately to PE1 only, therefore, avoiding a traffic blackhole.

Generally, when oper-groups are associated with EVPN instances:

  • The oper-group state is determined by the existence of at least one EVPN destination in the EVPN instance.

  • The oper-group that is configured under a BGP EVPN instance cannot be configured under any other object (for example, SAP, SDP binding, and so on) of the same or different service.

  • The status of an oper-group associated with an EVPN instance does not go down if all the EVPN destinations are operationally down due to a control-word or MTU mismatch.
  • The status of an oper-group associated with an EVPN instance goes down in the following cases:

    • the service admin-state is disabled (only for VPLS services, not for Epipes)

    • the BGP EVPN VXLAN or MPLS admin-state are disabled

    • there are no EVPN destinations associated with the instance

LAG or port standby signaling to the CE on non-DF EVPN PEs (single-active)

As described in EVPN for MPLS tunnels, EVPN single-active multihoming PEs that are elected as non-DF must notify their attached CEs so the CE does not send traffic to the non-DF PE. This can be performed on a per-service basis that is based on the ETH-CFM and fault-propagation. However, sometimes ETH-CFM is not supported in multihomed CEs and other notification mechanisms are needed, such as LACP standby or power-off. This scenario is shown in the following figure.

Figure 82. LACP standby signaling from the non-DF

As shown in the preceding figure, the multihomed PEs are configured with multiple EVPN services that use ES-1. ES-1 and its associated LAG is configured as follows:

*[ex:configure lag 1]
A:admin@PE-2# info
    admin-state enable 
    standby-signaling {power-off|lacp} 
    monitor-oper-group ”DF-signal-1" 
    mode access 
    port 1/1/c2/1 { 
    }
<snip>
ex:configure service system bgp evpn]
A:admin@PE-2# info 
    ethernet-segment "ES-1" { 
        admin-state enable 
        esi 0x01010000000000000000 
        multi-homing-mode single-active 
        oper-group ‟DF-signal-1” 
        association { 
            lag 1 { 
            } 
<snip>

When the operational group is configured on the ES and monitored on the associated LAG:

  • The operational group status is driven by the ES DF status (defined by the number of DF SAPs or oper-up SAPs owned by the ES).

  • The operational group goes down if all the SAPs in the ES go down (this happens in PE2 in LACP standby signaling from the non-DF). The ES operational group goes up when at least one SAP in the ES goes up.

    As a result, if PE2 becomes non-DF on all the SAPs in the ES, they all go operationally down, including the ES-1 operational group.

  • Because LAG-1 is monitoring the operational group, when its status goes down, LAG-1 signals LAG standby state to the CE. The standby signaling can be configured as LACP or power-off.

  • The ES and AD routes for the ES are not withdrawn because the router recognizes that the LAG becomes standby for the ES operational group.

If the Single-Active ES is associated with a port instead of a LAG, the config>port> monitor-oper-group DF-signal-1 command can be configured. In this case, the port monitors the ES operational group and the following rules apply:

  • As in the case of the LAG, if the ES goes non-DF, its operational group also goes down.
  • The port that is monitoring the ES operational group signals standby state by powering off the port itself.
  • As in the case of the LAG, the ES and AD routes for the ES are not withdrawn because the router recognizes that the port is in standby state because of the ES operational group.

Operational groups cannot be assigned to ESs that are configured as virtual, all-active or service-carving mode auto.

AC-Influenced DF Election Capability on an ES with oper-group

The Attachment Circuit Influenced (AC-Influenced) Designated Forwarder Election Capability (AC-DF), as described in RFC8584, is supported in SR OS. By default, the ac-df-capability command is set to the include option. This configuration addresses the need to consider EVPN Auto-discovery per EVI/ES (AD per EVI/ES) routes for a specific PE, which ensures that the PE is included on the candidate DF list.

Configuring ac-df-capability to exclude disables the AC-DF capability. When ac-df-capability exclude is configured on a specific ES, the presence or absence of the AD per EVI/ES routes from the ES peers does not modify the DF Election candidate list for the ES. The exclude option is recommended in ESs that use an oper-group, that is monitored by the access LAG, to signal standby lacp or power-off, as described in LAG or port standby signaling to the CE on non-DF EVPN PEs (single-active). All PE routers attached to the same ES must be configured consistently for the specific ac-df-capability.

EVPN Layer 3 OISM

Optimized Inter-Subnet Multicast (OISM) is an EVPN-based solution that optimizes the forwarding of IP multicast across R-VPLS of the same or a different subnet. EVPN OISM is supported for EVPN-MPLS and EVPN-VXLAN services, IPv4 and IPv6 multicast groups, and is described in this section.

Introduction and terminology

EVPN OISM is similar to Multicast VPNs (MVPN) in some aspects, because it does IP multicast routing in VPNs, uses MP-BGP to signal the interest of a PE in a specified multicast group and uses Provider Multicast Service Interface (PMSI) trees among the PEs to send and receive the IP multicast traffic.

However, OISM is simpler than MVPN and allows efficient multicast in networks that integrate Layer 2 and Layer 3; that is, networks where PEs may be attached to different subnets, but could also be attached to the same subnet.

OISM is simpler than MVPN in some aspects:

  • it does not need to setup shared trees (that need to switchover to shortest path trees)

  • it does not require of the MVPN Any Source Multicast (ASM) complex procedures or the Rendezvous Point (RP) function

  • it does not require Upstream Multicast Hop (UMH) selection and therefore does not have the UMH potential issues and limitations described in RFC6513 and RFC6514

  • multiple PEs can be attached to the same Receiver subnet or Source subnet, which provides full flexibility when designing the multicast network

EVPN OISM is defined by draft-ietf-bess-evpn-irb-mcast and uses the following terminology that is also used in the rest of this section:

BD with IRB
Broadcast Domain with an Integrated Routing and Bridging interface. It is an R-VPLS service in SR OS.
Ordinary BD
refers to an R-VPLS where sources or receivers, or both, are connected
SBD
Supplementary Broadcast Domain. It is a backhaul R-VPLS that connects the PEs' VPRN services and is configured as an evpn-tunnel interface in the VPRN services. The SBD is mandatory in OISM and is needed to receive multicast traffic on the PEs that are not attached to the source ordinary BD.
EVPN Tenant Domain
refers to the group of BDs and IP-VRFs (VPRNs) of the same tenant
SMET route or EVPN route type 6
the EVPN route that the PEs use to signal interest for a specific multicast group (S ,G) or (*,G)
IIF and OIF
refers to Incoming Interface and Outgoing Interface. A multicast enabled VPRN has Layer 3 IIF and OIFs. A multicast enabled R-VPLS have Layer 2 OIFs.
Upstream and Downstream PEs
refers to the PEs that are connected to sources and receivers respectively
I-PMSIs and S-PMSIs
refers to Inclusive and Selective (Provider Multicast Service Interface) trees. The inclusive trees are signaled via IMET routes and include all the PEs attached to the service. The selective trees are signaled via S-PMSI A-D routes, and only the downstream PEs with receivers for the group signaled by the S-PMSI A-D route join the tree.
S-PMSI A-D route or EVPN route type 10
Selective Provider Multicast Service Interface (S-PMSI) Auto-Discovery route, the EVPN route that the root PEs use to signal S-PMSI trees, when the root PE decides that setting up a specific tree for a specific (S,G) or (*G) is needed.

OISM forwarding plane

In an EVPN OISM network, it is assumed that the sources and receivers are connected to ordinary BDs and EVPN is the only multicast control plane protocol used among the PEs. Also, the subnets (and optionally hosts) are advertised normally by the EVPN IP Prefix routes. The IP-Prefix routes are installed in the PEs' VPRN route tables and are used for multicast RPF checks when routing multicast packets. EVPN OISM forwarding plane illustrates a simple EVPN OISM network.

Figure 83. EVPN OISM forwarding plane

In EVPN OISM forwarding plane, and from the perspective of the multicast flow (S1,G1), PE1 is considered an upstream PE, whereas PE2 and PE3 are downstream PEs. The OISM forwarding rules are as follows.

  • On the upstream PE (PE1), the multicast traffic is sent to local receivers irrespective of the receivers being attached to the source BD (BD1) or not (BD2).

    Note: OISM does not use any multicast Designated Router (DR) concept, therefore the upstream PE always routes locally as long as it has local receivers.
  • On downstream PEs that are attached to the source BD (PE2), the multicast traffic is always received on the source BD (BD1) and forwarded locally to receivers in the same or different ordinary BD (as in the case of Receiver-22 or Receiver-21). Multicast traffic received on this PE is never sent back to the SBD or remote EVPN PEs.

  • On downstream PEs that are not attached to the source BD (PE3), the multicast traffic is always received on the SBD and sent to local receivers. Multicast received on this PE is never sent to remote EVPN PEs.

    Note: In order for PE3 to receive the multicast traffic on the SBD, the source PE, PE1, forms an EVPN destination from BD1 to PE3's SBD. This EVPN destination on PE1 is referred to as an SBD destination.

OISM control plane

OISM uses the Selective Multicast Ethernet Tag (SMET) route or route type 6 to signal interest on a specific (S,G) or (*,G). Use of the SMET route provides an example.

Figure 84. Use of the SMET route

As shown in Use of the SMET route, a PE with local receivers interested in a multicast group G1 issues an SMET route encoding the source and group information (upon receiving local IGMP join messages for that group). EVPN OISM uses the SMET route in the following way:

  • A route type-6 (SMET) can carry information for IPv4 or IPv6 multicast groups, for (S,G) or (*,G) or even wildcard groups (*,*).

    Note: MVPN uses different route types or even families to address the different multicast group types.
  • The SMET routes are advertised with the route-target of the SBD, that guarantees that the SMET routes are imported by all the PEs of the tenant.

  • The SMET routes also help minimize the control plane overhead because they aggregate the multicast state created on the downstream PEs. This is illustrated in Use of the SMET route, where PE2 sends the minimum number of SMET routes to pull multicast traffic for G1. That is, if PE2 has state for (S1,G1) and (*,G1), the SMET route for (*,G1) is enough to attract the multicast traffic required by the local receivers. There is no need to send an SMET route for (S1,G1) and a different route for (*,G1). Only (*,G1) SMET route is advertised.

  • The SMET routes also provide an implicit S-PMSI (Selective Provider Multicast Service Interface) tree in case Ingress Replication is used to transport IP multicast. That is, PE1 sends the multicast traffic only to the PEs requesting it, for example, PE2 and not to PE3. In MVPN, even for Ingress Replication, a separate S-PMSI tree is setup to avoid PE1 from sending multicast to PE3.

EVPN OISM and multihoming

EVPN OISM supports multihomed multicast sources and receivers.

While MVPN requires complex UMH (Upstream Multicast Hop) selection procedures to provide multihoming for sources, EVPN simply reuses the existing EVPN multihoming procedures. EVPN OISM and multihomed sources illustrates an example of a multihomed source that makes use of EVPN all-active multihoming.

Figure 85. EVPN OISM and multihomed sources

The source S1 is attached to a switch SW1 that is connected via single LAG to PE1 and PE3, a pair of EVPN OISM PEs. PE1 and PE3 define Ethernet Segment ES-1 for SW1, where ES-1 is all-active in this case (single-active multihoming being supported too). Even in case of all-active, the multicast flow for (S1,G1) is only sent to one OISM PE, and the regular all-active multihoming procedures (Split-Horizon) make sure that PE3 does not send the multicast traffic back to SW1. This is true for EVPN-MPLS and EVPN-VXLAN BDs.

Convergence, in case of failure, is very fast because the downstream PEs, for example, PE2, advertise the SMET route for (*,G1) with the SBD route target and it is imported by both PE1 and PE3. In case of failure on PE1, PE3 already has state for (*,G1) and can forward the multicast traffic immediately.

EVPN OISM also supports multihomed receivers. EVPN OISM and multihomed receivers illustrates an example of multihomed receivers.

Figure 86. EVPN OISM and multihomed receivers

Multihomed receivers as depicted in EVPN OISM and multihomed receivers, require the support of multicast state synchronization on the multihoming PEs to avoid blackholes. As an example, consider that SW1 hashes an IGMP join (*,G1) to PE2, and PE2 adds the ES-1 SAP to the OIF list for (*,G1). Consider PE1 is the ES-1 DF. Unless the (*,G1) state is synchronized on PE1, the multicast traffic is pulled to PE2 only and then discarded. The state synchronization on PE1 pulls the multicast traffic to PE1 too, and PE1 forwards to the receiver using its DF SAP.

In SR OS, the IGMP/MLD-snooping state is synchronized across ES peers using EVPN Multicast Synch routes, as specified in RFC 9251.

The same mechanism must be used in all the PEs attached to the same Ethernet Segment. MCS takes precedence when both mechanisms are simultaneously used.

Note: The use of Multichassis Synchronization (MCS) protocol is not supported in VPLS services in OISM mode or evpn-proxy mode.

EVPN Multicast Synch routes are supported as specified in RFC 9251 for OISM services too. They use EVPN route types 7 and 8, and are known as the Multicast Join Synch and Multicast Leave Synch routes, respectively.

When a PE that is attached to an EVPN Ethernet Segment receives an IGMP or MLD join, it creates multicast state and advertises a Multicast Join Synch route so that the peer ES PEs can synchronize the state. Similarly, when a PE in the Ethernet Segment receives a leave message, it advertises a Multicast Leave Synch route so that all the PEs in the Ethernet Segment can synchronize the Last Member Query procedures.

The Multicast Join Synch route or EVPN route type 7 is similar to the SMET route, but also includes the ESI. The Multicast Join Synch route indicates the multicast group that must be synchronized in all objects of the Ethernet Segment. Multicast join synch route depicts the format of the Multicast Join Synch route.

Figure 87. Multicast join synch route

In accordance with RFC 9251, the following rules pertain:

  • All fields except for the Flags are part of the route key for BGP processing purposes.

  • Synch routes are resolved by BGP auto-bind resolution, as any other service route.

  • The Flags are advertised and processed based on the received IGMP or MLD report that triggered the advertisement of the route (this includes the versions for IGMP or MLD and Include/Exclude bit for IGMPv3).

  • The Route Distinguisher (RD) is the service RD.

  • This route is only distributed to the ES peers - it is advertised with the ES-import route target, which limits its distribution to ES peers only.

  • In addition, the route is sent with one EVI-RT extended community. The EVI-RT EC does not use a route target type/sub-type, therefore, it does not affect the distribution of the route, for example, it is not considered for route target constraint filtering; only the ES-import route target is. However, its value is still taken from the configured service route target or EVI auto-derived route target.

The Multicast Leave Synch route or EVPN route type 8 indicates the multicast group Leave states that must be synchronized in all objects of the Ethernet Segment. Multicast leave synch route depicts the format of the Multicast Leave Synch route.

Figure 88. Multicast leave synch route

In accordance with RFC 9251, the following rules pertain:

  • All fields except for the Flags, the Maximum Response Time and ‟reserved” field are part of the route key for BGP processing purposes.

  • Synch routes are resolved by BGP auto-bind resolution, as any other service route.

  • The Flags are generated based on the version of the leave message that triggered the advertisement of the route.

  • As with the Multicast Join Synch route, this is a service level route sent with one ES-import route target and one EVI-RT EC. RD, Flags, ES-import and EVI-RT EC are advertised and processed in the same way as for the Multicast Join Synch route.

The EVI-RT is automatically added to the routes type 7 and 8, depending on the type of route target being configured on the service.

  • If the service is configured with target:2byte-asnumber:ext-comm-val as route target, an EVI-RT type 0 is automatically added to routes type 7 and 8. No route target (other than the ES-import route target) is added to the route.

  • If the service is configured with target:ip-addr:comm-val as route target, an EVI-RT type 1 is automatically added to routes type 7 and 8. No route target (other than the ES-import route target) is added to the route.

  • If the service is configured with target:4byte-asnumber:comm-val as route target, an EVI-RT type 2 is automatically added to routes type 7 and 8. No route target (other than the ES-import route target) is added to the route.

  • If auto-derived service RTs are used in the service, the corresponding operating route target is used as the EVI-RT.

  • EVI-RT type 3 is not supported (type 3 is specified in RFC 9251).

  • In general, vsi-import and vsi-export must not be used in OISM mode services or when the Multicast Synch routes are used. Using vsi-import or vsi-export policies instead of the route target command or the EVI-derived route target leads to issues when advertising and processing the Multicast Synch routes.

The following are additional considerations about the Multicast Synch routes:

  • The routes are advertised without the need to configure any command as long as igmp-snooping or mld-snooping are enabled on an R-VPLS in OISM mode attached to a regular or virtual Ethernet Segment.

  • The reception of Multicast Join or Leave Synch routes triggers the synchronization of states and the associated procedures in RFC 9251.

  • Upon receiving a Leave message, the triggered Multicast Synch route encodes the configured Last Member Query interval times robust-count (LMQ ✕ robust-count) in the Maximum Response Time field. The local PE expires the multicast state after the usual time plus an additional time that accounts for the BGP propagation to the remote ES peers and can be configured with the following command.
    configure service system bgp-evpn multicast-leave-sync-propagation
    This timer value should be configured the same in all the PEs attached to the same ES.

EVPN OISM configuration guidelines

This section shows a configuration example for the network illustrated in EVPN OISM example.

Figure 89. EVPN OISM example

The following CLI excerpt shows the configuration required on PE4 for services 2000 (VPRN), BD-2003 and BD-2004 (ordinary BDs) and BD-2002 (SBD).

vprn 2000 name "tenant-2k" customer 1 create
    route-distinguisher auto-rd
    interface "bd-2003" create
        address 10.41.0.1/24
        vpls "bd-2003"
        exit                  
    exit
    interface "bd-2004" create
        address 10.42.0.1/24
        vpls "bd-2004"
        exit
    exit
    interface "bd-2002" create
        vpls "bd-2002"
            evpn-tunnel supplementary-broadcast-domain  <------
        exit
    exit
    igmp                      <------
        interface "bd-2003"   <------
            no shutdown
        exit
        interface "bd-2004"   <------
            no shutdown
        exit
        no shutdown
    exit
    pim                       <------
        rpf-table both           <------
        interface "bd-2002"      <------
            multicast-senders always     <------
        exit
        apply-to all             <------
        no shutdown
    exit
    no shutdown
exit

As shown in the previous configuration commands, the VPRN must be configured as follows:

  • The SBD interface in the VPRN must be configured as using the following command so that the OISM forwarding mode is enabled.
    configure service vprn interface vpls evpn-tunnel supplementary-broadcast-domain
  • IGMP must be enabled on the ordinary BD (R-VPLS) interfaces so that the PEs can process the received IGMP messages from the receivers.

  • Even though the protocol itself is not used, PIM is enabled in the VPRN on all the IRB interfaces so that the multicast source addresses can be resolved. Also, the following command must be enabled on the SBD interface;
    configure service vprn pim interface multicast-senders always
    this is because the SBD interface is unnumbered (it does not have an IP address associated) and the multicast traffic source RPF-check would discard the multicast traffic arriving at the SBD interface unless the system is informed that legal multicast traffic may be expected on the SBD. The multicast-senders always command allows the system to process multicast on the unnumbered SBD interface. The following command is needed in case sources are added to the VPRN route-table as ARP-ND host routes (which is typical in Data Centers).
    • MD-CLI
      configure service vprn pim ipv4 rpf-table both
      configure service vprn pim ipv6 rpf-table both
    • classic CLI
      configure service vprn pim rpf-table both

Besides the VPRN, BD-2003, BD-2004 and BD-2002 (SBD) must be configured as follows.

vpls 2003 name "bd-2003" customer 1 create
    allow-ip-int-bind
        forward-ipv4-multicast-to-ip-int    <------
    exit
    bgp
    exit
    bgp-evpn
        evi 2003
        mpls bgp 1
            ingress-replication-bum-label
            auto-bind-tunnel
                resolution any
            exit
            no shutdown
        exit
    exit
    igmp-snooping                            <------
        no shutdown                          <------
    exit
    sap 1/1/1:2003 create
        igmp-snooping
            mrouter-port
        exit                  
        no shutdown
    exit
    no shutdown
exit
vpls 2004 name "bd-2004" customer 1 create
    allow-ip-int-bind
        forward-ipv4-multicast-to-ip-int     <------
    exit
    bgp
    exit
    bgp-evpn
        evi 2004
        mpls bgp 1
            ingress-replication-bum-label
            auto-bind-tunnel
                resolution any
            exit
            no shutdown
        exit
    exit
    igmp-snooping                            <------
        no shutdown                          <------
    exit
    sap 1/1/1:2004 create
        igmp-snooping
            fast-leave
        exit
        no shutdown
    exit
    no shutdown
exit
vpls 2002 name "bd-2002" customer 1 create
    allow-ip-int-bind
        forward-ipv4-multicast-to-ip-int     <------
    exit
    bgp
    exit
    bgp-evpn
        no mac-advertisement
        ip-route-advertisement
        sel-mcast-advertisement              <------
        evi 2002
        mpls bgp 1
            auto-bind-tunnel
                resolution any
            exit
            no shutdown
        exit
    exit
    igmp-snooping                            <------
        no shutdown                          <------
    exit
    no shutdown
exit
As shown in the previous configuration commands, the following command must be configured in ordinary and SBD R-VPLS services so that IGMP messages or SMET routes are processed by the IGMP module.
  • MD-CLI
    configure service vpls igmp-snooping admin-state enable
  • classic CLI
    configure service vpls no igmp-snooping
Also, the following command allows the system to forward multicast traffic from the R-VPLS to the VPRN interface.
  • MD-CLI
    configure service vpls routed-vpls multicast ipv4 forward-to-ip-interface 
  • classic CLI
    configure service vpls allow-ip-int-bind forward-ipv4-multicast-to-ip-int
Finally, the following command must be enabled on the SBD R-VPLS, so that the SBD aggregates the multicast state for all the ordinary BDs and advertises the corresponding SMET routes with the SBD route-target.
  • MD-CLI
    configure service vpls bgp-evpn routes sel-mcast advertise true
  • classic CLI
    configure service vpls bgp-evpn sel-mcast-advertisement
In OISM mode, the SMET route is only advertised from the SBD R-VPLS and not from the other ordinary BD R-VPLSes.

PE2 and PE3 are configured with the VPRN (2000), ordinary BD (BD-2001) and SBD (BD-2002) as above. In addition, PE2 and PE3 are attached to ES-1 where a receiver is connected. Multicast state synchronization through BGP Multicast Synch routes is automatically enabled in R-VPLS services in OISM mode and no additional configuration is needed:

/* Example of ES-1 configuration and MCS on PE3. Similar configuration is needed in 
PE2.
bgp-evpn
    ethernet-segment "ES-1" virtual create
        esi 01:00:00:00:00:00:01:00:00:00
        service-carving
            mode manual
            manual
                preference non-revertive create
                    value 30
                exit
            exit
        exit
        multi-homing single-active
        lag 1
        dot1q
            q-tag-range 2001
        exit
        no shutdown
    exit

When the previous configuration is executed in the three nodes, the EVPN routes are exchanged. BD2003 in PE4 receives IMET routes from the remote SBD PEs and creates "SBD" destinations to PE2 and PE3. Those SBD destinations are used to forward multicast traffic to PE2 and PE3, following the OISM forwarding procedures described in OISM forwarding plane. The following command shows an example of IMET route (flagged as SBD route working on OISM mode) and SMET route received on PE4 from PE2.

IMET route received from PE2 on PE4.

show router bgp routes evpn incl-mcast community target:64500:2002 hunt

<snip>
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
Nexthop        : 192.0.2.2
From           : 192.0.2.2
Res. Nexthop   : 192.168.24.1
Local Pref.    : 100                    Interface Name : int-PE-4-PE-2
<snip>
Community      : target:64500:2002
                 mcast-flags:SBD/NO-MEG/NO-PEG/OISM/NO-MLD-Proxy/NO-IGMP-Proxy <---
                 bgp-tunnel-encap:MPLS
<snip>
EVPN type      : INCL-MCAST             
Tag            : 0                      
Originator IP  : 192.0.2.2                   <------
Route Dist.    : 192.0.2.2:2002
<snip>
-------------------------------------------------------------------------------
PMSI Tunnel Attributes : 
Tunnel-type    : Ingress Replication    
Flags          : Type: RNVE(0) BM: 0 U: 0 Leaf: not required
MPLS Label     : LABEL 524241           
Tunnel-Endpoint: 192.0.2.2
-------------------------------------------------------------------------------
 

SMET route from PE2 received on PE4.

show router bgp routes evpn smet community target:64500:2002 hunt
<snip>
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
Nexthop        : 192.0.2.2
From           : 192.0.2.2
Res. Nexthop   : 192.168.24.1
Local Pref.    : 100                    Interface Name : int-PE-4-PE-2
<snip>
Community      : target:64500:2002 bgp-tunnel-encap:MPLS
<snip>
EVPN type      : SMET                   
Tag            : 0                      
Src IP         : 0.0.0.0                 <------
Grp IP         : 239.0.0.4               <------
Originator IP  : 192.0.2.2               <------
Route Dist.    : 192.0.2.2:2002         
<snip>

When PE4 receives the IMET routes from PE2 and PE3 SBDs, it identifies the routes as SBD routes in OISM mode, and PE4 creates special EVPN destinations on the BD-2003 service that are used to forward the multicast traffic. The SBD destinations are shown as Sup BCast Domain in the show commands output.

show service id 2003 evpn-mpls

===============================================================================
BGP EVPN-MPLS Dest (Instance 1)
===============================================================================
TEP Address             Transport:Tnl   Egr Label     Oper     Mcast      Num
                                                      State               MACs
-------------------------------------------------------------------------------
192.0.2.2               ldp:65551       524266        Up       m          0
192.0.2.3               ldp:65537       524266        Up       m          0
-------------------------------------------------------------------------------
Number of entries : 2
===============================================================================

*A:PE-4#  

show service id 2003 evpn-mpls detail
===============================================================================
BGP EVPN-MPLS Dest (Instance 1)
===============================================================================
TEP Address             Transport:Tnl   Egr Label     Oper     Mcast     Num
                                                      State              MACs
-------------------------------------------------------------------------------
192.0.2.2               ldp:65551       524266        Up       m         0
  Oper Flags       : None
  Sup BCast Domain : Yes
  Last Update      : 02/07/2023 14:59:03
192.0.2.3               ldp:65537       524266        Up       m         0
  Oper Flags       : None
  Sup BCast Domain : Yes
  Last Update      : 02/07/2023 13:21:09
-------------------------------------------------------------------------------
Number of entries : 2
===============================================================================

Based on the reception of the SMET routes from PE2 and PE3, PE4 adds the SBD EVPN destinations to its MFIB on BD-2003.

show service id 2003 igmp-snooping base 
===============================================================================
IGMP Snooping Base info for service 2003
===============================================================================
Admin State : Up
Querier     : 10.41.0.1 on rvpls bd-2003
SBD service : 2002
-------------------------------------------------------------------------------
Port                      Oper MRtr Pim  Send Max   Max  Max   MVR       Num
Id                        Stat Port Port Qrys Grps  Srcs Grp   From-VPLS Grps
                                                         Srcs            
-------------------------------------------------------------------------------
sap:1/1/1:2003            Up   Yes  No   No   None  None None  Local     0
rvpls                     Up   Yes  No   N/A  N/A   N/A  N/A   N/A       N/A
sbd-mpls:192.0.2.2:524241 Up   No   No   N/A  N/A   N/A  N/A   N/A       1 <------
sbd-mpls:192.0.2.3:524253 Up   No   No   N/A  N/A   N/A  N/A   N/A       1 <------
===============================================================================
*A:PE-4#  

show service id 2003 igmp-snooping statistics
===============================================================================
IGMP Snooping Statistics for service 2003
===============================================================================
Message Type            Received      Transmitted   Forwarded
-------------------------------------------------------------------------------
<snip>
EVPN SMET Routes        2             0             N/A    <------
-------------------------------------------------------------------------------
*A:PE-4# show service id 2003 mfib                                               
<snip>
-------------------------------------------------------------------------------
*               *                     sap:1/1/1:2003               Local    Fwd
*               239.0.0.4             sap:1/1/1:2003               Local    Fwd
                                      sbd-eMpls:192.0.2.2:524241   Local    Fwd
                                      sbd-eMpls:192.0.2.3:524253   Local    Fwd

PE2 and PE3 also creates regular destinations and SBD destinations based on the reception of IMET routes. As an example, the following command shows the destinations created by PE3 in the ordinary BD-2001.

show service id 2001 evpn-mpls
==============================================================================
BGP EVPN-MPLS Dest (Instance 1)
===============================================================================
TEP Address               Transport:Tnl     Egr Label   Oper   Mcast   Num
                                                        State          MACs
-------------------------------------------------------------------------------
192.0.2.2                 ldp:65551         524266      Up     m       0
192.0.2.2                 ldp:65551         524267      Up     bum     0
192.0.2.2                 ldp:65551         524268      Up     none    1
192.0.2.4                 ldp:65539         524269      Up     m       0
-------------------------------------------------------------------------------
Number of entries : 4
===============================================================================
show service id 2001 evpn-mpls detail 
===============================================================================
BGP EVPN-MPLS Dest (Instance 1)
===============================================================================
TEP Address               Transport:Tnl     Egr Label   Oper   Mcast   Num
                                                        State          MACs
-------------------------------------------------------------------------------
192.0.2.2                 ldp:65551         524266       Up     m       0
  Oper Flags       : None
  Sup BCast Domain : Yes
  Last Update      : 02/07/2023 14:59:04
192.0.2.2                 ldp:65551         524267       Up     bum     0
  Oper Flags       : None
  Sup BCast Domain : No
  Last Update      : 02/07/2023 14:59:04
192.0.2.2                 ldp:65551         524268       Up     none     1
  Oper Flags       : None
  Sup BCast Domain : No
  Last Update      : 02/07/2023 14:59:04
192.0.2.4                 ldp:65539         524269       Up     m        0
  Oper Flags       : None
  Sup BCast Domain : Yes
  Last Update      : 02/07/2023 13:21:10
-------------------------------------------------------------------------------
Number of entries : 4
===============================================================================

In case of an SBD destination and a non-SBD destination to the same PE (PE2), IGMP only uses the non-SBD one in the MFIB. The non-SBD destination always has priority over the SBD destination. This can be seen in the following command in PE3, where the SBD destination to PE2 is down as long as the non-SBD destination is up.

show service id 2001 igmp-snooping base
==============================================================================
IGMP Snooping Base info for service 2001
===============================================================================
Admin State : Up
Querier     : 10.0.0.3 on rvpls bd-2001
SBD service : 2002
-------------------------------------------------------------------------------
Port                      Oper MRtr Pim  Send Max   Max  Max   MVR       Num
Id                        Stat Port Port Qrys Grps  Srcs Grp   From-VPLS Grps
                                                         Srcs            
-------------------------------------------------------------------------------
sap:lag-1:2001            Down No   No   No   None  None None  Local     1
rvpls                     Up   Yes  No   N/A  N/A   N/A  N/A   N/A       N/A
sbd-mpls:192.0.2.2:524241 Down No   No   N/A  N/A   N/A  N/A   N/A       0 <------
mpls:192.0.2.2:524242     Up   No   No   N/A  N/A   N/A  N/A   N/A       1 <------
sbd-mpls:192.0.2.4:524245 Up   No   No   N/A  N/A   N/A  N/A   N/A       0 
===============================================================================  
show service id 2001 mfib
==============================================================================
Multicast FIB, Service 2001
===============================================================================
Source Address  Group Address         Port Id                      Svc Id   Fwd
                                                                            Blk
-------------------------------------------------------------------------------
*               239.0.0.4             sap:lag-1:2001               Local    Fwd
                                      eMpls:192.0.2.2:524242       Local    Fwd <---

Finally, to check the Layer 3 IIF and OIF entries on the VPRN services, enter the following command. As an example, the command is executed in PE2:

show router 2000 pim group detail
==============================================================================
PIM Source Group ipv4
===============================================================================
Group Address      : 239.0.0.4
Source Address     : *
<snip>
===============================================================================
PIM Source Group ipv4
===============================================================================
Group Address      : 239.0.0.4
Source Address     : 10.41.0.41
<snip>
Up Time            : 0d 00:13:20        Resolved By        : rtable-u 
Up JP State        : Joined             Up JP Expiry       : 0d 00:00:00
Up JP Rpt          : Pruned             Up JP Rpt Override : 0d 00:00:00
 
Rpf Neighbor       : 10.41.0.41
Incoming Intf      : bd-2002
Outgoing Intf List : bd-2001
 
Curr Fwding Rate   : 0.000 kbps         
Forwarded Packets  : 1000               Discarded Packets  : 0
Forwarded Octets   : 84000              RPF Mismatches     : 0
Spt threshold      : 0 kbps             ECMP opt threshold : 7
Admin bandwidth    : 1 kbps             
-------------------------------------------------------------------------------
Groups : 2
===============================================================================

Inclusive Provider mLDP Tunnels in OISM

Inclusive provider tunnels of type mLDP are supported in OISM PEs. These tunnels can be used to transport multicast flows from root PEs to leaf PEs while preventing multiple copies of the same multicast packet on the same link.

OISM with IR versus inclusive mLDP illustrates the difference between using Ingress Replication (IR) and inclusive mLDP provider tunnels in OISM. With a source S1 connected to BD1 and sending a flow to G1, if IR is used, the multicast traffic is only sent to PEs with receivers for (S1,G1). However, if an inclusive mLDP tunnel on PE1 is used (right side of OISM with IR versus inclusive mLDP) the multicast flow is sent to all the PEs in the tenant domain. For example, PE3 receives the flow only to drop it because there are no local receivers.

Figure 90. OISM with IR versus inclusive mLDP

mLDP tunnels are referred to as Inclusive BUM tunnels, because, although IP multicast traffic uses these tunnels, any BUM frame is also distributed to all PEs in the tenant. For example, in OISM with IR versus inclusive mLDP (right hand side), any BUM frame generated by any host connected to BD1 in PE1 uses the mLDP tunnel and is also sent to PE3.

The use of mLDP-inclusive provider tunnels in OISM requires the following configuration and procedures to be enabled on the PEs:

  • All the PEs in the OISM tenant domain that need to transmit or receive multicast traffic on an mLDP tree in a BD, are configured with the following commands:
    configure service vpls provider-tunnel inclusive owner bgp-evpn-mpls
    configure service vpls provider-tunnel inclusive mldp
  • The PEs attached to the sources (root PEs) should be configured with the following command on the ordinary BDs, and the PEs attached to the receivers should be configured as root-and-leaf or leaf-only.
    configure service vpls provider-tunnel inclusive root-and-leaf
  • The PEs attached to the receivers (leaf PEs) need to be configured using the following command on the BDs or SBDs.
    • MD-CLI
      configure service vpls bgp-evpn routes incl-mcast advertise-ingress-replication
    • classic CLI
      configure service vpls bgp-evpn ingress-repl-inc-mcast-advertisement
    This ensures the leaf PEs advertise a label in the IMET routes so that the root PEs can create EVPN-MPLS destinations to the leaf PEs and add them to their MFIB. Having EVPN-MPLS destinations in the MFIB is required on the root PE to use the mLDP tunnel for the multicast traffic.
  • The SBD must always be configured as leaf-only in all PEs, because the SBD mLDP tree is not used to transmit IP multicast.

  • For the IMET and SMET routes to be exported and imported with the correct route targets, no vsi-import or vsi-export policies should be configured on the ordinary BDs and the SBDs.

Assuming the above guidelines are followed, and as illustrated in OISM with IR versus inclusive mLDP (right side), the root PE (PE1) that is attached to the source in BD1 sends the multicast traffic in an mLDP tree that is joined by leaf PEs either on BD1 (if BD1 is exists) or on the SBD (if BD1 does not exist on the leaf PE).

Example of Inclusive Provider Tunnels in OISM

OISM with inclusive mLDP example illustrates an example of the OISM procedures with mLDP trees.

Figure 91. OISM with inclusive mLDP example

Consider three PEs, PE1, PE2, and PE3, attached to BD1/BD2, BD1, and BD3 respectively, as in OISM with inclusive mLDP example. Assume that the source S1 is connected to BD1 in PE1. PE2 and PE3 are leaf PEs, because they have receivers but no sources. In this example:

  • BD and SBD services must be configured for provider tunnel as follows:

    • To have PE1 sending multicast traffic in P2MP mLDP tunnels on BD1 and BD2, both BDs are configured using the following command.
      configure service vpls provider-tunnel inclusive root-and-leaf
      They are also configured with the following command.
      • MD-CLI
        configure service vpls bgp-evpn routes incl-mcast advertise-ingress-replication
      • classic CLI
        configure service vpls bgp-evpn ingress-repl-inc-mcast-advertisement
      The following is an example configuration of BD1 in PE1.
      *A:PE-1>config>service>vpls# info 
      ----------------------------------------------
                  allow-ip-int-bind
                  exit
                  bgp
                  exit
                  bgp-evpn
                      evi 1
                      ingress-repl-inc-mcast-advertisement // default value
                      mpls bgp 1
                          auto-bind-tunnel
                              resolution any
                          exit
                          no shutdown       
                      exit
                  exit
                  provider-tunnel
                      inclusive
                          owner bgp-evpn-mpls
                          root-and-leaf
                          data-delay-interval 10
                          mldp
                          no shutdown
                      exit
                  exit
                  igmp-snooping / mld-snooping
                    no shutdown
                  exit
      <snip>
      
    • PE2 and PE3 BDs are configured as leaf as they must be able to join mLDP trees but not set up an mLDP tree themselves.
      • MD-CLI

        Do not configure root-and-leaf. An unconfigured root-and-leaf command functions as a leaf-only node. If configured, use the following command to delete the configuration.

        configure groups group service vpls provider-tunnel inclusive delete root-and-leaf 
      • classic CLI
        configure service vpls provider-tunnel inclusive no root-and-leaf
      It is important that these BDs are configured with the following command that allows upstream PEs to create destinations to them.
      • MD-CLI
        configure service vpls bgp-evpn routes incl-mcast advertise-ingress-replication
      • classic CLI
        configure service vpls bgp-evpn ingress-repl-inc-mcast-advertisement

      Multicast traffic cannot use the mLDP tree unless there is an EVPN-MPLS destination in the MFIB for the multicast stream.

    • The SBDs in all PEs must be configured as follows:
      configure service vpls provider-tunnel inclusive root-and-leaf
      and with the following command.
      • MD-CLI
        configure service vpls bgp-evpn routes incl-mcast advertise-ingress-replication
      • classic CLI
        configure service vpls bgp-evpn ingress-repl-inc-mcast-advertisement
  • When the configuration is added, the PEs create EVPN-MPLS destinations as follows, where a destination is represented as {pe, label} with ‟pe” being the IP address of the remote PE and ‟label” being the EVPN label advertised by the remote PE.

    • PE1 creates the following EVPN-MPLS destinations:

      • On BD1: {pe2,bd1-L21}, {pe2,sbd-L22}, {pe3,sbd-L32}

      • On BD2: {pe2,sbd-L22}, {pe3,sbd-L32}

      • On SBD: {pe2,sbd-L22}, {pe3,sbd-L32}

    • PE2 creates destinations as follows:

      • On BD1: {pe1,bd1-L11}, {pe1,sbd-L13}, {pe3,sbd-L32}

      • On SBD: {pe1,sbd-L13}, {pe3,sbd-L32}

    • PE3 creates destinations as follows:

      • On BD3: {pe1,sbd-L13}, {pe2,sbd-L22}

      • On SBD: {pe1,sbd-L13}, {pe2,sbd-L22}

    • PE2's BD1 and PE3's BD3 does not create an EVPN-MPLS destination to PE1's BD2. Also, PE3's BD3 does not create a destination to PE1's BD1. This is in spite of receiving IMET-Composite routes for those BDs with the SBD-RT, which is imported in PE2/PE3 ordinary BDs.

  • As an example, on BD1, PE1's IGMP process adds the EVPN-MPLS destinations {pe2,bd1-L21}, {pe3,sbd-L32} to the MFIB. The third destination {pe2,sbd-L22} is kept down because the EVPN-MPLS destination in BD1 has higher priority.

    1. Upon receiving the SMET route from PE2, PE1 adds {pe2,bd1-L21} as OIF for the MFIB (*,G1).

    2. In the meantime, PE2 and PE3 have joined the mLDP tree with tunnel-id 1.

    3. When multicast to G1 is received from S1, because there is an MFIB EVPN OIF entry, the multicast traffic is forwarded. At the IOM level, PE1 replaces the MFIB EVPN destination with the P2MP tunnel with tunnel-id 1, as long as the P2MP tree is operationally up.

    4. The multicast traffic is sent along the mLDP tree and arrives at PE2/BD1 and PE3/SBD. Then local forwarding or routing is performed in PE2 and PE3, as normally in OISM.

OISM interworking with MVPN and PIM for MEG or PEG gateways

For EVPN OISM to successfully interwork with MVPN and PIM, it is important to ensure that the MVPN/PIM procedures in the IPVPN network are not modified. In this interworking scenario, two (or more) OISM PEs act as the gateway between the EVPN and the MVPN/PIM network to ensure the OISM procedures are transparent to MVPN/PIM, and vice versa.

SR OS supports the MVPN-to-EVPN Gateway (MEG) and PIM-to-EVPN Gateway (PEG)functions in accordance with draft-ietf-bess-evpn-irb-mcast. Both, Ingress Replication (IR) and mLDP trees are supported on the SBD so that multicast traffic can be received from or transmitted to OISM PEs.

When more than one MEG or PEG is present per EVPN tenant (that is, per SBD), one of the MEG or PEGs acts as the MEG or PEG designated router (DR). The following are the special functions of MEG/PEGs DRs.

  • The DRs behave as a First Hop Router (FHR) from the MVPN/PIM network perspective and register sources in the OISM domain with the RP in the MVPN/PIM domain.

  • The DRs behave as Last Hop Router (LHR) from the MVPN/PIM network perspective, and join the shared or source tree. The non-DR PEs remove the SBD R-VPLS interface from the VPRN’s Layer 3 multicast OIF list, which prevents the PEs from sending multicast traffic to the OISM receivers.

The MEG or PEG DR election occurs in each PE attached to the SBD configured as MEG or PEG. Each PE builds a DR candidate list based on the reception of the Inclusive Multicast Ethernet Tag (IMET) routes for the SBD that include the MEG and/or PEG flag. After the timer set using the dr-activation-timer expires, the PE runs the DR election based on the default algorithm used for EVPN DF election (modulo function of the EVI and number of PEs). The dr-activation-timer command is configured in the following context:

  • MD-CLI

    configure service vpls routed-vpls multicast evpn-gateway
  • Classic CLI

    configure service vpls allow-ip-int-bind evpn-mcast-gateway
Note: A single DR election is run for MEGs and PEGs of the same SBD. The advertisement of an IMET route with MEG, PEG, or both flags set is configured using the following command:
  • MD-CLI

    configure service vpls routed-vpls multicast evpn-gateway advertise
  • classic CLI

    configure service vpls allow-ip-int-bind evpn-mcast-gateway advertise
Note: This section describes the MVPN and PIM procedures for MEG PEs, using MVPN examples. These procedures also apply to PEG PEs, using PIM messages instead of C-MCAST shared or source join routes.

Procedures for sources in MVPN and PIM and receivers in OISM, Procedures for ASM sources in OISM and receivers in MVPN, and Procedures for SSM sources in OISM and receivers in MVPN describe the MVPN and PIM procedures depending on whether the sources and receivers are attached to the OISM or MVPN network.

Procedures for sources in MVPN and PIM and receivers in OISM

The MEG DR for the SBD generates C-multicast source/shared tree join routes for receivers in the OISM domain. The following information applies to this procedure:

  • This is similar to a PIM DR and its Last Hop Router (LHR) function.

  • For receivers that are directly connected to the MEG DR, the MEG DR creates a Layer 3 multicast state upon receiving an IGMP or MLD message and generates the corresponding C-multicast routes. This handling applies to MEG and PEG PEs.

  • For receivers not directly connected, the MEG DR creates a Layer 3 multicast state upon receiving an SMET route from the PE connected to the receiver. Based on this newly created state, the MEG generates the corresponding C-multicast routes. This scenario is shown in Sources in the MVPN and PIM network .

  • Use one of the following commands to trigger the non-DR MEG to create the Layer 3 multicast state too and advertises the C-multicast routes to attract the multicast traffic. The attracted multicast traffic is dropped at the non-DR MEG; however, configuring the following command enables a faster convergence in case of a MEG DR failure.
    • Classic CLI

      configure service vpls routed-vpls multicast evpn-gateway non-dr-attract-traffic from-pim-mvpn
    • MD-CLI

      configure service vpls allow-ip-int-bind evpn-mcast-gateway non-dr-attract-traffic from-pim-mvpn
    .

The following figure displays sources in the MVPN and PIM network.

Figure 92. Sources in the MVPN and PIM network
Procedures for ASM sources in OISM and receivers in MVPN

When Any-Source Multicast (ASM) group sources in the EVPN OISM domain, the MEG DR for the SBD needs to attract the ASM traffic from the EVPN sources and initiate the MVPN register and source discovery procedure. This is homologous to a PIM DR and its First Hop Router (FHR) function. The following figure displays ASM sources in the EVPN network.

Figure 93. ASM sources in the EVPN network

To attract ASM source traffic and act as the FHR, the MEG DR performs the following steps:

  • The MEG DR generates a wildcard SMET route.

    The wildcard SMET route is automatically generated as soon as the MEG is elected as DR. The wildcard SMET route is formatted in accordance with RFC 6625, with address and length equal to zero.

    In addition, to attract multicast traffic from ASM sources on the non-DR routers, the user can configure the following command in the configure service vpls context in the SBD. This command triggers the advertisement of the wildcard SMET route from the non-DR routers:

    • MD-CLI

      routed-vpls multicast evpn-gateway non-dr-attract-traffic from-evpn-pim-mvpn
    • Classic CLI

      allow-ip-int-bind evpn-mcast-gateway non-dr-attract-traffic from-evpn
  • When the MEG DR (for example, PE3 in ASM sources in the EVPN network ) receives the ASM multicast traffic, it is handled as follows:

    • Assuming the MEG DR does not have the Layer 3 multicast state for it, the multicast traffic, (S1,G1) in the example shown in ASM sources in the EVPN network, is sent to the CPM.
    • The CPM encapsulates the multicast traffic into unicast register messages to the RP. For example, PE5 decapsulates the traffic and sends the multicast traffic down the shared tree.
    • In MVPN, PE5 triggers Source A-D routes and a C-multicast route for (S1,G1), and the SPT switchover occurs.
  • If the MEG non-DR (for example, PE4) receives the ASM multicast traffic, it is handled as follows:

    • Assuming the MEG non-DR does not have the Layer 3 multicast state for it, the multicast traffic, (S1,G1) in the example shown in ASM sources in the EVPN network, is sent to the CPM and discarded.
    • The multicast traffic is heavily rate limited in the CPM.
  • On the remote OISM PEs attached to the ASM source (for example, PE1), the PE creates an MFIB for (*,*) with OIFs for the MEGs that sent the wildcard SMET route. For example:
    • (*,*) OIF: evpn-dest-PE3 (assuming non-dr-attract-traffic false)
    • (*,*) OIF: evpn-dest-PE3, evpn-dest-PE4 (assuming non-dr-attract-traffic true)
  • Any multicast traffic is forwarded based on the MFIB for (*,*).
  • The preceding handling also applies to ASM sources attached to the non-DR MEG or PEG. The non-DR creates an EVPN destination (on the BD attached to the source) to the DR as OIF for (*,*).
Procedures for SSM sources in OISM and receivers in MVPN

Irrespective of the DR election and source discovery process, when a MEG receives an MVPN C-multicast join route, it creates the Layer 3 multicast state and generates an SMET route for the S,G. This is shown in the following figure.

Figure 94. SSM sources in the EVPN network
  • PE6 may pick PE4 as the Upstream Multicast Hop (UMH) PE for S1,G1 following regular MVPN procedures. In this case, PE4 adds the SBD interface as IIF, the MVPN tunnel as OIF member, and generates an SMET (S1,G1) route to draw the multicast traffic.

  • After PE4 creates the state for (S1,G1), traffic to (S1,G1) is no longer sent to the CPM to be discarded, but it is forwarded in the datapath based on the Layer 3 MFIB state.

  • PE1 creates an MFIB for (S1,G1) and starts sending traffic to PE4. The following are two potential scenarios in this case.

    • If PE4 is configured in the classic CLI as no non-dr-attract-traffic (or in the MD-CLI as non-dr-attract-traffic none), it does not send the wildcard SMET. PE1 creates the following entries in the MFIB and sends traffic to both MEGs:
      • (*,*) oif: evpn-dest-PE3
      • (S1,G1) oif: evpn-dest-PE3, evpn-dest-PE4
    • If PE4 is configured in the classic and MD-CLI as non-dr-attract-traffic from-evpn , PE4 and PE3 both send the wildcard SMET. PE1, ignores any SMET (S/*,G) routes from a PE when a SMET (*,*) is received from the same PE. If the (*,*) route is removed, PE1 reverts to handling (S/*,G) entries. For this reason, PE1 in this case creates only (*,*) OIFs and sends the traffic to both MEGs.

      The following OIF entry is created: (*,*) oif: evpn-dest-PE3, evpn-dest-PE4.

Note: To avoid duplication between MEG or PEG nodes, MEG or PEG PEs do not create EVPN destinations to each other in the SBD.

MEG or PEG gateways and local receivers or sources

This section uses examples to describe the applicable considerations for local receivers and sources on MEG and PEG PEs.

Local singlehomed Receiver-2 on a MEG or PEG PE1, BD2

Local singlehomed receivers shows an initial situation where PE1 and PE2 are MEG/PEGs and PE2 is elected as the MEG or PEG DR. PE1 and PE2 do not have an EVPN destination between their SBDs.

The following shows a local singlehomed receiver.

Figure 95. Local singlehomed receivers

The following workflow applies to the example shown in the preceding graphic:

  1. PE1 learns via IGMP/MLD that Receiver-2 is interested in (S1,G1).

    As shown in Local singlehomed receivers, Receiver-1, which is connected to a remote OISM non-MEG PE, issues an IGMP join for the same group. This triggers the corresponding SMET route from PE3 and PE4.

  2. PE1 determines from its route table that there is a route to S1 via IP-VPN.

    PE1 originates an MVPN C-multicast source tree join (S,G) route or a PIM (S,G) join, via normal MVPN or PIM procedures.

    1. PE1 adds the MVPN tunnel or PIM interface as the Layer 3 IIF. The BD2 IRB is added to the Layer 3 OIF list.
    2. PE1 also issues an SMET route as usual.
    3. Since PE2 is the SBD’s MEG DR, PE2 also sends a PIM/C-multicast join route upon receiving the SMET route from PE3 and PE4.
  3. PE1 or PE2 receives the multicast traffic from the appropriate tunnel or interface, and passes RPF check. PE1 sends multicast down the BD2 IRB to the receiver. Since PE1 is non-DR for the SBD, the SBD IRB is not in the Layer 3 OIF list.

    PE2’s SBD does not send the multicast flow to PE1, because there are no EVPN multicast destinations between MEG or PEG PEs of the same SBD.

  4. Only PE2, the SBD’s MEG or PEG DR, sends the multicast down the SBD’s IRB to the remote OISM PEs and regular OISM forwarding follows on PE3 and PE4.
Local multihomed Receiver-1 on a pair of MEG or PEG PE1 and PE2, BD2

Local multihomed receivers, shows an initial situation where MEG or PEG routers PE1 and PE2 are multihomed to a local receiver in BD2. PE2 is the DR for the SBD. As both PEs are MEG or PEG for the same SBD, no EVPN multicast destination exists between the PEs for the SBD.

The following figure shows local multihomed receivers.

Figure 96. Local multihomed receivers

The following workflow applies to the example shown in the preceding graphic.

  1. PE2 learns, via IGMP/MLD, that Receiver-1 is interested in (S1,G1) and adds the ES SAP to the OIF list.

  2. PE2 synchronizes the (S1,G1) state with PE1 via BGP multicast synch routes and adds the ES SAP to the OIF list.

  3. Both PE1 and PE2 originate an SMET (S1,G1) following normal OISM procedures.

    Both PEs generate the corresponding MVPN/PIM join route for (S1,G1). This is because the MEG or PEG DR election occurs only in the SBD and the state is created in BD2. Consequently, both PEs send the MVPN/PIM join route in this case.

  4. Step 3 causes traffic from the source to flow to both the DF and NDF, although only the DF forwards the traffic.
    1. The MEG or PEG DR and non-DR states only impact the addition of the SBD interface to the Layer 3 OIF.
    2. The datapath extensions prevent MVPN traffic from being sent to EVPN destinations other than an SBD EVPN destination.
  5. PE2’s SBD is added to the Layer 3 OIF list. However, since there is no EVPN multicast destination between the MEG/PEGs of the same SBD, multicast is not sent from PE2 to PE1.
Note: In OISM, all BDs are assumed to be EVPN enabled.

Local-bias behavior only applies to Layer 2 multicast (BUM in general) and not to Layer 3 multicast. That is, in Local multihomed receivers, the following applies to Layer 3 multicast traffic arriving at PE1 and PE2:

  • can be forwarded to single-homed and DF SAPs in BD2
  • cannot be forwarded to non-DF SAPs in BD2
  • cannot be forwarded to EVPN destinations in BD2, in accordance with the OISM rules
Local multihomed source S2 on a pair of MEG or PEG PE1 and PE2, BD2

Local multihomed sources shows a scenario where PE1 and PE2 are multihomed to a local source. A local receiver is also using multihoming to the same MEG or PEG pair. PE2 is the SBD-DR.

Figure 97. Local multihomed sources

When the source sends multicast traffic for S2,G2, VXLAN local-bias or regular ESI-label filtering ensures the multihomed local receiver does not get duplicate traffic. The following applies in this scenario:

  • The MEG SBD-DR (PE2) still performs the FHR functionality in this case (sends register/Source A-D routes), even if the source was singlehomed to the non-DR.
  • If S2 was singlehomed to PE2 only, to avoid tromboning, the source S2 would be learned via ARP/ND as a host route and advertised in a VPN-IP route to attract the join route on PE2.
  • If S2 is multihomed, as shown in Local multihomed sources, tromboning may occur, but traffic still flows correctly. For example:
    1. the remote PE performs UMH selection and picks up PE1
    2. PE1 generates a SMET route in the SBD as usual
    3. the SMET is imported and the state added to BD2 in PE2
    4. traffic is received by PE2, forwarded to PE1 via BD2, and then forwarded to the remote MVPN/PIM PE

MEG or PEG configuration example for Ingress Replication on the SBD

This section shows a configuration example for a pair of redundant MEGs. For a PEG example, replace the MVPN configuration for PIM interfaces in the VPRN service.

Each MEG in the pair is configured with a VPRN that contains the MVPN configuration and an SBD R-VPLS. It is assumed that there are no local sources or receivers in this example. The use of domain-id in the VPRN and the SBD R-VPLS prevents control plane loops for unicast routes reinjected from the IP-VPN domain into the EVPN domain, and the other way around. Preventing these loops guarantees the correct installation of unicast routes in the MEGs' route tables, and therefore ensures the C-multicast routes are correctly advertised and processed. See BGP D-PATH attribute for Layer 3 loop protection for more information about the configuration of domain ID. The following CLI shows the configuration in MEG1.

// MEG1’s VPRN service

*A:MEG1# configure service vprn 6000 
*A:MEG1>config>service>vprn# info 
----------------------------------------------
            interface "SBD-6002" create
                vpls "SBD-6002"
                    evpn-tunnel supplementary-broadcast-domain
                exit
            exit
            bgp-ipvpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    domain-id 64500:6000
                    route-distinguisher 192.0.2.2:6000
                    vrf-target target:64500:6000
                    no shutdown
                exit
            exit
            igmp
                interface "SBD-6002"
                    no shutdown
                exit
                no shutdown
            exit
            pim
                interface "SBD-6002"
                    multicast-senders always
                exit
                apply-to all
                rp
                    static
                        address 2.2.2.2
                            group-prefix 239.0.0.0/8
                        exit          
                    exit
                    bsr-candidate
                        shutdown
                    exit
                    rp-candidate
                        shutdown
                    exit
                exit
                no shutdown
            exit
            mvpn
                auto-discovery default
                c-mcast-signaling bgp
                intersite-shared persistent-type5-adv
                provider-tunnel
                    inclusive
                        mldp
                            no shutdown
                        exit
                    exit
                exit
                vrf-target unicast
                exit
            exit
            no shutdown
----------------------------------------------

// MEG1’s SBD service

*A:MEG1>config>service>vprn# /configure service vpls 6002 
*A:MEG1>config>service>vpls# info 
----------------------------------------------
            allow-ip-int-bind
                forward-ipv4-multicast-to-ip-int
                forward-ipv6-multicast-to-ip-int
                evpn-mcast-gateway create
                    non-dr-attract-traffic from-evpn from-pim-mvpn
                    no shutdown
                exit
            exit
            bgp
            exit
            bgp-evpn
                no mac-advertisement
                ip-route-advertisement domain-id 64500:6002
                sel-mcast-advertisement
                evi 6002
                mpls bgp 1
                    ingress-replication-bum-label
                    ecmp 2
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            igmp-snooping
                no shutdown
            exit
            mld-snooping
                no shutdown
            exit
            no shutdown
----------------------------------------------

The configuration of the redundant MEG2 is as follows:

// MEG2’s VPRN configuration

*A:MEG2# configure service vprn 6000 
*A:MEG2>config>service>vprn# info 
----------------------------------------------
            interface "SBD-6002" create
                vpls "SBD-6002"
                    evpn-tunnel supplementary-broadcast-domain
                exit
            exit
            bgp-ipvpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    domain-id 64500:6000
                    route-distinguisher 192.0.2.3:6000
                    vrf-target target:64500:6000
                    no shutdown
                exit
            exit
            igmp
                interface "SBD-6002"
                    no shutdown
                exit
                no shutdown
            exit
            pim
                interface "SBD-6002"
                    multicast-senders always
                exit
                apply-to all
                rp
                    static
                        address 3.3.3.3
                            group-prefix 239.0.0.0/8
                        exit          
                    exit
                    bsr-candidate
                        shutdown
                    exit
                    rp-candidate
                        shutdown
                    exit
                exit
                no shutdown
            exit
            mvpn
                auto-discovery default
                c-mcast-signaling bgp
                intersite-shared persistent-type5-adv
                provider-tunnel
                    inclusive
                        mldp
                            no shutdown
                        exit
                    exit
                exit
                vrf-target unicast
                exit
            exit
            no shutdown
----------------------------------------------

// MEG2 SBD configuration

*A:MEG2>config>service>vprn# /configure service vpls 6002 
*A:MEG2>config>service>vpls# info 
----------------------------------------------
            allow-ip-int-bind
                forward-ipv4-multicast-to-ip-int
                forward-ipv6-multicast-to-ip-int
                evpn-mcast-gateway create
                    non-dr-attract-traffic from-evpn from-pim-mvpn
                    no shutdown
                exit
            exit
            bgp
            exit
            bgp-evpn
                no mac-advertisement
                ip-route-advertisement domain-id 64500:6002
                sel-mcast-advertisement
                evi 6002
                mpls bgp 1
                    ingress-replication-bum-label
                    ecmp 2
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            igmp-snooping
                no shutdown
            exit
            mld-snooping
                no shutdown
            exit
            no shutdown
----------------------------------------------

After the preceding configuration is added, MEG1 and MEG2 run the DR election. In the following example, which displays a sample DR election result, MEG1 is the DR:

*A:MEG1# show service id "SBD-6002" evpn-mcast-gateway all 

===============================================================================
Service Evpn Multicast Gateway
===============================================================================
Type                         : mvpn-pim                
Admin State                  : Enabled                 
DR Activation Timer          : 3 secs                  
Mvpn Evpn Gateway DR         : Yes                     
Pim  Evpn Gateway DR         : Yes                     
===============================================================================

===============================================================================
Mvpn Evpn Gateway
===============================================================================
DR Activation Timer Remaining: 3 secs                  
DR                           : Yes                     
DR Last Change               : 09/27/2021 08:50:32     
===============================================================================

===============================================================================
Candidate list
===============================================================================
Orig-Ip                                 Time Added
-------------------------------------------------------------------------------
192.0.2.2                               09/27/2021 08:50:29
192.0.2.3                               09/27/2021 08:51:20
-------------------------------------------------------------------------------
Number of Entries: 2
===============================================================================

===============================================================================
Pim Evpn Gateway
===============================================================================
DR Activation Timer Remaining: 3 secs                  
DR                           : Yes                     
DR Last Change               : 09/27/2021 08:50:32     
===============================================================================

===============================================================================
Candidate list
===============================================================================
Orig-Ip                                 Time Added
-------------------------------------------------------------------------------
192.0.2.2                               09/27/2021 08:50:29
192.0.2.3                               09/27/2021 08:51:20
-------------------------------------------------------------------------------
Number of Entries: 2
===============================================================================


*A:MEG2# show service id “SBD-6002” evpn-mcast-gateway all 

===============================================================================
Service Evpn Multicast Gateway
===============================================================================
Type                         : mvpn-pim                
Admin State                  : Enabled                 
DR Activation Timer          : 3 secs                  
Mvpn Evpn Gateway DR         : No                      
Pim  Evpn Gateway DR         : No                      
===============================================================================

===============================================================================
Mvpn Evpn Gateway
===============================================================================
DR Activation Timer Remaining: 3 secs                  
DR                           : No                      
DR Last Change               : 09/27/2021 08:51:24     
===============================================================================

===============================================================================
Candidate list
===============================================================================
Orig-Ip                                 Time Added
-------------------------------------------------------------------------------
192.0.2.2                               09/27/2021 08:51:21
192.0.2.3                               09/27/2021 08:50:37
-------------------------------------------------------------------------------
Number of Entries: 2
===============================================================================

===============================================================================
Pim Evpn Gateway
===============================================================================
DR Activation Timer Remaining: 3 secs                  
DR                           : No                      
DR Last Change               : 09/27/2021 08:51:24     
===============================================================================

===============================================================================
Candidate list
===============================================================================
Orig-Ip                                 Time Added
-------------------------------------------------------------------------------
192.0.2.2                               09/27/2021 08:51:21
192.0.2.3                               09/27/2021 08:50:37
-------------------------------------------------------------------------------
Number of Entries: 2
===============================================================================

If a source 40.0.0.1 is located in a remote PE of the MVPN network, and it is streaming group 239.0.0.44, the DR (for example, MEG1) attracts the traffic (by sending a C-multicast source join route) and forwards it to the SBD. The non-DR MEG2 does not add the SBD to the OIF list, and therefore it does not forward the multicast traffic to the OISM domain. The following is a sample output for this scenario.

// On the DR, MEG1, the SBD-6002 is added to the OIF list

*A:MEG1# show router 6000 pim group 239.0.0.44 detail 

===============================================================================
PIM Source Group ipv4
===============================================================================
Group Address      : 239.0.0.44
Source Address     : 40.0.0.1
RP Address         : 2.2.2.2
Advt Router        : 192.0.2.4
Flags              : spt                Type               : (S,G)
Mode               : sparse             
MRIB Next Hop      : 192.0.2.4
MRIB Src Flags     : remote             
Keepalive Timer    : Not Running        
Up Time            : 2d 04:42:53        Resolved By        : rtable-u
 
Up JP State        : Joined             Up JP Expiry       : 0d 00:00:06
Up JP Rpt          : Not Joined StarG   Up JP Rpt Override : 0d 00:00:00
 
Register State     : No Info            
Reg From Anycast RP: No                 
 
Rpf Neighbor       : 192.0.2.4
Incoming Intf      : mpls-if-73731
Outgoing Intf List : SBD-6002
 
Curr Fwding Rate   : 0.000 kbps         
Forwarded Packets  : 9999               Discarded Packets  : 0
Forwarded Octets   : 839916             RPF Mismatches     : 0
Spt threshold      : 0 kbps             ECMP opt threshold : 7
Admin bandwidth    : 1 kbps             
-------------------------------------------------------------------------------
Groups : 1
===============================================================================

// SBD-6002 is not added to the OIF list on the non-DR MEG2

*A:PE-3# show router 6000 pim group 239.0.0.44 detail 

===============================================================================
PIM Source Group ipv4
===============================================================================
Group Address      : 239.0.0.44
Source Address     : 40.0.0.1
RP Address         : 3.3.3.3
Advt Router        : 192.0.2.4
Flags              : spt                Type               : (S,G)
Mode               : sparse             
MRIB Next Hop      : 192.0.2.4
MRIB Src Flags     : remote             
Keepalive Timer    : Not Running        
Up Time            : 2d 04:43:02        Resolved By        : rtable-u
 
Up JP State        : Joined             Up JP Expiry       : 0d 00:00:58
Up JP Rpt          : Not Joined StarG   Up JP Rpt Override : 0d 00:00:00
 
Register State     : No Info            
Reg From Anycast RP: No                 
 
Rpf Neighbor       : 192.0.2.4
Incoming Intf      : mpls-if-73733
Outgoing Intf List : 
 
Curr Fwding Rate   : 0.000 kbps         
Forwarded Packets  : 0                  Discarded Packets  : 0
Forwarded Octets   : 0                  RPF Mismatches     : 0
Spt threshold      : 0 kbps             ECMP opt threshold : 7
Admin bandwidth    : 1 kbps             
-------------------------------------------------------------------------------
Groups : 1
===============================================================================

MEG or PEG configuration example for mLDP on the SBD

This section shows a configuration example for a pair of redundant MEGs that use mLDP in the SBD to transmit and receive multicast traffic.

As in the previous example, each MEG in the pair is configured with a VPRN that contains the MVPN configuration and an SBD R-VPLS. Local sources/receivers are supported in this example and they are attached to local BDs or local interfaces in the VPRN. A receiver connected to a local BD (BD-6023) is multihomed to MEG1 and MEG2. Also, as in the previous example, the use of domain-id in the VPRN and the SBD R-VPLS prevents control plane loops for unicast routes. The following CLI shows the configuration in MEG1.

// MEG1’s VPRN service

*A:MEG1# configure service vprn 6000 
*A:MEG1>config>service>vprn# info 
----------------------------------------------
            local-routes-domain-id 64500:2 // avoids loops for local routes
            interface "BD-6023" create  // local BD
                address 11.0.0.2/24
                vrrp 1 passive
                    backup 11.0.0.254
                exit
                vpls "BD-6023"
                    evpn
                        arp
                            no learn-dynamic
                            advertise dynamic
                        exit
                    exit
                exit
            exit
            interface "SBD-6002" create
                vpls "SBD-6002"
                    evpn-tunnel supplementary-broadcast-domain
                exit
            exit
            interface "local" create // local interface
                address 20.0.0.254/24
                sap pxc-6.a:600 create
                exit
            exit
            bgp-ipvpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    domain-id 64500:6000
                    route-distinguisher 192.0.2.2:6000
                    vrf-target target:64500:6000
                    no shutdown
                exit
            exit
            igmp
                interface "BD-6023" 
                    no shutdown 
                exit 
                interface "SBD-6002" 
                    no shutdown 
                exit 
                interface "local" 
                    no shutdown 
                exit
            exit
            pim
                interface "SBD-6002"
                    multicast-senders always
                exit
                apply-to all
                rp
                    static
                        address 4.4.4.4
                            group-prefix 224.0.0.0/4
                        exit          
                    exit
                    bsr-candidate
                        shutdown
                    exit
                    rp-candidate
                        shutdown
                    exit
                exit
                no shutdown
            exit
            mvpn
                auto-discovery default
                c-mcast-signaling bgp
                intersite-shared persistent-type5-adv
                provider-tunnel
                    inclusive
                        mldp
                            no shutdown
                        exit
                    exit
                exit
                vrf-target unicast
                exit
            exit
            no shutdown
----------------------------------------------

// MEG1’s SBD service

*A:MEG1>config>service>vprn# /configure service vpls 6002 
*A:MEG1>config>service>vpls# info 
----------------------------------------------
            allow-ip-int-bind
                forward-ipv4-multicast-to-ip-int
                forward-ipv6-multicast-to-ip-int
                evpn-mcast-gateway create
                    non-dr-attract-traffic from-evpn from-pim-mvpn
                    no shutdown
                exit
            exit
            bgp
            exit
            bgp-evpn
                no mac-advertisement
                ip-route-advertisement domain-id 64500:6002
                sel-mcast-advertisement
                evi 6002
                mpls bgp 1
                    ingress-replication-bum-label
                    ecmp 2
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            provider-tunnel // mldp is enabled on the SBD
                inclusive
                    owner bgp-evpn-mpls
                    data-delay-interval 10
                    root-and-leaf
                    mldp
                    no shutdown
                exit
            exit
            igmp-snooping
                no shutdown
            exit
            mld-snooping
                no shutdown
            exit
            no shutdown
----------------------------------------------

// MEG1’s local BD-6023 service

*A:MEG1>config>service>vprn# /configure service vpls 6023
*A:MEG1>config>service>vpls# info 
----------------------------------------------
            allow-ip-int-bind
                forward-ipv4-multicast-to-ip-int
                forward-ipv6-multicast-to-ip-int
                igmp-snooping
                    mrouter-port
                exit
                mld-snooping
                    mrouter-port
                exit
            exit
            bgp
            exit
            bgp-evpn
                evi 623
                mpls bgp 1
                    ingress-replication-bum-label
                    auto-bind-tunnel
                        resolution any
                        exit
                    no shutdown
                exit
             exit
             provider-tunnel
                 inclusive
                     owner bgp-evpn-mpls
                     data-delay-interval 10
                     root-and-leaf
                     mldp
                     no shutdown
                 exit
              exit
              stp
                  shutdown
              exit      
              igmp-snooping
                  no shutdown
              exit
              mld-snooping
                  no shutdown
              exit
              sap lag-1:623 create
                  igmp-snooping
                      send-queries 
                  exit
                  no shutdown
              exit
              no shutdown

The configuration of the redundant MEG2 is as follows:

// MEG2’s VPRN configuration

*A:MEG2# configure service vprn 6000 
*A:MEG2>config>service>vprn# info 
----------------------------------------------
            local-routes-domain-id 64500:3
            interface "BD-6023" create
                address 11.0.0.3/24
                vrrp 1 passive
                    backup 11.0.0.254
                exit
                vpls "BD-6023"
                    evpn
                        arp
                            no learn-dynamic
                            advertise dynamic
                        exit
                    exit
                 exit
            exit
            interface "SBD-6002" create
                vpls "SBD-6002"
                    evpn-tunnel supplementary-broadcast-domain
                exit
            exit
            interface "local" create
                address 30.0.0.254/24
                sap pxc-6.a:600 create
                exit
            exit
            bgp-ipvpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    domain-id 64500:6000
                    route-distinguisher 192.0.2.3:6000
                    vrf-target target:64500:6000
                    no shutdown
                exit
            exit
            igmp
                interface "BD-6023" 
                    no shutdown 
                exit 
                interface "SBD-6002" 
                    no shutdown 
                exit 
                interface "local" 
                    no shutdown 
                exit
                no shutdown
            exit
            pim
                interface "SBD-6002"
                    multicast-senders always
                exit
                apply-to all
                rp
                    static
                        address 4.4.4.4
                            group-prefix 224.0.0.0/4
                        exit          
                    exit
                    bsr-candidate
                        shutdown
                    exit
                    rp-candidate
                        shutdown
                    exit
                exit
                no shutdown
            exit
            mvpn
                auto-discovery default
                c-mcast-signaling bgp
                intersite-shared persistent-type5-adv
                provider-tunnel
                    inclusive
                        mldp
                            no shutdown
                        exit
                    exit
                exit
                vrf-target unicast
                exit
            exit
            no shutdown
----------------------------------------------

// MEG2 SBD configuration

*A:MEG2>config>service>vprn# /configure service vpls 6002 
*A:MEG2>config>service>vpls# info 
----------------------------------------------
            allow-ip-int-bind
                forward-ipv4-multicast-to-ip-int
                forward-ipv6-multicast-to-ip-int
                evpn-mcast-gateway create
                    non-dr-attract-traffic from-evpn from-pim-mvpn
                    no shutdown
                exit
            exit
            bgp
            exit
            bgp-evpn
                no mac-advertisement
                ip-route-advertisement domain-id 64500:6002
                sel-mcast-advertisement
                evi 6002
                mpls bgp 1
                    ingress-replication-bum-label
                    ecmp 2
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            provider-tunnel
                inclusive
                    owner bgp-evpn-mpls
                    data-delay-interval 10
                    root-and-leaf
                    mldp
                    no shutdown
                exit
            exit
            igmp-snooping
                no shutdown
            exit
            mld-snooping
                no shutdown
            exit
            no shutdown
----------------------------------------------

// MEG2’s local BD-6023 service

A:MEG2>config>service>vpls# /configure service vpls 6023 
A:MEG2>config>service>vpls# info 
----------------------------------------------
            allow-ip-int-bind
                forward-ipv4-multicast-to-ip-int
                forward-ipv6-multicast-to-ip-int
                igmp-snooping
                    mrouter-port
                exit
                mld-snooping
                    mrouter-port
                exit
            exit
            bgp
            exit
            bgp-evpn
                evi 623
                mpls bgp 1
                    ingress-replication-bum-label
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            provider-tunnel
                inclusive
                    owner bgp-evpn-mpls
                    data-delay-interval 10
                    root-and-leaf
                    mldp
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            igmp-snooping
                no shutdown
            exit
            mld-snooping
                no shutdown
            exit
            sap lag-1:623 create
                igmp-snooping
                    send-queries      
                exit
                no shutdown
            exit
            no shutdown
----------------------------------------------

After the preceding configuration is added, MEG1 and MEG2 run the DR election. As in the previous example, MEG1 is elected as DR and MEG2 as non-DR. In this example, the SBD is using mLDP instead of ingress replication to transmit and receive multicast traffic. The following sample output shows the status of the provider-tunnel in MEG1 and MEG2.

A:MEG1# show service id "SBD-6002" provider-tunnel 

===============================================================================
Service Provider Tunnel Information
===============================================================================
Type               : inclusive          Root and Leaf      : enabled
Admin State        : enabled            Data Delay Intvl   : 10 secs
PMSI Type          : ldp                LSP Template       : 
Remain Delay Intvl : 0 secs             LSP Name used      : 8195
PMSI Owner         : bgpEvpnMpls        
Oper State         : up                 Root Bind Id       : 32767
===============================================================================

A:MEG1# tools dump service id "SBD-6002" provider-tunnels   

===============================================================================
VPLS 6002 Inclusive Provider Tunnels Originating             
===============================================================================
ipmsi (LDP)                                     P2MP-ID  Root-Addr
-------------------------------------------------------------------------------
8195                                            8195    192.0.2.2       

-------------------------------------------------------------------------------

===============================================================================
VPLS 6002 Inclusive Provider Tunnels Terminating             
===============================================================================
ipmsi (LDP)                                     P2MP-ID  Root-Addr
-------------------------------------------------------------------------------
                                                8193    192.0.2.1       

-------------------------------------------------------------------------------

A:MEG2# show service id "SBD-6002" provider-tunnel 

===============================================================================
Service Provider Tunnel Information
===============================================================================
Type               : inclusive          Root and Leaf      : enabled
Admin State        : enabled            Data Delay Intvl   : 10 secs
PMSI Type          : ldp                LSP Template       : 
Remain Delay Intvl : 0 secs             LSP Name used      : 8195
PMSI Owner         : bgpEvpnMpls        
Oper State         : up                 Root Bind Id       : 32767
===============================================================================

A:MEG2# tools dump service id "SBD-6002" provider-tunnels   

===============================================================================
VPLS 6002 Inclusive Provider Tunnels Originating             
===============================================================================
ipmsi (LDP)                                     P2MP-ID  Root-Addr
-------------------------------------------------------------------------------
8195                                            8195    192.0.2.3       

-------------------------------------------------------------------------------

===============================================================================
VPLS 6002 Inclusive Provider Tunnels Terminating             
===============================================================================
ipmsi (LDP)                                     P2MP-ID  Root-Addr
-------------------------------------------------------------------------------
                                                8193    192.0.2.1       

-------------------------------------------------------------------------------

Also the example in MEG or PEG configuration example for Ingress Replication on the SBD showed the IIF and OIF lists on the MEGs for a source 40.0.0.1 that was connected to a remote MVPN PE and was streaming group 239.0.0.44. In this example, there is a source 10.0.0.1 connected to a remote OISM PE and it is streaming group 239.0.0.1. The group has local receivers on the local SBD-6023, which is multihomed to MEG1 and MEG2, and has receivers on a remote MVPN PE. The remote MVPN PE is configured with the default mvpn umh-selection highest-ip and therefore a local join triggers a C-multicast source-join route that is imported only by MEG2 (given that it has higher IP address than MEG1).

The following shows a sample output for this scenario.

// The source-join route for 239.0.0.1 is only imported by MEG2

A:MEG1# show router bgp routes mvpn-ipv4 type source-join group-ip 239.0.0.1 source-ip 10.0.0.1 
===============================================================================
 BGP Router ID:192.0.2.2        AS:64500       Local AS:64500      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP MVPN-IPv4 Routes
===============================================================================
Flag  RouteType                   OriginatorIP           LocalPref   MED
      RD                          SourceAS               Path-Id     IGP Cost
      Nexthop                     SourceIP                           Label
      As-Path                     GroupIP                            
-------------------------------------------------------------------------------
No Matching Entries Found.
===============================================================================

A:PE-3# show router bgp routes mvpn-ipv4 type source-join group-ip 239.0.0.1 source-ip 10.0.0.1 
===============================================================================
 BGP Router ID:192.0.2.3        AS:64500       Local AS:64500      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP MVPN-IPv4 Routes
===============================================================================
Flag  RouteType                   OriginatorIP           LocalPref   MED
      RD                          SourceAS               Path-Id     IGP Cost
      Nexthop                     SourceIP                           Label
      As-Path                     GroupIP                            
-------------------------------------------------------------------------------
u*>i  Source-Join                 -                      100         0
      192.0.2.3:6000              64500                  None        -
      192.0.2.4                   10.0.0.1                            
      No As-Path                  239.0.0.1                           
-------------------------------------------------------------------------------
Routes : 1
===============================================================================

// Therefore, only MEG2 will add the MVPN tunnel to the OIF list for the group 
// MEG1 only adds the local BD-6023 to the OIF list

A:MEG1# show router 6000 pim group 239.0.0.1 detail 

===============================================================================
PIM Source Group ipv4
===============================================================================
Group Address      : 239.0.0.1
Source Address     : 10.0.0.1
RP Address         : 4.4.4.4
Advt Router        : 
Flags              : spt                Type               : (S,G)
Mode               : sparse             
MRIB Next Hop      : 10.0.0.1
MRIB Src Flags     : direct             
Keepalive Timer Exp: 0d 00:02:43        
Up Time            : 1d 17:27:02        Resolved By        : rtable-u
 
Up JP State        : Joined             Up JP Expiry       : 0d 00:00:00
Up JP Rpt          : Not Joined StarG   Up JP Rpt Override : 0d 00:00:00
 
Register State     : Pruned             Register Stop Exp  : 0d 00:00:09
Reg From Anycast RP: No                 
 
Rpf Neighbor       : 10.0.0.1
Incoming Intf      : SBD-6002
Outgoing Intf List : BD-6023, SBD-6002
 
Curr Fwding Rate   : 67.200 kbps        
Forwarded Packets  : 15258              Discarded Packets  : 0
Forwarded Octets   : 1281672            RPF Mismatches     : 0
Spt threshold      : 0 kbps             ECMP opt threshold : 7
Admin bandwidth    : 1 kbps             
-------------------------------------------------------------------------------
Groups : 1
===============================================================================

// on MEG1's local BD-6023 there is a receiver on sap:lag-1:623

A:MEG1# show service id "BD-6023" mfib 

===============================================================================
Multicast FIB, Service 6023
===============================================================================
Source Address  Group Address         Port Id                      Svc Id   Fwd
                                                                            Blk
-------------------------------------------------------------------------------
*               *                     mpls:192.0.2.3:524258        Local    Fwd
10.0.0.1        239.0.0.1             sap:lag-1:623                Local    Fwd
                                      mpls:192.0.2.3:524258        Local    Fwd
*               * (mac)               mpls:192.0.2.3:524258        Local    Fwd
-------------------------------------------------------------------------------
Number of entries: 3
===============================================================================

// MEG2 adds the local BD-6023 and the MVPN tunnel to the OIF list

A:PE-3# show router 6000 pim tunnel-interface 

===============================================================================
PIM Interfaces ipv4
===============================================================================
Interface                        Originator Address   Adm  Opr  Transport Type
-------------------------------------------------------------------------------
mpls-if-73729                    192.0.2.3            Up   Up   Tx-IPMSI
mpls-if-73733                    192.0.2.4            Up   Up   Rx-IPMSI
mpls-if-73736                    192.0.2.2            Up   Up   Rx-IPMSI
-------------------------------------------------------------------------------
Interfaces : 3
===============================================================================

A:PE-3# show router 6000 pim group 239.0.0.1 detail 

===============================================================================
PIM Source Group ipv4
===============================================================================
Group Address      : 239.0.0.1
Source Address     : 10.0.0.1
RP Address         : 4.4.4.4
Advt Router        : 
Flags              : spt                Type               : (S,G)
Mode               : sparse             
MRIB Next Hop      : 10.0.0.1
MRIB Src Flags     : direct             
Keepalive Timer    : Not Running        
Up Time            : 1d 17:32:43        Resolved By        : rtable-u
 
Up JP State        : Joined             Up JP Expiry       : 0d 00:00:00
Up JP Rpt          : Not Joined StarG   Up JP Rpt Override : 0d 00:00:00
 
Register State     : No Info            
Reg From Anycast RP: No                 
 
Rpf Neighbor       : 10.0.0.1
Incoming Intf      : SBD-6002
Outgoing Intf List : BD-6023, mpls-if-73729
 
Curr Fwding Rate   : 66.864 kbps        
Forwarded Packets  : 27221              Discarded Packets  : 0
Forwarded Octets   : 2286564            RPF Mismatches     : 0
Spt threshold      : 0 kbps             ECMP opt threshold : 7
Admin bandwidth    : 1 kbps             
-------------------------------------------------------------------------------
Groups : 1
===============================================================================

EVPN Layer-2 multicast (IGMP/MLD proxy)

SR OS supports EVPN Layer-2 multicast as described in the EVPN IGMP/MLD Proxy specification RFC9251. When this is enabled in a VPLS service with active IGMP or MLD snooping, IGMP or MLD messages are no longer sent to EVPN destinations. SMET routes (EVPN routes type 6) are advertised instead, so that the interest in a specific (S,G) can be signaled to the rest of the PEs attached to the same VPLS (also known as a Broadcast Domain (BD)). See SMET routes replace IGMP/MLD reports.

Figure 98. SMET routes replace IGMP/MLD reports

A VPLS service supporting EVPN-based proxy-IGMP/MLD functionality is configured as follows:

vpls 1 name "evi-1" customer 1 create 
            bgp
            exit
            bgp-evpn
                evi 1
                sel-mcast-advertisement
                vxlan
                    shutdown
                exit
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit                  
            exit
            igmp/mld-snooping
                evpn-proxy 
                    no shutdown           
                exit
            sap lag-1:101 create
                igmp-snooping
                    send-queries
                exit
                no shutdown
            exit 

Where:

  • The sel-mcast-advertisement command allows the advertisement of SMET routes.

    The received SMET routes are processed regardless of the command.

  • The evpn-proxy command in either the igmp-snooping or mld-snooping contexts:

    • triggers an IMET route update with the multicast flags EC and the proxy bits set. The multicast flags extended community carries a flag for IGMP proxy, that is set if igmp-snooping>evpn-proxy no shutdown is configured. Similarly, the MLD proxy flag is set if mld-snooping>evpn-proxy no shutdown is configured.

    • no longer turns EVPN MPLS into an Mrouter port, when used in EVPN MPLS service

    • enables EVPN proxy (IGMP or MLD snooping must be shutdown)

When the VPLS service is configured as an EVPN proxy service, IGMP or MLD queries or reports are no longer forwarded to EVPN destinations of PEs that support EVPN proxy. The reports are also no longer processed when received from PEs that support EVPN proxy.

The IGMP or MLD snooping function works in the following manner when the evpn-proxy command is enabled:

  • IGMP or MLD works in proxy mode despite its configuration as IGMP or MLD snooping.

  • Received IGMP or MLD join or leave messages on SAP or SDP bindings are processed by the proxy database to summarize the IGMP or MLD state in the service based on the group joined (each join for a group lists all sources to join). The proxy database can be displayed as follows.

    # show service id 4000 igmp-snooping proxy-db 
     
    ===============================================================================
    IGMP Snooping Proxy-reporting DB for service 4000
    ===============================================================================
    Group Address    Mode     Up Time        Num Sources
    -------------------------------------------------------------------------------
    239.0.0.1        exclude  0d 00:53:00    0
    239.0.0.2        include  0d 00:53:01    1
    -------------------------------------------------------------------------------
    Number of groups: 2
    ===============================================================================
    
  • When evpn-proxy is enabled, an additional EVPN proxy database is created to hand the version flags over to BGP and generate the SMET routes with the proper IGMP or MLD version flags. This EVPN proxy database is populated with local reports received on SAP or SDP binds but not with received SMET routes (the regular proxy database includes reports from SMETs too, without the version). The EVPN proxy database can be displayed as follows:

    # show service id 4000 igmp-snooping evpn-proxy-db 
     
    ===============================================================================
    IGMP Snooping Evpn-Proxy-reporting DB for service 4000
    ===============================================================================
    Group Address    Mode     Up Time        Num Sources    V1   V2   V3
    -------------------------------------------------------------------------------
    239.0.0.1        exclude  0d 00:53:55    0                        V3
    239.0.0.2        include  0d 00:53:55    1                        V3
    -------------------------------------------------------------------------------
    Number of groups: 2
    ===============================================================================
    
  • The EVPN proxy database or proxy database process IGMP or MLD reports as follows:

    • The EVPN proxy database result is communicated to the EVPN layer so that the corresponding SMET routes and flags are sent to the BGP peers. If multiple versions exist on the EVPN proxy database, multiple flags are set in the SMET routes.

    • The regular proxy database result is conveyed to the local Mrouter ports on SAP or SDP binds by IGMP or MLD reports and they are never sent to EVPN destinations of PEs with evpn-proxy configured.

  • IGMP or MLD messages received on local SAP or SDP bind Mrouter ports (which have a default *.* entry) and queries are not processed by the proxy database. Instead, they are forwarded to local SAP or SDP binds but never to EVPN destinations of PEs with evpn-proxy configured (they are, however, still sent to non-EVPN proxy PEs).

  • IGMP or MLD reports or queries are not received from EVPN PEs with evpn-proxy configured, but they are received and processed from EVPN PEs with no evpn-proxy configured. A PE determines if a specified remote PE, in the same BD, supports EVPN proxy based on the received igmp-proxy and mld-proxy flags along with the IMET routes.

  • The Layer-2 MFIB OIF list for an (S,G) is built out of the local IGMP or MLD reports and remote SMET routes.

    • For backwards compatibility, PEs that advertise IMET routes without the multicast flags EC or with the EC but without the proxy bit set, are considered as Mrouters. For example, its EVPN binds are added to all OIF lists and reports are sent to them.

    • Even if MLD snooping is shut down and only IGMP snooping is enabled, the MFIB shows the EVPN binds added to *,* for MAC scope. If MLD snooping is enabled, the EVPN binds are not added as Mrouter ports for MAC scope.

  • When SMET routes are received for a specific (S,G), the corresponding reports are sent to local SAP or SDP binds connected to queriers. The report version is set based on the local version of the querier.

The IGMP or MLD EVPN proxy functionality is supported in VPLS services with EVPN-VXLAN or EVPN-MPLS, and along with ingress replication or mLDP provider-tunnel trees.

In addition, EVPN proxy VPLS services support EVPN multihoming with multicast state synchronization using EVPN routes type 7 and 8. No additional command is needed to trigger the advertisement and processing of the multicast synch routes. In VPLS services, BGP sync routes are advertised or processed whenever the evpn-proxy command is enabled and there is a local Ethernet segment in the service. See EVPN OISM and multihoming for more information about the EVPN multicast synchronization routes and state synchronization in Ethernet segments.

Selective Provider Tunnels in OISM and EVPN-proxy services

Selective Provider Tunnels (S-PMSI)

Selective Provider Tunnels or Selective Provider Multicast Service Interface (S-PMSI) tunnels are supported in R-VPLS services configured in Optimized Inter-Subnet Multicast (OISM) mode or VPLS services configured in evpn-proxy mode.

Selective Provider Tunnels are signaled using the EVPN Selective Provider Multicast Service Interface Auto-Discovery (S-PMSI A-D) route, or EVPN route type 10. SR OS supports two types of Selective Provider Tunnels:

  • mLDP wildcard S-PMSI trees, which are used to optimize the delivery of multicast and forward it to only PEs with IP Multicast sources or receivers. Wildcard S-PMSIs are enabled by the following command.
    configure service vpls provider-tunnel selective wildcard-spmsi
  • mLDP specific S-PMSI trees for (S,G) and/or (*,G) groups, which are used to optimize the delivery of some multicast groups that have receivers only in a limited number of PEs. (S,G) and (*,G) S-PMSIs are enabled by configuring the following command.
    configure service vpls provider-tunnel selective data-threshold

The configuration of mLDP S-PMSIs for EVPN is similar to the mLDP S-PMSIs for MVPN. A data-threshold for a group-address and mask is configured. When the threshold (configured in kbps) for a group contained in the group-address and mask is exceeded, the router sets up a selective provider tunnel and the PEs with receivers for that group will join the mLDP selective tree. Options to setup the S-PMSIs based on the number of interested PEs are also supported, as well as a maximum-p2mp-spmsi parameter that limits the number of S-PMSI trees created per service.

BGP-EVPN S-PMSI A-D route

The EVPN Selective Provider Multicast Service Interface Auto-Discovery route or simply S-PMSI A-D route or route type 10 is required to advertise:

  • Wildcard PMSI routes to setup mLDP IP multicast trees
  • Selective S-PMSI routes to setup mLDP selective IP multicast trees.

The S-PMSI A-D route is specified in draft-ietf-bess-evpn-bum-procedure-updates and the format is depicted in S-PMSI A-D route format:

Figure 99. S-PMSI A-D route format

Where:

  • All the fields are considered part of the route key for BGP processing.
  • When a service is configured to advertise wildcard spmsi routes, a route type 10 is advertised with Source and Group being all zeros. Otherwise the Source and Groups are populated as in the case of the other multicast routes.
  • The S-PMSI A-D route above is only supported along with tunnel type mLDP (in the Provider Tunnel Attribute). No other tunnel types are supported.
  • While in VPLS evpn-proxy services the S-PMSI AD routes is advertised using the route distinguisher and the route target of the service, in OISM mode, the S-PMSI A-D routes are advertised from the SBD or from an ordinary BD:
    • When advertised from an ordinary BD, the route includes the BD route-target (and route distinguisher) where the selective tree is configured plus the SBD route target
    • When advertised from the SBD, the route includes the SBD route target only. This is only required in cases where the PE is a MEG/PEG.

Wildcard Selective Provider tunnels

As per draft-ietf-bess-evpn-irb-mcast, wildcard S-PMSI tunnels allow PEs decouple trees used for BUM traffic from trees used for IP multicast. If wildcard S-PMSIs are enabled, BUM I-PMSI tunnels are not used to send IP Multicast traffic.
Note: Per draft-ietf-bess-evpn-irb-mcast, Note that this will cause all the BUM traffic from a given BD in a Tenant Domain to be sent to all PEs that attach to that Tenant Domain, even the PEs that don't attach to the given BD. To avoid this, it is RECOMMENDED that the BUM tunnels not be used as IP Multicast inclusive tunnels, ...

The wildcard S-PMSI A-D route is supported in OISM and VPLS evpn-proxy modes.

  • An mLDP tree can be configured as selective wildcard-spmsi in all the R-VPLS services of the tenant, including the SBD.
  • An mLDP tree can be configured as selective wildcard-spmsi in VPLS services as long as evpn-proxy is enabled.
  • The selective provider-tunnel configuration is blocked on services where evpn-proxy or OISM are not enabled.
Wildcard S-PMSI A-D route in OISM illustrates an example for OISM:
Figure 100. Wildcard S-PMSI A-D route in OISM
  • Based on the configuration of the following command PE1 signals a wildcard S-PMSI A-D route for BD1 (in addition to the IMET routes as in the regular OISM case or the EVPN proxy case). The route contains the SDB-RT (SBD's route target) in addition to the BD1-RT (BD1's route target).
    configure service vpls provider-tunnel selective wildcard-spmsi
  • PE2 and PE3 import the route as they would do for BD1 IMET in OISM mode. PE2 and PE3 join the wildcard S-PMSI mLDP tree if they have been enabled using the following command and they have any local receivers that issued an IGMP/MLD join. A PE will not join the wildcard S-PMSI if no local receivers are joined.
    • MD-CLI
      configure service vpls provider-tunnel selective admin-state enable
    • classic CLI
      configure service vpls provider-tunnel selective no shutdown
  • The impact of this procedure is twofold:
    • PE1 now uses the wildcard S-PMSI mLDP tree for IP Multicast traffic. The IP multicast traffic is delivered to only those downstream PEs that joined the wildcard S-PMSI tree, and not the rest of the PEs of the tenant. Note that, in MVPN, the wildcard-spmsi does not carry traffic (the route does not even contain PTA). In EVPN, the wildcard-spmsi carries IP multicast and the route is advertised with an MLDP PTA.
    • PE1 now sends BUM traffic to only the PEs attached to the source BD, for example, PE2, and not to PE-3, while still using MLDP for multicast traffic. Without wildcard-spmsis, if we wanted to use mLDP for multicast, it had to be used for BUM traffic too, which would mean BUM was attracted by PE3 as well (in the example above).
  • The wildcard-spmsi is used for multicast and the BUM EVPN destinations can be used for BUM. Note that the PE1’s EVPN SBD destination bind to PE3 is of type multicast (‘m’), so it is not used for BUM.

Inclusive and Selective mLDP Provider Tunnels are not simultaneously supported in the same service.

(S,G) and (*,G) Selective Provider Tunnels

SR OS supports mLDP Selective Provider Tunnels for specific (S,G) or (*,G) trees. Selective Provider Tunnels for OSIM illustrates an example for OISM services.
Figure 101. Selective Provider Tunnels for OSIM
  • PE1 may use wildcard-spmsi or regular inclusive forwarding for IP multicast traffic. In the example, PE1 uses wildcard-spmsi.
  • PE2 and PE3 are configured with the following command and therefore join the wildcard-spmsi tree.
    • MD-CLI
      configure service vpls provider-tunnel selective admin-state enable
    • classic CLI
      configure service vpls provider-tunnel selective no shutdown
  • Since PE2 receives a local IGMP join (*,G1), PE2 triggers an SMET (*,G1) that creates an MFIB entry for (*,G1) on PE1.
  • PE1 is configured with a threshold in ‘kbps’ units for G1 and starts polling stats for all MFIB entries that include G1. When the configured kbps threshold (and optionally, the number of PEs for a S,G) is exceeded, PE1 signals an S-PMSI A-D route for the (*,G), and after the delay-interval it starts using the new tree for S,G.
  • If PE1 receives an SMET (S,G), then it generates a S-PMSI A-D route for (S,G) instead.
  • If both SMETs are received, for example, (*,G1) and (S,G1), both S-PMSI types are generated, with the different mLDP tree information (in that way, a receiver only interested in (S,G) would not attract (*,G) traffic).
  • Interested PEs with local receivers for the (S,G) join the new tree. In the example, only PE2 joins the spmsi tree, because it is the only PE with a local MFIB entry for (*,G1).
While the example uses a (*,G) SPMSI, (S,G) SPMSIs are possible too. The use of spmsis is configured as follows:
*A:PE-2>config>service>vpls>provider-tunnel# tree detail
selective
|
+---data-delay-interval <seconds>
| no data-delay-interval
|
+---mldp
| no mldp
|
+---wildcard-spmsi
| no wildcard-spmsi
|
+---data-threshold {<c-grp-ip-addr/mask>|<c-grp-ip-addr> <netmask>} <s-pmsi-threshold> [pe-threshold-add <pe-threshold-add>] [pe-threshold-delete <pe-threshold-delete>]
data-threshold <c-grp-ipv6-addr/prefix-length> <s-pmsi-threshold> [pe-threshold-add <pe-threshold-add>] [pe-threshold-delete <pe-threshold-delete>]
no data-threshold {<c-grp-ip-addr/mask>|<c-grp-ip-addr> <netmask>}
no data-threshold <c-grp-ipv6-addr/prefix-length>|
|
+---maximum-p2mp-spmsi <range>
| no maximum-p2mp-spmsi
|
+---no shutdown
| shutdown

Where:

  • The selective container and the commands above are supported in VPLS services in evpn-proxy mode and R-VPLS services in OISM mode, in particular, in all ordinary BDs and the SBD of MEG/PEG nodes.
  • group-address/mask — specifies an IPv4 or IPv6 multicast group address and netmask length. Multiple group-address/masks can be specified. In case of overlapping ranges, for a aowxudux group, only the longest prefix match is used. For instance, if the following two overlapping ranges are configured, and an SMET route for (*,232.0.1.1) is received, the S-PMSI tree for (*,232.0.1.1) is created only when the BW threshold exceeds 10kbps.
    *A:PE-4>config>service>vpls>pt>selective$ info
    ----------------------------------------------
    data-threshold 232.0.0.0/16 0
    data-threshold 232.0.1.0/24 10
    ----------------------------------------------
  • s-pmsi-threshold - rate in kbps. If the rate for a given (S,G) or (*,G) within the specified group range exceeds the threshold, traffic for the (S,G) or (*,G) included in the group range is switched to the selective provider tunnel. Threshold 0 is also supported for mLDP. When threshold 0 is configured, the (S,G) or (*,G) switches to S-PMSI as soon as it is learned in the SBD/BD.
  • pe-threshold-add — specifies the number of receiver PEs for creating S-PMSI. When the number of receiver PEs for a given multicast group configuration is non-zero and below this value, and the bandwidth threshold is satisfied, the S-PMSI is created. The number of receiver PEs is derived out of the SMET count (of routes included in the group range) for the SBD/BD. The originator-IP of the SMET route is checked so that the same PE is not counted multiple times. For example, for a (*,G1) SPMSI setup by PE1, if PE2 has a local receiver for (S1,G1) and another one for (S2,G1), PE2 issues two SMET routes. However, those are received by PE1 with the same originator-IP and therefore they count as one PE. The command pe-threshold-add dictates when to bring back the spmsi-tunnels after the number of receiver PEs counter has hit the pe-threshold-delete, in which case we have deleted the spmsi-tunnel for this group. It has no implication on when to setup the spmsi-tunnel, since the router always waits for the s-pmsi-threshold to be exceeded.
  • pe-threshold-delete — specifies the number of receiver PEs needed to delete the S-PMSI. When the number of receiver PEs for a given multicast group configuration is above the threshold, the S-PMSI is deleted and the multicast group is moved to ingress replication EVPN destinations or a wildcard-spmsi if configured, or potentially to a (*,G) P2MP if the MFIB was previously using a (S,G) PMSI. It is recommended that the delete threshold is significantly larger than the add threshold, to avoid re-signaling of S-PMSI as the receiver PE count fluctuates.
  • Note that the threshold add/delete commands are based on SMET route counts, which not always match the number of receivers in the network for a specific (*,G) or (S,G). For instance:
    • SMETs may be received from non-spmsi enabled PEs. These routes are counted, however the receivers on these PEs do not get the traffic because they do not support spmsi trees.
    • SMETs from a PE can be aggregated, for example, for local (*,G), (S1..Sn,G) state, SMETs are aggregated into a single (*,G) SMETs. That does not provide a clear indication of the amount of receivers for a specific group on the root PE.
  • Examples of how these thresholds work are shown in the following tables, assuming pe-threshold-add 2 pe-threshold-delete 5:
    Table 6. Receiver PE count rising thresholds
    Receiver PE count (rising) (based on SMET routes) PMSI used by the root PE Effect
    0→1 Selective

    PE count < pe-threshold-add

    S-PMSI used to carry traffic
    1→2 Selective

    PE count < pe-threshold-delete

    Traffic remains on S-PMSI
    2→3 Selective

    PE count < pe-threshold-delete

    Traffic remains on S-PMSI
    3→4 Selective

    PE count < pe-threshold-delete

    Traffic remains on S-PMSI
    4→5 wildcard-spmsi if exists or EVPN destinations (Ingress Replication)

    PE count = pe-threshold-delete

    Traffic switched back to wildcard-spmsi if exists or IR otherwise. Or potentially a (*,G) SPMSI if the MFIB was previously using it before moving to (S,G) SPMSI
    Table 7. Receiver PE count falling thresholds
    Receiver PE count (falling) (based on SMET routes) PMSI used by root PE Effect
    5 wildcard-spmsi if exists or EVPN destinations (Ingress Replication) Traffic flows on wildcard-spmsi if exists or IR. Or potentially a (*,G) SPMSI if the SMETs are for (S,G)
    5→4 wildcard-spmsi if exists or EVPN destinations (Ingress Replication)

    PE count > pe-threshold-add

    Traffic remains on wildcard-spmsi if exists or IR. Or potentially a (*,G) SPMSI if the SMETs are for (S,G)
    4→3 wildcard-spmsi if exists or EVPN destinations (Ingress Replication)

    PE count > pe-threshold-add

    Traffic remains on wildcard-spmsi if exists or IR. Or potentially a (*,G) SPMSI if the SMETs are for (S,G)
    3→2 Selective

    PE count = pe-threshold-add

    S-PMSI re-signaled. Traffic switched to S-PMSI.
    2→1 Selective Traffic flows on S-PMSI
  • maximum-p2mp-spmsi - determines the maximum number of originating spmsi tunnels in the service (including the wildcard-spmsi). This limit is not validated against the total number of p2mp tunnels supported in the system.

Other parameters are configured as in the provider-tunnel inclusive context.

Selective Provider Tunnels are also supported in MEG/PEG gateways. In a MEG/PEG scenario, when the source is attached to an OISM PE, the PE may not use the S-PMSI tree for a given (x,G) if the only OIF is the MEG/PEG Designated Router (DR). This is because the way the implementation handles the wildcard SMET versus specific SMET routes in the MFIB (a specific SMET does not create an entry if there is a wildcard SMET from the same PE). Suppose MEG1 and MEG2 are the two MEGs between an OISM and an MVPN network, where the receivers are in the MVPN network and the source in the OISM domain. In that case:

  • An OISM PE would create only a (*,*) entry if it received a wildcard SMET and a (S,G) SMET from the MEG DR. Therefore even if the threshold for (S,G) is exceeded, the OISM PE still uses the wildcard S-PMSI as opposed to the more specific S-PMSI.
  • In addition, the same OISM PE would create an OIF to the non-DR MEG and a (S,G) entry (with OIFs to the two MEGs) if it received a (S,G) SMET from the non-DR MEG.

Configuration examples for selective provider tunnels

The following sections provide example configurations for selective provider tunnels in EVPN proxy and OISM services.

Use of Selective Provider Tunnels in EVPN proxy services
Suppose PE2, PE3 and PE4 are attached to the same VPLS service (named "evpn-proxy-bd-10k") that is configured in EVPN proxy mode. At the same time, PE2 and PE3 are multihomed to the same CE1. The PEs are configured as follows:
// PE2 

[ex:/configure service vpls "evpn-proxy-bd-10k"]
A:admin@PE-2# info
admin-state enable
service-id 10000
customer "1"
bgp 1 {
}
igmp-snooping {
  admin-state enable
  evpn-proxy {
    admin-state enable
  }
}
bgp-evpn {
  evi 10000
  routes {
    sel-mcast {
      advertise true
    }
  }
  mpls 1 {
    admin-state enable
    ingress-replication-bum-label true
    ecmp 2
    auto-bind-tunnel {
      resolution any
    }
  }
}
sap lag-1:100 {
  igmp-snooping {
    send-queries true
  }
}
provider-tunnel {
  selective {
    admin-state enable
    owner bgp-evpn-mpls
    wildcard-spmsi true
    mldp true
    data-threshold {
      group-prefix 224.0.0.0/4 {
        threshold 0
      }
    }
  }
}

// PE3

[ex:/configure service vpls "evpn-proxy-bd-10k"]
A:admin@PE-3# info
admin-state enable
service-id 10000
customer "1"
bgp 1 {
}
igmp-snooping {
  admin-state enable
  evpn-proxy {
    admin-state enable
  }
}
bgp-evpn {
  evi 10000
  routes {
    sel-mcast {
      advertise true
    }
  }
  mpls 1 {
    admin-state enable
    ingress-replication-bum-label true
    ecmp 2
    auto-bind-tunnel {
      resolution any
    }
  }
}
sap lag-1:100 {
  igmp-snooping {
    send-queries true
  }
}
provider-tunnel {
  selective {
    admin-state enable
    owner bgp-evpn-mpls
    wildcard-spmsi true
    mldp true
    data-threshold {
      group-prefix 224.0.0.0/4 {
        threshold 0
      }
    }
  }
}

// PE4


[ex:/configure service vpls "evpn-proxy-bd-10k"]
A:admin@PE-4# info
admin-state enable
service-id 10000
customer "1"
bgp 1 {
}
igmp-snooping {
  admin-state enable
  evpn-proxy {
    admin-state enable
  }
}
bgp-evpn {
  evi 10000
  routes {
    sel-mcast {
      advertise true
    }
  }
  mpls 1 {
    admin-state enable
    ingress-replication-bum-label true
    ecmp 2
    auto-bind-tunnel {
      resolution any
    }
  }
}
sap pxc-6.a:100 {
}
provider-tunnel {
  selective {
    admin-state enable
    data-delay-interval 5
    owner bgp-evpn-mpls
    wildcard-spmsi true
    mldp true
    data-threshold {
      group-prefix 224.0.0.0/4 {
        threshold 0
      }
      group-prefix 239.0.0.0/8 {
        threshold 1
      }    
    }
  }
}
Assuming a source with IP address 10.0.0.4 connected to PE4 starts sending multicast traffic to 239.0.0.4, PE4 detects the stream and as soon as it exceeds the configured threshold (1 kbps), PE4 advertises an S-PMSI A-D route. Since PE2 and PE3 receive an IGMP join for (10.0.0.4,239.0.0.4), they advertise the corresponding SMET routes and the S-PMSI trees are setup:
// PE4 advertises the S-PMSI A-D route since the received stream exceeds 1kbps:

A:PE-4# 
439 2023/02/07 19:25:13.370 UTC MINOR: DEBUG #2001 Base Peer 1: 192.0.2.6
"Peer 1: 192.0.2.6: UPDATE
Peer 1: 192.0.2.6 - Send BGP UPDATE:
Withdrawn Length = 0
Total Path Attr Length = 100
Flag: 0x90 Type: 14 Len: 38 Multiprotocol Reachable NLRI:
Address Family EVPN
NextHop len 4 NextHop 192.0.2.4
Type: EVPN-SPMSI-AD Len: 27 RD: 192.0.2.4:10000, tag: 0, Mcast-Src-Len: 
32, Mcast-Src-Addr: 10.0.0.4, Mcast-Grp-Len: 32, Mcast-Grp-Addr: 239.0.0.4, Orig Addr: 192.0.2.4/32 
Flag: 0x40 Type: 1 Len: 1 Origin: 0
Flag: 0x40 Type: 2 Len: 0 AS Path:
Flag: 0x40 Type: 5 Len: 4 Local Preference: 100
Flag: 0xc0 Type: 16 Len: 16 Extended Community:
target:64500:10000
bgp-tunnel-encap:MPLS
Flag: 0xc0 Type: 22 Len: 22 PMSI:
Tunnel-type LDP P2MP LSP (2)
Flags: (0x0)[Type: None BM: 0 U: 0 Leaf: not required]
MPLS Label 0
Root-Node 192.0.2.4, LSP-ID 0x2008
"
3 2023/02/07 19:25:13.368 UTC MAJOR: SVCMGR #2320 Base 
"Service Id 10000, Dynamic vplsPmsi SDP Bind Id 32767:4294967285 was created."

Output of the MFIB and S-PMSIs in PE2 and PE4

show service id "10000" mfib statistics 

===============================================================================
Multicast FIB Statistics, Service 10000
===============================================================================
Source Address   Group Address   Matched Pkts   Matched Octets
                                                Forwarding Rate
-------------------------------------------------------------------------------
10.0.0.4         239.0.0.4       11190           1096620
                                                 77.616 kbps
*                * (mac)         0               0
                                                 0.000 kbps
-------------------------------------------------------------------------------
Number of entries: 2
===============================================================================
show service id "10000" provider-tunnel spmsi-tunnels 
===============================================================================
LDP Spmsi Tunnels
===============================================================================
LSP ID : 8199 
Root Address : 192.0.2.2
S-PMSI If Index : 73750 
Num. Leaf PEs : 1 
Uptime : 0d 04:32:59 
Group Address : 239.0.0.4
Source Address : 10.0.0.4
Origin IP Address : 192.0.2.2
State : TX Joined 
Remain Delay Intvl : 0 
-------------------------------------------------------------------------------
LSP ID : 8200 
Root Address : 192.0.2.3
S-PMSI If Index : 73748 
Num. Leaf PEs : 1 
Uptime : 0d 04:33:02 
Group Address : 239.0.0.4
Source Address : 10.0.0.4
Origin IP Address : 192.0.2.3
State : RX Joined 
Remain Delay Intvl : 0 
-------------------------------------------------------------------------------
LSP ID : 8200 
Root Address : 192.0.2.4
S-PMSI If Index : 73754 
Num. Leaf PEs : 1 
Uptime : 0d 00:00:32 
Group Address : 239.0.0.4
Source Address : 10.0.0.4
Origin IP Address : 192.0.2.4
State : RX Joined 
Remain Delay Intvl : 0 
-------------------------------------------------------------------------------
LSP ID : 8197 
Root Address : 192.0.2.2
S-PMSI If Index : 73733 
Uptime : 0d 04:32:59 
Group Address : * (wildcard)
Source Address : *
Origin IP Address : 192.0.2.2
State : TX Joined 
Remain Delay Intvl : 0 
-------------------------------------------------------------------------------
LSP ID : 8198 
Root Address : 192.0.2.3
S-PMSI If Index : 73747 
Uptime : 0d 04:33:02 
Group Address : * (wildcard)
Source Address : *
Origin IP Address : 192.0.2.3
State : RX Joined 
Remain Delay Intvl : 0 
-------------------------------------------------------------------------------
LSP ID : 8197 
Root Address : 192.0.2.4
S-PMSI If Index : 73746 
Uptime : 0d 04:33:02 
Group Address : * (wildcard)
Source Address : *
Origin IP Address : 192.0.2.4
State : RX Joined 
Remain Delay Intvl : 0 
-------------------------------------------------------------------------------
===============================================================================
tools dump service id "10000" provider-tunnels type terminating
===============================================================================
VPLS 10000 Inclusive Provider Tunnels Terminating 
===============================================================================
ipmsi (LDP) P2MP-ID Root-Addr
-------------------------------------------------------------------------------
               8197 192.0.2.4 
               8198 192.0.2.3   
               8200 192.0.2.3 
               8200 192.0.2.4 
-------------------------------------------------------------------------------
===============================================================================
VPLS 10000 Selective Provider Tunnels Terminating 
===============================================================================
spmsi (LDP)   Source-Addr   Group-Addr   Root-Addr   LSP-ID   Lsp-Name
-------------------------------------------------------------------------------
              10.0.0.4       239.0.0.4   192.0.2.3   8200 
              10.0.0.4       239.0.0.4   192.0.2.4   8200
              *              *           192.0.2.3   8198 
              *              *           192.0.2.4   8197 
-------------------------------------------------------------------------------

Outputs in PE4

===============================================================================
Multicast FIB Statistics, Service 10000
===============================================================================
Source Address     Group Address     Matched Pkts     Matched Octets
                                                      Forwarding Rate
-------------------------------------------------------------------------------
*                   *                 0               0
                                                      0.000 kbps
10.0.0.4            239.0.0.4         31484           3211368
                                                      57.691 kbps
*                   * (mac)           0               0
                                                      0.000 kbps
-------------------------------------------------------------------------------
Number of entries: 3
===============================================================================
tools dump service id 10000 provider-tunnels type originating 
===============================================================================
VPLS 10000 Inclusive Provider Tunnels Originating 
===============================================================================
No Tunnels Found 
-------------------------------------------------------------------------------
===============================================================================
VPLS 10000 Selective Provider Tunnels Originating 
===============================================================================
spmsi (LDP) Source-Addr     Group-Addr   Root-Addr   LSP-ID   Lsp-Name
-------------------------------------------------------------------------------
            10.0.0.4        239.0.0.4    192.0.2.4   8200     8200 
            *               *            192.0.2.4   8197     8197 
-------------------------------------------------------------------------------
Use of Selective Provider Tunnel in OISM service
Suppose an OISM network exists in PE2, PE3 and PE4. The three PEs are configured with VPRN "oism-vprn-20000" and use the SBD "SB20001", however, PE2 and PE3 are attached to the same ordinary BD "BD20023" whereas PE4 is attached to the ordinary BD "BD20004". In this example, a source with address 40.0.0.4 is connected to PE4's ordinary BD (and its stream triggers the setup of an S-PMSI) and wildcard-spmsis and S-PMSI thresholds are configured appropriately. Configurations are shown as follows. Since the configuration in PE2 and PE3 are equivalent, only the configuration of PE2 and PE4 are shown:

// PE2's relevant configuration for OISM

[ex:/configure service]
A:admin@PE-2# info
vpls "BD20023" {
  admin-state enable
  service-id 20023
  customer "1"
  routed-vpls {
    multicast {
      ipv4 {
        forward-to-ip-interface true
      }
    }
  }
  bgp 1 {
  }
  igmp-snooping {
    admin-state enable
  }
  bgp-evpn {
    evi 20023
    mpls 1 {
      admin-state enable
      ingress-replication-bum-label true
      auto-bind-tunnel {
        resolution any
      }
    }
  }
  sap lag-1:200 {
  }
  provider-tunnel {
    selective {
      admin-state enable
      owner bgp-evpn-mpls
      wildcard-spmsi true
      mldp true
    }
  }
}
vpls "SBD20001" {
  admin-state enable
  service-id 20001
  customer "1"
  routed-vpls {
    multicast {
      ipv4 {
        forward-to-ip-interface true
      }
    }
  }
  bgp 1 {
  }
  igmp-snooping {
    admin-state enable
  }
  bgp-evpn {
    evi 20001
    routes {
      ip-prefix {
        advertise true
      }
      sel-mcast {
        advertise true
      }
    }
    mpls 1 {
      admin-state enable
      auto-bind-tunnel {
        resolution any
      }
    }
  }
  provider-tunnel {
    selective {
      admin-state enable
      owner bgp-evpn-mpls
      mldp true
    }
  }
}
vprn "oism-vprn-20000" {
  admin-state enable
  service-id 20000
  customer "1"
  ecmp 2
  igmp {
    interface "BD20023" {
    }
  }
  pim {
    apply-to all
    ipv4 {
      rpf-table both
    }
    interface "SBD20001" {
      multicast-senders always
    }
  }
  interface "BD20023" {
    ipv4 {
      primary {
        address 10.0.0.2
        prefix-length 24
      }
      neighbor-discovery {
        learn-unsolicited true
        proactive-refresh true
        host-route {
          populate dynamic {
          }
        }
      }
      vrrp 1 {
        backup [10.0.0.254]
        passive true
      }
    }
  }
}

// PE4 relevant configuration for OISM


[ex:/configure service]
A:admin@PE-4# info
vpls "BD20004" {
  admin-state enable
  service-id 20004
  customer "1"
  routed-vpls {
    multicast {
      ipv4 {
        forward-to-ip-interface true
      }
    }
  }
  bgp 1 {
  }
  igmp-snooping {
    admin-state enable
  }
  bgp-evpn {
    evi 20004
    mpls 1 {
      admin-state enable
      ingress-replication-bum-label true
      auto-bind-tunnel {
        resolution any
      }
    }
  }
  sap pxc-6.a:200 {
  }
  provider-tunnel {
    selective {
      admin-state enable
      owner bgp-evpn-mpls
      wildcard-spmsi true
      mldp true
      data-threshold {
        group-prefix 239.0.0.0/8 {
          threshold 0
        }
      }
    }
  }
}
vpls "SBD20001" {
  admin-state enable
  service-id 20001
  customer "1"
  routed-vpls {
    multicast {
      ipv4 {
        forward-to-ip-interface true
      }
    }
  }
  bgp 1 {
  }
  igmp-snooping {
    admin-state enable
  }
  bgp-evpn {
    evi 20001
    routes {
      ip-prefix {
        advertise true
      }
      sel-mcast {
        advertise true
      }
    }
    mpls 1 {
      admin-state enable
      auto-bind-tunnel {
        resolution any
      }
    }
  }
  provider-tunnel {
    selective {
      admin-state enable
      owner bgp-evpn-mpls
      mldp true
    }
  }
}
vprn "oism-vprn-20000" {
  admin-state enable
  service-id 20000
  customer "1"
  ecmp 2
  igmp {
    interface "BD20004" {
    }
  }
  pim {
    apply-to all
    ipv4 {
      rpf-table both
    }
    interface "SBD20001" {
      multicast-senders always
    }
  }
  interface "BD20004" {
    ipv4 {
      primary {
        address 40.0.0.1
        prefix-length 24
      }
      neighbor-discovery {
        learn-unsolicited true
        proactive-refresh true
        host-route {
          populate dynamic {
          }
        }
      }
    }
    vpls "BD20004" {
      evpn {
        arp {
          learn-dynamic false
          advertise dynamic {
          }
        }
      }
    }
  }
  interface "SBD20001" {
    mac 00:00:00:00:00:04
    vpls "SBD20001" {
      evpn-tunnel {
        supplementary-broadcast-domain true
      }
    }
  }
}
With the above configuration on PE2, PE3 and PE4, PE4 advertises an S-PMSI A-D route for group (40.0.0.4,239.0.0.4), in addition to the wildcard-spmsi route:
show router bgp routes evpn spmsi-ad rd 192.0.2.4:20004 
===============================================================================
BGP Router ID:192.0.2.2 AS:64500 Local AS:64500 
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete


========================================================
BGP EVPN SPMSI AD Routes
========================================================
Flag   Route Dist.       Src Address
       Tag               Grp Address
                         Orig Address
--------------------------------------------------------
u*>i   192.0.2.4:20004   0.0.0.0
       0                 0.0.0.0
                         192.0.2.4
u*>i   192.0.2.4:20004   40.0.0.4
       0                 239.0.0.4
                         192.0.2.4
--------------------------------------------------------
Routes : 2
========================================================

The S-PMSI A-D route for 239.0.0.4 makes PE2 and PE3 join the Selective mLDP tree setup by PE4. The multicast group is delivered over the S-PMSI tree. The following commands help show the established S-PMSI trees (which are modeled as SDP-binds at the service level and therefore consume SDP-bind resources). PE2 and PE3 join the S-PMSI tree for 239.0.0.4 on the SBD because they are not attached to the source ordinary BD. The traffic is received at Layer 3, therefore the statistics are seeing at the VPRN level and not at the MFIB level (as in the case of EVPN proxy):

show service id "20001" provider-tunnel spmsi-tunnels detail
===============================================================================
LDP Spmsi Tunnels
===============================================================================
LSP ID : 8199 
Root Address : 192.0.2.4
S-PMSI If Index : 73752 
Num. Leaf PEs : 1 
Uptime : 0d 14:45:11 
Group Address : 239.0.0.4
Source Address : 40.0.0.4
Origin IP Address : 192.0.2.4
State : RX Joined 
Remain Delay Intvl : 0 
-------------------------------------------------------------------------------
LSP ID : 8198 
Root Address : 192.0.2.4
S-PMSI If Index : 73751 
Uptime : 0d 14:45:11 
Group Address : * (wildcard)
Source Address : *
Origin IP Address : 192.0.2.4
State : RX Joined 
Remain Delay Intvl : 0 
-------------------------------------------------------------------------------
===============================================================================
tools dump service id "20001" provider-tunnels type terminating

===============================================================================
VPLS 20001 Inclusive Provider Tunnels Terminating 
===============================================================================
No Tunnels Found 
-------------------------------------------------------------------------------
===============================================================================
VPLS 20001 Selective Provider Tunnels Terminating 
===============================================================================
spmsi (LDP)   Source-Addr    Group-Addr     Root-Addr   LSP-ID    Lsp-Name
-------------------------------------------------------------------------------
              40.0.0.4       239.0.0.4      192.0.2.4   8199 
              *              *              192.0.2.4   8198 
-------------------------------------------------------------------------------
show router "20000" pim group detail
===============================================================================
PIM Source Group ipv4
===============================================================================
Group Address : 239.0.0.4
Source Address : 40.0.0.4
RP Address : 0
Advt Router : 
Flags : Type : (S,G)
Mode : sparse 
MRIB Next Hop : 40.0.0.4
MRIB Src Flags : direct 
Keepalive Timer : Not Running 
Up Time : 0d 15:42:23 Resolved By : rtable-u

Up JP State : Joined                Up JP Expiry : 0d 00:00:14
Up JP Rpt : Not Joined StarG        Up JP Rpt Override : 0d 00:00:00

Register State : No Info 
Reg From Anycast RP: No 

Rpf Neighbor : 40.0.0.4
Incoming Intf : SBD20001
Outgoing Intf List : BD20023

Curr Fwding Rate : 67.200 kbps 
Forwarded Packets : 29945            Discarded Packets : 0
Forwarded Octets : 2515380           RPF Mismatches : 0
Spt threshold : 0 kbps               ECMP opt threshold : 7
Admin bandwidth : 1 kbps 
-------------------------------------------------------------------------------
Groups : 1
===============================================================================

EVPN-VPWS PW headend functionality

EVPN-VPWS is often used as an aggregation technology to connect access devices to the residential or business PE in the service provider network. The PE receives tagged traffic inside EVPN-VPWS circuits and maps each tag to a different service in the core, such as ESM services, Epipe services, or VPRN services.

SR OS implements this PW headend functionality by using PW ports that use multihomed Ethernet Segments (ESs) for redundancy. ESs can be associated with PW ports in two different modes of operation.

  • PW port-based ESs with multihoming procedures on PW SAPs
  • PW port-based ESs with multihoming procedures on stitching Epipe

PW port-based ESs with multihoming procedures on PW SAPs

PW ports in ESs and virtual ESs (vESs) are supported for EVPN-VPWS MPLS services. In addition to LAG, port, and SDP objects, PW port ID can be configured in an Ethernet Segment. In this mode of operation, PW port-based ESs only support all-active configuration mode, and not single-active configuration mode.

The following requirements apply:

  • Port-based or FPE-based PW ports can be used in ESs
  • PW port scenarios supported along with ESs are as follows:
    • port-based PW port
    • FPE-based PW port, where the stitching service uses a spoke SDP to the access CE
    • FPE-based PW port, where the stitching service uses EVPN-VPWS (MPLS) to the access CE

For all the preceding scenarios, fault-propagation to the access CE only works in the case of physical failures. Administrative shutdown of individual Epipes, PW SAPs, ESs or BGP-EVPN may result in traffic black holes.

The following figure shows the use of PW ports in ESs. In this example, an FPE-based PW port is associated with the ES, where the stitching service itself also uses EVPN-VPWS.

Figure 102. ES FPE-based PW port access using EVPN-VPWS

In this example, the following conditions apply:

  • Redundancy is driven by EVPN all-active multihoming. ES-1 is a virtual ES configured on the FPE-based PW port on PE-1 and PE-2.
  • The access network between the access PE (PE-A) and the network PEs (PE-1 and PE-2), uses EVPN-VPWS to backhaul the traffic. Therefore, PE-1 and PE-2 use EVPN-VPWS in the PW port stitching service, where:
    • PE-1 and PE-2 apply the same Ethernet tag configuration on the stitching service (Epipe 10)
    • Optionally PE-1 and PE-2 can use the same RD on the stitching service
    • AD per-EVI routes for the stitching service Ethernet tags are advertised with ESI=0
  • Forwarding in the CE-1 to CE-2 or CE-3 direction, works as follows:
    • PE-A forwards traffic based on the selection of the best AD per-EVI route advertised by PE-1 and PE-2 for the stitching Epipe 10. This selection can be either BGP-based if PE-2 and PE-3 use the same RD in the stitching service, or EVPN-based if different RD is used.
    • When the PE-1 route is selected, PE-1 receives the traffic on the local PW-SAP for Epipe 1 or Epipe 2, and forwards it based on the customer EVPN-VPWS rules in the core.
  • Forwarding in the CE-2 or CE-3 to CE-1 direction, works as follows:
    • PE-3 forwards the traffic based on the configuration of ECMP and aliasing rules for Epipe 1 and Epipe 2.
    • PE-3 can send the traffic to PE-2 and PE-2 to PE-A, following different directions.
  • If the user needs the traffic to follow a symmetric path in both directions, then the AD per-EVI route selection on PE-A and PE-3 can be handled so that the same PE (PE-1 or PE-2) is selected for both directions.
  • For this example, the solution provides redundancy in case of node failures in PE-1 or PE-2. However, the administrative shutdowns, configured in some objects, are not propagated to PE-A, leading to traffic blackholing. As a result, black holes may be caused by the following events in PE-1 or PE-2:
    • Epipe 1 or Epipe 2 service shutdown
    • Epipe 1 or Epipe 2 BGP-EVPN MPLS shutdown
    • vES-1 shutdown
    • BGP shutdown

PW port-based ESs with multihoming on stitching Epipe

The solution described in PW port-based ESs with multihoming procedures on PW SAPs provides PW-headend redundancy where the access PE selects one of the PW-headend PE devices based on BGP best path selection, and the traffic from the core to the access may follow an asymmetric path. This is because the multihoming procedures are actually run on the PW SAPs of the core services, and the AD per-EVI routes advertised in the context of the stitching Epipe use an ESI=0.

SR OS also supports a different mode of operation called pw-port headend which allows running the multihoming procedures in the stitching Epipe and, therefore, use regular EVPN-VPWS primary or backup signaling to the access PE. The mode of operation is supported in a single-active mode shown in the following figure.

Figure 103. ES FPE-based pw-port headend

The following configuration triggers the needed behavior:

// ES and stitching Epipe config

PE-1/2>config>service# info
  system
    bgp-evpn
      ethernet-segment “ES-1” create
        esi 00:12:12:12:12:12:12:12:12:12
        multi-homing single-active
        pw-port 1 pw-headend
        no shutdown

  epipe 300 name ”stitching-300" customer 1 create
    pw-port 1 fpe 1 create
      no shutdown
    bgp-evpn
      local-attachment-circuit ac-23 eth-tag 23
      remote-attachment-circuit ac-1 eth-tag 1
      mpls bgp 1
        auto-bind-tunnel resolution any


// Services config

epipe 10
  sap pw-1:10 create
  bgp-evpn
    mpls bgp 1

epipe 11
  sap pw-1:10 create
  bgp-evpn
    mpls bgp 1

The configuration and functionality are divided in four aspects.

Configuration of single-active multihoming on ESs associated with PW ports of type pw-headend

In this mode, PW Ports are associated with single-active non-virtual Ethernet Segments. The pw-headend keyword is needed when associating the PW port.

PE-1/2>config>service# info
  system
    bgp-evpn
      ethernet-segment “ES-1” create
        esi 00:12:12:12:12:12:12:12:12:12
        multi-homing single-active
        pw-port 1 pw-headend
        no shutdown

The pw-port id pw-headend command indicates to the system that the multihoming procedures are run in the PW port stitching Epipe and the routes advertised in the context of the stitching Epipe contains the ESI of the ES.

Configuration of the PW port stitching Epipe

A configuration example of the stitching Epipe follows.

epipe 300 name ”stitching-300" customer 1 create
    pw-port 1 fpe 1 create
      no shutdown
    bgp-evpn
      local-attachment-circuit ac-23 eth-tag 23
      remote-attachment-circuit ac-1 eth-tag 1
      mpls bgp 1
        auto-bind-tunnel resolution any

The preceding example shows the configuration of a stitching EVPN VPWS Epipe with MPLS transport, however SRv6 transport is also supported.

When the ES is configured with a PW port in pw-headend mode, the stitching Epipe associated with the PW port is now running the ES and DF election procedures. Therefore, the following actions apply:

  • an AD per-ES route is advertised with:
    • the RD or RT of the stitching Epipe
    • the configured ESI of the ES associated with the PW port
    • the ESI-label extended community with the multihomed mode indication and ESI label
  • an AD per EVI route is advertised with:
    • the RD or RT of the stitching Epipe
    • the configured ESI where the PW port resides
    • the P/B bits according to the DF election procedures
  • the non-DF drives the PW port operationally down with a flag MHStandby. As a result, all the PW SAPs contained in the PW port are brought operationally down. Optionally, the config>service>epipe>pw-port>oper-up-on-mhstandby command can be configured so that the PW port stays operationally up even if it is in MHStandby state (that is, the PE is non-DF). This command may speed up convergence in case a significant number of PW SAPs are configured in the same PW port.

Configuration of the PW port-contained PW SAPs and edge services

The edge services that contain the PW SAPs of the pw-headend pw-port command are configured without any other additional commands. These PW SAPs can be configured on Epipes, VPRN interfaces, or subscriber interfaces, VPLS (capture SAPs). As an example, if the PW SAP is configured on an Epipe EVPN-VPWS service:

epipe 10
  sap pw-1:10 create
  bgp-evpn
    mpls bgp 1
The behavior of the PW SAPs when the PW port is configured with the pw-headend keyword follows:
  • The PW SAP is brought operationally down if the PW port is down. The PW port goes down with the reason MHStandby if the PE is a non-DF, or with reason stitching-svc-down if the EVPN destination is removed from the stitching Epipe.
  • If the PW SAP is configured in an EVPN-VPWS edge service as in the preceding example, the following actions are performed:
    • An AD per ES route is advertised for the EVPN-VPWS service with the RD or RT of the service Epipe, the configured ESI of the ES associated with the PW port, and the ESI-label extended community with the multihomed mode indication of the ES and ESI label (label is the same value as in the AD per ES for the stitching Epipe). If the PW port is only down because of the MHStandby flag, the AD per ES route for the Epipe service is still advertised.
    • In addition, an AD per EVI route is advertised with the RD or RT of the service Epipe, the configured ESI of the ES associated with the PW port, and the P/B flags of the ES:
      • P=1/B=0 on the DF
      • P=0/B=1 on backup
      • P=0/B=0 on non-DFs and non-backup
    • If the PW port is down only because of MHStandby, the AD per EVI route for the service Epipe is still advertised.

Some considerations and dependencies between the PW port and the service Epipe PW SAPs

  • If all the PW SAPs associated with the FPE PW port are brought down, the following rules apply:
    • state of the PW port does not change
    • does not trigger any AD per-ES/EVI or ES route withdraw toward the CE from the stitching Epipe
  • Any event that brings down the PW port (except for MHStandby) triggers:
    • an AD per-EVI/ES route withdrawal within the context of the stitching Epipe
    • an ES route withdrawal
    • an AD per-EVI/ES routes withdrawal within the context of the service Epipes
    • the pw-port>monitoring-oper-group command can also modify the state of the PW port driven by the state of the operational group
  • An individual PW SAP going administrative or operationally down while the PW port is still operationally up, the following actions may be performed:
    • may create black holes for that particular service
    • triggers the withdrawal of the AD per-EVI routes for the service Epipe (not the AD per-ES route, which is kept advertised if the PW port is up)
    • if the PW SAP is administratively not shutdown, the service Epipe AD per-ES/EVI routes mirror the AD per-ES/EVI routes of the stitching service and they are advertised if the routes for the stitching Epipe are advertised

The PW SAP can also be configured on VPRN services (under regular interfaces or subscriber interfaces) and works without any special consideration, other than that a PW port in non-DF state brings down the PW SAP and, therefore, the interface. Similarly, VPLS services with capture PW SAPs support this mode of operation too.

Interaction of EVPN and other features

This section contains information about EVPN and how it interacts with other features.

Interaction of EVPN-VXLAN and EVPN-MPLS with existing VPLS features

When enabling existing VPLS features in an EVPN-VXLAN or an EVPN-MPLS enabled service, the following must be considered:

  • EVPN-VXLAN services are not supported on I-VPLS/B-VPLS. VXLAN cannot be enabled on those services. EVPN-MPLS is only supported in regular VPLS and B-VPLS. Other VPLS types, such as m-vpls, are not supported with either EVPN-VXLAN or EVPN-MPLS VPLS etree services are supported with EVPN-MPLS.

  • In general, no router-generated control packets are sent to the EVPN destination bindings, except for ARP, VRRP, ping, BFD and Eth-CFM for EVPN-VXLAN, and proxy-ARP/proxy-ND confirm messages and Eth-CFM for EVPN-MPLS.

  • The following rules apply to xSTP and M-VPLS services:

    • xSTP can be configured in BGP-EVPN services. BPDUs are not sent over the EVPN bindings.

    • bgp-evpn is blocked in m-vpls services; however, a different m-vpls service can manage a SAP or spoke SDP in a bgp-evpn-enabled service.

    • xSTP is not supported in BGP-EVPN services that use Ethernet segments for multihoming, and an M-VPLS must not drive the state of a BGP-EVPN service that uses Ethernet segments.

  • In bgp-evpn enabled VPLS services, mac-move can be used in SAPs/SDP bindings; however, the MACs being learned through BGP-EVPN are considered.

    Note: The MAC duplication already provides a protection against mac-moves between EVPN and SAPs/SDP bindings.
  • disable-learning and other fdb-related tools only work for data plane learned MAC addresses.

  • mac-protect cannot be used in conjunction with EVPN.

    Note: EVPN provides its own protection mechanism for static MAC addresses.
  • MAC OAM tools are not supported for bgp-evpn services, that is: mac-ping, mac-trace, mac-populate, mac-purge, and cpe-ping.

  • EVPN multihoming and BGP-MH can be enabled in the same VPLS service, as long as they are not enabled in the same SAP-SDP or spoke SDP. There is no limitation on the number of BGP-MH sites supported per EVPN-MPLS service.

    Note: The number of BGP-MH sites per EVPN-VXLAN service is limited to 1.
  • SAPs/SDP bindings that belong to a specified ES but are configured on non-BGP-EVPN-MPLS-enabled VPLS or Epipe services are kept down with the StandByForMHProtocol flag.

  • CPE-ping is not supported on EVPN services but it is in PBB-EVPN services (including I-VPLS and PBB-Epipe). CPE-ping packets are not sent over EVPN destinations. CPE-ping only works on local active SAP or SDP bindings in I-VPLS and PBB-Epipe services.

  • Other commands not supported in conjunction with bgp-evpn are:

    • Subscriber management commands under service, SAP, and SDP binding interfaces

    • BPDU translation

    • L2PT termination

    • MAC-pinning

  • Other commands not supported in conjunction with bgp-evpn mpls are:

    SPB configuration and attributes

Interaction of PBB-EVPN with existing VPLS features

In addition to the B-VPLS considerations described in section Interaction of EVPN-VXLAN and EVPN-MPLS with existing VPLS features, the following specific interactions for PBB-EVPN should also be considered:

  • When bgp-evpn mpls is enabled in a b-vpls service, an i-vpls service linked to that b-vpls cannot be an R-VPLS (the allow-ip-int-bind command is not supported).

  • The ISID value of 0 is not allowed for PBB-EVPN services (I-VPLS and Epipes).

  • The ethernet-segments can be associated with b-vpls SAPs/SDP bindings and i-vpls/epipe SAPs/SDP bindings,; however, the same ES cannot be associated with b-vpls and i-vpls/epipe SAP or SDP bindings at the same time.

  • When PBB-Epipes are used with PBB-EVPN multihoming, spoke SDPs are not supported on ethernet-segments.

  • When bgp-evpn mpls is enabled, eth-tunnels are not supported in the b-vpls instance.

Interaction of VXLAN, EVPN-VXLAN and EVPN-MPLS with existing VPRN or IES features

When enabling existing VPRN features on interfaces linked to VXLAN R-VPLS (static or BGP-EVPN based), or EVPN-MPLS R-VPLS interfaces, consider that the following are not supported:

  • the commands arp-populate and authentication-policy

  • dynamic routing protocols such as IS-IS, RIP, and OSPF

When enabling existing IES features on interfaces linked to VXLAN R-VPLS or EVPN-MPLS R-VPLS interfaces, the following commands are not supported:

  • if>vpls>evpn-tunnel

  • bgp-evpn>ip-route-advertisement

  • arp-populate

  • authentication-policy

Dynamic routing protocols such as IS-IS, RIP, and OSPF are also not supported.

Interaction of EVPN with BGP owners in the same VPRN service

SR OS allows multiple BGP owners in the same VPRN service to receive or advertise IP prefixes contained in the VPRN's route table. Specifically, the same VPRN route table can simultaneously install and process IPv4 or IPv6 prefixes for the following owners:

  • EVPN-IFL (EVPN Interface-less IP prefix routes)

  • EVPN-IFF (EVPN Interface-ful IP prefix routes)

  • VPN-IP (also referred to as IPVPN routes)

  • IP (also referred to as BGP PE-CE routes)

Different owners supported on the same VPRN shows the service architecture and the concept of different owners supported on the same VPRN.

Figure 104. Different owners supported on the same VPRN

In the example shown in Different owners supported on the same VPRN, VPRN 10 is configured with regular interfaces and R-VPLS interfaces and receives the same prefix 10.0.0.0/24 via the four owners.

EVPN-IFL routes are EVPN IP-Prefix (or type 5) routes that are imported and exported based on the VPRN bgp-evpn>mpls configuration, as described in Interface-less IP-VRF-to-IP-VRF model (IP encapsulation) for MPLS tunnels.

EVPN-IFF routes are EVPN IP-Prefix (or type 5) routes that are imported and exported based on the configuration of the R-VPLS services attached to the VPRN. EVPN-IFF routes are advertised and processed if the R-VPLS services are configured with the configure>service>vpls>bgp-evpn>ip-route-advertisement command. Although installed in the VPRN service, EVPN-IFF routes use the route distinguisher and route targets determined by the configuration in the R-VPLS, and are supported in R-VPLS services with VXLAN or MPLS encapsulations. See Interface-ful IP-VRF-to-IP-VRF with SBD IRB model for more information about EVPN-IFF routes.

In addition to EVPN-IFL and EVPN-IFF routes, BGP IP and VPN-IP families are supported on the same VPRN.

Interworking of EVPN-IFL and IPVPN in the same VPRN

This section describes the SR OS interworking details for BGP owners in the same VPRN. The behavior is compliant with draft-ietf-bess-evpn-ipvpn-interworking.

A VPRN service can be configured to support EVPN-IFL and IPVPN simultaneously. For example, the following MD CLI excerpt shows a VPRN service configured for EVPN-IFL (vprn>bgp-evpn context) and IPVPN (vprn>bgp-ipvpn context):

[ex:/configure service vprn "vprn-ipvpn-evpnifl-AL-80"]
A:admin@PE-2# info
    admin-state enable
    service-id 80
    customer "1"
    bgp-evpn {
        mpls 1 {
            admin-state enable
            route-distinguisher "192.0.2.2:80"
            vrf-target {
                community "target:64500:80"
            }
            auto-bind-tunnel {
                resolution any
            }
        }
    }
    bgp-ipvpn {
        mpls {
            admin-state enable
            route-distinguisher "192.0.2.2:80"
            vrf-target {
                community "target:64500:80"
            }
            auto-bind-tunnel {
                resolution any
            }
        }
    }
    interface "lo0" {
        loopback true
        ipv4 {
            primary {
                address 2.2.2.2
                prefix-length 32
            }
        }
    }

When EVPN-IFL and IPVPN are both enabled on the same VPRN, the following rules apply:

  • IPVPN and EVPN-IFL routes are treated by BGP as separate routes; that is, the selection is done at route table level and not at the BGP level.

  • At the route table level, IPVPN and EVPN-IFL routes may have the same route table preference (by default, this is 170 for both routes), route selection between IPVPN and EVPN-IFL routes is based on regular BGP path selection.

  • ECMP across IPVPN and EVPN-IFL routes for the same prefix is not supported. When vprn>ecmp is configured to 2 or greater, installing multiple equal cost next hops for the same prefix in the VPRN route table is only supported within the same route owner, IPVPN or EVPN IFL.

  • When EVPN-IFL and IPVPN are both enabled in the same VPRN, by default, EVPN-IFL routes are exported into IPVPN and the other way around (CLI configuration is not required).

  • The configure>service>vprn>allow-export-bgp-vpn command is relevant within the same owner (either IPVPN or EVPN-IFL) and works as follows:

    • The command re-exports a received EVPN-IFL route into an EVPN-IFL route to a different peer.

    • The command also re-exports a received IPVPN route into an IPVPN route.

    • If EVPN-IFL and IPVPN are both configured in the same VPRN, an EVPN-IFL route is automatically re-exported into an IPVPN route. Conversely, an IPVPN route is re-exported into an EVPN-IFL. This is true unless export policies prevent the automatic re-export function.

Route selection across EVPN-IFL and other owners in the VPRN service

This section describes the rules for route selection among EVPN-IFL, VPN-IP, and IP route table owners.

A PE may receive an IPv4 or IPv6 prefix in routes from different or same owners, and from the same or different BGP peer. For example, prefix 10.0.0.0/24 can be received as an EVPN-IFL route and also received as a VPN-IPv4 route. Or prefix 2001:db8:1::/64 can be received in two EVPN-IFL routes with different route distinguishers from different peers. In all these examples, the router selects the best route in a deterministic way.

For EVPN-IFF route selection rules, see Route selection for EVPN-IFF routes in the VPRN service. In SR OS, the VPRN route table route selection for all BGP routes, excluding EVPN-IFF, is performed using the following ordered, tie-breaking rules:

  1. valid route wins over invalid route

  2. lowest origin validation state (valid<not found<invalid) wins

  3. lowest RTM (route table) preference wins

  4. highest local preference wins

  5. shortest D-PATH wins (skipped if d-path-length-ignore is configured)

  6. lowest AIGP metric wins

  7. shortest AS_PATH wins (skipped if the as-path-ignore command is configured for the route owner)

  8. lowest origin wins

  9. lowest MED wins

  10. lowest owner type wins (BGP<BGP-LABEL<BGP-VPN)

    Note: BGP-VPN refers to VPN-IP and EVPN-IFL in this context.
  11. EBGP wins

  12. lowest route table or tunnel-table cost to the next hop (skipped if the ignore-nh-metric command is configured)

  13. lowest next-hop type wins (resolution of next hop in TTM wins vs RTM) (skipped if the ignore-nh-metric command is configured)

  14. lowest router ID wins (skipped if the ignore-router-id command is configured)

  15. shortest cluster_list length wins

  16. lowest IP address

    Note: The IP address refers to the peer that advertised the route.
  17. EVPN-IFL wins over IPVPN routes

  18. next-hop check (IPv4 next hop wins over IPv6 next hop, and then lowest next hop wins)

    Note: This is a tiebreaker if BGP receives the same prefix for VPN-IPv6 and IFL. An IPv6-Prefix received as VPN-IPv6 is mapped as IPv6 next hop, whereas the same IPv6 prefix received as IFL could have an IPv4 next hop.
  19. RD check for RTM (lowest RD wins)

ECMP is not supported across EVPN-IFL and other owners, but it is supported within the EVPN-IFL owner for multiple EVPN-IFL routes received with the same IP prefix. When ECMP is configured with N number of paths in the VPRN, BGP orders the routes based on the previously described tie-break criteria breaking out after step 13 (lowest next-hop type). At that point, BGP creates an ECMP set with the best N routes.

Example:

In a scenario in which two EVPN-IFL routes are received on the same VPRN with same prefix, 10.0.0.0/24; different RDs 192.0.2.1:1 and 192.0.2.2; and different router ID, 192.0.2.1 and 192.0.2.2; the following tie-breaking criteria are considered.

  • Assuming everything else is the same, BGP orders the routes based on the preceding criteria and prefers the route with the lowest router ID.

  • If vprn>ecmp=2, the two routes are treated as equal in the route table and added to the same ECMP set.

Route selection for EVPN-IFF routes in the VPRN service

While the route selection in VPRN for other BGP owners described in Route selection across EVPN-IFL and other owners in the VPRN service follows similar criteria, the default selection for EVPN-IFF routes in the VPRN route table follow different rules:

  • By default, EVPN-IFF routes have a VPRN route table preference of 169; therefore, EVPN-IFF routes are preferred over EVPN-IFL, VPN-IP, or IP owners that have a preference of 170.

  • When two or more EVPN-IFF routes with the same IPv4 or IPv6 prefix and length, but with different route keys are received (for example, two routes with the same prefix and length but with different RDs), BGP hands the EVPN-IFF routes over to the EVPN application for selection. In this case, EVPN orders the routes based on their {R-VPLS Ifindex, RD, Ethernet Tag} and considers the top one for installing in the route table if ecmp is 1. If ecmp is N, the top N routes for the prefix are selected.

    Example:

    • Consider the following two IP-Prefix routes that are received on the same R-VPLS service:

      Route 1: (RD=192.0.0.1:30, Ethernet Tag=0, Prefix=10.0.0.0/24, next-hop 192.0.0.1)

      Route 2: (RD=192.0.0.2:30, Ethernet Tag=0, Prefix=10.0.0.0/24, next-hop 192.0.0.2)

    • Because their route key is different (their RDs do not match), EVPN orders them based on R-VPLS Ifindex first, then RD, and then Ethernet Tag. Because they are received on the same R-VPLS, the Ifindex is the same on both. The top route on the priority list is Route 1, based on its lower RD. If the VPRN's ecmp command has a value of 1, only Route 1 is installed in the VPRN's route table.

  • If the previously described way of selecting EVPN-IFF routes in the VPRN does not satisfy the user requirements, the configure service system bgp evpn ip-prefix-routes iff-bgp-path-selection command enables a BGP-based path selection for EVPN-IFF routes, which is equivalent to the selection followed for EVPN-IFL or IPVPN routes with the following considerations:

  • If the iff-bgp-path-selection command is configured, the EVPN-IFF path selection for the N routes that form the ECMP set follow the same rules as in Route selection across EVPN-IFL and other owners in the VPRN service for EVPN-IFL or IPVPN routes.

    Figure 105. EVPN-IFF path selection for N routes with iff-bgp-path-selection configured

Upon receiving the same IP prefix 10.0.0.0/24 for the same VPRN in EVPN-IFF routes with different RDs, as shown in EVPN-IFF path selection for N routes with iff-bgp-path-selection configured, the following selection criteria is used:

  • If the iff-bgp-path-selection command is configured, the selection is based on BGP path selection, and the selected route is the top route, based on the highest Local Preference (LP)(500>300>200).

  • However, if the iff-bgp-path-selection command is not configured, the bottom route is selected assuming the three routes are received on the same R-VPLS, and based on the lower RD (1.1.1.1:1<1.1.1.5:1<1.1.1.10:1).

Although, by default, EVPN-IFF routes have an RTM preference of 169 and they are preferred over the RTM preference of 170 used for the other BGP route owners, a selection across EVPN-IFF and the other owners may result if the RTM preference is changed and made equal (via import policy or config>router>bgp>preference). Note that the route table preference for EVPN-IFF routes can be changed from the default value 169 if the iff-attribute-uniform-propagation command is enabled and an import policy or the config>router>bgp>preference command is configured to change it.

In case the RTM preference is changed and made equal to the same as for EVPN-IFF routes, if multiple routes with the same key (different RD) are received for EVPN-IFF and another owner in the same VPRN, the selection order is as follows:

  1. BGP (IPv4 or IPv6)

  2. BGP-IPVPN

  3. EVPN-IFF

  4. EVPN-IFL

Note: The previous selection order applies to EVPN-IFF routes when compared with others. When BGP-IPVPN, BGP IP and EVPN-IFL routes are compared, regular BGP path selection is used as described in Route selection across EVPN-IFL and other owners in the VPRN service.

BGP path attribute propagation

A VPRN can receive and install routes for a specific BGP for a specific BGP owner. The routes may be re-exported in the context of the same VPRN and to the same BGP owner or a different one. For example, an EVPN-IFL route can be received from peer N, installed in VPRN 1, and re-exported to peer M using family VPN-IPv4.

When re-exporting BGP routes, the original BGP path attributes are preserved without any configuration in the following cases:

  • EVPN-IFL route re-exported into an IPVPN route, and the other way around

  • EVPN-IFL route re-exported into a BGP IP route (PE-CE), and the other way around

  • IPVPN route re-exported into a BGP IP route (PE-CE), and the other way around

  • EVPN-IFL, IPVPN or BGP IP routes re-exported into a route of the same owner. For example, EVPN-IFL to EVPN-IFL, when the allow-export-bgp-vpn command is configured.

BGP path attributes to or from EVPN-IFF are not preserved by default. If BGP Path Attribute propagation is required, the configure service system bgp-evpn ip-prefix-routes iff-attribute-uniform-propagation command must be configured. BGP path attribute propagation when iff-attribute-uniform-propagation is configured shows an example of BGP Path Attribute propagation from EVPN-IFF to the other BGP owners in the VPRN when the iff-attribute-uniform-propagation command is configured.

Figure 106. BGP path attribute propagation when iff-attribute-uniform-propagation is configured

In the example in BGP path attribute propagation when iff-attribute-uniform-propagation is configured, DGW1 propagates the received LP and communities on an EVPN-IFF route, when advertising the same prefix into any type of BGP owner route, including VPN-IPv4/6, EVPN-IFL, EVPN-IFF, IPv4, or IPv6. If the iff-attribute-uniform-propagation command is not configured on DCW1, no BGP path attributes are propagated, but are re-originated instead. The propagation in the opposite direction follows the same rules; configuration of the iff-attribute-uniform-propagation command is required.

When propagating BGP path attributes, the following criteria are considered:

  • The propagation is compliant with the uniform propagation described in draft-ietf-bess-evpn-ipvpn-interworking.

  • The following extended communities are filtered or excluded when propagating attributes:

    • all extended communities of type 0x06 (EVPN type). In particular, all those that are supported by routes type 5:

      • MAC Mobility extended community (sub-type 0x00)

      • EVPN Router's MAC extended community (sub-type 0x03)

    • BGP encapsulation extended community

    • all Route Target extended communities

  • The BGP Path Attribute propagation within the same owner is supported in the following cases:

    • EVPN-IFF to EVPN-IFF (route received on R-VPLS and advertised in a different R-VPLS context), assuming the iff-attribute-uniform-propagation command is configured

    • EVPN-IFL to EVPN-IFL (route received on a VPRN and re-advertised based on the configuration of vprn>allow-export-bgp-vpn)

    • VPN-IPv4/6 to VPN-IPv4/6 (route received on a VPRN and re-advertised based on the configuration of vprn>allow-export-bgp-vpn)

  • The propagation is supported for iBGP and eBGP as follows:

    • iBGP-only attributes can only be propagated to iBGP peers

    • non-transitive attributes are propagated based on existing rules

    • when peering an eBGP neighbor, the AS_PATH is prepended by the VPRN ASN

  • If ECMP is enabled in the VPRN and multiple routes of the same BGP owner with different Route Distinguishers are installed in the route table, only the BGP path attributes of the best route are subject for propagation.

BGP D-PATH attribute for Layer 3 loop protection

SR OS has a full implementation of the D-PATH attribute as described in draft-ietf-bess-evpn-ipvpn-interworking.

D-PATH is composed of a sequence of domain segments (similar to AS_PATH). Each domain segment is graphically represented as shown in the following figure.

Figure 107. D-PATH attribute

Where:

  • Each domain segment comprises of <domain_segment_length, domain_segment_value>, where the domain segment value is a sequence of one or more domains.

  • Each domain is represented by <DOMAIN-ID:ISF_SAFI_TYPE>, where the newly added domain is added by a gateway, is always prepended at the left of the existing last domain.

  • The supported ISF_SAFI_TYPE values are:

    • 0 = Local ISF route
    • 1 = safi 1 (typically identifies PE-CE BGP domains)
    • 70 = evpn
    • 128 = safi 128 (IPVPN domains)
  • Labeled unicast IP routes do not support D-PATH.

  • The D-PATH attribute is only modified by a gateway and not by an ABR/ASBR or RR. A gateway is defined as a PE where a VPRN is instantiated, and that VPRN advertises or receives routes from multiple BGP owners (for example, EVPN-IFL and BGP-IPVPN) or multiple instances of the same owner (for example, VPRN with two BGP-IPVPN instances)

Suppose a router receives prefix P in an EVPN-IFL instance with the following D-PATH from neighbor N.

If the router imports the route in VPRN-1, BGP-EVPN SRv6 instance with domain 65000:2, the router readvertises the route to its BGP-IPVPN MPLS instance as follows.

If the router imports the route in VPRN-1, BGP-EVPN SRv6 instance with domain 65000:3, the router readvertises the route to its BGP-EVPN MPLS instance as follows.

If the router imports the route in VPRN-1, BGP-EVPN MPLS instance with domain 65000:4, the router readvertises the route to its PE-CE BGP neighbor as follows.

When a BGP route of families that support D-PATH is received and must be imported in a VPRN, the following rules apply:

  • All domain IDs included in the D-PATH are compared with the local domain IDs configured in the VPRN. The local domain IDs for the VPRN include a list of (up to four) domain IDs configured at the vprn or vprn bgp instance level, including the domain IDs in local attached R-VPLS instances.

  • If one or more D-PATH domain IDs match any local domain IDs for the VPRN, the route is not installed in the VPRN’s route table.

  • In the case where the IP-VPN or EVPN route matches the import route target in multiple VRFs, the D-PATH loop detection works per VPRN. For example, for each VPRN, BGP checks if the received domain IDs match any locally configured (maximum 4) domain IDs for that VPRN. A route may have a looped domain for one VPRN and not the other. In this case, BGP installs a route only in the VPRN route table that does not have a loop; the route is not installed in the VPRN that has the loop.

  • A route that is not installed in any VPRN RTM (due to the domain ID matching any of the local domain IDs in the importing VPRNs) is still kept in the RIB-IN. The route is displayed in the show router bgp routes command with a DPath Loop VRFs field, indicating the VPRN in which the route is not installed due to a loop.

  • Route target-based leaking between VPRNs and D-PATH loop detection is described in the following example.

    Consider an EVPN-IFL route to prefix P imported in VPRN 20 (configured with domain 65000:20) is leaked into VPRN 30.

    When the route to prefix P is readvertised in the context of VPRN 30, which is enabled for BGP-IPVPN MPLS and BGP-EVPN MPLS, the readvertised BGP-IPVPN and BGP-EVPN routes have a D-PATH with a prepended domain 65000:20:0. That is, leaked routes are readvertised with the domain ID of the VPRN of origin and an ISF_SAFI_TYPE = 0, as described in draft-ietf-bess-evpn-ipvpn-interworking.

In the D-PATH example shown in the following figure, the different gateway PEs along the domains modify the D-PATH attribute by adding the source domain and family. If PE4 receives a route for the prefix with the domain of PE4 included in the D-PATH, PE4 does not install the route to avoid control plane loops.

Figure 108. D-PATH attribute example

In the D-PATH example shown in the following figure, DGW1 and DGW2 rely on the D-PATH attribute to automatically discard the prefixes received from the peer gateway in IPVPN and avoid loops by reinjecting the route back into the EVPN domain.

Figure 109. D-PATH attribute example two
Note: While site-of-origin extended communities and policies can be used in D-PATH attribute example two, the D-PATH method works across multiple domains and does not require policies.
BGP D-PATH configuration

The D-PATH attribute is modified on transmission or processed on reception based on the local VPRN or R-VPLS configuration. The domain ID is configured per-BGP instance and the ISF_SAFI_TYPE automatically derived from the instance type that imported the original route.

The domain-id is configured at service bgp instance level as a six-byte value that includes a global admin value and a local admin value, for example, 65000:1. Domain ID configuration is supported on:

  • VPRN BGP-EVPN MPLS and SRv6 instances (EVPN-IFL)
  • VPRN BGP-IPVPN MPLS and SRv6 instance
  • R-VPLS BGP-EVPN MPLS and VXLAN instances (EVPN-IFF only – the R-VPLS is configured with the evpn-tunnel command)
  • VPRN BGP neighbors (PE-CE)
  • VPRN level (for local routes)

The following is an example CLI configuration:

// domain-id configuration

*[ex:configure service vprn "blue" bgp-evpn mpls 1]
*[ex:configure service vprn "blue" bgp-evpn segment-routing-v6 1]
*[ex:configure service vprn "blue" bgp-ipvpn mpls 1]
*[ex:configure service vprn "blue" bgp-ipvpn segment-routing-v6 1]
*[ex:configure service vprn "blue" bgp]
*[ex:configure service vpls "blue" bgp-evpn routes ip-prefix]
+-- domain-id <global-field:local-field>


*[ex:configure service vprn "blue"]
A:admin@PE-2# 
+-- local-routes-domain-id <global-field:local-field>
// used as the domain-id for non-bgp routes in the VPRN.


// Example ‘a’

*[ex:configure service vprn "blue" bgp-ipvpn mpls 1]
    domain-id 65000:1

In the preceding "example 'a'", if a VPN-IPv4 route is received from a neighbor, imported in VPRN "blue" and exported to another neighbor as EVPN, the router prepends a D-PATH segment <65000:1:IPVPN> to the advertised EVPN RT5.


// Example ‘b’

*[ex:configure service vprn "blue"]
    local-routes-domain-id 65000:10

In the preceding "example 'b'", the local-routes-domain-id is configured at the vprn level. When configured, local routes (direct, static, IGP routes) are advertised with a D-PATH that contains the vprn>local-routes-domain-id.

The following additional considerations apply:

  • If vprn>local-routes-domain-id is not configured, the local routes are advertised into the BGP instances with no D-PATH.
  • If a VPRN BGP instance is not configured with a domain ID, the following handling applies:
    • Routes imported in the VPRN BGP instance are readvertised in a different instance without modifying the D-PATH.
    • Routes exported in the VPRN BGP instance are advertised with the D-PATH modified to include the domain ID of the instance that imported the route in the first place.
  • Up to a maximum of four domain IDs per VPRN are supported. This includes domain IDs configured in the associated R-VPLS services.
  • Modifying the domain IDs list initiates a route refresh for all address families associated with the VPRN.
BGP D-PATH and BGP best path selection

D-PATH is also considered for the BGP best path selection, as described in draft-ietf-bess-evpn-ipvpn-interworking.

As D-PATH is introduced in networks, not all the PEs may support D-PATH for BGP path selection. To guarantee compatibility in networks with PEs that do not support D-PATH, the following command determines if the D-PATH should be considered for BGP best-path selection.

ex:/configure]
A:admin@PE-3# 
router “Base” bgp best-path-selection d-path-length-ignore <boolean> // default: false
service vprn <string> bgp best-path-selection d-path-length-ignore <boolean> // default: false
service vprn <string> d-path-length-ignore <boolean> // default: false

configure service system bgp evpn ip-prefix-routes d-path-length-ignore <boolean> // default: false

The following conditions apply to the d-path-length-ignore command usage:

  • When d-path-length-ignore is configured at the base router level (or vprn>bgp level for PE-CE routes), BGP ignores the D-PATH domain segment length for best path selection purposes. This ignores d-path-length when comparing two VPN routes or two IFL routes within the same RD. These VPN or IFL routes are processed in main BGP instance.
  • When d-path-length-ignore is configured at the VPRN router level, the VPRN RTM ignores the D-PATH domain segment length for best path selection purposes (for routes in VPRN).
  • When d-path-length-ignore is configured at the service system bgp evpn ip-prefix-routes context, EVPN ignores the D-PATH length when iff-bgp-path-selection is enabled.
  • When d-path-length-ignore is not configured, the D-PATH length is considered in the BGP best path selection process (at the BGP, the RTM, and IFF levels, respectively).

Configuration examples

This section describes configuration examples for stitching IPVPN and EVPN-IFL domains and the propagation of BGP path attributes for EVPN-IFF.

Example 1 - stitching IPVPN and EVPN-IFL domains

In this configuration example, IPVPN and EVPN-IFL are simultaneously configured in VPRN 80 of PE2. This allows the stitching of IPVPN and EVPN-IFL domains, as shown in Stitching IPVPN and EVPN-IFL domains.

Figure 110. Stitching IPVPN and EVPN-IFL domains

The following is an example configuration of PE1, PE2, and PE4 for VPRN 80.

Note: In this scenario, the BGP path attributes added by CE801 are propagated all the way up to CE804, across the VPRN-IPv4 and EVPN-IFL families.
// PE1's VPRN 80
A:PE-1# configure service vprn 80 
A:PE-1>config>service>vprn# info 
----------------------------------------------
            router-id 192.0.2.1
            autonomous-system 64500
            interface "lo0" create
                address 1.1.1.1/32
                loopback
            exit
            interface "local" create
                address 10.0.0.254/24
                sap 1/1/c1/1:80 create
                exit
            exit
            bgp-ipvpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    route-distinguisher 192.0.2.1:80
                    vrf-target target:64500:80
                    no shutdown
                exit
            exit
            bgp
                min-route-advertisement 1
                group "pe-ce"
                    family ipv4
                    type external
                    export "export-al-to-vnf"
                    neighbor 10.0.0.1
                        local-as 64500
                        peer-as 81
                    exit
                exit
                no shutdown
            exit
            no shutdown
// PE2's VPRN 80
A:PE-2# configure service vprn 80 
A:PE-2>config>service>vprn# info 
----------------------------------------------
            interface "lo0" create
                address 2.2.2.2/32
                loopback
            exit
            bgp-ipvpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    route-distinguisher 192.0.2.2:80
                    vrf-target target:64500:80
                    no shutdown
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    route-distinguisher 192.0.2.2:80
                    vrf-target target:64500:80
                    no shutdown
                exit
            exit
            no shutdown
----------------------------------------------
// PE4's VPRN 80
A:PE-4# configure service vprn 80 
A:PE-4>config>service>vprn# info 
----------------------------------------------
            router-id 192.0.2.4
            autonomous-system 64500
            interface "lo0" create
                address 4.4.4.4/32
                loopback
            exit
            interface "local" create
                address 40.0.0.254/24
                sap 1/1/c1/1:80 create
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    route-distinguisher 192.0.2.4:80
                    vrf-target target:64500:80
                    no shutdown
                exit
            exit
            bgp
                min-route-advertisement 1
                group "pe-ce"
                    family ipv4
                    type external
                    export "export-bl-to-pe"
                    neighbor 40.0.0.1
                        local-as 64500
                        peer-as 84
                    exit
                exit
                no shutdown
            exit
            no shutdown
Example 2 - propagation of BGP path attributes for EVPN-IFF

In this configuration example, the DCGW PE2 re-exports EVPN-IFF routes into EVPN-IFF (leaked) routes and EVPN-IFL routes. The BGP path attributes are propagated as shown in Propagation of BGP path attributes for EVPN-IFF. As described in BGP path attribute propagation, EVPN extended communities, BGP encapsulation extended community and route targets are not propagated but instead, re-originated.

Figure 111. Propagation of BGP path attributes for EVPN-IFF

The following is an example configuration for PE4 and PE2 (PE1 has equivalent configuration as PE4).

// PE4 services for EVPN-IFF
A:PE-4>config>service>vprn# /configure service vprn 93 
A:PE-4>config>service>vprn# info 
----------------------------------------------
            router-id 4.4.4.4
            autonomous-system 64500
            interface "evi-95" create
                address 94.0.0.254/24
                vrrp 1 owner passive
                    backup 94.0.0.254
                exit
                vpls "evi-95"
                exit
            exit
            interface "evi-94" create
                vpls "evi-94"
                    evpn-tunnel
                exit
            exit
            bgp
                min-route-advertisement 1
                group "pe-ce"
                    family ipv4
                    type external
                    export "export-al-to-vnf"
                    neighbor 94.0.0.1
                        local-as 64500
                        peer-as 94
                    exit
                exit
                no shutdown
            exit
            no shutdown
----------------------------------------------
A:PE-4>config>service>vprn# /configure service vpls 95 
A:PE-4>config>service>vpls# info 
----------------------------------------------
            allow-ip-int-bind
            exit
            stp
                shutdown
            exit
            sap 1/1/c1/1:90 create
                no shutdown
            exit
            no shutdown
----------------------------------------------
A:PE-4>config>service>vpls# /configure service vpls 94 
A:PE-4>config>service>vpls# info 
----------------------------------------------
            allow-ip-int-bind
            exit
            vxlan instance 1 vni 94 create
            exit
            bgp
            exit
            bgp-evpn
                no mac-advertisement
                ip-route-advertisement
                evi 94
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
----------------------------------------------
// PE2 config
A:PE-2# configure service vprn 90 
A:PE-2>config>service>vprn# info 
----------------------------------------------
            interface "evi-91" create
                vpls "evi-91"
                    evpn-tunnel
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    route-distinguisher 192.0.2.2:90
                    vrf-export "leak-color-51-into-93"
                    vrf-target import target:64500:90
                    no shutdown
                exit
            exit
            no shutdown
----------------------------------------------
A:PE-2>config>service>vprn# /configure service vpls 91 
A:PE-2>config>service>vpls# info 
----------------------------------------------
            allow-ip-int-bind
            exit
            vxlan instance 1 vni 91 create
            exit
            bgp
            exit
            bgp-evpn
                no mac-advertisement
                ip-route-advertisement
                evi 91
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
----------------------------------------------
A:PE-2>config>service>vpls# /configure service vprn 93 
A:PE-2>config>service>vprn# info 
----------------------------------------------
            interface "evi-94" create
                vpls "evi-94"
                    evpn-tunnel
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    route-distinguisher 192.0.2.2:93
                    vrf-export "leak-color-51-into-90"
                    vrf-target import target:64500:93
                    no shutdown
                exit
            exit
            no shutdown
----------------------------------------------
A:PE-2>config>service>vprn# /configure service vpls 94 
A:PE-2>config>service>vpls# info 
----------------------------------------------
            allow-ip-int-bind
            exit
            vxlan instance 1 vni 94 create
            exit
            bgp
            exit
            bgp-evpn
                no mac-advertisement
                ip-route-advertisement
                evi 94
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
----------------------------------------------
A:PE-2>config>service>vpls# /show router policy "leak-color-51-into-90" 
    entry 10
        from
            community "color-51"
        exit
        action accept
            community add "RT64500:90" "RT64500:93"
        exit
    exit
    default-action accept
        community add "RT64500:93"
    exit
A:PE-2>config>service>vpls# /show router policy "leak-color-51-into-93" 
    entry 10
        from
            community "color-51"
        exit
        action accept
            community add "RT64500:90" "RT64500:93"
        exit
    exit
    default-action accept
        community add "RT64500:90"
    exit
Example 3 - D-PATH configuration

The example in the following figure shows a typical Layer 3 EVPN DC gateway scenario where EVPN-IFF routes are translated into IPVPN routes, and vice versa. Because redundant gateways are used, this scenario is subject to Layer 3 routing loops, and the D-PATH attribute helps preventing these loops in an automatic way, without the need for extra routing policies to tag or drop routes.

Figure 112. Use of D-PATH for Layer 3 DC gateway redundancy

The following is the configuration of the VPRN or R-VPLS services in DGW1 and DGW2 in the preceding figure.

A:DGW1# configure service vprn 20 
A:DGW1>config>service>vprn# info 
----------------------------------------------
            interface "sbd-1" create
                vpls “sbd-1”
                    evpn-tunnel
                exit
            exit  
            segment-routing-v6 1 create
                locator "LOC-1"
                    function
                        end-dt46
                    exit
                exit
            exit
            bgp-ipvpn
                segment-routing-v6
                    route-distinguisher 192.0.2.1:20
                    srv6-instance 1 default-locator "LOC-1"
                    source-address 2001:db8::1
                    vrf-target target:64500:20
                    domain-id 65000:2
                    no shutdown
                exit
            exit
            no shutdown 
*A:DGW1# configure service vpls "sbd-1" 
*A:DGW1>config>service>vpls# info 
----------------------------------------------
            allow-ip-int-bind
            exit
            vxlan instance 1 vni 1 create
            exit
            bgp
            exit
            bgp-evpn
                evi 1
                ip-route-advertisement domain-id 65000:1
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit    

A:DGW2# configure service vprn 20 
A:DGW2>config>service>vprn# info 
----------------------------------------------
            interface "sbd-1" create
                vpls “sbd-1”
                    evpn-tunnel
                exit
            exit  
            segment-routing-v6 1 create
                locator "LOC-1"
                    function
                        end-dt46
                    exit
                exit
            exit
            bgp-ipvpn
                segment-routing-v6
                    route-distinguisher 192.0.2.2:20
                    srv6-instance 1 default-locator "LOC-1"
                    source-address 2001:db8::2
                    vrf-target target:64500:20
                    domain-id 65000:2
                    no shutdown
                exit
            exit
            no shutdown 
*A:DGW2# configure service vpls "sbd-1" 
*A:DGW2>config>service>vpls# info 
----------------------------------------------
            allow-ip-int-bind
            exit
            vxlan instance 1 vni 1 create
            exit
            bgp
            exit
            bgp-evpn
                evi 1
                ip-route-advertisement domain-id 65000:1
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit       

The following considerations apply to the example configuration shown in Use of D-PATH for Layer 3 DC gateway redundancy.

  • Imported VPN-IP SRv6 routes are readvertised as EVPN-IFF VXLAN routes with a prepended D-PATH domain 65000:2:128.
  • Imported EVPN-IFF VXLAN routes are readvertised as VPN-IP SRv6 routes with a prepended D-PATH domain 65000:1:70.

If PE1 sends an EVPN-IFF route 10.0.0.0/24 that is imported by both DGW1 and DGW2, then, when DGW1 and DGW2 receive each other’s routes, they identify the D-PATH attribute and compare the list of domains with the locally configured domains in the VPRN. Since the domain matches one of the local domains, the route is not installed in the VPRN route table and it is flagged as a looped route (the show router bgp routes detail or hunt commands show DPath Loop VRFs: 20). In this way loops are prevented.

Routing policies for BGP EVPN routes

Routing policies match on specific fields when EVPN routes are imported or exported. These matching fields (excluding route table evpn ip-prefix routes, unless explicitly mentioned), are:

  • communities, extended-communities, and large-communities

  • well-known communities (no-export | no-export-subconfed | no-advertise)

  • family EVPN

  • protocol BGP-VPN (this term also matches VPN-IPv4 and VPN-IPv6 routes)

  • prefix lists for routes type 2 when they contain an IP address, and for type 5

  • route tags that can be passed by EVPN to BGP from:

    • service>epipe/vpls>bgp-evpn>mpls/vxlan>default-route-tag (this route-tag can be matched on export only)

    • service>vpls>proxy-arp/nd>evpn-route-tag (this route tag can be matched on export only)

    • route table route-tags when exporting EVPN IP-prefix routes

  • EVPN type

  • BGP attributes that are applicable to EVPN routes (such as AS-path, local-preference, next-hop)

Additionally, the route tags can be used on export policies to match EVPN routes that belong to a service and BGP instance, routes that are created by the proxy-arp or proxy-nd application, or IP-Prefix routes that are added to the route table with a route tag.

EVPN can pass only one route tag to BGP to achieve matching on export policies. In case of a conflict, the default-route-tag has the least priority of the three potential tags added by EVPN.

For instance, if VPLS 10 is configured with proxy-arp>evpn-route-tag 20 and bgp-evpn>mpls>default-route-tag 10, all MAC/IP routes, which are generated by the proxy-arp application, uses route tag 20. Export policies can then use ‟from tag 20” to match all those routes. In this case, inclusive Multicast routes are matched by using ‟from tag 10”.

Routing policies for BGP EVPN IP prefixes

BGP routing policies are supported for IP prefixes imported or exported through BGP-EVPN in R-VPLS services (EVPN-IFF routes) or VPRN services (EVPN-IFL routes).

When applying routing policies to control the distribution of prefixes between EVPN-IFF and IP-VPN (or EVPN-IFL), the user must consider that these owners are completely separate as far as BGP is concerned and when prefixes are imported in the VPRN routing table, the BGP attributes are lost to the other owner, unless the iff-attribute-uniform-propagation command is configured on the router.

If the iff-attribute-uniform-propagation command is disabled, the use of route tags allows the controlled distribution of prefixes across the two families.

IP-VPN import and EVPN export BGP workflow shows an example of how VPN-IPv4 routes are imported into the RTM (Routing Table Manager) and then passed to EVPN for its own process.

Note: VPN-IPv4 routes can be tagged at ingress and that tag is preserved throughout the RTM and EVPN processing so that the tag can be matched at the egress BGP routing policy.
Figure 113. IP-VPN import and EVPN export BGP workflow

Policy tags can be used to match EVPN IP prefixes that were learned not only from BGP VPN-IPv4 but also from other routing protocols. The tag range supported for each protocol is different, as follows:

<tag>  : accepts in decimal or hex
        [0x1..0xFFFFFFFF]H (for OSPF and IS-IS)
        [0x1..0xFFFF]H (for RIP)
        [0x1..0xFF]H (for BGP)

EVPN import and I-VPN export BGP workflow shows an example of the reverse workflow where routes are imported from EVPN and exported from RTM to BGP VPN-IPv4.

Figure 114. EVPN import and I-VPN export BGP workflow

The preceding described behavior and the use of tags is also valid for vsi-import and vsi-export policies in the R-VPLS.

The following is a summary of the policy behavior for EVPN-IFF IP-prefixes when iff-attribute-uniform-propagation is disabled.

  • For EVPN-IFF routes received and imported in RTM, policy entries (peer or vsi-import) match on communities or any of the following fields, and can add tags (as action):

    • communities, extended-communities or large communities

    • well-known communities

    • family EVPN

    • protocol bgp-vpn

    • prefix-lists

    • EVPN route type

    • BGP attributes (as-path, local-preference, next-hop)

  • For exporting RTM to EVPN-IFF prefix routes, policy entries only match on tags, and based on this matching, add communities, accept, or reject. This applies to the peer level or on the VSI export level. Policy entries can also add tags for static routes, RIP, OSPF, IS-IS, BGP, and ARP-ND routes, which can then be matched on the BGP peer export policy, or on the VSI export policy for EVPN-IFF routes.

The following applies if the iff-attribute-uniform-propagation command is enabled.

For exporting RTM to EVPN-IFF prefix routes, in addition to matching on tags, matching path attributes on EVPN-IFF routes is supported in the following:

  • vrf-export (when exporting the prefixes in VPN-IP or EVPN IFL or IP routes)

  • vsi-export policies (when exporting the prefixes in EVPN-IFF routes)

  • for non-BGP route-owners (RIP, OSPF, IS-IS, static, ARP-ND), there are no changes and the only match criterion in vsi-export for EVPN-IFF routes is tags

EVPN Weighted ECMP for IP prefix routes

SR OS supports weighted ECMP for EVPN IP prefix routes (IPv4 and IPv6), in the EVPN Interface-less (EVPN-IFL) and EVPN Interface-ful (EVPN-IFF) models.

Based on draft-ietf-bess-evpn-unequal-lb, the EVPN Link Bandwidth extended community is used in the IP Prefix routes to indicate a weight that the receiver PE must consider when load balancing traffic to multiple EVPN, CE, or both next hops. The supported weight in the extended community is of type Generalized weight and encodes the count of CEs that advertised prefix N to a PE in a BGP PE-CE route. The following figure shows the use of EVPN weighted ECMP.

Figure 115. Weighted ECMP for IP prefix routes use case

In the preceding figure, some multirack Container Network Functions (CNFs) are connected to a few TORs in the EVPN network. Each CNF advertises the same anycast service network 10.1.1.0/24 using a single PE-CE BGP session. Without Weighted ECMP, the TOR2, TOR3 and TO4 would re-advertise the prefix in an EVPN IP-Prefix route and flows to 10.1.1.0/24 from the Border Leaf-1 would be equally distributed among TOR2, TOR3 and TOR4. However, the needed load balancing distribution is based on the count of CNFs that are attached to each TOR. That is, out of five flows to 10.1.1.0/24, three should be directed to TOR3 (because it has three CNFs attached), one to TOR4 and one to either TOR2 or TOR1 (since CNF1 is dual-homed to both).

Weighted ECMP achieves the needed unequal load balancing based on the CNF count on each TOR. In the Weighted ECMP for IP prefix routes use case example, if Weighted ECMP is enabled, the TORs add a weight encoded in the EVPN IP Prefix route, where the weight matches the count of CNFs that each TOR has locally . The Border Leaf creates an ECMP set for prefix 10.1.1.0/24 were the weights are considered when distributing the load to the prefix.

The procedures associated with EVPN Weighted ECMP for IP Prefix routes can be divided into advertising and receiving procedures:

  • Use the following commands to configure the advertising procedures for EVPN IFL.
    configure service vprn bgp-evpn mpls evpn-link-bandwidth advertise
    configure service vprn bgp-evpn segment-routing-v6 evpn-link-bandwidth advertise
    Use the following command to configure the advertising procedures for EVPN IFF.
    configure service vpls bgp-evpn ip-route-link-bandwidth advertise

    The advertise command triggers the advertisement of the EVPN Link Bandwidth extended community with a weight that matches the CE count advertised by the route. The dynamic weight can, optionally, be overridden by a configuring the advertise weight value.

  • Use the following commands to configure the receiving procedures for EVPN-IFL.
    configure service vprn bgp-evpn mpls evpn-link-bandwidth weighted-ecmp
    configure service vprn bgp-evpn segment-routing-v6 evpn-link-bandwidth weighted-ecmp
    Use the following command to configure the receiving procedures for EVPN-IFF.
    configure service vpls bgp-evpn ip-route-link-bandwidth weighted-ecmp

    When the weighted-ecmp command is enabled, the receiving PE installs IP Prefix routes in the VPRN route-table associated with a normalized weight that is derived from the signaled weight.

    • For EVPN-IFL, for weighted ECMP across EVPN next hops and CE next hops, the following commands must be configured.
      configure service vprn bgp group evpn-link-bandwidth add-to-received-bgp
      configure service vprn bgp eibgp-loadbalance
    • For EVPN-IFF, Weighted ECMP can only be applied to EVPN next hops and not to the eibgp-loadbalance command.

EVPN-IFL MPLS service configuration

The following example shows the configuration of the EVPN Weighted ECMP feature for EVPN IFL routes with MPLS transport. A similar example could have been added for EVPN IFL routes with SRv6 transport.

Suppose PE2, PE4, and PE5 are attached to the same EVPN-IFL service on VPRN 2000. PE4 is connected to two CEs (CE-41 and CE-42) and PE5 to one CE (CE-51). The three CEs advertise the same prefix 192.168.1.0/24 using PE-CE BGP and the goal is for PE2 to distribute to PE4 (to 192.168.1.0/24) twice as many flows as for PE5.

The configuration of PE4 and PE5 follows:

*A:PE-4# configure service vprn 2000 
*A:PE-4>config>service>vprn# info 
----------------------------------------------
            ecmp 10
            autonomous-system 64500
            interface "to-CE41" create
                address 10.41.0.1/24
                sap pxc-3.a:401 create
                exit
            exit
            interface "to-CE42" create
                address 10.42.0.1/24
                sap pxc-3.a:402 create
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    evi 2000
                    evpn-link-bandwidth
                        advertise
                        weighted-ecmp
                    exit
                    route-distinguisher 192.0.2.4:2000
                    vrf-target target:64500:2000
                    no shutdown
                exit
            exit
            bgp
                multi-path
                    ipv4 10
                exit
                eibgp-loadbalance
                router-id 4.4.4.4
                rapid-withdrawal
                group "pe-ce"
                    family ipv4 ipv6
                    neighbor 10.41.0.2
                        peer-as 64541
                        evpn-link-bandwidth
                            add-to-received-bgp 1
                        exit
                    exit
                    neighbor 10.42.0.2
                        peer-as 64542
                        evpn-link-bandwidth
                            add-to-received-bgp 1
                        exit
                    exit
                exit
                no shutdown
            exit
            no shutdown


A:PE-5# configure service vprn 2000 
A:PE-5>config>service>vprn# info 
----------------------------------------------
            autonomous-system 64500
            interface "to-CE51" create
                address 10.51.0.1/24
                sap pxc-3.a:501 create
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    evi 2000
                    evpn-link-bandwidth
                        advertise
                        weighted-ecmp
                    exit
                    route-distinguisher 192.0.2.5:2000
                    vrf-target target:64500:2000
                    no shutdown
                exit
            exit
            bgp
                multi-path
                    ipv4 10
                exit
                eibgp-loadbalance
                router-id 5.5.5.5
                rapid-withdrawal
                group "pe-ce"
                    family ipv4 ipv6
                    neighbor 10.51.0.2
                        peer-as 64551
                        evpn-link-bandwidth
                            add-to-received-bgp 1
                        exit
                    exit
                exit
                no shutdown
            exit
            no shutdown

The configuration on PE2 follows:

*A:PE-2# configure service vprn 2000 
*A:PE-2>config>service>vprn# info 
----------------------------------------------
            ecmp 10
            interface "to-PE" create
                address 20.10.0.1/24
                sap pxc-3.a:2000 create
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    evi 2000
                    evpn-link-bandwidth
                        advertise
                        weighted-ecmp
                    exit
                    route-distinguisher 192.0.2.2:2000
                    vrf-target target:64500:2000
                    no shutdown
                exit
            exit
            no shutdown

PE4 and PE5 IP Prefix route advertisement

As a result of the preceding configuration, PE4 (next-hop 2001:db8::4) and PE5 (next-hop 2001:db8::5) advertise the IP Prefix route from the CEs with weights 2 and 1 respectively:

show router bgp routes evpn ip-prefix prefix 192.168.1.0/24 community target:64500:2000 hunt 
===============================================================================
 BGP Router ID:192.0.2.2        AS:64500       Local AS:64500      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
Network        : n/a
Nexthop        : 2001:db8::4
Path Id        : None                   
From           : 2001:db8::4
Res. Nexthop   : fe80::b446:ffff:fe00:142
Local Pref.    : 100                    Interface Name : int-PE-2-PE-4
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : None
AIGP Metric    : None                   IGP Cost       : 10
Connector      : None
Community      : target:64500:2000 evpn-bandwidth:1:2
                 bgp-tunnel-encap:MPLS
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.4
Flags          : Used Valid Best IGP 
Route Source   : Internal
AS-Path        : 64541 
EVPN type      : IP-PREFIX              
ESI            : ESI-0
Tag            : 0                      
Gateway Address: 00:00:00:00:00:00
Prefix         : 192.168.1.0/24
Route Dist.    : 192.0.2.4:2000         
MPLS Label     : LABEL 524283           
Route Tag      : 0                      
Neighbor-AS    : 64541
Orig Validation: N/A                    
Source Class   : 0                      Dest Class     : 0
Add Paths Send : Default                
Last Modified  : 01h19m43s              
 
Network        : n/a                  
Nexthop        : 2001:db8::5
Path Id        : None                   
From           : 2001:db8::5
Res. Nexthop   : fe80::b449:1ff:fe01:1f
Local Pref.    : 100                    Interface Name : int-PE-2-PE-5
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : None
AIGP Metric    : None                   IGP Cost       : 10
Connector      : None
Community      : target:64500:2000 evpn-bandwidth:1:1
                 bgp-tunnel-encap:MPLS
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.5
Flags          : Used Valid Best IGP 
Route Source   : Internal
AS-Path        : 64551 
EVPN type      : IP-PREFIX              
ESI            : ESI-0
Tag            : 0                      
Gateway Address: 00:00:00:00:00:00
Prefix         : 192.168.1.0/24
Route Dist.    : 192.0.2.5:2000         
MPLS Label     : LABEL 524285           
Route Tag      : 0                      
Neighbor-AS    : 64551
Orig Validation: N/A                    
Source Class   : 0                      Dest Class     : 0
Add Paths Send : Default                
Last Modified  : 00h08m45s              
 
-------------------------------------------------------------------------------
RIB Out Entries
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Routes : 2
===============================================================================

PE2 prefix installation

The show router id route-table extensive command performed on PE2, shows that PE2 installs the prefix with weights 2 and 1 respectively for PE4 and PE5:

show router 2000 route-table 192.168.1.0/24 extensive 
===============================================================================
Route Table (Service: 2000)
===============================================================================
Dest Prefix             : 192.168.1.0/24
  Protocol              : EVPN-IFL
  Age                   : 01h22m47s
  Preference            : 170
  Indirect Next-Hop     : 2001:db8::4
    Label               : 524283
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 2
    Resolving Next-Hop  : 2001:db8::4 (LDP tunnel)
      Metric            : 10
      ECMP-Weight       : N/A
  Indirect Next-Hop     : 2001:db8::5
    Label               : 524285
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 2001:db8::5 (LDP tunnel)
      Metric            : 10
      ECMP-Weight       : N/A
-------------------------------------------------------------------------------
No. of Destinations: 1
===============================================================================

*A:PE-2# show router 2000 fib 1 192.168.1.0/24 extensive                                              

===============================================================================
FIB Display (Service: 2000)
===============================================================================
Dest Prefix             : 192.168.1.0/24
  Protocol              : EVPN-IFL
  Installed             : Y
  Indirect Next-Hop     : 2001:db8::4
    Label               : 524283
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 2
    Resolving Next-Hop  : 2001:db8::4 (LDP tunnel)
      ECMP-Weight       : 1
  Indirect Next-Hop     : 2001:db8::5
    Label               : 524285
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 2001:db8::5 (LDP tunnel)
      ECMP-Weight       : 1
===============================================================================
Total Entries : 1
===============================================================================

EVPN-IFL handling

In case of EVPN-IFL, Weighted ECMP is also supported for EIBGP load balancing among EVPN and CE next hops. For example, PE4 installs the same prefix with an EVPN-IFL next hop and two CE next hops, and each one with its normalized weight:

show router 2000 route-table 192.168.1.0/24 extensive
===============================================================================
Route Table (Service: 2000)
===============================================================================
Dest Prefix             : 192.168.1.0/24
  Protocol              : BGP
  Age                   : 00h02m27s
  Preference            : 170
  Indirect Next-Hop     : 10.41.0.2
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 10.41.0.2
      Interface         : to-CE41
      Metric            : 0
      ECMP-Weight       : N/A
  Indirect Next-Hop     : 10.42.0.2
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 10.42.0.2
      Interface         : to-CE42
      Metric            : 0
      ECMP-Weight       : N/A
  Indirect Next-Hop     : 2001:db8::5
    Label               : 524285
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 2001:db8::5 (LDP tunnel)
      Metric            : 10
      ECMP-Weight       : N/A
-------------------------------------------------------------------------------
No. of Destinations: 1
===============================================================================

EVPN IP aliasing for IP prefix routes

SR OS supports IP aliasing for EVPN IP prefix routes in the EVPN IFL (Interface-less) or EVPN IFF (Interface-ful) models and as described in draft-sajassi-bess-evpn-ip-aliasing.

IP aliasing allows PEs to load-balance flows to multiple PEs attached to the same prefix, even if not all of them advertise reachability to the prefix in IP prefix routes. IP aliasing works based on the following principles:

  • It requires the configuration of a virtual Ethernet Segment (ES), for example, ES-1, that is associated with a vprn-next-hop and an evi configured in the vprn context. All PEs with reachability to the vprn-next-hop, via the non-EVPN route, advertise their attachment to the ES using EVPN Auto-Discovery per ES and per EVI routes in the VPRN service context.
  • Any PE that receives a BGP PE-CE route for a prefix P via next-hop N, where N matches the active vprn-next-hop, advertises an IP prefix route for P with the ESI of the ES; for example, ESI-1.
  • On reception, PEs importing IP prefix routes with ESI-1 install the prefix P in the route table using the next hops of the AD per-EVI routes for ESI-1, instead of the next hop of the IP prefix route.

EVPN IP aliasing in an EVPN-IFL model is an example of the use of IP aliasing in an EVPN-IFL model.

Figure 116. EVPN IP aliasing in an EVPN-IFL model

In the EVPN IP aliasing in an EVPN-IFL model example shown in the preceding figure, a multirack Virtual Network Function (VNF) is attached to Leaf-1 and Leaf-2. Although the VNF supports a single PE-CE eBGP session to Leaf-1, the preferred behavior is for the Border-Leaf-1 to load balance the traffic toward the VNF using both Leaf-1 and Leaf-2 as next hops. EVPN IP aliasing achieves that preferred behavior based on the following configuration.

An ES L3-ES-1 is configured in Leaf-1 and Leaf-2. The ES is configured for all-active mode and is associated with the vprn-next-hop of the VNF. The association with the evi of the VPRN where the next hop is installed is also required.

Leaf-1 and Leaf-2 ES configuration (MD-CLI)

[ex:/configure service system bgp evpn]
A:admin@node2# info 
    ethernet-segment "L3-ES-1" {
        admin-state enable
        type virtual
        esi 0x01010101010000000000
        multi-homing-mode all-active
        association {
            vprn-next-hop 1.1.1.1 {
                virtual-ranges {
                    evi 2500 { }
                }
            }
        }
    }

Leaf-1 and Leaf-2 ES configuration (classic CLI)

config>service>system>bgp-evpn# info
----------------------------------------------
ethernet-segment "L3-ES-1" virtual create
                    esi 01:01:01:01:01:00:00:00:00:00
                    service-carving
                        mode auto
                    exit
                    multi-homing all-active
                    vprn-next-hop 1.1.1.1
                    evi
                        evi-range 2500
                    exit
                    no shutdown
                exit

The VPRN service configuration in Leaf-1 and Leaf-2 requires the configuration of the evi so that the ES is active on the service.

Leaf-1 VPRN configuration (MD-CLI)

[ex:/configure service vprn "2500"]
A:admin@node2# info
    admin-state enable
    customer "1"
    autonomous-system 64502
    bgp-evpn {
        mpls 1 {
            admin-state enable
            route-distinguisher "192.168.0.1:2500"
            evi 2500
            vrf-target {
                community "target:64500:2500"
            }
            auto-bind-tunnel {
                resolution any
            }
        }
    }
    bgp {
        min-route-advertisement 1
        router-id 2.2.2.2
        rapid-withdrawal true
        ebgp-default-reject-policy {
            import false
            export false
        }
        next-hop-resolution {
            use-bgp-routes true
        }
        group "pe-ce" {
            multihop 10
            family {
                ipv4 true
                ipv6 true
            }
        }
        neighbor "1.1.1.1" {
            group "pe-ce"
            peer-as 64501
        }
    }
    interface "irb1" {
        ipv4 {
            primary {
                address 10.10.10.254
                prefix-length 24
            }
            vrrp 1 {
                backup [10.10.10.254]
                owner true
                passive true
            }
        }
        vpls "BD2501" {
            evpn {
                arp {
                    learn-dynamic false
                    advertise dynamic {
                    }
                }
            }
        }
    }
    interface "lo1" {
        loopback true
        ipv4 {
            primary {
                address 2.2.2.2
                prefix-length 32
            }
        }
    }
    static-routes {
        route 1.1.1.1/32 route-type unicast {
            next-hop "10.10.10.1" {
                admin-state enable
            }
        }
    }

Leaf-1 VPRN configuration (classic CLI)

config>service>vprn 2500 # info
----------------------------------------------
      autonomous-system 64502
            interface "irb1" create
                address 10.10.10.254/24
                vrrp 1 owner passive
                    backup 10.10.10.254
                exit
                vpls "BD2501"
                    evpn
                        arp
                            no learn-dynamic
                            advertise dynamic
                        exit
                    exit
                exit
            exit
            interface "lo1" create
                address 2.2.2.2/32
                loopback
            exit
            static-route-entry 1.1.1.1/32
                next-hop 10.10.10.1
                    no shutdown
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    evi 2500
                    route-distinguisher 192.168.0.1:2500
                    vrf-target target:64500:2500
                    no shutdown
                exit
            exit
            bgp
                min-route-advertisement 1
                router-id 2.2.2.2
                rapid-withdrawal
                next-hop-resolution
                    use-bgp-routes
                exit
                group "pe-ce"
                    family ipv4 ipv6
                    multihop 10
                    neighbor 1.1.1.1
                        peer-as 64501
                    exit
                exit
                no shutdown
            exit
            no shutdown

Leaf-2 VPRN configuration (MD-CLI)

[ex:/configure service vprn "2500"]
A:admin@node2# info
    admin-state enable
    customer "1"
    bgp-evpn {
        mpls 1 {
            admin-state enable
            route-distinguisher "192.168.0.2:2500"
            evi 2500
            vrf-target {
                community "target:64500:2500"
            }
            auto-bind-tunnel {
                resolution any
            }
        }
    }
    interface "irb1" {
        ipv4 {
            primary {
                address 10.10.10.254
                prefix-length 24
            }
            vrrp 1 {
                backup [10.10.10.254]
                owner true
                passive true
            }
        }
        vpls "BD2501" {
            evpn {
                arp {
                    learn-dynamic false
                    advertise dynamic {
                    }
                }
            }
        }
    }
    static-routes {
        route 1.1.1.1/32 route-type unicast {
            next-hop "10.10.10.1" {
                admin-state enable
            }
        }
    }

Leaf-2 VPRN configuration (classic CLI)

config>service>vprn 2500 # info
----------------------------------------------
            interface "irb1" create
                address 10.10.10.254/24
                vrrp 1 owner passive
                    backup 10.10.10.254
                exit
                vpls "BD2501"
                    evpn
                        arp
                            no learn-dynamic
                            advertise dynamic
                        exit
                    exit
                exit
            exit
            static-route-entry 1.1.1.1/32
                next-hop 10.10.10.1
                    no shutdown
                exit
            exit

            bgp-evpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    evi 2500
                    route-distinguisher 192.168.0.2:2500
                    vrf-target target:64500:2500
                    no shutdown
                exit
            exit
            no shutdown

The Border-Leaf-1 configuration also needs the addition of the evi in the VPRN. This allows the creation of ECMP-sets where the next hops of the received IP prefixes are linked to the AD per-EVI routes next hops.

Border-Leaf-1 VPRN configuration (MD-CLI)

[ex:/configure service vprn "2500"]
A:admin@node2# info 
    admin-state enable
    customer "1"
    ecmp 4
    bgp-evpn {
        mpls 1 {
            admin-state enable
            route-distinguisher "192.168.0.3:2500"
            evi 2500
            vrf-target {
                community "target:64500:2500"
            }
            auto-bind-tunnel {
                resolution any
            }
        }
    }

Border-Leaf-1 VPRN configuration (classic CLI)

config>service>vprn 2500 # info
----------------------------------------------
            ecmp 4
            bgp-evpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    evi 2500
                    route-distinguisher 192.168.0.3:2500
                    vrf-target target:64500:2500
                    no shutdown
                exit
            exit
            no shutdown

Based on the preceding configuration and the reachability of next-hop 1.1.1.1 via non-EVPN route, the two leaf nodes advertise their attachment to the ES via AD per-ES or EVI routes. Use the following command to display the advertisement status for ESI routes.

show router bgp routes evpn auto-disc esi 01:01:01:01:01:00:00:00:00:00

Advertisement of Auto-Discovery per-ES routes

===============================================================================
 BGP Router ID:192.0.2.3        AS:64500       Local AS:64500      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP EVPN Auto-Disc Routes
===============================================================================
Flag  Route Dist.         ESI                           NextHop
      Tag                                               Label
-------------------------------------------------------------------------------
u*>i  192.168.0.1:2500    01:01:01:01:01:00:00:00:00:00 192.168.0.1
      0                                                 LABEL 524282

u*>i  192.168.0.1:2500    01:01:01:01:01:00:00:00:00:00 192.168.0.1
      MAX-ET                                            LABEL 0

u*>i  192.168.0.2:2500    01:01:01:01:01:00:00:00:00:00 192.168.0.2
      0                                                 LABEL 524283

u*>i  192.168.0.2:2500    01:01:01:01:01:00:00:00:00:00 192.168.0.2
      MAX-ET                                            LABEL 0

-------------------------------------------------------------------------------
Routes : 4
===============================================================================

At the same time, upon reception of the BGP PE-CE route from the VNF with prefix 11.11.11.11/32 (with next-hop 1.1.1.1, matching the vprn-next-hop), Leaf-1 readvertises the route in an IP prefix route with the ESI of the IP aliasing ES. Use the following command.

show router bgp routes evpn ip-prefix prefix 11.11.11.11/32 

Leaf-1 readvertises the route in an IP Prefix route with the ESI of the IP Aliasing ES

===============================================================================
 BGP Router ID:192.0.2.3        AS:64500       Local AS:64500      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
Flag  Route Dist.         Prefix
      Tag                 Gw Address
                          NextHop
                          Label
                          ESI
-------------------------------------------------------------------------------
u*>i  192.168.0.3:2500    11.11.11.11/32
      0                   00:00:00:00:00:00
                          192.168.0.1
                          LABEL 524279
                          01:01:01:01:01:00:00:00:00:00

-------------------------------------------------------------------------------
Routes : 1
===============================================================================

The IP prefix routes with non-zero ESI are also processed and recursively resolved on the PEs that are part of the ES. In the EVPN IP aliasing in an EVPN-IFL model example, Leaf-2 installs the prefix with the next hop associated with the ES instead of the next hop of the IP Prefix route; that is, the resolved next-hop is 1.1.1.1 instead of the IP prefix route next-hop 192.168.1.1. Use the following command to display the prefix with next hop association.

show router 2500 route-table 11.11.11.11/32 extensive 

Prefix with next hop associated with ES

===============================================================================
Route Table (Service: 2500)
===============================================================================
Dest Prefix             : 11.11.11.11/32
  Protocol              : EVPN-IFL
  Age                   : 22h41m44s
  Preference            : 170
  Indirect Next-Hop     : 1.1.1.1
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : N/A
    Resolving Next-Hop  : 1.1.1.1
      Interface         : irb1
      Metric            : 0
      ECMP-Weight       : N/A
-------------------------------------------------------------------------------
No. of Destinations: 1
===============================================================================

Although the preceding example is based on an EVPN IFL model, vprn-next-hop ES can also be associated with VPRNs that use the EVPN IFF model to exchange IP Prefix routes. In an EVPN IFF model, the vprn-next-hop ES is associated with the VPRN if its R-VPLS connected to the EVPN tunnel contains the evi configured in the ES.

The following considerations about vprn-next-hop ES apply:

  • The ES is operationally up as long as it is administratively enabled. The operational state does not reflect the presence of the VPRN next hop in the VPRN’s route table.
  • The AD per-ES or EVI routes for the ES are advertised as long as the VPRN next hop is installed in the route table (as a non-EVPN route). If the vprn-next-hop is installed in the VPRN’s route table as an EVPN IP prefix route, the AD per-ES or EVI routes are not advertised.
  • A node can generate an IP prefix route with the ESI of a vprn-next-hop as long as the node has the vprn-next-hop installed in its VPRN’s route table, even as an EVPN IP prefix route.
  • The AD per-ES or EVI routes are advertised with the RD and route target of the VPRN instance associated with the evi configured for the vprn-next-hop.
  • ES routes are also advertised for the ES and are responsible for the DF Election in the ES in the case of single-active mode.
  • All the non-DF PEs in the ES advertise their AD per-EVI route with bit P=0 and bit B=1, whereas the DF PE advertises its AD per-EVI with P=1 and B=0. When creating the ECMP-set for a prefix associated with an ESI, the remote PEs exclude those PEs for which their AD per-EVI routes indicate P=0.

EVPN Sticky ECMP for IP prefix routes

SR OS supports sticky ECMP for EVPN-IFL and EVPN-IFF IP prefix routes. Non-sticky ECMP, or just ECMP, for a specific IP prefix with n number of next hops requires the router to rehash the flows when one of the next hops is removed or added. This may impact flows that are now sent to a different next hop.

Sticky ECMP refers to the property of the router to minimize the impact in case of a change in the number of next hops for a specific IP prefix. For example, suppose an EVPN IP prefix P has an associated ECMP set of four next hops. In this case, the following actions occur when sticky ECMP is enabled :
  • Upon withdrawal of one of the next hops, only the affected flows are redistributed into the remaining three next hops, as equally as possible.
  • Upon addition of the fifth next hop, the router minimizes the impact on existing flows.

The implementation of sticky ECMP is based on software. The router emulates the behavior by repeating each ECMP next hop of the sticky route a number of times (according to the next-hop normalized weight) in different hash buckets, to create a fill pattern of size N for the incoming flows. In general, the closer the number of next hops gets to the maximum number of ECMP paths, the worse the distribution algorithm works. For detailed information about the general implementation of sticky ECMP in SR OS, see section BGP support for sticky ECMP in the 7450 ESS, 7750 SR, 7950 XRS, and VSR Unicast Routing Protocols Guide.

An IP prefix is made sticky configuring the sticky-ecmp policy action on an import policy (at the peer or VPRN level). Sticky ECMP for EVPN IP-prefix routes is supported in combination with other ECMP features such as EVPN unequal ECMP or IP aliasing.

EVPN VLAN-aware bundle mode for BGP-EVPN VPLS or R-VPLS services

SR OS supports VLAN-aware bundle mode for BGP-EVPN VPLS or R-VPLS services, and is compliant with RFC 7432. A Broadcast Domain (BD) in RFC 7432 is mapped to a VPLS service in SR OS. Multiple BDs (VPLS services) can be grouped together under the same VLAN-aware bundle, where each BD is assigned a different Ethernet Tag ID.

Use the following command to associate a VPLS service with a bundle name.

configure service vpls bgp-evpn vlan-aware-bundle

Use the following commands to indicate the Ethernet Tag ID allocated for the VPLS service within the bundle:

  • MD-CLI
    configure service vpls bgp-evpn routes vlan-aware-bundle-eth-tag
  • classic CLI
    configure service vpls bgp-evpn vlan-aware-bundle eth-tag 

When the vlan-aware-bundle-eth-tag command is set to a non-zero value, the EVPN service routes (types 1, 2 and 3) advertised for the VPLS service are advertised with this value into the Ethernet Tag ID field of the routes. On reception of EVPN routes with non-zero Ethernet Tag ID, BGP imports the routes based on the import route target as usual. However, the system checks the received Ethernet Tag ID field and only processes routes whose Ethernet Tag ID match the local vlan-aware-bundle-eth-tag value. In addition, use commands in the following context to display details of the VPLS services in a given bundle.

show service vlan-aware-bundle 

The following example shows a configuration for VLAN-aware bundle, “bundle-1”. This bundle is composed of two services with Ethernet Tag IDs 120 and 121, respectively.

MD-CLI

[ex:/configure service]
A:admin@node-2# info
    vpls "vpls-120-bundle-1" {
        admin-state enable
        service-id 120
        customer "1"
        routed-vpls {
        }
        bgp 1 {
        }
        bgp-evpn {
            evi 120
            vlan-aware-bundle "bundle-1"
            routes {
                vlan-aware-bundle-eth-tag 120
            }
            mpls 1 {
                admin-state enable
                ingress-replication-bum-label true
                auto-bind-tunnel {
                    resolution any
                }
            }
        }
        sap pxc-10.a:120 {
        }
    }
    vpls "vpls-121-bundle-1" {
        admin-state enable
        service-id 121
        customer "1"
        segment-routing-v6 1 {
            locator "LOC-1" {
                function {
                    end-dt2u {
                    }
                    end-dt2m {
                    }
                }
            }
        }
        bgp 1 {
            route-target {
                export "target:64500:120"
                import "target:64500:120"
            }
        }
        bgp-evpn {
            evi 121
            vlan-aware-bundle "bundle-1"
            routes {
                vlan-aware-bundle-eth-tag 121
            }
            segment-routing-v6 1 {
                admin-state enable
                source-address 2001:db8::2
                srv6 {
                    instance 1
                    default-locator "LOC-1"
                }
            }
        }
    }

classic CLI

A:node-2>config>service# info 
----------------------------------------------
        vpls 120 name "vpls-120-bundle-1" customer 1 create
            allow-ip-int-bind
            exit
            bgp
            exit
            bgp-evpn
                vlan-aware-bundle "bundle-1" eth-tag 120
                evi 120
                mpls bgp 1
                    ingress-replication-bum-label
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            sap pxc-10.a:120 create
                no shutdown
            exit
            no shutdown
        exit
        vpls 121 name "vpls-121-bundle-1" customer 1 create
            segment-routing-v6 1 create
                locator "LOC-1"
                    function
                        end-dt2u
                        end-dt2m
                    exit              
                exit
            exit
            bgp
                route-target export target:64500:120 import target:64500:120
            exit
            bgp-evpn
                vlan-aware-bundle "bundle-1" eth-tag 121
                evi 121
                segment-routing-v6 bgp 1 srv6-instance 1 default-locator "LOC-1" create
                    source-address 2001:db8::2
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
        exit
Use the following command to display VLAN-aware bundle information.
# show service vlan-aware-bundle 
================================================================
VLAN Aware Bundle
================================================================
Bundle                           Service Id Eth Tag   Evi
----------------------------------------------------------------
bundle-1                         120        120       120
                                 121        121       121
----------------------------------------------------------------
Number of entries: 2
----------------------------------------------------------------
================================================================

===============================================================================
VLAN Aware Bundle Summary
===============================================================================
MAC Entries                            : 2
EVPN-MPLS Destinations                 : 2
EVPN-MPLS Ethernet Segment Destinations: 0
VXLAN Destinations                     : 0
VXLAN Ethernet Segment Destinations    : 0
SRv6 Destinations                      : 2
SRv6 Ethernet segment Destinations     : 0
===============================================================================
Use the following command to display VLAN-aware bundle forwarding database information.
# show service vlan-aware-bundle "bundle-1" fdb 
===============================================================================
Service Id: 120  Name: vpls-120-bundle-1

===============================================================================
Forwarding Database, Service 120
===============================================================================
ServId     MAC               Source-Identifier       Type     Last Change
            Transport:Tnl-Id                         Age      
-------------------------------------------------------------------------------
120        00:ca:fe:ca:fe:01 mpls-1:                 EvpnS:P  01/22/24 14:33:31
                             192.0.2.5:524270
           ldp:65543
-------------------------------------------------------------------------------
No. of MAC Entries: 1
-------------------------------------------------------------------------------
Legend:L=Learned O=Oam P=Protected-MAC C=Conditional S=Static Lf=Leaf T=Trusted
===============================================================================
===============================================================================
Service Id: 121  Name: vpls-121-bundle-1

===============================================================================
Forwarding Database, Service 121
===============================================================================
ServId     MAC               Source-Identifier       Type     Last Change
            Transport:Tnl-Id                         Age      
-------------------------------------------------------------------------------
121        00:ca:fe:ca:fe:01 srv6-1:                 EvpnS:P  01/22/24 14:33:39
                             192.0.2.5
           cafe:1:0:5:7b1c:d000::
-------------------------------------------------------------------------------
No. of MAC Entries: 1
-------------------------------------------------------------------------------
Legend:L=Learned O=Oam P=Protected-MAC C=Conditional S=Static Lf=Leaf T=Trusted
===============================================================================

Configuring an EVPN service with CLI

This section provides information to configure VPLS using the command line interface.

EVPN-VXLAN configuration examples

Layer 2 PE example

This section shows a configuration example for three PEs in a Data Center, all the following assumptions are considered:

  • PE-1 is a Data Center Network Virtualization Edge device (NVE) where service VPLS 2000 is configured.

  • PE-2 and PE-3 are redundant Data Center Gateways providing Layer 2 connectivity to the WAN for service VPLS 2000.

DC PE-1 configuration for service VPLS 2000

DC PE-2 and PE-3 configuration with SAPs at the WAN side (advertisement of all macs and unknown-mac-route):

vpls 2000 name "2000" customer 1 create
                vxlan instance 1 vni 2000 create
                exit
                bgp
                    route-distinguisher 65001:2000
                    route-target export target:65000:2000 import target:65000:2000
                exit
                bgp-evpn
                    unknown-mac-route
                    vxlan bgp 1 vxlan-instance 1
                        no shutdown
                    exit
                exit
                site "site-1" create
                    site-id 1
                    sap 1/1/1:1           
                    no shutdown           
                exit                      
                sap 1/1/1:1 create        
                    no shutdown           
                exit                      
                no shutdown               
            exit 

DC PE-2 and PE-3 configuration with BGP-AD spoke-SDPs at the WAN side (mac-advertisement disable, only unknown-mac-route advertised):

service vpls 2000 name "vpls2000" customer 1 create
    vxlan instance 1 vni 2000 create
    bgp 
        pw-template-binding 1 split-horizon-group ‟to-WAN” import-
rt target:65000:2500
        vsi-export ‟export-policy-1” #policy exporting the WAN and DC RTs
        vsi-import ‟import-policy-1” #policy importing the WAN and DC RTs
        route-distinguisher 65001:2000
    bgp-ad
        no shutdown
        vpls-id 65000:2000
    bgp-evpn
        mac-advertisement disable 
        unknown-mac-route
        vxlan bgp 1 vxlan-instance 1
            no shutdown
    site site-1 create
        split-horizon-group ‟to-WAN”
        no shutdown
        site-id 1

EVPN for VXLAN in R-VPLS services example

This section shows a configuration example for three 7750 SR, 7450 ESS, or 7950 XRS PEs in a Data Center, based on the following assumptions:

PE-1 is a Data Center Network Virtualization Edge device (NVE) where the following services are configured:

  • R-VPLS 2001 and R-VPLS 2002 are subnets where Tenant Systems are connected

  • VPRN 500 is a VPRN instance providing inter-subnet forwarding between the local subnets and from local subnets to the WAN subnets

  • R-VPLS 501 is an IRB backhaul R-VPLS service that provides EVPN-VXLAN connectivity to the VPRNs in PE-2 and PE-3

*A:PE-1>config>service# info
        vprn 500 name "vprn500" customer 1 create
            ecmp 4
            route-distinguisher 65071:500
            vrf-target target:65000:500
            interface "evi-501" create
              address 10.30.30.1/24
                vpls "evpn-vxlan-501"
                exit
            exit
            interface "subnet-2001" create
                address 10.10.10.1/24
                vpls "r-vpls 2001"
                exit
            exit
            interface "subnet-2002" create
                address 10.20.20.1/24
                vpls "r-vpls 2002"
                exit
            exit
            no shutdown
        exit
        vpls 501 name ‟evpn-vxlan-501” customer 1 create
            allow-ip-int-bind
            vxlan instance 1 vni 501 create
            exit
            bgp
                route-distinguisher 65071:501
                route-target export target:65000:501 import target:65000:501
          exit 
            bgp-evpn
                ip-route-advertisement incl-host
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
        exit
        vpls 2001 name ‟r-vpls 2001” customer 1 create
            allow-ip-int-bind
            sap 1/1/1:21 create
            exit
            sap 1/1/1:501 create
            exit
            no shutdown
        exit                          
        vpls 2002 name ‟r-vpls 2002” customer 1 create
            allow-ip-int-bind
            sap 1/1/1:22 create
            exit
            sap 1/1/1:502 create
            exit
            no shutdown
        exit                          

PE-2 and PE-3 are redundant Data Center Gateways providing Layer 3 connectivity to the WAN for subnets ‟subnet-2001” and ‟subnet-2002”. The following configuration excerpt shows an example for PE-2. PE-3 would have an equivalent configuration.

*A:PE-2>config>service# info
        vprn 500  name "vprn500" customer 1 create
            ecmp 4
            route-distinguisher 65072:500
            auto-bind-tunnel
              resolution-filter
                gre
                ldp
                rsvp
              exit
              resolution filter
            exit
            vrf-target target:65000:500
            interface "evi-501" create
              address 10.30.30.2/24
                vpls "evpn-vxlan-501"
                exit
            exit
            no shutdown
        exit
        vpls 501 name ‟evpn-vxlan-501” customer 1 create
            allow-ip-int-bind
            vxlan instance 1 vni 501 create
            exit
            bgp
                route-distinguisher 65072:501
                route-target export target:65000:501 import target:65000:501
            exit                      
            bgp-evpn
                ip-route-advertisement incl-host
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
        exit

EVPN for VXLAN in EVPN tunnel R-VPLS services example

The example in EVPN for VXLAN in R-VPLS services example can be optimized by using EVPN tunnel R-VPLS services instead of regular IRB backhaul R-VPLS services. If EVPN tunnels are used, the corresponding R-VPLS services cannot contain SAPs or SDP-bindings and the VPRN interfaces do not need IP addresses.

The following excerpt shows the configuration in PE-1 for the VPRN 500. The R-VPLS 501, 2001 and 2002 can keep the same configuration as shown in the previous section.

*A:PE-1>config>service# info
        vprn 500 name "vprn500" customer 1 create
            ecmp 4
            route-distinguisher 65071:500
            vrf-target target:65000:500
            interface "evi-501" create
                vpls "evpn-vxlan-501"
                    evpn-tunnel# no need to configure an IP address
                exit
            exit
            interface "subnet-2001" create
                address 10.10.10.1/24
                vpls "r-vpls 2001"
                exit
            exit
            interface "subnet-2002" create
                address 20.20.20.1/24
                vpls "r-vpls 2002"
                exit
            exit
            no shutdown
        exit

The VPRN 500 configuration in PE-2 and PE-3 would be changed in the same way by adding the evpn-tunnel and removing the IP address of the EVPN-tunnel R-VPLS interface. No other changes are required.

*A:PE-2>config>service# info
        vprn 500 name "vprn500" customer 1 create
            ecmp 4
            route-distinguisher 65072:500
            auto-bind-tunnel
              resolution-filter
                gre
                ldp
                rsvp
              exit
              resolution filter
            exit
            vrf-target target:65000:500
            interface "evi-501" create
                vpls "evpn-vxlan-501"
                    evpn-tunnel# no need to configure an IP address
                exit
            exit
            no shutdown
        exit

EVPN for VXLAN in R-VPLS services with IPv6 interfaces and prefixes example

In the following configuration example, PE1 is connected to CE1 in VPRN 30 through a dual-stack IP interface. VPRN 30 is connected to an EVPN-tunnel R-VPLS interface enabled for IPv6.

In the following excerpt configuration the PE1 advertises, in BGP EVPN, the 172.16.0.0/24 and 2001:db8:1000::1 prefixes in two separate NLRIs. The NLRI for the IPv4 prefix uses gateway IP = 0 and a non-zero gateway MAC, whereas the NLRI for the IPv6 prefix is sent with gateway IP = Link-Local Address for interface ‟int-evi-301” and no gateway MAC.

*A:PE1>config>service# info 
        vprn 30 name "vprn30" customer 1 create
            route-distinguisher 192.0.2.1:30
            vrf-target target:64500:30
            interface "int-PE-1-CE-1" create
                enable-ingress-stats
                address 172.16.0.254/24
                ipv6
                    address 2001:db8:1000::1/64 
                exit
                sap 1/1/1:30 create
                exit
            exit
            interface "int-evi-301" create
                ipv6
                exit
                vpls "evi-301"
                    evpn-tunnel
                exit
            exit
            no shutdown
----------------------------------------------

EVPN-MPLS configuration examples

EVPN all-active multihoming example

This section shows a configuration example for three 7750 SR, 7450 ESS, or 7950 XRS PEs, all the following assumptions are considered:

  • PE-1 and PE-2 are multihomed to CE-12 that uses a LAG to get connected to the network. CE-12 is connected to LAG SAPs configured in an all-active multihoming Ethernet segment.

  • PE-3 is a remote PE that performs aliasing for traffic destined for the CE-12.

The following configuration excerpt applies to a VPLS-1 on PE-1 and PE-2, as well as the corresponding Ethernet-segment and LAG commands.


A:PE1# configure lag 1 
A:PE1>config>lag# info 
----------------------------------------------
        mode access
        encap-type dot1q
        port 1/1/2 
        lacp active administrative-key 1 system-id 00:00:00:00:69:72 
        no shutdown
----------------------------------------------
A:PE1>config>lag# /configure service system bgp-evpn 
A:PE1>config>service>system>bgp-evpn# info 
----------------------------------------------
                route-distinguisher 192.0.2.69:0
                ethernet-segment "ESI-71" create
                    esi 0x01000000007100000001
                    es-activation-timer 10
                    service-carving
                        mode auto
                    exit
                    multi-homing all-active
                    lag 1
                    no shutdown
                exit
----------------------------------------------
A:PE1>config>service>system>bgp-evpn# /configure service vpls 1 
A:PE1>config>service>vpls# info 
----------------------------------------------
            bgp
            exit
            bgp-evpn
                cfm-mac-advertisement
                evi 1
                vxlan
                    shutdown
                exit
                mpls bgp 1
                    ingress-replication-bum-label
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            sap lag-1:1 create
              
            exit
            no shutdown
----------------------------------------------

A:PE2# configure lag 1 
A:PE2>config>lag# info 
----------------------------------------------
        mode access
        encap-type dot1q
        port 1/1/3 
        lacp active administrative-key 1 system-id 00:00:00:00:69:72 
        no shutdown
----------------------------------------------
A:PE2>config>lag# /configure service system bgp-evpn 
A:PE2>config>service>system>bgp-evpn# info 
----------------------------------------------
                route-distinguisher 192.0.2.72:0
                ethernet-segment "ESI-71" create
                    esi 0x01000000007100000001
                    es-activation-timer 10
                    service-carving
                        mode auto
                    exit
                    multi-homing all-active
                    lag 1
                    no shutdown
                exit
----------------------------------------------
A:PE2>config>service>system>bgp-evpn# /configure service vpls 1 
A:PE2>config>service>vpls# info 
----------------------------------------------
            bgp
            exit
            bgp-evpn
                cfm-mac-advertisement
                evi 1
                vxlan
                    shutdown
                exit
                mpls bgp 1
                    ingress-replication-bum-label
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            sap lag-1:1 create
            exit
            no shutdown
----------------------------------------------

The configuration on the remote PE (PE-3), which supports aliasing to PE-1 and PE-2 is shown below. PE-3 does not have any Ethernet-segment configured. It only requires the VPLS-1 configuration and ecmp>1 to perform aliasing.

*A:PE3>config>service>vpls# info 
----------------------------------------------
            bgp
            exit
            bgp-evpn
                cfm-mac-advertisement
                evi 1
                mpls bgp 1
                    ingress-replication-bum-label
                    ecmp 4
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            sap 1/1/1:1 create
            exit
            spoke-sdp 4:13 create
                no shutdown
            exit
            no shutdown
----------------------------------------------

EVPN single-active multihoming example

If we wanted to use single-active multihoming on PE-1 and PE-2 instead of all-active multihoming, we would only need to modify the following:

  • change the LAG configuration to single-active

    The CE-12 is now configured with two different LAGs, therefore, the key/system-id/system-priority must be different on PE-1 and PE-2

  • change the Ethernet-segment configuration to single-active

No changes are needed at service level on any of the three PEs.

The differences between single-active versus all-active multihoming are highlighted in bold in the following example excerpts:

A:PE1# configure lag 1 
A:PE1>config>lag# info 
----------------------------------------------
        mode access
        encap-type dot1q
        port 1/1/2 
        lacp active administrative-key 1 system-id 00:00:00:00:69:69 
        no shutdown
----------------------------------------------
A:PE1>config>lag# /configure service system bgp-evpn 
A:PE1>config>service>system>bgp-evpn# info 
----------------------------------------------
                route-distinguisher 192.0.2.69:0
                ethernet-segment "ESI-71" create
                    esi 0x01000000007100000001
                    es-activation-timer 10
                    service-carving
                        mode auto
                    exit
                    multi-homing single-active
                    lag 1
                    no shutdown
                exit
----------------------------------------------

A:PE2# configure lag 1 
A:PE2>config>lag# info 
----------------------------------------------
        mode access
        encap-type dot1q
        port 1/1/3 
        lacp active administrative-key 1 system-id 00:00:00:00:72:72 
        no shutdown
----------------------------------------------
A:PE2>config>lag# /configure service system bgp-evpn 
A:PE2>config>service>system>bgp-evpn# info 
----------------------------------------------
                route-distinguisher 192.0.2.72:0
                ethernet-segment "ESI-71" create
                    esi 0x01000000007100000001
                    es-activation-timer 10
                    service-carving
                        mode auto
                    exit
                    multi-homing single-active
                    lag 1
                    no shutdown
                exit
----------------------------------------------

PBB-EVPN configuration examples

PBB-EVPN all-active multihoming example

As in the EVPN all-active multihoming example, this section also shows a configuration example for three 7750 SR, 7450 ESS, or 7950 XRS PEs, however, PBB-EVPN is used in this excerpt, as follows:

  • PE-1 and PE-2 are multihomed to CE-12 that uses a LAG to get connected to I-VPLS 20001. CE-12 is connected to LAG SAPs configured in an all-active multihoming Ethernet-segment.

  • PE-3 is a remote PE that performs aliasing for traffic destined for the CE-12.

  • The three PEs are connected through B-VPLS 20000, a Backbone VPLS where EVPN is enabled.

The following excerpt shows the example configuration for I-VPLS 20001 and B-VPLS 20000 on PE-1 and PE-2, as well as the corresponding Ethernet-segment and LAG commands:


*A:PE1# configure lag 1 
*A:PE1>config>lag# info 
----------------------------------------------
        mode access
        encap-type dot1q
        port 1/1/2 
        lacp active administrative-key 1 system-id 00:00:00:00:69:72 
        no shutdown
----------------------------------------------
*A:PE1>config>lag# /configure service system bgp-evpn 
*A:PE1>config>service>system>bgp-evpn# info 
----------------------------------------------
                route-distinguisher 192.0.2.69:0
                ethernet-segment "ESI-71" create
                    esi 01:00:00:00:00:71:00:00:00:01
                    source-bmac-lsb 71-71 es-bmac-table-size 8
                    es-activation-timer 5
                    service-carving
                        mode auto
                    exit
                    multi-homing all-active
                    lag 1
                    no shutdown
                exit
----------------------------------------------
*A:PE1>config>service>system>bgp-evpn# /configure service vpls 20001
*A:PE1>config>service>vpls# info 
----------------------------------------------
            pbb
                backbone-vpls 20000
                exit
            exit
            stp
                shutdown
            exit
            sap lag-1:71 create
            exit
            no shutdown
----------------------------------------------
*A:PE1>config>service>vpls# /configure service vpls 20000 
*A:PE1>config>service>vpls# info 
----------------------------------------------
            service-mtu 2000
            pbb
                source-bmac 00:00:00:00:00:69
                use-es-bmac
            exit
            bgp-evpn
                evi 20000
                mpls bgp 1
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
----------------------------------------------

*A:PE2# configure lag 1 
*A:PE2>config>lag# info 
----------------------------------------------
        mode access
        encap-type dot1q
        port 1/1/3 
        lacp active administrative-key 1 system-id 00:00:00:00:69:72 
        no shutdown
----------------------------------------------
*A:PE2>config>lag# /configure service system bgp-evpn 
*A:PE2>config>service>system>bgp-evpn# info 
----------------------------------------------
                route-distinguisher 192.0.2.72:0
                ethernet-segment "ESI-71" create
                    esi 01:00:00:00:00:71:00:00:00:01
                    source-bmac-lsb 71-71 es-bmac-table-size 8
                    es-activation-timer 5
                    service-carving
                        mode auto
                    exit
                    multi-homing all-active
                    lag 1
                    no shutdown
                exit
----------------------------------------------
*A:PE2>config>service>system>bgp-evpn# /configure service vpls 20001 
*A:PE2>config>service>vpls# info 
----------------------------------------------
            pbb
                backbone-vpls 20000
                exit
            exit
            stp
                shutdown
            exit
            sap lag-1:71 create
            exit
            no shutdown
----------------------------------------------
*A:PE2>config>service>vpls# /configure service vpls 20000 
*A:PE2>config>service>vpls# info 
----------------------------------------------
            service-mtu 2000
            pbb
                source-bmac 00:00:00:00:00:72
                use-es-bmac
            exit
            bgp-evpn
                evi 20000
                mpls bgp 1
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
----------------------------------------------
*A:PE2>config>service>vpls#

The combination of the pbb source-bmac and the Ethernet-segment source-bmac-lsb create the same BMAC for all the packets sourced from both PE-1 and PE-2 for Ethernet-segment ‟ESI-71”.

PBB-EVPN single-active multihoming example

In the following configuration example, PE-70 and PE-73 are part of the same single-active multihoming, Ethernet-segment ESI-7413. In this case, the CE is connected to PE-70 and PE-73 through spoke-SDPs 4:74 and 34:74, respectively.

In this example PE-70 and PE-73 use a different source-bmac for packets coming from ESI-7413 and it is not an es-bmac as shown in the PBB-EVPN all-active multihoming example.

*A:PE70# configure service system bgp-evpn 
*A:PE70>config>service>system>bgp-evpn# info 
----------------------------------------------
                route-distinguisher 192.0.2.70:0
                ethernet-segment "ESI-7413" create
                    esi 01:74:13:00:74:13:00:00:74:13
                    es-activation-timer 0
                    service-carving
                        mode auto
                    exit
                    multi-homing single-active
                    sdp 4
                    no shutdown
                exit
----------------------------------------------
*A:PE70>config>service>system>bgp-evpn# /configure service vpls 20001 
*A:PE70>config>service>vpls# info 
----------------------------------------------
            pbb
                backbone-vpls 20000
                exit
            exit
            stp
                shutdown
            exit
            spoke-sdp 4:74 create
                no shutdown
            exit
            no shutdown
----------------------------------------------
*A:PE70>config>service>vpls# /configure service vpls 20000 
*A:PE70>config>service>vpls# info 
----------------------------------------------
            service-mtu 2000
            pbb
                source-bmac 00:00:00:00:00:70
            exit
            bgp-evpn
                evi 20000
                mpls bgp 1
                    ecmp 2
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
----------------------------------------------
*A:PE70>config>service>vpls#


A:PE73>config>service>system>bgp-evpn# info 
----------------------------------------------
                route-distinguisher 192.0.2.73:0
                ethernet-segment "ESI-7413" create
                    esi 01:74:13:00:74:13:00:00:74:13
                    es-activation-timer 0
                    service-carving
                        mode auto
                    exit
                    multi-homing single-active
                    sdp 34
                    no shutdown
                exit
----------------------------------------------
A:PE73>config>service>system>bgp-evpn# /configure service vpls 20001 
A:PE73>config>service>vpls# info 
----------------------------------------------
            pbb
                backbone-vpls 20000
                exit
            exit
            stp
                shutdown
            exit
            spoke-sdp 34:74 create
                no shutdown
            exit
            no shutdown
----------------------------------------------
A:PE73>config>service>vpls# /configure service vpls 20000 
A:PE73>config>service>vpls# info 
----------------------------------------------
            service-mtu 2000
            pbb
                source-bmac 00:00:00:00:00:73
            exit
            bgp-evpn
                evi 20000
                mpls bgp 1
                    auto-bind-tunnel
                        resolution any
                    exit
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
----------------------------------------------
A:PE73>config>service>vpls#