EVPN
Overview and EVPN applications
Ethernet Virtual Private Networks (EVPN) is an IETF technology per RFC 7432, BGP MPLS-Based Ethernet VPN, that uses a new BGP address family and allows VPLS services to be operated as IP-VPNs, where the MAC addresses and the information to set up the flooding trees are distributed by BGP.
EVPN is defined to fill the gaps of other L2VPN technologies such as VPLS. The main objective of the EVPN is to build E-LAN services in a similar way to RFC 4364 IP-VPNs, while supporting MAC learning within the control plane (distributed by MP-BGP), efficient multi-destination traffic delivery, and active-active multihoming.
EVPN can be used as the control plane for different data plane encapsulations. The Nokia implementation supports the following data planes:
EVPN for VXLAN overlay tunnels (EVPN-VXLAN)
EVPN for VXLAN overlay tunnels (EVPN-VXLAN), being the Data Center Gateway (DGW) function the main application for this feature. In such application VXLAN is expected within the Data Center and VPLS SDP bindings or SAPs are expected for the connectivity to the WAN. R-VPLS and VPRN connectivity to the WAN is also supported.
The EVPN-VXLAN functionality is standardized in RFC 8365.
EVPN for MPLS tunnels (EVPN-MPLS)
EVPN for MPLS tunnels (EVPN-MPLS), where PEs are connected by any type of MPLS tunnel. EVPN-MPLS is generally used as an evolution for VPLS services in the WAN, being Data Center Interconnect one of the main applications.
The EVPN-MPLS functionality is standardized in RFC 7432.
EVPN for PBB over MPLS tunnels (PBB-EVPN)
PEs are connected by PBB over MPLS tunnels in this data plane. It is usually used for large scale E-LAN and E-Line services in the WAN.
The PBB-EVPN functionality is standardized in RFC 7623.
The 7750 SR, 7450 ESS, or 7950 XRS EVPN VXLAN implementation is integrated in the Nuage Data Center architecture, where the router serves as the DGW.
For more information about the Nuage Networks architecture and products, see the Nuage Networks Virtualized Service Platform Guide. The following sections describe the applications supported by EVPN in the 7750 SR, 7450 ESS, or 7950 XRS implementation.
EVPN for VXLAN tunnels in a Layer 2 DGW (EVPN-VXLAN)
Layer 2 DC PE with VPLS to the WAN shows the use of EVPN for VXLAN overlay tunnels on the 7750 SR, 7450 ESS, or 7950 XRS when it is used as a Layer 2 DGW.
DC providers require a DGW solution that can extend tenant subnets to the WAN. Customers can deploy the NVO3-based solutions in the DC, where EVPN is the standard control plane and VXLAN is a predominant data plane encapsulation. The Nokia DC architecture uses EVPN and VXLAN as the control and data plane solutions for Layer 2 connectivity within the DC and so does the SR OS.
While EVPN VXLAN is used within the DC, some service providers use VPLS and H-VPLS as the solution to extend Layer 2 VPN connectivity. Layer 2 DC PE with VPLS to the WAN shows the Layer 2 DGW function on the 7750 SR, 7450 ESS, and 7950 XRS routers, providing VXLAN connectivity to the DC and regular VPLS connectivity to the WAN.
The WAN connectivity is based on VPLS where SAPs (null, dot1q, and qinq), spoke SDPs (FEC type 128 and 129), and mesh-SDPs are supported.
The DC GWs can provide multihoming resiliency through the use of BGP multihoming.
EVPN-MPLS can also be used in the WAN. In this case, the Layer 2 DGW function provides translation between EVPN-VXLAN and EVPN-MPLS. EVPN multihoming can be used to provide DGW redundancy.
If point-to-point services are needed in the DC, SR OS supports the use of EVPN-VPWS for VXLAN tunnels, including multihoming, according to RFC8214.
EVPN for VXLAN tunnels in a Layer 2 DC with integrated routing bridging connectivity on the DGW
Gateway IRB on the DC PE for an L2 EVPN/VXLAN DC shows the use of EVPN for VXLAN overlay tunnels on the 7750 SR, 7450 ESS, or 7950 XRS when the DC provides Layer 2 connectivity and the DGW can route the traffic to the WAN through an R-VPLS and linked VPRN.
In some cases, the DGW must provide a Layer 3 default gateway function to all the hosts in a specified tenant subnet. In this case, the VXLAN data plane is terminated in an R-VPLS on the DGW, and connectivity to the WAN is accomplished through regular VPRN connectivity. The 7750 SR, 7450 ESS, and 7950 XRS support IPv4 and IPv6 interfaces as default gateways in this scenario.
EVPN for VXLAN tunnels in a Layer 3 DC with integrated routing bridging connectivity among VPRNs
Gateway IRB on the DC PE for an L3 EVPN/VXLAN DC shows the use of EVPN for VXLAN tunnels on the 7750 SR, 7450 ESS, or 7950 XRS when the DC provides distributed Layer 3 connectivity to the DC tenants.
Each tenant has several subnets for which each DC Network Virtualization Edge (NVE) provides intra-subnet forwarding. An NVE may be a Nuage VSG, VSC/VRS, or any other NVE in the market supporting the same constructs, and each subnet normally corresponds to an R-VPLS. For example, in Gateway IRB on the DC PE for an L3 EVPN/VXLAN DC, subnet 10.20.0.0 corresponds to R-VPLS 2001 and subnet 10.10.0.0 corresponds to R-VPLS 2000.
In this example, the NVE provides inter-subnet forwarding too, by connecting all the local subnets to a VPRN instance. When the tenant requires Layer 3 connectivity to the IP-VPN in the WAN, a VPRN is defined in the DGWs, which connects the tenant to the WAN. That VPRN instance is connected to the VPRNs in the NVEs by means of an IRB (Integrated Routing and Bridging) backhaul R-VPLS. This IRB backhaul R-VPLS provides a scalable solution because it allows Layer 3 connectivity to the WAN without the need for defining all of the subnets in the DGW.
The 7750 SR, 7450 ESS, and 7950 XRS DGW support the IRB backhaul R-VPLS model, where the R-VPLS runs EVPN-VXLAN and the VPRN instances exchange IP prefixes (IPv4 and IPv6) through the use of EVPN. Interoperability between the EVPN and IP-VPN for IP prefixes is also fully supported.
EVPN for VXLAN tunnels in a Layer 3 DC with EVPN-tunnel connectivity among VPRNs
EVPN-tunnel gateway IRB on the DC PE for an L3 EVPN/VXLAN DC shows the use of EVPN for VXLAN tunnels on the 7750 SR, 7450 ESS, or 7950 XRS, when the DC provides distributed Layer 3 connectivity to the DC tenants and the VPRN instances are connected through EVPN tunnels.
The solution described in section EVPN for VXLAN tunnels in a Layer 3 DC with integrated routing bridging connectivity among VPRNs provides a scalable IRB backhaul R-VPLS service where all the VPRN instances for a specified tenant can be connected by using IRB interfaces. When this IRB backhaul R-VPLS is exclusively used as a backhaul and does not have any SAPs or SDP bindings directly attached, the solution can be optimized by using EVPN tunnels.
EVPN tunnels are enabled using the evpn-tunnel command under the R-VPLS interface configured on the VPRN. EVPN tunnels provide the following benefits to EVPN-VXLAN IRB backhaul R-VPLS services:
easier provisioning of the tenant service
If an EVPN tunnel is configured in an IRB backhaul R-VPLS, there is no need to provision the IRB IPv4 addresses on the VPRN. This makes the provisioning easier to automate and saves IP addresses from the tenant space.
Note: IPv6 interfaces do not require the provisioning of an IPv6 Global Address; a Link Local Address is automatically assigned to the IRB interface.higher scalability of the IRB backhaul R-VPLS
If EVPN tunnels are enabled, multicast traffic is suppressed in the EVPN-VXLAN IRB backhaul R-VPLS service (it is not required). As a result, the number of VXLAN binds in IRB backhaul R-VPLS services with EVPN-tunnels can be much higher.
This optimization is fully supported by the 7750 SR, 7450 ESS, and 7950 XRS.
EVPN for MPLS tunnels in E-LAN services
EVPN for MPLS in VPLS services shows the use of EVPN for MPLS tunnels on the 7750 SR, 7450 ESS, and 7950 XRS. In this case, EVPN is used as the control plane for E-LAN services in the WAN.
EVPN-MPLS is standardized in RFC 7432 as an L2VPN technology that can fill the gaps in VPLS for E-LAN services. A significant number of service providers offering E-LAN services today are requesting EVPN for their multihoming capabilities, as well as the optimization EVPN provides. EVPN supports all-active multihoming (per-flow load-balancing multihoming) as well as single-active multihoming (per-service load-balancing multihoming).
EVPN is a standard-based technology that supports all-active multihoming, and although VPLS already supports single-active multihoming, EVPN's single-active multihoming is perceived as a superior technology because of its mass-withdrawal capabilities to speed up convergence in scaled environments.
EVPN technology provides a number of significant benefits, including:
superior multihoming capabilities
an IP-VPN-like operation and control for E-LAN services
reduction and (in some cases) suppression of the BUM (broadcast, Unknown unicast, and Multicast) traffic in the network
simple provision and management
new set of tools to control the distribution of MAC addresses and ARP entries in the network
The SR OS EVPN-MPLS implementation is compliant with RFC 7432.
EVPN-MPLS can also be enabled in R-VPLS services with the same feature-set that is described for VXLAN tunnels in sections EVPN for VXLAN tunnels in a Layer 3 DC with integrated routing bridging connectivity among VPRNs and EVPN for VXLAN tunnels in a Layer 3 DC with EVPN-tunnel connectivity among VPRNs.
EVPN for MPLS tunnels in E-Line services
The MPLS network used by EVPN for E-LAN services can also be shared by E-Line services using EVPN in the control plane. EVPN for E-Line services (EVPN-VPWS) is a simplification of the RFC 7432 procedures, and it is supported in compliance with RFC 8214.
EVPN for MPLS tunnels in E-Tree services
The MPLS network used by E-LAN and E-Line services can also be shared by Ethernet-Tree (E-Tree) services using the EVPN control plane. EVPN E-Tree services use the EVPN control plane extensions described in IETF RFC 8317 and are supported on the 7750 SR, 7450 ESS, and 7950 XRS.
EVPN for PBB over MPLS tunnels (PBB-EVPN)
EVPN for PBB over MPLS shows the use of EVPN for MPLS tunnels on the 7750 SR, 7450 ESS, and 7950 XRS. In this case, EVPN is used as the control plane for E-LAN services in the WAN.
EVPN for PBB over MPLS (hereafter called PBB-EVPN) is specified in RFC 7623. It provides a simplified version of EVPN for cases where the network requires very high scalability and does not need all the advanced features supported by EVPN-MPLS (but still requires single-active and all-active multihoming capabilities).
PBB-EVPN is a combination of 802.1ah PBB and RFC 7432 EVPN and reuses the PBB-VPLS service model, where BGP-EVPN is enabled in the B-VPLS domain. EVPN is used as the control plane in the B-VPLS domain to control the distribution of B-MACs and setup per-ISID flooding trees for I-VPLS services. The learning of the C-MACs, either on local SAPs/SDP bindings or associated with remote B-MACs, is still performed in the data plane. Only the learning of B-MACs in the B-VPLS is performed through BGP.
The SR OS PBB-EVPN implementation supports PBB-EVPN for I-VPLS and PBB-Epipe services, including single-active and all-active multihoming.
EVPN for VXLAN tunnels and cloud technologies
This section provides information about EVPN for VXLAN tunnels and cloud technologies.
VXLAN
The SR OS, SR Linux and Nuage solution for DC supports VXLAN (Virtual eXtensible Local Area Network) overlay tunnels as per RFC 7348.
VXLAN addresses the data plane needs for overlay networks within virtualized data centers accommodating multiple tenants. The main attributes of the VXLAN encapsulation are:
-
VXLAN is an overlay network encapsulation used to carry MAC traffic between VMs over a logical Layer 3 tunnel.
-
Avoids the Layer 2 MAC explosion, because VM MACs are only learned at the edge of the network. Core nodes simply route the traffic based on the destination IP (which is the system IP address of the remote PE or VTEP-VXLAN Tunnel End Point).
-
Supports multi-path scalability through ECMP (to a remote VTEP address, based on source UDP port entropy) while preserving the Layer 2 connectivity between VMs. xSTP is no longer needed in the network.
-
Supports multiple tenants, each with their own isolated Layer 2 domain. The tenant identifier is encoded in the VNI field (VXLAN Network Identifier) and allows up to 16M values, as opposed to the 4k values provided by the 802.1q VLAN space.
VXLAN frame format shows an example of the VXLAN encapsulation supported by the Nokia implementation.
As shown in VXLAN frame format, VXLAN encapsulates the inner Ethernet frames into VXLAN + UDP/IP packets. The main pieces of information encoded in this encapsulation are:
-
VXLAN header (8 bytes)
-
Flags (8 bits) where the I flag is set to 1 to indicate that the VNI is present and valid. The rest of the flags (‟Reserved” bits) are set to 0.
-
Includes the VNI field (24-bit value) or VXLAN network identifier. It identifies an isolated Layer 2 domain within the DC network.
-
The rest of the fields are reserved for future use.
-
-
UDP header (8 bytes)
-
Where the destination port is a well-known UDP port assigned by IANA (4789).
-
The source port is derived from a hashing of the inner source and destination MAC/IP addresses that the 7750 SR, 7450 ESS, or 7950 XRS does at ingress. This creates an ‟entropy” value that can be used by the core DC nodes for load balancing on ECMP paths.
-
The checksum is set to zero.
-
-
Outer IP and Ethernet headers (34 or 38 bytes)
-
The source IP and source MAC identifies the source VTEP. That is, these fields are populated with the PE’s system IP and chassis MAC address.
Note: The source MAC address is changed on all the IP hops along the path, as is usual in regular IP routing. -
The destination IP identifies the remote VTEP (remote system IP) and be the result of the destination MAC lookup in the service Forwarding Database (FDB).
Note: All remote MACs are learned by the EVPN BGP and associated with a remote VTEP address and VNI.
-
Some considerations related to the support of VXLAN on the 7750 SR, 7450 ESS, and 7950 XRS are:
-
VXLAN is only supported on network or hybrid ports with null or dot1q encapsulation.
-
VXLAN is supported on Ethernet/LAG and POS/APS.
-
IPv4 and IPv6 unicast addresses are supported as VTEPs.
-
By default, system IP addresses are supported, as VTEPs, for originating and terminating VXLAN tunnels. Non-system IPv4 and IPv6 addresses are supported by using a Forwarding Path Extension (FPE).
VXLAN ECMP and LAG
The DGW supports ECMP load balancing to reach the destination VTEP. Also, any intermediate core node in the Data Center should be able to provide further load balancing across ECMP paths because the source UDP port of each tunneled packet is derived from a hash of the customer inner packet. The following must be considered:
ECMP for VXLAN is supported on VPLS services, but not for BUM traffic. Unicast spraying is based on the packet contents.
ECMP for VXLAN on R-VPLS services is supported for VXLAN IPv6 tunnels.
ECMP for VXLAN IPv4 tunnels on R-VPLS is only supported if the command configure service vpls allow-ip-int-bind vxlan-ipv4-tep-ecmp is enabled on the R-VPLS (as well as config>router>ecmp).
ECMP for Layer 3 multicast traffic on R-VPLS services with EVPN-VXLAN destinations is only supported if the vpls allow-ip-int-bind ip-multicast-ecmp command is enabled (as well as config>router>ecmp).
In the cases where ECMP is not supported (BUM traffic in VPLS and ECMP on R-VPLS if not enabled), each VXLAN binding is tied to a single (different) ECMP path, so that in a normal deployment with a reasonable number of remote VTEPs, there should be a fair distribution of the traffic across the paths. In other words, only per-VTEP load-balancing is supported, instead of per-flow load-balancing.
LAG spraying based on the packet hash is supported in all the cases (VPLS unicast, VPLS BUM, and R-VPLS).
VXLAN VPLS tag handling
The following describes the behavior on the 7750 SR, 7450 ESS, and 7950 XRS with respect to VLAN tag handling for VXLAN VPLS services:
Dot1q, QinQ, and null SAPs, as well as regular VLAN handling procedures at the WAN side, are supported on VXLAN VPLS services.
No ‟vc-type vlan” like VXLAN VNI bindings are supported. Therefore, at the egress of the VXLAN network port, the router does not add any inner VLAN tag on top of the VXLAN encapsulation, and at the ingress network port, the router ignores any VLAN tag received and considers it as part of the payload.
VXLAN MTU considerations
For VXLAN VPLS services, the network port MTU must be at least 50 Bytes (54 Bytes if dot1q) greater than the Service-MTU to allow enough room for the VXLAN encapsulation.
The Service-MTU is only enforced on SAPs, (any SAP ingress packet with MTU greater than the service-mtu is discarded) and not on VXLAN termination (any VXLAN ingress packet makes it to the egress SAP regardless of the configured service-mtu).
If BGP-EVPN is enabled in a VXLAN VPLS service, the Service-MTU can be advertised in the Inclusive Multicast Ethernet Tag routes and enforce that all the routers attached to the same EVPN service have the same Service-MTU configured.
VXLAN QoS
VXLAN is a network port encapsulation; therefore, the QoS settings for VXLAN are controlled from the network QoS policies.
Ingress
The network ingress QoS policy can be applied either to the network interface over which the VXLAN traffic arrives or under vxlan/network/ingress within the EVPN service.
Regardless of where the network QoS policy is applied, the ingress network QoS policy is used to classify the VXLAN packets based on the outer dot1p (if present), then the outer DSCP, to yield an FC/profile.
If the ingress network QoS policy is applied to the network interface over which the VXLAN traffic arrives then the VXLAN unicast traffic uses the network ingress queues configured on FP where the network interface resides. QoS control of BUM traffic received on the VXLAN tunnels is possible by separately redirecting these traffic types to policers within an FP ingress network queue group. This QoS control uses the per forwarding class fp-redirect-group parameter together with broadcast-policer, unknown-policer, and mcast-policer within the ingress section of a network QoS policy. This QoS control applies to all BUM traffic received for that forwarding class on the network IP interface on which the network QoS policy is applied.
The ingress network QoS policy can also be applied within the EVPN service by referencing an FP queue group instance, as follows:
configure
service
vpls <service-id>
vxlan vni <vni-id>
network
ingress
qos <network-policy-id>
fp-redirect-group <queue-group-name>
instance <instance-id>
In this case, the redirection to a specific ingress FP queue group applies as a single entity (per forwarding class) to all VXLAN traffic received only by this service. This overrides the QoS applied to the related network interfaces for traffic arriving on VXLAN tunnels in that service but does not affect traffic received on a spoke SDP in the same service. It is possible to also redirect unicast traffic to a policer using the per forwarding class fp-redirect-group policer parameter, as well as the BUM traffic as above, within the ingress section of a network QoS policy. The use of ler-use-dscp, ip-criteria and ipv6-criteria statements are ignored if configured in the ingress section of the referenced network QoS policy. If the instance of the named queue group template referenced in the qos command is not configured on an FP receiving the VXLAN traffic, then the traffic uses the ingress network queues or queue group related to the network interface.
Egress
On egress, there is no need to specify ‟remarking” in the policy to mark the DSCP. This is because the VXLAN adds a new IPv4 header, and the DSCP is always marked based on the egress network qos policy.
VXLAN ping
A new VXLAN troubleshooting tool, VXLAN Ping, is available to verify VXLAN VTEP connectivity. The VXLAN Ping command is available from interactive CLI and SNMP.
This tool allows the operator to specify a wide range of variables to influence how the packet is forwarded from the VTEP source to VTEP termination. The ping function requires the operator to specify a different test-id (equates to originator handle) for each active and outstanding test. The required local service identifier from which the test is launched determines the source IP (the system IP address) to use in the outer IP header of the packet. This IP address is encoded into the VXLAN header Source IP TLV. The service identifier also encodes the local VNI. The outer-ip-destination must equal the VTEP termination point on the remote node, and the dest-vni must be a valid VNI within the associated service on the remote node. The outer source IP address is automatically detected and inserted in the IP header of the packet. The outer source IP address uses the IPv4 system address by default.
If the VTEP is created using a non-system source IP address through the vxlan-src-vtep command, the outer source IP address uses the address specified by vxlan-src-vtep. The remainder of the variables are optional.
The VXLAN PDU is encapsulated in the appropriate transport header and forwarded within the overlay to the appropriate VTEP termination. The VXLAN router alert (RA) bit is set to prevent forwarding OAM PDU beyond the terminating VTEP. Because handling of the router alert bit was not defined in some early releases of VXLAN implementations, the VNI Informational bit (I-bit) is set to ‟0” for OAM packets. This indicates that the VNI is invalid, and the packet should not be forwarded. This safeguard can be overridden by including the i-flag-on option that sets the bit to ‟1”, valid VNI. Ensure that OAM frames meant to be contained to the VTEP are not forwarded beyond its endpoints.
The supporting VXLAN OAM ping draft includes a requirement to encode a reserved IEEE MAC address as the inner destination value. However, at the time of implementation, that IEEE MAC address had not been assigned. The inner IEEE MAC address defaults to 00:00:00:00:00:00, but may be changed using the inner-l2 option. Inner IEEE MAC addresses that are included with OAM packets are not learned in the local Layer 2 forwarding databases.
The echo responder terminates the VXLAN OAM frame, and takes the appropriate response action, and include relevant return codes. By default, the response is sent back using the IP network as an IPv4 UDP response. The operator can choose to override this default by changing the reply-mode to overlay. The overlay return mode forces the responder to use the VTEP connection representing the source IP and source VTEP. If a return overlay is not available, the echo response is dropped by the responder.
Support is included for:
IPv4 VTEP
Optional specification of the outer UDP Source, which helps downstream network elements along the path with ECMP to hash to flow to the same path
Optional configuration of the inner IP information, which helps the operator test different equal paths where ECMP is deployed on the source. A test only validates a single path where ECMP functions are deployed. The inner IP information is processed by a hash function, and there is no guarantee that changing the IP information between tests selects different paths.
Optional end system validation for a single L2 IEEE MAC address per test. This function checks the remote FDB for the configured IEEE MAC Address. Only one end system IEEE MAC Address can be configured per test.
Reply mode UDP (default) or Overlay
Optional additional padding can be added to each packet. There is an option that indicates how the responder should handle the pad TLV. By default, the padding is not reflected to the source. The operator can change this behavior by including the reflect-pad option. The reflect-pad option is not supported when the reply mode is set to UDP.
Configurable send counts, intervals, times outs, and forwarding class
The VXLAN OAM PDU includes two timestamps. These timestamps are used to report forward direction delay. Unidirectional delay metrics require accurate time of day clock synchronization. Negative unidirectional delay values are reported as ‟0.000”. The round trip value includes the entire round trip time including the time that the remote peer takes to process that packet. These reported values may not be representative of network delay.
The following example commands and outputs show how the VXLAN Ping function can be used to validate connectivity. The echo output includes a new header to better describe the VXLAN ping packet headers and the various levels.
oam vxlan-ping test-id 1 service 1 dest-vni 2 outer-ip-destination 10.20.1.4
interval
0.1 send-count 10
TestID 1, Service 1, DestVNI 2, ReplyMode UDP, IFlag Off, PadSize 0, ReflectPad No,
SendCount 10, Interval 0.1, Timeout 5
Outer: SourceIP 10.20.1.3, SourcePort Dynamic, DestIP 10.20.1.4, TTL 10, FC be, Prof
ile
In
Inner: DestMAC 00:00:00:00:00:00, SourceIP 10.20.1.3, DestIP 127.0.0.1
! ! ! ! ! ! ! ! ! !
---- vxlan-id 2 ip-address 10.20.1.4 PING Statistics ----
10 packets transmitted, 10 packets received, 0.00% packet loss
10 non-errored responses(!), 0 out-of-order(*), 0 malformed echo responses(.)
0 send errors(.), 0 time outs(.)
0 overlay segment not found, 0 overlay segment not operational
forward-delay min = 1.097ms, avg = 2.195ms, max = 2.870ms, stddev = 0.735ms
round-trip-delay min = 1.468ms, avg = 1.693ms, max = 2.268ms, stddev = 0.210ms
oam vxlan-ping test-id 2 service 1 dest-vni 2 outer-ip-destination 10.20.1.4 outer-
ip-source-udp 65000 outer-ip-ttl 64 inner-l2 d0:0d:1e:00:00:01 inner-ip-source
192.168.1.2 inner-ip-destination 127.0.0.8 reply-mode overlay send-count 20
interval
1 timeout 3 padding 1000 reflect-pad fc nc profile out
TestID 2, Service 1, DestVNI 2, ReplyMode overlay, IFlag Off, PadSize 1000, ReflectP
ad
Yes, SendCount 20, Interval 1, Timeout 3
Outer: SourceIP 10.20.1.3, SourcePort 65000, DestIP 10.20.1.4, TTL 64, FC nc, Profil
e
out
Inner: DestMAC d0:0d:1e:00:00:01, SourceIP 192.168.1.2, DestIP 127.0.0.8
===================================================================================
rc=1 Malformed Echo Request Received, rc=2 Overlay Segment Not Present, rc=3 Overlay
Segment Not Operational, rc=4 Ok
===================================================================================
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=1 ttl=255 rtt-time=1.733ms fwd
-time=0.302ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=2 ttl=255 rtt-time=1.549ms fwd
-time=1.386ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=3 ttl=255 rtt-time=3.243ms fwd
-time=0.643ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=4 ttl=255 rtt-time=1.551ms fwd
-time=2.350ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=5 ttl=255 rtt-time=1.644ms fwd
-time=1.080ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=6 ttl=255 rtt-time=1.670ms fwd
-time=1.307ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=7 ttl=255 rtt-time=1.636ms fwd
-time=0.490ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=8 ttl=255 rtt-time=1.649ms fwd
-time=0.005ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=9 ttl=255 rtt-time=1.401ms fwd
-time=0.685ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=10 ttl=255 rtt-time=1.634ms fwd
-time=0.373ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=11 ttl=255 rtt-time=1.559ms fwd
-time=0.679ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=12 ttl=255 rtt-time=1.666ms fwd
-time=0.880ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=13 ttl=255 rtt-time=1.629ms fwd
-time=0.669ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=14 ttl=255 rtt-time=1.280ms fwd
-time=1.029ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=15 ttl=255 rtt-time=1.458ms fwd
-time=0.268ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=16 ttl=255 rtt-time=1.659ms fwd
-time=0.786ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=17 ttl=255 rtt-time=1.636ms fwd
-time=1.071ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=18 ttl=255 rtt-time=1.568ms fwd
-time=2.129ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=19 ttl=255 rtt-time=1.657ms fwd
-time=1.326ms. rc=4
1132 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=20 ttl=255 rtt-time=1.762ms fwd
-time=1.335ms. rc=4
---- vxlan-id 2 ip-address 10.20.1.4 PING Statistics ----
20 packets transmitted, 20 packets received, 0.00% packet loss
20 valid responses, 0 out-of-order, 0 malformed echo responses
0 send errors, 0 time outs
0 overlay segment not found, 0 overlay segment not operational
forward-delay min = 0.005ms, avg = 0.939ms, max = 2.350ms, stddev = 0.577ms
round-trip-delay min = 1.280ms, avg = 1.679ms, max = 3.243ms, stddev = 0.375ms
oam vxlan-ping test-id 1 service 1 dest-vni 2 outer-ip-destination 10.20.1.4 send
-count 10 end-system 00:00:00:00:00:01 interval 0.1
TestID 1, Service 1, DestVNI 2, ReplyMode UDP, IFlag Off, PadSize 0, ReflectPad No,
EndSystemMAC 00:00:00:00:00:01, SendCount 10, Interval 0.1, Timeout 5
Outer: SourceIP 10.20.1.3, SourcePort Dynamic, DestIP 10.20.1.4, TTL 10, FC be, Prof
ile
In
Inner: DestMAC 00:00:00:00:00:00, SourceIP 10.20.1.3, DestIP 127.0.0.1
2 2 2 2 2 2 2 2 2 2
---- vxlan-id 2 ip-address 10.20.1.4 PING Statistics ----
10 packets transmitted, 10 packets received, 0.00% packet loss
10 non-errored responses(!), 0 out-of-order(*), 0 malformed echo responses(.)
0 send errors(.), 0 time outs(.)
0 overlay segment not found, 0 overlay segment not operational
0 end-system present(1), 10 end-system not present(2)
forward-delay min = 0.467ms, avg = 0.979ms, max = 1.622ms, stddev = 0.504ms
round-trip-delay min = 1.501ms, avg = 1.597ms, max = 1.781ms, stddev = 0.088ms
oam vxlan-ping test-id 1 service 1 dest-vni 2 outer-ip-destination 10.20.1.4 send
-count 10 end-system 00:00:00:00:00:01
TestID 1, Service 1, DestVNI 2, ReplyMode UDP, IFlag Off, PadSize 0, ReflectPad No,
EndSystemMAC 00:00:00:00:00:01, SendCount 10, Interval 1, Timeout 5
Outer: SourceIP 10.20.1.3, SourcePort Dynamic, DestIP 10.20.1.4, TTL 10, FC be, Prof
ile
In
Inner: DestMAC 00:00:00:00:00:00, SourceIP 10.20.1.3, DestIP 127.0.0.1
===================================================================================
rc=1 Malformed Echo Request Received, rc=2 Overlay Segment Not Present, rc=3 Overlay
Segment Not Operational, rc=4 Ok
mac=1 End System Present, mac=2 End System Not Present
===================================================================================
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=1 ttl=255 rtt-time=2.883ms fwd
-time=4.196ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=2 ttl=255 rtt-time=1.596ms fwd
-time=1.536ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=3 ttl=255 rtt-time=1.698ms fwd
-time=0.000ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=4 ttl=255 rtt-time=1.687ms fwd
-time=1.766ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=5 ttl=255 rtt-time=1.679ms fwd
-time=0.799ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=6 ttl=255 rtt-time=1.678ms fwd
-time=0.000ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=7 ttl=255 rtt-time=1.709ms fwd
-time=0.031ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=8 ttl=255 rtt-time=1.757ms fwd
-time=1.441ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=9 ttl=255 rtt-time=1.613ms fwd
-time=2.570ms. rc=4 mac=2
92 bytes from vxlan-id 2 10.20.1.4: vxlan_seq=10 ttl=255 rtt-time=1.631ms fwd
-time=2.130ms. rc=4 mac=2
---- vxlan-id 2 ip-address 10.20.1.4 PING Statistics ----
10 packets transmitted, 10 packets received, 0.00% packet loss
10 valid responses, 0 out-of-order, 0 malformed echo responses
0 send errors, 0 time outs
0 overlay segment not found, 0 overlay segment not operational
0 end-system present, 10 end-system not present
forward-delay min = 0.000ms, avg = 1.396ms, max = 4.196ms, stddev = 1.328ms
round-trip-delay min = 1.596ms, avg = 1.793ms, max = 2.883ms, stddev = 0.366ms
EVPN-VXLAN routed VPLS multicast routing support
IPv4 and IPv6 multicast routing is supported in an EVPN-VXLAN VPRN and IES routed VPLS service through its IP interface when the source of the multicast stream is on one side of its IP interface and the receivers are on either side of the IP interface. For example, the source for multicast stream G1 could be on the IP side, sending to receivers on both other regular IP interfaces and the VPLS of the routed VPLS service, while the source for group G2 could be on the VPLS side sending to receivers on both the VPLS and IP side of the routed VPLS service. See IPv4 and IPv6 multicast routing support for more details.
IGMP and MLD snooping on VXLAN
The delivery of IP multicast in VXLAN services can be optimized with IGMP and MLD snooping. IGMP and MLD snooping are supported in EVPN-VXLAN VPLS services and in EVPN-VXLAN VPRN/IES R-VPLS services. When enabled, IGMP and MLD reports are snooped on SAPs or SDP bindings, but also on VXLAN bindings, to create or modify entries in the MFIB for the VPLS service.
When configuring IGMP and MLD snooping in EVPN-VXLAN VPLS services, consider the following:
To enable IGMP snooping in the VPLS service on VXLAN, use the configure service vpls igmp-snooping no shutdown command.
To enable MLD snooping in the VPLS service on VXLAN, use the configure service vpls mld-snooping no shutdown command.
The VXLAN bindings only support basic IGMP/MLD snooping functionality. Features configurable under SAPs or SDP bindings are not available for VXLAN (VXLAN bindings are configured with the default values used for SAPs and SDP bindings). By default, a specified VXLAN binding only becomes a dynamic Mrouter when it receives IGMP or MLD queries and adds a specified multicast group to the MFIB when it receives an IGMP or MLD report for that group.
Alternatively, it is possible to configure all VXLAN bindings for a particular VXLAN instance to be Mrouter ports using the configure service vpls vxlan igmp-snooping mrouter-port and configure service vpls vxlan mld-snooping mrouter-port commands.
The show service id igmp-snooping, clear service id igmp-snooping, show service id mld-snooping, and clear service id mld-snooping commands are also available for VXLAN bindings.
Note: MLD snooping uses MAC-based forwarding. See MAC-based IPv6 multicast forwarding for more details.The following CLI commands show how the system displays IGMP snooping information and statistics on VXLAN bindings (the equivalent MLD output is similar).
*A:PE1# show service id 1 igmp-snooping port-db vxlan vtep 192.0.2.72 vni 1 detail
===============================================================================
IGMP Snooping VXLAN 192.0.2.72/1 Port-DB for service 1
===============================================================================
-------------------------------------------------------------------------------
IGMP Group 239.0.0.1
-------------------------------------------------------------------------------
Mode : exclude Type : dynamic
Up Time : 0d 19:07:05 Expires : 137s
Compat Mode : IGMP Version 3
V1 Host Expires : 0s V2 Host Expires : 0s
-------------------------------------------------------
Source Address Up Time Expires Type Fwd/Blk
-------------------------------------------------------
No sources.
-------------------------------------------------------------------------------
IGMP Group 239.0.0.2
-------------------------------------------------------------------------------
Mode : include Type : dynamic
Up Time : 0d 19:06:39 Expires : 0s
Compat Mode : IGMP Version 3
V1 Host Expires : 0s V2 Host Expires : 0s
-------------------------------------------------------
Source Address Up Time Expires Type Fwd/Blk
-------------------------------------------------------
10.0.0.232 0d 19:06:39 137s dynamic Fwd
-------------------------------------------------------------------------------
Number of groups: 2
===============================================================================
*A:PE1# show service id 1 igmp-snooping
statistics vxlan vtep 192.0.2.72 vni 1
===============================================================================
IGMP Snooping Statistics for VXLAN 192.0.2.72/1 (service 1)
===============================================================================
Message Type Received Transmitted Forwarded
-------------------------------------------------------------------------------
General Queries 0 0 556
Group Queries 0 0 0
Group-Source Queries 0 0 0
V1 Reports 0 0 0
V2 Reports 0 0 0
V3 Reports 553 0 0
V2 Leaves 0 0 0
Unknown Type 0 N/A 0
-------------------------------------------------------------------------------
Drop Statistics
-------------------------------------------------------------------------------
Bad Length : 0
Bad IP Checksum : 0
Bad IGMP Checksum : 0
Bad Encoding : 0
No Router Alert : 0
Zero Source IP : 0
Wrong Version : 0
Lcl-Scope Packets : 0
Rsvd-Scope Packets : 0
Send Query Cfg Drops : 0
Import Policy Drops : 0
Exceeded Max Num Groups : 0
Exceeded Max Num Sources : 0
Exceeded Max Num Grp Srcs: 0
MCAC Policy Drops : 0
===============================================================================
*A:PE1# show service id 1 mfib
===============================================================================
Multicast FIB, Service 1
===============================================================================
Source Address Group Address SAP or SDP Id Svc Id Fwd/Blk
-------------------------------------------------------------------------------
* * sap:1/1/1:1 Local Fwd
* 239.0.0.1 sap:1/1/1:1 Local Fwd
vxlan:192.0.2.72/1 Local Fwd
10.0.0.232 239.0.0.2 sap:1/1/1:1 Local Fwd
vxlan:192.0.2.72/1 Local Fwd
-------------------------------------------------------------------------------
Number of entries: 3
===============================================================================
PIM snooping on VXLAN
PIM snooping for IPv4 and IPv6 are supported in an EVPN-EVPN-VXLAN VPLS or R-VPLS service (with the R-VPLS attached to a VPRN or IES service). The snooping operation is similar to that within a VPLS service (see PIM snooping for VPLS) and supports both PIM snooping and PIM proxy modes.
PIM snooping for IPv4 is enabled using the configure service vpls pim-snooping command.
PIM snooping for IPv6 is enabled using the configure service vpls pim-snooping no ipv6-multicast-disable command.
When using PIM snooping for IPv6, the default forwarding is MAC-based with optional support for SG-based (see IPv6 multicast forwarding). SG-based forwarding requires FP3- or higher-based hardware.
It is not possible to configure max-num-groups for VXLAN bindings.
Static VXLAN termination in Epipe services
By default, the system IP address is used to terminate and generate VXLAN traffic. The following configuration example shows an Epipe service that supports static VXLAN termination:
config service epipe 1 name "epipe1" customer 1 create
sap 1/1/1:1 create
exit
vxlan vni 100 create
egr-vtep 192.0.2.1
oper-group op-grp-1
exit
no shutdown
Where:
vxlan vni vni create specifies the ingress VNI the router uses to identify packets for the service. The following considerations apply:
In services that use EVPN, the configured VNI is only used as the ingress VNI to identify packets that belong to the service. Egress VNIs are learned from the BGP EVPN. In the case of Static VXLAN, the configured VNI is also used as egress VNI (because there is no BGP EVPN control plane).
The configured VNI is unique in the system, and as a result, it can only be configured in one service (VPLS or Epipe).
egr-vtep ip-address specifies the remote VTEP the router uses when encapsulating frames into VXLAN packets. The following consideration apply:
When the PE receives VXLAN packets, the source VTEP is not checked against the configured egress VTEP.
The ip-address must be present in the global routing table so that the VXLAN destination is operationally up.
The oper-group may be added under egr-vtep. The expected behavior for the operational group and service status is as follows:
If the egr-vtep entry is not present in the routing table, the VXLAN destination (in the show service id vxlan command) and the provisioned operational group under egr-vtep enters into the operationally down state.
If the Epipe SAP goes down, the service goes down, but it is not affected if the VXLAN destination goes down.
If the service is admin shutdown, then in addition to the SAP, the VXLAN destination and the oper-group also enters the operationally down state.
Note: The operational group configured under egr-vtep cannot be monitored on the SAP of the Epipe where it is configured.
The following features are not supported by Epipe services with VXLAN destinations:
per-service hashing
SDP-binds
PBB context
BGP-VPWS
spoke SDP-FEC
PW-port
Static VXLAN termination in VPLS/R-VPLS services
VXLAN instances in VPLS and R-VPLS can be configured with egress VTEPs. This is referred as static vxlan-instances. The following configuration example shows a VPLS service that supports a static vxlan-instance:
config service vpls 1 name "vpls-1" customer 1 create
sap 1/1/1:1 create
exit
vxlan instance 1 vni 100 create
source-vtep-security
no disable-aging /* default: disable-aging
no disable-learning /* default: disable-learning
no discard-unknown-source
no max-nbr-mac-addr <table-size>
restrict-protected-src discard-frame
egr-vtep 192.0.2.1 create
exit
egr-vtep 192.0.2.2 create
exit
vxlan instance 2 vni 101 create
egr-vtep 192.0.2.3 create
exit
vxlan instance 2 vni 101 create
egr-vtep 192.0.2.3 create
exit
no shutdown
Specifically the following can be stated:
Each VPLS service can have up to two static VXLAN instances. Each instance is an implicit split-horizon-group, and up to 255 static VXLAN binds are supported in total, shared between the two VXLAN instances.
Single VXLAN instance VPLS services with static VXLAN are supported along with SAPs and SDP bindings. Therefore:
VNIs configured in static VXLAN instances are ‟symmetric”, that is, the same ingress and egress VNIs are used for VXLAN packets using that instance. Note that asymmetric VNIs are actually possible in EVPN VXLAN instances.
The addresses can be IPv4 or IPv6 (but not a mix within the same service).
A specified VXLAN instance can be configured with static egress VTEPs, or be associated with BGP EVPN, but the same instance cannot be configured to support both static and BGP-EVPN based VXLAN bindings.
Up to two VXLAN instances are supported per VPLS (up to two).
When two VXLAN instances are configured in the same VPLS service, any combination of static and BGP-EVPN enabled instances are supported. That is, the two VXLAN instances can be static, or BGP-EVPN enabled, or one of each type.
When a service is configured with EVPN and there is a static BGP-EVPN instance in the same service, the user must configure restrict-protected-src discard-frame along with no disable-learning in the static BGP-EVPN instance, service>vpls>vxlan.
MAC addresses are learned also on the VXLAN bindings of the static VXLAN instance. Therefore, they are shown in the FDB commands. Note that disable-learning and disable-aging are by default enabled in static vxlan-instance.
The learned MAC addresses are subject to the remote-age, and not the local-age (only MACs learned on SAPs use the local-age setting).
MAC addresses are learned on a VTEP as long as no disable-learning is configured, and the VXLAN VTEP is present in the base route table. When the VTEP disappears from the route table, the associated MACs are flushed.
The vpls vxlan source-vtep-security command can be configured per VXLAN instance on VPLS services. When enabled, the router performs an IPv4 source-vtep lookup to discover if the VXLAN packet comes from a trusted VTEP. If not, the router discards the frame. If the lookup yields a trusted source VTEP, then the frame is accepted.
A trusted VTEP is an egress VTEP that has been statically configured, or dynamically learned (through EVPN) in any service, Epipe or VPLS
The command show service vxlan shows the list of trusted VTEPs in the router.
The command source-vtep-security works for static VXLAN instances or BGP-EVPN enabled VXLAN instances, but only for IPv4 VTEPs.
The command is mutually exclusive with assisted-replication (replicator or leaf) in the VNI instance. AR can still be configured in a different instance.
Static VXLAN instances can use non-system IPv4/IPv6 termination.
Non-system IPv4 and IPv6 VXLAN termination in VPLS, R-VPLS, and Epipe services
By default, only VXLAN packets with the same IP destination address as the system IPv4 address of the router can be terminated and processed for a subsequent MAC lookup. A router can simultaneously terminate VXLAN tunnels destined for its system IP address and three additional non-system IPv4 or IPv6 addresses, which can be on the base router or VPRN instances. This section describes the configuration requirements for services to terminate VXLAN packets destined for a non-system loopback IPv4 or IPv6 address on the base router or VPRN.
- Create the FPE (see FPE creation)
- Associate the FPE with VXLAN termination (see FPE association with VXLAN termination)
- Configure the router loopback interface (see VXLAN router loopback interface)
- Configure VXLAN termination (non-system) VTEP addresses (see VXLAN termination VTEP addresses)
- Add the service configuration (see VXLAN services)
-
FPE creation
A Forwarding Path Extension (FPE) is required to terminate non-system IPv4 or IPv6 VXLAN tunnels.
In a non-system IPv4 VXLAN termination, the FPE function is used for additional processing required at ingress (VXLAN tunnel termination) only, and not at egress (VXLAN tunnel origination).
If the IPv6 VXLAN terminates on a VPLS or Epipe service, the FPE function is used at ingress only, and not at egress.
For R-VPLS services terminating IPv6 VXLAN tunnels and also for VPRN VTEPs, the FPE is used for the egress as well as the VXLAN termination function. In the case of R-VPLS, an internal static SDP is created to allow the required extra processing.
For information about FPE configuration and functions, see the 7450 ESS, 7750 SR, 7950 XRS, and VSR Interface Configuration Guide, "Forwarding Path Extension".
-
FPE association with VXLAN termination
The FPE must be associated with the VXLAN termination application. The following example configuration shows two FPEs and their corresponding association. FPE 1 uses the base router and FPE 2 is configured for VXLAN termination on VPRN 10.
configure fwd-path-ext fpe 1 create path pxc pxc-1 vxlan-termination fpe 2 create path pxc pxc-2 vxlan-termination router 10
-
VXLAN router loopback interface
Create the interface that terminates and originates the VXLAN packets. The interface is created as a router interface, which is added to the Interior Gateway Protocol (IGP) and used by the BGP as the EVPN NLRI next hop.
Because the system cannot terminate the VXLAN on a local interface address, a subnet must be assigned to the loopback interface and not a host IP address that is /32 or /128. In the following example, all the addresses in subnet 11.11.11.0/24 (except 11.11.11.1, which is the interface IP) and subnet 10.1.1.0/24 (except 10.1.1.1) can be used for tunnel termination. The subnet is advertised using the IGP and is configured on either the base router or a VPRN. In the example, two subnets are assigned, in the base router and VPRN 10 respectively.
configure router interface "lo1" loopback address 10.11.11.1/24 isis interface "lo1" passive no shutdown
configure service vprn 10 name "vprn10" customer 1 create interface "lo1" loopback address 10.1.1.1/24 isis interface "lo1" passive no shutdown
A local interface address cannot be configured as a VXLAN tunnel-termination IP address in the CLI, as shown in the following example.
*A:PE-3# configure service system vxlan tunnel-termination 192.0.2.3 fpe 1 create MINOR: SVCMGR #8353 VXLAN Tunnel termination IP address cannot be configured - IP address in use by another application or matches a local interface IP address
The subnet can be up to 31 bits. For example, to use 10.11.11.1 as the VXLAN termination address, the subnet should be configured and advertised as shown in the following example configuration.
interface "lo1" address 10.11.11.0/31 loopback no shutdown exit isis 0 interface "lo1" passive no shutdown exit no shutdown exit
It is not a requirement for the remote PEs and NVEs to have the specific /32 or /128 IP address in their RTM to resolve the BGP EVPN NLRI next hop or forward the VXLAN packets. An RTM with a subnet that contains the remote VTEP can also perform these tasks.
Note: The system does not check for a pre-existing local base router loopback interface with a subnet corresponding to the VXLAN tunnel termination address. If a tunnel termination address is configured and the FPE is operationally up, the system starts terminating VXLAN traffic and responding ICMP messages for that address. The following conditions are ignored in this scenario:-
the presence of a loopback interface in the base router
-
the presence of an interface with the address contained in the configured subnet, and no loopback
The following example output includes an IPv6 address in the base router. It could also be configured in a VPRN instance.
configure router interface "lo1" loopback address 10.11.11.1/24 ipv6 address 2001:db8::/127 exit isis interface "lo1" passive no shutdown
-
-
VXLAN termination VTEP addresses
The service>system>vxlan>tunnel-termination context allows the user to configure non-system IP addresses that can terminate the VXLAN and their corresponding FPEs.
As shown in the following example, an IP address may be associated with a new or existing FPE already terminating the VXLAN. The list of addresses that can terminate the VXLAN can include IPv4 and IPv6 addresses.
config service system vxlan# tunnel-termination 10.11.11.1 fpe 1 create tunnel-termination 2001:db8:1000::1 fpe 1 create config service vprn 10 vxlan# tunnel-termination 10.1.1.2 fpe 2 create
The tunnel-termination command creates internal loopback interfaces that can respond to ICMP requests. In the following sample output, an internal loopback is created when the tunnel termination address is added (for 10.11.11.1 and 2001:db8:1000::1). The internal FPE router interfaces created by the VXLAN termination function are also shown in the output. Similar loopback and interfaces are created for tunnel termination addresses in a VPRN (not shown).
*A:PE1# show router interface =============================================================================== Interface Table (Router: Base) =============================================================================== Interface-Name Adm Opr(v4/v6) Mode Port/SapId IP-Address PfxState ------------------------------------------------------------------------------- _tmnx_fpe_1.a Up Up/Up Network pxc-2.a:1 fe80::100/64 PREFERRED _tmnx_fpe_1.b Up Up/Up Network pxc-2.b:1 fe80::101/64 PREFERRED _tmnx_vli_vxlan_1_131075 Up Up/Up Network loopback 10.11.11.1/32 n/a 2001:db8:1000::1 PREFERRED fe80::6cfb:ffff:fe00:0/64 PREFERRED lo1 Up Up/Down Network loopback 10.11.11.0/31 n/a system Up Up/Down Network system 1.1.1.1/32 n/a <snip>
-
VXLAN services
By default, the VXLAN services use the system IP address as the source VTEP of the VXLAN encapsulated frames. The vxlan-src-vtep command in the config>service>vpls or config>service>epipe context enables the system to use a non-system IPv4 or IPv6 address as the source VTEP for the VXLAN tunnels in that service.
A different vxlan-src-vtep can be used for different services, as shown in the following example where two different services use different non-system IP addresses as source VTEPs.
configure service vpls 1 vxlan-src-vtep 10.11.11.1 configure service vpls 2 vxlan-src-vtep 2001:db8:1000::1
In addition, if a vxlan-src-vtep is configured and the service uses EVPN, the IP address is also used to set the BGP NLRI next hop in EVPN route advertisements for the service.
Note: The BGP EVPN next hop can be overridden by the use of export policies based on the following rules:-
A BGP peer policy can override a next hop pushed by the vxlan-src-vtep configuration.
-
If the VPLS service is IPv6 (that is, the vxlan-src-vtep is IPv6) and a BGP peer export policy is configured with next-hop-self, the BGP next-hop is overridden with an IPv6 address auto-derived from the IP address of the system. The auto-derivation is based on RFC 4291. For example, ::ffff:10.20.1.3 is auto-derived from system IP 10.20.1.3.
-
The policy checks the address type of the next hop provided by the vxlan-src-vtep command. If the command provides an IPv6 next hop, the policy is unable use an IPv4 address to override the IPv6 address provided by the vxlan-src-vtep command.
After the preceding steps are performed to configure a VXLAN termination, the VPLS, R-VPLS, or Epipe service can be used normally, except that the service terminates VXLAN tunnels with a non-system IPv4 or IPv6 destination address (in the base router or a VPRN instance) instead of the system IP address only.
The FPE vxlan-termination function creates internal router interfaces and loopbacks that are displayed by the show commands. When configuring IPv6 VXLAN termination on an R-VPLS service, as well as the internal router interfaces and loopbacks, the system creates internal SDP bindings for the required egress processing. The following output shows an example of an internal FPE-type SDP binding created for IPv6 R-VPLS egress processing.
*A:PE1# show service sdp-using =============================================================================== SDP Using =============================================================================== SvcId SdpId Type Far End Opr I.Label E.Label State ------------------------------------------------------------------------------- 2002 17407:2002 Fpe fpe_1.b Up 262138 262138 ------------------------------------------------------------------------------- Number of SDPs : 1 ------------------------------------------------------------------------------- ===============================================================================
When BGP EVPN is used, the BGP peer over which the EVPN-VXLAN updates are received can be an IPv4 or IPv6 peer, regardless of whether the next-hop is an IPv4 or IPv6 address.
The same VXLAN tunnel termination address cannot be configured on different router instances; that is, on two different VPRN instances or on a VPRN and the base router.
-
EVPN for overlay tunnels
This section describes the specifics of EVPN for non-MPLS Overlay tunnels.
BGP-EVPN control plane for VXLAN overlay tunnels
RFC 8365 describes EVPN as the control plane for overlay-based networks. The 7750 SR, 7450 ESS, and 7950 XRS support all routes and features described in RFC 7432 that are required for the DGW function. EVPN multihoming and BGP multihoming based on the L2VPN BGP address family are both supported if redundancy is needed.
EVPN-VXLAN required routes and communities shows the EVPN MP-BGP NLRI, required attributes and extended communities, and two route types supported for the DGW Layer 2 applications:
- route type 3
- Inclusive Multicast Ethernet Tag route
- route type 2
- MAC/IP advertisement route
EVPN route type 3 – inclusive multicast Ethernet tag route
Route type 3 is used to set up the flooding tree (BUM flooding) for a specified VPLS service in the data center. The received inclusive multicast routes add entries to the VPLS flood list in the 7750 SR, 7450 ESS, and 7950 XRS. The tunnel types supported in an EVPN route type 3 when BGP-EVPN MPLS is enabled are ingress replication, P2MP MLDP, and composite tunnels.
Ingress Replication (IR) and Assisted Replication (AR) are supported for VXLAN tunnels. See Layer 2 multicast optimization for VXLAN (Assisted-Replication) for more information about the AR.
If ingress-repl-inc-mcast-advertisement is enabled, a route type 3 is generated by the router per VPLS service as soon as the service is in an operationally up state. The following fields and values are used:
-
Route Distinguisher is taken from the RD of the VPLS service within the BGP context.
Note: The RD can be configured or derived from the bgp-evpn evi value. -
Ethernet Tag ID is 0.
-
IP address length is always 32.
-
Originating router’s IP address carries an IPv4 or IPv6 address.
Note: By default, the IP address of the Originating router is derived from the system IP address. However, this can be overridden by the configure service vpls bgp-evpn incl-mcast-orig-ip ip-address command for the Ingress Replication (and mLDP if MPLS is used) tunnel type. -
For PMSI Tunnel Attribute (PTA), tunnel type = Ingress replication (6) or Assisted Replication (10)
-
Leaf not required for Flags.
-
MPLS label carries the VNI configured in the VPLS service. Only one VNI can be configured per VPLS service.
-
Tunnel endpoint is equal to the system IP address.
As shown in PMSI attribute flags field for AR, additional flags are used in the PTA when the service is configured for AR.
The Flags field is defined as a Type field (for AR) with two new flags that are defined as follows:
-
T is the AR Type field (2 bits):
-
00 (decimal 0) = RNVE (non-AR support)
-
01 (decimal 1) = AR REPLICATOR
-
10 (decimal 2) = AR LEAF
-
-
The U and BM flags defined in IETF Draft draft-ietf-bess-evpn-optimized-ir are not used in the SR OS.
AR-R and AR-L routes and usage describes the inclusive multicast route information sent per VPLS service when the router is configured as assisted-replication replicator (AR-R) or assisted-replication leaf (AR-L). A Regular Network Virtualization Edge device (RNVE) is defined as an EVPN-VXLAN router that does not support (or is not configured for) Assisted-Replication.
Note: For AR-R, two inclusive multicast routes may be advertised if ingress-repl-inc-mcast-advertisement is enabled: a route with tunnel-type IR, tunnel-id = IR IP (generally system-ip) and a route with tunnel-type AR, tunnel-id = AR IP (the address configured in the assisted-replication-ip command).Table 1. AR-R and AR-L routes and usage AR role Function Inclusive Mcast routes advertisement AR-R
Assists AR-LEAFs
-
IR included in the Mcast route (uses IR IP) if ingress-repl-inc-mcast-advertisement is enabled
-
AR included in the Mcast route (uses AR IP, tunnel type=AR, T=1)
AR-LEAF
Sends BM only to AR-Rs
IR inclusive multicast route (IR IP, T=2) if ingress-repl-inc-mcast-advertisement is enabled
RNVE
Non-AR support
IR inclusive multicast route (IR IP) if ingress-repl-inc-mcast-advertisement is enabled
-
EVPN route type 2 – MAC/IP advertisement route
The 7750 SR, 7450 ESS, and 7950 XRS generates this route type for advertising MAC addresses. If mac-advertisement is enabled, the router generates MAC advertisement routes for the following:
-
learned MACs on SAPs or SDP bindings
-
conditional static MACs
The route type 2 generated by a router uses the following fields and values:
-
Route Distinguisher is taken from the RD of the VPLS service within the BGP context.
Note: The RD can be configured or derived from the bgp-evpn evi value. -
Ethernet Segment Identifier (ESI) value = 0:0:0:0:0:0:0:0:0:0 or non-zero, depending on whether the MAC addresses are learned on an Ethernet Segment.
-
Ethernet Tag ID is 0.
-
MAC address length is always 48.
-
MAC Address:
-
is 00:00:00:00:00:00 for the Unknown MAC route address.
-
is different from 00:…:00 for the rest of the advertised MACs.
-
-
IP address and IP address length:
-
The length of the IP address associated with the MAC being advertised is either 32 for IPv4 or 128 for IPv6.
-
If the MAC address is the Unknown MAC route, the IP address length is zero and the IP omitted.
-
In general, any MAC route without IP has IPL=0 (IP length) and the IP is omitted.
-
When received, any IPL value not equal to zero, 32, or 128 discards the route.
-
-
MPLS Label 1 carries the VNI configured in the VPLS service. Only one VNI can be configured per VPLS.
-
MPLS Label 2 is 0.
-
MAC Mobility extended community is used for signaling the sequence number in case of MAC moves and the sticky bit in case of advertising conditional static MACs. If a MAC route is received with a MAC mobility ext-community, the sequence number and the sticky bit are considered for the route selection.
When EVPN-VXLAN multihoming is enabled, type 1 routes (Auto-Discovery per-ES and per-EVI routes) and type 4 routes (ES routes) are also generated and processed. See BGP-EVPN control plane for MPLS tunnels for more information about route types 1 and 4.
EVPN route type 5 – IP prefix route
EVPN route-type 5 shows the IP prefix route or route-type 5.
The router generates this route type for advertising IP prefixes in EVPN. The router generates IP prefix advertisement routes for IP prefixes existing in a VPRN linked to the IRB backhaul R-VPLS service.
The route-type 5 generated by a router uses the following fields and values:
-
Route Distinguisher: taken from the RD configured in the IRB backhaul R-VPLS service within the BGP context
-
Ethernet Segment Identifier (ESI): value = 0:0:0:0:0:0:0:0:0:0
-
Ethernet Tag ID: 0
-
IP address length: any value in the 0 to 128 range
-
IP address: any valid IPv4 or IPv6 address
-
Gateway IP address: can carry two different values:
-
if different from zero, the route-type 5 carries the primary IP interface address of the VPRN behind which the IP prefix is known. This is the case for the regular IRB backhaul R-VPLS model.
-
if 0.0.0.0, the route-type 5 is sent with a MAC next-hop extended community that carries the VPRN interface MAC address. This is the case for the EVPN tunnel R-VPLS model.
-
-
MPLS Label: carries the VNI configured in the VPLS service. Only one VNI can be configured per VPLS service.
All the routes in EVPN-VXLAN is sent with the RFC 5512 tunnel encapsulation extended community, with the tunnel type value set to VXLAN.
EVPN for VXLAN in VPLS services
The EVPN-VXLAN service is designed around the current VPLS objects and the additional VXLAN construct.
Layer 2 DC PE with VPLS to the WAN shows a DC with a Layer 2 service that carries the traffic for a tenant who wants to extend a subnet beyond the DC. The DC PE function is carried out by the 7750 SR, 7450 ESS, and 7950 XRS where a VPLS instance exists for that particular tenant. Within the DC, the tenant has VPLS instances in all the Network Virtualization Edge (NVE) devices where they require connectivity (such VPLS instances can be instantiated in TORs, Nuage VRS, VSG, and so on). The VPLS instances in the redundant DGW and the DC NVEs are connected by VXLAN bindings. BGP-EVPN provides the required control plane for such VXLAN connectivity.
The DGW routers are configured with a VPLS per tenant that provides the VXLAN connectivity to the Nuage VPLS instances. On the router, each tenant VPLS instance is configured with:
-
The WAN-related parameters (SAPs, spoke SDPs, mesh-SDPs, BGP-AD, and so on).
-
The BGP-EVPN and VXLAN (VNI) parameters. The following CLI output shows an example for an EVPN-VXLAN VPLS service.
*A:DGW1>config>service>vpls# info
----------------------------------------------
description "vxlan-service"
vxlan instance 1 vni 1 create
exit
bgp
route-distinguisher 65001:1
route-target export target:65000:1 import target:65000:1
exit
bgp-evpn
unknown-mac-route
mac-advertisement
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
sap 1/1/1:1 create
exit
no shutdown
----------------------------------------------
The bgp-evpn context specifies the encapsulation type (only vxlan is supported) to be used by EVPN and other parameters like the unknown-mac-route and mac-advertisement commands. These commands are typically configured in three different ways:
-
If the operator configures no unknown-mac-route and mac-advertisement (default option), the router advertises new learned MACs (on the SAPs or SDP bindings) or new conditional static MACs.
-
If the operator configures unknown-mac-route and no mac-advertisement, the router only advertises an unknown-mac-route as long as the service is operationally up (if no BGP-MH site is configured in the service) or the router is the DF (if BGP-MH is configured in the service).
-
If the operator configures unknown-mac-route and mac-advertisement, the router advertises new learned MACs, conditional static MACs, and the unknown-mac-route. The unknown-mac-route is only advertised under the preceding described conditions.
Other parameters related to EVPN or VXLAN are:
-
MAC duplication parameters
-
VXLAN VNI (defines the VNI that the router uses in the EVPN routes generated for the VPLS service)
After the VPLS is configured and operationally up, the router sends or receives inclusive multicast Ethernet Tag routes, and a full-mesh of VXLAN connections are automatically created. These VXLAN ‟auto-bindings” can be characterized as follows:
-
The VXLAN auto-binding model is based on an IP-VPN-like design, where no SDPs or SDP binding objects are created by or visible to the user. The VXLAN auto-binds are composed of remote VTEPs and egress VNIs, and can be displayed with the following command:
-
show service id 112 vxlan destinations
Output example
============================================================================== Egress VTEP, VNI (Instance 1) =============================================================================== VTEP Address Egress VNI Oper Mcast Num State MACs ------------------------------------------------------------------------------- 192.0.2.2 112 Up BUM 1 192.0.2.3 112 Down BUM 0 ------------------------------------------------------------------------------- Number of Egress VTEP, VNI : 2 ===============================================================================
-
show service id 112 vxlan destinations detail
Output example
=============================================================================== Egress VTEP, VNI (Instance 1) =============================================================================== VTEP Address Egress VNI Oper Mcast Num State MACs ------------------------------------------------------------------------------- 192.0.2.2 112 Up BUM 1 Oper Flags : None Type : evpn L2 PBR : No Sup BCast Domain : No Last Update : 02/03/2023 22:15:06 192.0.2.3 112 Down BUM 0 Oper Flags : MTU-Mismatch Type : evpn L2 PBR : No Sup BCast Domain : No Last Update : 01/31/2023 21:28:39 ------------------------------------------------------------------------------- Number of Egress VTEP, VNI : 2 ===============================================================================
-
- If the following command is configured on the PEs attached to the
same service, the service MTU value is advertised in the EVPN Layer-2 Attributes
extended community along with the Inclusive Multicast Ethernet Tag routes.
- MD-CLI
configure service vpls bgp-evpn routes incl-mcast advertise-l2-attributes
- classic
CLI
configure service vpls bgp-evpn incl-mcast-l2-attributes-advertisement
configure service vpls bgp-evpn ignore-mtu-mismatch
- MD-CLI
-
The VXLAN bindings observe the VPLS split-horizon rule. This is performed automatically without the need for any split-horizon configuration.
-
BGP Next-Hop Tracking for EVPN is fully supported. If the BGP next-hop for a specified received BGP EVPN route disappears from the routing table, the BGP route is not marked as ‟used” and the respective entry in show service id vxlan destinations is removed.
After the flooding domain is setup, the routers and DC NVEs start advertising MAC addresses, and the routers can learn MACs and install them in the FDB. Some considerations are the following:
-
All the MAC addresses associated with remote VTEP/VNIs are always learned in the control plane by EVPN. Data plane learning on VXLAN auto-bindings is not supported.
-
When unknown-mac-route is configured, it is generated when no (BGP-MH) site is configured, or a site is configured AND the site is DF in the PE.
Note: The unknown-mac-route is not installed in the FDB (therefore, does not show up in the show service id svc-id fdb detail command). -
While the router can be configured with only one VNI (and signals a single VNI per VPLS), it can accept any VNI in the received EVPN routes as long as the route target is properly imported. The VTEPs and VNIs show up in the FDB associated with MAC addresses:
A:PE65# show service id 1000 fdb detail =============================================================================== Forwarding Database, Service 1000 =============================================================================== ServId MAC Source-Identifier Type Last Change Age ------------------------------------------------------------------------------- 1000 00:00:00:00:00:01 vxlan-1: Evpn 10/05/13 23:25:57 192.0.2.63:1063 1000 00:00:00:00:00:65 sap:1/1/1:1000 L/30 10/05/13 23:25:57 1000 00:ca:ca:ca:ca:00 vxlan-1: EvpnS 10/04/13 17:35:43 192.0.2.63:1063 ------------------------------------------------------------------------------- No. of MAC Entries: 3 ------------------------------------------------------------------------------- Legend: L=Learned O=Oam P=Protected-MAC C=Conditional S=Static ===============================================================================
Resiliency and BGP multihoming
The DC overlay infrastructure relies on IP tunneling, that is, VXLAN; therefore, the underlay IP layer resolves failure in the DC core. The IGP should be optimized to get the fastest convergence.
From a service perspective, resilient connectivity to the WAN may be provided by BGP multihoming.
Use of BGP-EVPN, BGP-AD, and sites in the same VPLS service
All BGP-EVPN (control plane for a VXLAN DC), BGP-AD (control plane for MPLS-based spoke SDPs connected to the WAN), and one site for BGP multihoming (control plane for the multihomed connection to the WAN) can be configured in one service in a specified system. If that is the case, the following considerations apply:
The configured BGP route-distinguisher and route-target are used by BGP for the two families, that is, evpn and l2vpn. If different import/export route targets are to be used per family, vsi-import/export policies must be used.
The pw-template-binding command under BGP, does not have any effect on evpn or bgp-mh. It is only used for the instantiation of the BGP-AD spoke SDPs.
If the same import/export route-targets are used in the two redundant DGWs, VXLAN binding as well as a fec129 spoke SDP binding is established between the two DGWs, creating a loop. To avoid creating a loop, the router allows the establishment of an EVPN VXLAN binding and an SDP binding to the same far-end, but the SDP binding is kept operationally down. Only the VXLAN binding is operationally up.
Use of the unknown-mac-route
This section describes the behavior of the EVPN-VXLAN service in the router when the unknown-mac-route and BGP-MH are configured at the same time.
The use of EVPN, as the control plane of NVO networks in the DC, provides a significant number of benefits as described in IETF Draft draft-ietf-bess-evpn-overlay.
However, there is a potential issue that must be addressed when a VPLS DCI is used for an NVO3-based DC: all the MAC addresses learned from the WAN side of the VPLS must be advertised by BGP EVPN updates. Even if optimized BGP techniques like RT-constraint are used, the number of MAC addresses to advertise or withdraw (in case of failure) from the DC GWs can be difficult to control and overwhelming for the DC network, especially when the NVEs reside in the hypervisors.
The 7750 SR, 7450 ESS, and 7950 XRS solution to this issue is based on the use of an unknown-mac-route address that is advertised by the DC PEs. By using this unknown-mac-route advertisement, the DC tenant may decide to optionally turn off the advertisement of WAN MAC addresses in the DGW, therefore, reducing the control plane overhead and the size of the FDB tables in the NVEs.
The use of the unknown-mac-route is optional and helps to reduce the amount of unknown-unicast traffic within the data center. All the receiving NVEs supporting this concept send any unknown-unicast packet to the owner of the unknown-mac-route, as opposed to flooding the unknown-unicast traffic to all other NVEs that are part of the same VPLS.
The use of the unknown-mac-route assumes the following:
A fully virtualized DC where all the MACs are control-plane learned, and learned previous to any communication (no legacy TORs or VLAN connected servers).
The only exception is MACs learned over the SAPs/SDP bindings that are part of the BGP-MH WAN site-id. Only one site-id is supported in this case.
No other SAPs/SDP bindings out of the WAN site-id are supported, unless only static MACs are used on those SAPs/SDP bindings.
Therefore, when unknown-mac-route is configured, it is only generated when one of the following applies:
No site is configured and the service is operationally up.
A BGP-MH site is configured AND the DGW is Designated Forwarder (DF) for the site. In case of BGP-MH failover, the unknown-mac-route is withdrawn by the former DF and advertised by the new DF.
EVPN for VXLAN in R-VPLS services
Gateway IRB on the DC PE for an L2 EVPN/VXLAN DC shows a DC with a Layer 2 service that carries the traffic for a tenant who extends a subnet within the DC, while the DGW is the default gateway for all the hosts in the subnet. The DGW function is carried out by the 7750 SR, 7450 ESS, and 7950 XRS where an R-VPLS instance exists for that particular tenant. Within the DC, the tenant has VPLS instances in all the NVE devices where they require connectivity (such VPLS instances can be instantiated in TORs, Nuage VRS, VSG, and so on). The WAN connectivity is based on existing IP-VPN features.
In this model, the DGW routers are configured with a R-VPLS (bound to the VPRN that provides the WAN connectivity) per tenant that provides the VXLAN connectivity to the Nuage VPLS instances. This model provides inter-subnet forwarding for L2-only TORs and other L2 DC NVEs.
On the router:
The VPRN is configured with an interface bound to the backhaul R-VPLS. That interface is a regular IP interface (IP address configured or possibly a Link Local Address if IPv6 is added).
The VPRN can support other numbered interfaces to the WAN or even to the DC.
The R-VPLS is configured with the BGP, BGP-EVPN and VXLAN (VNI) parameters.
The Nuage VSGs and NVEs use a regular VPLS service model with BGP EVPN and VXLAN parameters.
Consider the following:
Route-type 2 routes with MACs and IPs are advertised. Some considerations about MAC+IP and ARP/ND entries are:
The 7750 SR advertises its IRB MAC+IP in a route type 2 route and possibly the VRRP vMAC+vIP if it runs VRRP and the 7750 SR is the active router. In both cases, the MACs are advertised as static MACs, therefore, protected by the receiving PEs.
If the 7750 SR VPRN interface is configured with one or more additional secondary IP addresses, they are all advertised in routes type 2, as static MACs.
The 7750 SR processes route-type 2 routes as usual, populating the FDB with the received MACs and the VPRN ARP/ND table with the MAC and IPs, respectively.
Note: ND entries received from the EVPN are installed as Router entries. The ARP/ND entries coming from the EVPN are tagged as evpn.-
When a VPLS containing proxy-ARP/proxy-ND entries is bound to a VPRN (allow-ip-int-bind) all the proxy-ARP/proxy-ND entries are moved to the VPRN ARP/ND table. ARP/ND entries are also moved to proxy-ARP/proxy-ND entries if the VPLS is unbound.
-
EVPN does not program EVPN-received ARP/ND entries if the receiving VPRN has no IP addresses for the same subnet. The entries are added when the IP address for the same subnet is added.
-
Static ARP/ND entries have precedence over dynamic and EVPN ARP/ND entries.
VPRN interface binding to VPLS service brings down the VPRN interface operational status, if the VPRN interface MAC or the VRRP MAC matches a static-mac or OAM MAC configured in the associated VPLS service. If that is the case, a trap is generated.
Redundancy is handled by VRRP. The active 7750 SR advertises vMAC and vIP, as discussed, including the MAC mobility extended community and the sticky bit.
EVPN-enabled R-VPLS services are also supported on IES interfaces.
EVPN for VXLAN in IRB backhaul R-VPLS services and IP prefixes
Gateway IRB on the DC PE for an L3 EVPN/VXLAN DC shows a Layer 3 DC model, where a VPRN is defined in the DGWs, connecting the tenant to the WAN. That VPRN instance is connected to the VPRNs in the NVEs by means of an IRB backhaul R-VPLS. Because the IRB backhaul R-VPLS provides connectivity only to all the IRB interfaces and the DGW VPRN is not directly connected to all the tenant subnets, the WAN ip-prefixes in the VPRN routing table must be advertised in EVPN. In the same way, the NVEs send IP prefixes in EVPN that are received by the DGW and imported in the VPRN routing table.
Local router interface host addresses are not advertised in EVPN by default. To advertise them, the ip-route-advertisement incl-host command must be enabled. For example:
===============================================================================
Route Table (Service: 2)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Active Metric
-------------------------------------------------------------------------------
10.1.1.0/24 Local Local 00h00m11s 0
if Y 0
10.1.1.100/32 Local Host 00h00m11s 0
if Y 0
==============================================================================
For the case displayed by the output above, the behavior is the following:
ip-route-advertisement only local subnet (default) - 10.1.1.0/24 is advertised
ip-route-advertisement incl-host local subnet, host - 10.1.1.0/24 and 10.1.1.100/32 are advertised
Below is an example of VPRN (500) with two IRB interfaces connected to backhaul R-VPLS services 501 and 502 where EVPN-VXLAN runs:
vprn 500 customer 1 create
ecmp 4
route-distinguisher 65072:500
vrf-target target:65000:500
interface "evi-502" create
address 10.20.20.72/24
vpls "evpn-vxlan-502"
exit
exit
interface "evi-501" create
address 10.10.10.72/24
vpls "evpn-vxlan-501"
exit
exit
no shutdown
vpls 501 name ‟evpn-vxlan-501” customer 1 create
allow-ip-int-bind
vxlan instance 1 vni 501 create
exit
bgp
route-distinguisher 65072:501
route-target export target:65000:501 import target:65000:501
exit
bgp-evpn
ip-route-advertisement incl-host
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
no shutdown
exit
vpls 502 name ‟evpn-xvlan-502” customer 1 create
allow-ip-int-bind
vxlan instance 1 vni 502 create
exit
bgp
route-distinguisher 65072:502
route-target export target:65000:502 import target:65000:502
exit
bgp-evpn
ip-route-advertisement incl-host
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
no shutdown
exit
When the above commands are enabled, the router behaves as follows:
Receive route-type 5 routes and import the IP prefixes and associated IP next-hops into the VPRN routing table.
If the route-type 5 is successfully imported by the router, the prefix included in the route-type 5 (for example, 10.0.0.0/24), is added to the VPRN routing table with a next-hop equal to the gateway IP included in the route (for example, 192.0.0.1. that refers to the IRB IP address of the remote VPRN behind which the IP prefix sits).
When the router receives a packet from the WAN to the 10.0.0.0/24 subnet, the IP lookup on the VPRN routing table yields 192.0.0.1 as the next-hop. That next-hop is resolved to a MAC in the ARP table and the MAC resolved to a VXLAN tunnel in the FDB table
Note: IRB MAC and IP addresses are advertised in the IRB backhaul R-VPLS in routes type 2.
Generate route-type 5 routes for the IP prefixes in the associated VPRN routing table.
For example, if VPRN-1 is attached to EVPN R-VPLS 1 and EVPN R-VPLS 2, and R-VPLS 2 has bgp-evpn ip-route-advertisement configured, the 7750 SR advertises the R-VPLS 1 interface subnet in one route-type 5.
Routing policies can filter the imported and exported IP prefix routes accordingly.
The VPRN routing table can receive routes from all the supported protocols (BGP-VPN, OSPF, IS-IS, RIP, static routing) as well as from IP prefixes from EVPN, as shown below:
*A:PE72# show router 500 route-table
===============================================================================
Route Table (Service: 500)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
-------------------------------------------------------------------------------
10.20.20.0/24 Local Local 01d11h10m 0
evi-502 0
10.20.20.71/32 Remote BGP EVPN 00h02m26s 169
10.10.10.71 0
10.10.10.0/24 Remote Static 00h00m05s 5
10.10.10.71 1
10.16.0.1/32 Remote BGP EVPN 00h02m26s 169
10.10.10.71 0
-------------------------------------------------------------------------------
No. of Routes: 4
The following considerations apply:
The route Preference for EVPN IP prefixes is 169.
BGP IP-VPN routes have a preference of 170 by default, therefore, if the same route is received from the WAN over BGP-VPRN and from BGP-EVPN, then the EVPN route is preferred.
When the same route-type 5 prefix is received from different gateway IPs, ECMP is supported if configured in the VPRN.
All routes in the VPRN routing table (as long as they do not point back to the EVPN R-VPLS interface) are advertised via EVPN.
Although the description above is focused on IPv4 interfaces and prefixes, it applies to IPv6 interfaces too. The following considerations are specific to IPv6 VPRN R-VPLS interfaces:
IPv4 and IPv6 interfaces can be defined on R-VPLS IP interfaces at the same time (dual-stack).
The user may configure specific IPv6 Global Addresses on the VPRN R-VPLS interfaces. If a specific Global IPv6 Address is not configured on the interface, the Link Local Address interface MAC/IP is advertised in a route type 2 as soon as IPv6 is enabled on the VPRN R-VPLS interface.
Routes type 5 for IPv6 prefixes are advertised using either the configured Global Address or the implicit Link Local Address (if no Global Address is configured).
If more than one Global Address is configured, normally the first IPv6 address is used as gateway IP. The ‟first IPv6 address” refers to the first one on the list of IPv6 addresses shown through show router id interface interface ipv6 or through SNMP.
The rest of the addresses are advertised only in MAC-IP routes (Route Type 2) but not used as gateway IP for IPv6 prefix routes.
EVPN for VXLAN in EVPN tunnel R-VPLS services
EVPN-tunnel gateway IRB on the DC PE for an L3 EVPN/VXLAN DC shows an L3 connectivity model that optimizes the solution described in EVPN for VXLAN in IRB backhaul R-VPLS services and IP prefixes. Instead of regular IRB backhaul R-VPLS services for the connectivity of all the VPRN IRB interfaces, EVPN tunnels can be configured. The main advantage of using EVPN tunnels is that they do not need the configuration of IP addresses, as regular IRB R-VPLS interfaces do.
In addition to the ip-route-advertisement command, this model requires the configuration of the config>service>vprn>if>vpls <name> evpn-tunnel.
The example below shows a VPRN (500) with an EVPN-tunnel R-VPLS (504):
vprn 500 name "vprn500" customer 1 create
ecmp 4
route-distinguisher 65071:500
vrf-target target:65000:500
interface "evi-504" create
vpls "evpn-vxlan-504"
evpn-tunnel
exit
exit
no shutdown
exit
vpls 504 name "evpn-vxlan-504" customer 1 create
allow-ip-int-bind
vxlan instance 1 vni 504 create
exit
bgp
route-distinguisher 65071:504
route-target export target:65000:504 import target:65000:504
exit
bgp-evpn
ip-route-advertisement
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
no shutdown
exit
A specified VPRN supports regular IRB backhaul R-VPLS services as well as EVPN tunnel R-VPLS services.
The process followed upon receiving a route-type 5 on a regular IRB R-VPLS interface differs from the one for an EVPN-tunnel type:
IRB backhaul R-VPLS VPRN interface:
When a route-type 2 that includes an IP prefix is received and it becomes active, the MAC/IP information is added to the FDB and ARP tables. This can be checked with the show router arp command and the show service id fdb detail command.
When route-type 5 is received and becomes active for the R-VPLS service, the IP prefix is added to the VPRN routing table, regardless of the existence of a route-type 2 that can resolve the gateway IP address. If a packet is received from the WAN side and the IP lookup hits an entry for which the gateway IP (IP next-hop) does not have an active ARP entry, the system uses ARP to get a MAC. If ARP is resolved but the MAC is unknown in the FDB table, the system floods into the TLS multicast list. Routes type 5 can be checked in the routing table with the show router route-table and show router fib commands.
EVPN tunnel R-VPLS VPRN interface:
When route-type 2 is received and becomes active, the MAC address is added to the FDB (only).
When a route-type 5 is received and active, the IP prefix is added to the VPRN routing table with next-hop equal to EVPN tunnel: GW-MAC.
For example, ET-d8:45:ff:00:01:35, where the GW-MAC is added from the GW-MAC extended community sent along with the route-type 5.
If a packet is received from the WAN side, and the IP lookup hits an entry for which the next-hop is a EVPN tunnel: GW-MAC, the system looks up the GW-MAC in the FDB. Usually a route-type 2 with the GW-MAC is previously received so that the GW-MAC can be added to the FDB. If the GW-MAC is not present in the FDB, the packet is dropped.
IP prefixes with GW-MACs as next-hops are displayed by the show router command, as shown below:
*A:PE71# show router 500 route-table
===============================================================================
Route Table (Service: 500)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
-------------------------------------------------------------------------------
10.20.20.72/32 Remote BGP EVPN 00h23m50s 169
10.10.10.72 0
10.30.30.0/24 Remote BGP EVPN 01d11h30m 169
evi-504 (ET-d8:45:ff:00:01:35) 0
10.10.10.0/24 Remote BGP VPN 00h20m52s 170
192.0.2.69 (tunneled) 0
10.1.0.0/16 Remote BGP EVPN 00h22m33s 169
evi-504 (ET-d8:45:ff:00:01:35) 0
-------------------------------------------------------------------------------
No. of Routes: 4
The GW-MAC as well as the rest of the IP prefix BGP attributes are displayed by the show router bgp routes evpn ip-prefix command.
*A:Dut-A# show router bgp routes evpn ip-prefix prefix 3.0.1.6/32 detail
===============================================================================
BGP Router ID:10.20.1.1 AS:100 Local AS:100
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
Origin codes : i - IGP, e - EGP, ? - incomplete, > - best, b - backup
===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
-------------------------------------------------------------------------------
Original Attributes
Network : N/A
Nexthop : 10.20.1.2
From : 10.20.1.2
Res. Nexthop : 192.168.19.1
Local Pref. : 100 Interface Name : NotAvailable
Aggregator AS : None Aggregator : None
Atomic Aggr. : Not Atomic MED : 0
AIGP Metric : None
Connector : None
Community : target:100:1 mac-nh:00:00:01:00:01:02
bgp-tunnel-encap:VXLAN
Cluster : No Cluster Members
Originator Id : None Peer Router Id : 10.20.1.2
Flags : Used Valid Best IGP
Route Source : Internal
AS-Path : No As-Path
EVPN type : IP-PREFIX
ESI : N/A Tag : 1
Gateway Address: 00:00:01:00:01:02
Prefix : 3.0.1.6/32 Route Dist. : 10.20.1.2:1
MPLS Label : 262140
Route Tag : 0xb
Neighbor-AS : N/A
Orig Validation: N/A
Source Class : 0 Dest Class : 0
Modified Attributes
Network : N/A
Nexthop : 10.20.1.2
From : 10.20.1.2
Res. Nexthop : 192.168.19.1
Local Pref. : 100 Interface Name : NotAvailable
Aggregator AS : None Aggregator : None
Atomic Aggr. : Not Atomic MED : 0
AIGP Metric : None
Connector : None
Community : target:100:1 mac-nh:00:00:01:00:01:02
bgp-tunnel-encap:VXLAN
Cluster : No Cluster Members
Originator Id : None Peer Router Id : 10.20.1.2
Flags : Used Valid Best IGP
Route Source : Internal
AS-Path : 111
EVPN type : IP-PREFIX
ESI : N/A Tag : 1
Gateway Address: 00:00:01:00:01:02
Prefix : 3.0.1.6/32 Route Dist. : 10.20.1.2:1
MPLS Label : 262140
Route Tag : 0xb
Neighbor-AS : 111
Orig Validation: N/A
Source Class : 0 Dest Class : 0
-------------------------------------------------------------------------------
Routes : 1
===============================================================================
EVPN tunneling is also supported on IPv6 VPRN interfaces. When sending IPv6 prefixes from IPv6 interfaces, the GW-MAC in the route type 5 (IP-prefix route) is always zero. If no specific Global Address is configured on the IPv6 interface, the routes type 5 for IPv6 prefixes are always sent using the Link Local Address as GW-IP. The following example output shows an IPv6 prefix received via BGP EVPN.
*A:PE71# show router 30 route-table ipv6
===============================================================================
IPv6 Route Table (Service: 30)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
-------------------------------------------------------------------------------
2001:db8:1000::/64 Local Local 00h01m19s 0
int-PE-71-CE-1 0
2001:db8:2000::1/128 Remote BGP EVPN 00h01m20s 169
fe80::da45:ffff:fe00:6a-"int-evi-301" 0
-------------------------------------------------------------------------------
No. of Routes: 2
Flags: n = Number of times nexthop is repeated
B = BGP backup route available
L = LFA nexthop available
S = Sticky ECMP requested
===============================================================================
*A:PE71# show router bgp routes evpn ipv6-prefix prefix 2001:db8:2000::1/128 hunt
===============================================================================
BGP Router ID:192.0.2.71 AS:64500 Local AS:64500
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked
Origin codes : i - IGP, e - EGP, ? - incomplete, > - best, b - backup
===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
Network : N/A
Nexthop : 192.0.2.69
From : 192.0.2.69
Res. Nexthop : 192.168.19.2
Local Pref. : 100 Interface Name : int-71-69
Aggregator AS : None Aggregator : None
Atomic Aggr. : Not Atomic MED : 0
AIGP Metric : None
Connector : None
Community : target:64500:301 bgp-tunnel-encap:VXLAN
Cluster : No Cluster Members
Originator Id : None Peer Router Id : 192.0.2.69
Flags : Used Valid Best IGP
Route Source : Internal
AS-Path : No As-Path
EVPN type : IP-PREFIX
ESI : N/A Tag : 301
Gateway Address: fe80::da45:ffff:fe00:*
Prefix : 2001:db8:2000::1/128 Route Dist. : 192.0.2.69:301
MPLS Label : 0
Route Tag : 0
Neighbor-AS : N/A
Orig Validation: N/A
Source Class : 0 Dest Class : 0
Add Paths Send : Default
Last Modified : 00h41m17s
-------------------------------------------------------------------------------
RIB Out Entries
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Routes : 1
===============================================================================
EVPN-VPWS for VXLAN tunnels
BGP-EVPN control plane for EVPN-VPWS
EVPN-VPWS uses route-type 1 and route-type 4; it does not use route-types 2, 3 or 5. EVPN-VPWS BGP extensions shows the encoding of the required extensions for the Ethernet A-D per-EVI routes. The encoding follows the guidelines described in RFC 8214.
If the advertising PE has an access SAP-SDP or spoke SDP that is not part of an Ethernet Segment (ES), the PE populates the fields of the AD per-EVI route with the following values:
-
Ethernet Tag ID field is encoded with the value configured by the user in the service bgp-evpn local-attachment-circuit eth-tag value command.
-
RD and MPLS label values are encoded as specified in RFC 7432. For VXLAN, the MPLS field encodes the VXLAN VNI.
-
ESI is 0.
-
The route is sent along an EVPN L2 attributes extended community, as specified in RFC 8214, where:
-
type and subtype are 0x06 and 0x04 as allocated by IANA
-
flag C is set if a control word is configured in the service; C is always zero for VXLAN tunnels
-
P and B flags are zero
-
L2 MTU is encoded with a service MTU configured in the Epipe service
-
If the advertising PE has an access SAP-SDP or spoke SDP that is part of an ES, the AD per-EVI route is sent with the information described above, with the following minor differences:
-
The ESI encodes the corresponding non-zero value.
-
The P and B flags are set in the following cases:
-
All-active multihoming
-
All PEs that are part of the ES always set the P flag.
-
The B flag is never set in the all-active multihoming ES case.
-
-
Single-active multihoming
-
Only the DF PE sets the P bit for an EVI and the remaining PEs send it as P=0.
-
Only the backup DF PE sets the B bit.
If more than two PEs are present in the same single-active ES, the backup PE is the winner of a second DF election (excluding the DF). The remaining non-DF PEs send B=0.
-
-
Also, ES and AD per-ES routes are advertised and processed for the Ethernet-Segment, as described in RFC 7432 ESs. The ESI label sent with the AD per-ES route is used by BUM traffic on VPLS services; it is not used for Epipe traffic.
EVPN-VPWS for VXLAN tunnels in Epipe services
BGP-EVPN can be enabled in Epipe services with either SAPs or spoke SDPs at the access, as shown in EVPN-MPLS VPWS.
EVPN-VPWS is supported in VXLAN networks that also run EVPN-VXLAN in VPLS services. From a control plane perspective, EVPN-VPWS is a simplified point-to-point version of RFC 7432 for E-Line services for the following reasons:
-
EVPN-VPWS does not use inclusive multicast, MAC/IP routes or IP-prefix routes.
-
AD Ethernet per-EVI routes are used to advertise the local attachment circuit identifiers at each side of the VPWS instance. The attachment circuit identifiers are configured as local and remote Ethernet tags. When an AD per-EVI route is imported and the Ethernet tag matches the configured remote Ethernet tag, an EVPN destination is created for the Epipe.
In the following configuration example, Epipe 2 is an EVPN-VPWS service between PE2 and PE4 (as shown in EVPN-MPLS VPWS).
PE2>config>service>epipe(2)#
-----------------------
vxlan vni 2 instance 1 create
exit
bgp
exit
bgp-evpn
evi 2
local-attachment-circuit "AC-1"
eth-tag 100
remote-attachment-circuit "AC-2"
eth-tag 200
vxlan bgp 1 vxlan-instance 1
ecmp 2
no shutdown
sap 1/1/1:1 create
PE4>config>service>epipe(2)#
-----------------------
vxlan vni 2 instance 1 create
exit
bgp
exit
bgp-evpn
evi 2
local-attachment-circuit "AC-2"
eth-tag 200
remote-attachment-circuit "AC-1"
eth-tag 100
vxlan bgp 1 vxlan-instance 1
ecmp 2
no shutdown
spoke-sdp 1:1
The following considerations apply to the preceding example configuration:
-
When the EVI value is lower than 65535, the EVI is used to automatically derive the route-target or route-distinguisher of the service. For EVI values greater than 65535, the route-distinguisher is not automatically derived and the route-target is automatically derived, if evi-three-byte-auto-rt is configured. The EVI values must be unique in the system regardless of the type of service to which they are assigned (Epipe or VPLS).
-
Support for the following BGP-EVPN commands in Epipe services is the same as in VPLS services:
-
vxlan bgp 1 vxlan-instance 1
-
vxlan send-tunnel-encap
-
vxlan shutdown
-
vxlan ecmp
-
-
The following BGP-EVPN commands identify the local and remote attachment circuits, with the configured Ethernet tags encoded in the advertised and received AD Ethernet per-EVI routes:
-
local-attachment-circuit name
-
local-attachment-circuit name eth-tag tag-value; where tag-value is 1 to 16777215
-
remote-attachment-circuit name
-
remote-attachment-circuit name eth-tag tag-value; where tag-value is 1 to 16777215
Changes to remote Ethernet tags are allowed without shutting down BGP-EVPN VXLAN or the Epipe service. The local AC Ethernet tag value cannot be changed without BGP-EVPN VXLAN shutdown.
Both local and remote Ethernet tags are mandatory to bring up the Epipe service.
-
EVPN-VPWS Epipes can also be configured with the following characteristics:
-
Access attachment circuits can be SAPs or spoke SDP. Only manually-configured spoke SDP is supported; BGP-VPWS and endpoints are not supported. The VC switching configuration is not supported on BGP-EVPN enabled pipes.
-
EVPN-VPWS Epipes can advertise the Layer 2 (service) MTU and check its consistency as follows:
-
The advertised MTU value is taken from the configured service MTU in the Epipe service.
-
The received L2 MTU is compared to the local value. In case of a mismatch between the received MTU and the configured service MTU, the system does not set up the EVPN destination; as a result, the service does not come up.
Consider the following:
-
The system does not check the network port MTU value.
-
If the received L2 MTU value is 0, the MTU is ignored.
-
-
Using A/S PW and MC-LAG with EVPN-VPWS Epipes
The use of A/S PW (for access spoke SDP) and MC-LAG (for access SAPs) provides an alternative redundant solution for EVPN-VPWS that do not use the EVPN multi homing procedures described in RFC 8214. A/S PW and MC-LAG support on EVPN-VPWS shows the use of both mechanisms in a single Epipe.
In A/S PW and MC-LAG support on EVPN-VPWS, an A/S PW connects the CE to PE1 and PE2 (left side of the diagram), and an MC-LAG connects the CE to PE3 and PE4 (right side of the diagram). As EVPN multi homing is not used, there are no AD per-ES routes or ES routes. The redundancy is handled as follows:
-
PE1 and PE2 are configured with Epipe-1, where a spoke SDP connects the service in each PE to the access CE. The local AC Ethernet tag is 1 and the remote AC Ethernet tag is 2 (in PE1/PE2).
-
PE3 and PE4 are configured with Epipe-1, where each PE has a lag SAP that belongs to a previously-configured MC-LAG construct. The local AC Ethernet tag is 2 and the remote AC Ethernet tag is 1.
-
An endpoint and A/S PW is configured on the CE on the left side of the diagram. PE1/PE2 are able to advertise Ethernet tag 1 based on the operating status or the forwarding status of the spoke SDP.
For example, if PE1 receives a standby PW status indication from the CE and the previous status was forward, it withdraws the AD EVI route for Ethernet tag 1. If PE2 receives a forward PW status indication and the previous status was standby or down, it advertises the AD EVI route for Ethernet tag 1.
-
The user can configure MC-LAG for access SAPs using the example configuration of PE3 and PE4, as shown in A/S PW and MC-LAG support on EVPN-VPWS. In this case, the MC-LAG determines which chassis is active and which is standby.
If PE4 becomes the standby chassis, the entire LAG port is brought down. As a result, the SAP goes operationally down and PE4 withdraws any previous AD EVI routes for Ethernet tag 2.
If PE3 becomes the active chassis, the LAG port becomes operationally up. As a result, the SAP and the PE3 advertise the AD per-EVI route for Ethernet tag 2.
EVPN multihoming for EVPN-VPWS services
EVPN multihoming is supported for EVPN-VPWS Epipe services with the following considerations:
-
Single-active and all-active multihoming is supported for SAPs and spoke SDP.
-
ESs can be shared between the Epipe (MPLS and VXLAN) and VPLS (MPLS) services for LAGs, ports, and SDPs.
-
A split-horizon function is not required because there is no traffic between the Designated Forwarder (DF) and the non-DF for Epipe services. As a result, the ESI label is never used, and the ethernet-segment multi-homing single-active no-esi-label and ethernet-segment source-bmac-lsb commands do not affect Epipe services.
-
The local Ethernet tag values must match on all PEs that are part of the same ES, regardless of the multi homing mode. The PEs in the ES use the AD per-EVI routes from the peer PEs to validate the PEs as DF election candidates for a specific EVI.
The DF election for Epipes that is defined in an all-active multi homing ES is not relevant because all PEs in the ES behave in the same way as follows:
-
All PEs send P=1 on the AD per-EVI routes.
-
All PEs can send upstream and downstream traffic, regardless of whether the traffic is unicast, multicast, or broadcast (all traffic is treated as unicast in the Epipe services).
Therefore, the following tools command shows N/A when all-active multihoming is configured.
*A:PE-2# tools dump service system bgp-evpn ethernet-segment "ESI-12" evi 6000 df [03/18/2016 20:31:35] All Active VPWS - DF N/A
Aliasing is supported for traffic sent to an ES destination. If ECMP is enabled on the ingress PE, per-flow load balancing is performed to all PEs that advertise P=1. The PEs that advertise P=0, are not considered as next hops for an ES destination.
Although DF election is not relevant for Epipes in an all-active multi homing ES, it is essential for the following forwarding and backup functions in a single-active multihoming ES:
-
The PE elected as DF is the primary PE for the ES in the Epipe. The primary PE unblocks the SAP or spoke SDP for upstream and downstream traffic; the remaining PEs in the ES bring their ES SAPs or spoke SDPs operationally down.
-
The DF candidate list is built from the PEs sending ES routes for the same ES and is pruned for a specific service, depending on the availability of the AD per-ES and per-EVI routes.
-
When the SAP or spoke SDPs that are part of the ES come up, the AD per-EVI routes are sent with P=0 and B=0. The remote PEs do not start sending traffic until the DF election process is complete and the ES activation timer is expired, and the PEs advertise AD per-EVI routes with P and B bits other than zero.
-
The backup PE function is supported as defined in RFC 8214. The primary PE, backup, or none status is signaled by the PEs (part of the same single-active MH ES) in the P or B flags of the EVPN L2 attributes extended community. EVPN-VPWS single-active multihoming shows the advertisement and use of the primary, backup, or none indication by the PEs in the ES.
As specified in RFC 7432, the remote PEs in VPLS services have knowledge of the primary PE in the remote single-active ES, based on the advertisement of the MAC/IP routes because only the DF learns and advertises MAC/IP routes.
Because there are no MAC/IP routes in EVPN-VPWS, the remote PEs can forward the traffic based on the P/B bits. The process is described in the following list:
-
The DF PE for an EVI (PE1) sends P=1 and B=0.
-
For each ES or EVI, a second DF election is run among the PEs in the backup candidate list to elect the backup PE. The backup PE sends P=0 and B=1 (PE2).
-
All remaining multi homing PEs send P=0 and B=0 (PE3 and PE4).
-
At the remote PEs (PE5), the P and B flags are used to identify the primary and backup PEs within the ES destination. The traffic is then sent to the primary PE, provided that it is active.
-
-
When a remote PE receives the withdrawal of an Ethernet AD per-ES (or per-EVI) route from the primary PE, the remote PE immediately switches the traffic to the backup PE for the affected EVIs. The backup PE takes over immediately without waiting for the ES activation timer to bring up its SAP or spoke SDP.
-
The BGP-EVPN MPLS ECMP setting also governs the forwarding in single-active multi homing, regardless of the single-active multi homing bit in the AD per-ES route received at the remote PE (PE5).
-
PE5 always sends the traffic to the primary remote PE (the owner of the P=1 bit). In case of multiple primary PEs and ECMP>1, PE5 load balances the traffic to all primary PEs, regardless of the multi homing mode.
-
If the last primary PE withdraws its AD per-EVI or per-ES route, PE5 sends the traffic to the backup PE or PEs. In case of multiple backup PEs and ECMP>1, PE1 load balances the traffic to the backup PEs.
-
Non-system IPv4/IPv6 VXLAN termination for EVPN-VPWS services
EVPN-VPWS services support non-system IPv4/IPv6 VXLAN termination. For system configuration information, see Non-system IPv4 and IPv6 VXLAN termination in VPLS, R-VPLS, and Epipe services.
EVPN multihoming is supported when the PEs use non-system IP termination, however additional configuration steps are needed in this case:
-
The configure service system bgp-evpn eth-seg es-orig-ip ip-address command must be configured with the non-system IPv4/IPv6 address used for the EVPN-VPWS VXLAN service. As a result, this command modifies the originating-ip field in the ES routes advertised for the Ethernet Segment, and makes the system use this IP address when adding the local PE as DF candidate.
-
The configure service system bgp-evpn eth-seg route-next-hop ip-address command must be configured with the non-system IP address, too. The command changes the next-hop of the ES and AD per-ES routes to the configured address.
-
The non-system IP address (in each of the PEs in the ES) must match in these three commands for the local PE to be considered suitable for DF election:
-
es-orig-ip ip-address
-
route-next-hop ip-address
-
vxlan-src-vtep ip-address
-
EVPN for VXLAN in IRB backhaul R-VPLS services and IP prefixes
Gateway IRB on the DC PE for an L3 EVPN/VXLAN DC shows a Layer 3 DC model, where a VPRN is defined in the DGWs, connecting the tenant to the WAN. That VPRN instance is connected to the VPRNs in the NVEs by means of an IRB backhaul R-VPLS. Because the IRB backhaul R-VPLS provides connectivity only to all the IRB interfaces and the DGW VPRN is not directly connected to all the tenant subnets, the WAN ip-prefixes in the VPRN routing table must be advertised in EVPN. In the same way, the NVEs send IP prefixes in EVPN that is received by the DGW and imported in the VPRN routing table.
Local router interface host addresses are not advertised in EVPN by default. To advertise them, the ip-route-advertisement incl-host command must be enabled. For example:
===============================================================================
Route Table (Service: 2)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Active Metric
-------------------------------------------------------------------------------
10.1.1.0/24 Local Local 00h00m11s 0
if Y 0
10.1.1.100/32 Local Host 00h00m11s 0
if Y 0
==============================================================================
For the case displayed by the output above, the behavior is the following:
ip-route-advertisement only local subnet (default) - 10.1.1.0/24 is advertised
ip-route-advertisement incl-host local subnet, host - 10.1.1.0/24 and 10.1.1.100/32 are advertised
Below is an example of VPRN (500) with two IRB interfaces connected to backhaul R-VPLS services 501 and 502 where EVPN-VXLAN runs:
vprn 500 customer 1 create
ecmp 4
route-distinguisher 65072:500
vrf-target target:65000:500
interface "evi-502" create
address 10.20.20.72/24
vpls "evpn-vxlan-502"
exit
exit
interface "evi-501" create
address 10.10.10.72/24
vpls "evpn-vxlan-501"
exit
exit
no shutdown
vpls 501 name "evpn-vxlan-501" customer 1 create
allow-ip-int-bind
vxlan instance 1 vni 501 create
exit
bgp
route-distinguisher 65072:501
route-target export target:65000:501 import target:65000:501
exit
bgp-evpn
ip-route-advertisement incl-host
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
no shutdown
exit
vpls 502 name "evpn-vxlan-502" customer 1 create
allow-ip-int-bind
vxlan instance 1 vni 502 create
exit
bgp
route-distinguisher 65072:502
route-target export target:65000:502 import target:65000:502
exit
bgp-evpn
ip-route-advertisement incl-host
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
no shutdown
exit
When the above commands are enabled, the router behaves as follows:
Receive route-type 5 routes and import the IP prefixes and associated IP next-hops into the VPRN routing table.
If the route-type 5 is successfully imported by the router, the prefix included in the route-type 5 (for example, 10.0.0.0/24), is added to the VPRN routing table with a next-hop equal to the gateway IP included in the route (for example, 192.0.0.1. that refers to the IRB IP address of the remote VPRN behind which the IP prefix sits).
When the router receives a packet from the WAN to the 10.0.0.0/24 subnet, the IP lookup on the VPRN routing table yields 192.0.0.1 as the next-hop. That next-hop is resolved to a MAC in the ARP table and the MAC resolved to a VXLAN tunnel in the FDB table
Note: IRB MAC and IP addresses are advertised in the IRB backhaul R-VPLS in routes type 2.
Generate route-type 5 routes for the IP prefixes in the associated VPRN routing table.
For example, if VPRN-1 is attached to EVPN R-VPLS 1 and EVPN R-VPLS 2, and R-VPLS 2 has bgp-evpn ip-route-advertisement configured, the 7750 SR advertises the R-VPLS 1 interface subnet in one route-type 5.
Routing policies can filter the imported and exported IP prefix routes accordingly.
The VPRN routing table can receive routes from all the supported protocols (BGP-VPN, OSPF, IS-IS, RIP, static routing) as well as from IP prefixes from EVPN, as shown below:
*A:PE72# show router 500 route-table
===============================================================================
Route Table (Service: 500)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
-------------------------------------------------------------------------------
10.20.20.0/24 Local Local 01d11h10m 0
evi-502 0
10.20.20.71/32 Remote BGP EVPN 00h02m26s 169
10.10.10.71 0
10.10.10.0/24 Remote Static 00h00m05s 5
10.10.10.71 1
10.16.0.1/32 Remote BGP EVPN 00h02m26s 169
10.10.10.71 0
-------------------------------------------------------------------------------
No. of Routes: 4
The following considerations apply:
The route Preference for EVPN IP prefixes is 169.
BGP IP-VPN routes have a preference of 170 by default, therefore, if the same route is received from the WAN over BGP-VPRN and from BGP-EVPN, then the EVPN route is preferred.
When the same route-type 5 prefix is received from different gateway IPs, ECMP is supported if configured in the VPRN.
All routes in the VPRN routing table (as long as they do not point back to the EVPN R-VPLS interface) are advertised via EVPN.
Although the description above is focused on IPv4 interfaces and prefixes, it applies to IPv6 interfaces too. The following considerations are specific to IPv6 VPRN R-VPLS interfaces:
IPv4 and IPv6 interfaces can be defined on R-VPLS IP interfaces at the same time (dual-stack).
The user may configure specific IPv6 Global Addresses on the VPRN R-VPLS interfaces. If a specific Global IPv6 Address is not configured on the interface, the Link Local Address interface MAC/IP is advertised in a route type 2 as soon as IPv6 is enabled on the VPRN R-VPLS interface.
Routes type 5 for IPv6 prefixes are advertised using either the configured Global Address or the implicit Link Local Address (if no Global Address is configured).
If more than one Global Address is configured, normally the first IPv6 address is used as gateway IP. The ‟first IPv6 address” refers to the first one on the list of IPv6 addresses shown through the show router <id> interface interface IPv6 or through SNMP.
The rest of the addresses are advertised only in MAC-IP routes (Route Type 2) but not used as gateway IP for IPv6 prefix routes.
EVPN for VXLAN in EVPN tunnel R-VPLS services
EVPN-tunnel gateway IRB on the DC PE for an L3 EVPN/VXLAN DC shows an L3 connectivity model that optimizes the solution described in EVPN for VXLAN in IRB backhaul R-VPLS services and IP prefixes. Instead of regular IRB backhaul R-VPLS services for the connectivity of all the VPRN IRB interfaces, EVPN tunnels can be configured. The main advantage of using EVPN tunnels is that they do not need the configuration of IP addresses, as regular IRB R-VPLS interfaces do.
In addition to the ip-route-advertisement command, this model requires the configuration of the config>service>vprn>if>vpls <name> evpn-tunnel.
The example below shows a VPRN (500) with an EVPN-tunnel R-VPLS (504):
vprn 500 customer 1 create
ecmp 4
route-distinguisher 65071:500
vrf-target target:65000:500
interface "evi-504" create
vpls "evpn-vxlan-504"
evpn-tunnel
exit
exit
no shutdown
exit
vpls 504 name ‟evpn-vxlan-504” customer 1 create
allow-ip-int-bind
vxlan instance 1 vni 504 create
exit
bgp
route-distinguisher 65071:504
route-target export target:65000:504 import target:65000:504
exit
bgp-evpn
ip-route-advertisement
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
no shutdown
exit
A specified VPRN supports regular IRB backhaul R-VPLS services as well as EVPN tunnel R-VPLS services.
The process followed upon receiving a route-type 5 on a regular IRB R-VPLS interface differs from the one for an EVPN-tunnel type:
IRB backhaul R-VPLS VPRN interface:
When a route-type 2 that includes an IP prefix is received and it becomes active, the MAC/IP information is added to the FDB and ARP tables. This can be checked with the show router arp command and the show service id fdb detail command.
When route-type 5 is received and becomes active for the R-VPLS service, the IP prefix is added to the VPRN routing table, regardless of the existence of a route-type 2 that can resolve the gateway IP address. If a packet is received from the WAN side and the IP lookup hits an entry for which the gateway IP (IP next-hop) does not have an active ARP entry, the system uses ARP to get a MAC. If ARP is resolved but the MAC is unknown in the FDB table, the system floods into the TLS multicast list. Routes type 5 can be checked in the routing table with the show router route-table and show router fib commands.
EVPN tunnel R-VPLS VPRN interface:
When route-type 2 is received and becomes active, the MAC address is added to the FDB (only).
When a route-type 5 is received and active, the IP prefix is added to the VPRN routing table with next-hop equal to EVPN tunnel: GW-MAC.
For example, ET-d8:45:ff:00:01:35, where the GW-MAC is added from the GW-MAC extended community sent along with the route-type 5.
If a packet is received from the WAN side, and the IP lookup hits an entry for which the next-hop is a EVPN tunnel: GW-MAC, the system looks up the GW-MAC in the FDB. Usually a route-type 2 with the GW-MAC is previously received so that the GW-MAC can be added to the FDB. If the GW-MAC is not present in the FDB, the packet is dropped.
IP prefixes with GW-MACs as next-hops are displayed by the show router command, as shown below:
*A:PE71# show router 500 route-table
===============================================================================
Route Table (Service: 500)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
-------------------------------------------------------------------------------
10.20.20.72/32 Remote BGP EVPN 00h23m50s 169
10.10.10.72 0
10.30.30.0/24 Remote BGP EVPN 01d11h30m 169
evi-504 (ET-d8:45:ff:00:01:35) 0
10.10.10.0/24 Remote BGP VPN 00h20m52s 170
192.0.2.69 (tunneled) 0
10.1.0.0/16 Remote BGP EVPN 00h22m33s 169
evi-504 (ET-d8:45:ff:00:01:35) 0
-------------------------------------------------------------------------------
No. of Routes: 4
The GW-MAC as well as the rest of the IP prefix BGP attributes are displayed by the show router bgp routes evpn ip-prefix command.
*A:Dut-A# show router bgp routes evpn ip-prefix prefix 3.0.1.6/32 detail
===============================================================================
BGP Router ID:10.20.1.1 AS:100 Local AS:100
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
Origin codes : i - IGP, e - EGP, ? - incomplete, > - best, b - backup
===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
-------------------------------------------------------------------------------
Original Attributes
Network : N/A
Nexthop : 10.20.1.2
From : 10.20.1.2
Res. Nexthop : 192.168.19.1
Local Pref. : 100 Interface Name : NotAvailable
Aggregator AS : None Aggregator : None
Atomic Aggr. : Not Atomic MED : 0
AIGP Metric : None
Connector : None
Community : target:100:1 mac-nh:00:00:01:00:01:02
bgp-tunnel-encap:VXLAN
Cluster : No Cluster Members
Originator Id : None Peer Router Id : 10.20.1.2
Flags : Used Valid Best IGP
Route Source : Internal
AS-Path : No As-Path
EVPN type : IP-PREFIX
ESI : N/A Tag : 1
Gateway Address: 00:00:01:00:01:02
Prefix : 3.0.1.6/32 Route Dist. : 10.20.1.2:1
MPLS Label : 262140
Route Tag : 0xb
Neighbor-AS : N/A
Orig Validation: N/A
Source Class : 0 Dest Class : 0
Modified Attributes
Network : N/A
Nexthop : 10.20.1.2
From : 10.20.1.2
Res. Nexthop : 192.168.19.1
Local Pref. : 100 Interface Name : NotAvailable
Aggregator AS : None Aggregator : None
Atomic Aggr. : Not Atomic MED : 0
AIGP Metric : None
Connector : None
Community : target:100:1 mac-nh:00:00:01:00:01:02
bgp-tunnel-encap:VXLAN
Cluster : No Cluster Members
Originator Id : None Peer Router Id : 10.20.1.2
Flags : Used Valid Best IGP
Route Source : Internal
AS-Path : 111
EVPN type : IP-PREFIX
ESI : N/A Tag : 1
Gateway Address: 00:00:01:00:01:02
Prefix : 3.0.1.6/32 Route Dist. : 10.20.1.2:1
MPLS Label : 262140
Route Tag : 0xb
Neighbor-AS : 111
Orig Validation: N/A
Source Class : 0 Dest Class : 0
-------------------------------------------------------------------------------
Routes : 1
===============================================================================
EVPN tunneling is also supported on IPv6 VPRN interfaces. When sending IPv6 prefixes from IPv6 interfaces, the GW-MAC in the route type 5 (IP-prefix route) is always zero. If no specific Global Address is configured on the IPv6 interface, the routes type 5 for IPv6 prefixes are always sent using the Link Local Address as GW-IP. The following example output shows an IPv6 prefix received through BGP EVPN.
*A:PE71# show router 30 route-table ipv6
===============================================================================
IPv6 Route Table (Service: 30)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
-------------------------------------------------------------------------------
2001:db8:1000::/64 Local Local 00h01m19s 0
int-PE-71-CE-1 0
2001:db8:2000::1/128 Remote BGP EVPN 00h01m20s 169
fe80::da45:ffff:fe00:6a-"int-evi-301" 0
-------------------------------------------------------------------------------
No. of Routes: 2
Flags: n = Number of times nexthop is repeated
B = BGP backup route available
L = LFA nexthop available
S = Sticky ECMP requested
===============================================================================
*A:PE71# show router bgp routes evpn ipv6-prefix prefix 2001:db8:2000::1/128 hunt
===============================================================================
BGP Router ID:192.0.2.71 AS:64500 Local AS:64500
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked
Origin codes : i - IGP, e - EGP, ? - incomplete, > - best, b - backup
===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
Network : N/A
Nexthop : 192.0.2.69
From : 192.0.2.69
Res. Nexthop : 192.168.19.2
Local Pref. : 100 Interface Name : int-71-69
Aggregator AS : None Aggregator : None
Atomic Aggr. : Not Atomic MED : 0
AIGP Metric : None
Connector : None
Community : target:64500:301 bgp-tunnel-encap:VXLAN
Cluster : No Cluster Members
Originator Id : None Peer Router Id : 192.0.2.69
Flags : Used Valid Best IGP
Route Source : Internal
AS-Path : No As-Path
EVPN type : IP-PREFIX
ESI : N/A Tag : 301
Gateway Address: fe80::da45:ffff:fe00:*
Prefix : 2001:db8:2000::1/128 Route Dist. : 192.0.2.69:301
MPLS Label : 0
Route Tag : 0
Neighbor-AS : N/A
Orig Validation: N/A
Source Class : 0 Dest Class : 0
Add Paths Send : Default
Last Modified : 00h41m17s
-------------------------------------------------------------------------------
RIB Out Entries
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Routes : 1
===============================================================================
Layer 2 multicast optimization for VXLAN (Assisted-Replication)
The Assisted-Replication feature for IPv4 VXLAN tunnels (both Leaf and Replicator functions) is supported in compliance with the non-selective mode described in IETF Draft draft-ietf-bess-evpn-optimized-ir.
The Assisted-Replication feature is a Layer 2 multicast optimization feature that helps software-based PE and NVEs with low-performance replication capabilities to deliver broadcast and multicast Layer 2 traffic to remote VTEPs in the VPLS service.
The EVPN and proxy-ARP/ND capabilities can reduce the amount of broadcast and unknown unicast in the VPLS service; ingress replication is sufficient for most use cases in this scenario. However, when multicast applications require a significant amount of replication at the ingress node, software-based nodes struggle because of their limited replication performance. By enabling the Assisted-Replication Leaf function on the software-based SR-series router, all the broadcast and multicast packets are sent to a 7x50 router configured as a Replicator, which replicates the traffic to all the VTEPs in the VPLS service on behalf of the Leaf. This guarantees that the broadcast or multicast traffic is delivered to all the VPLS participants without any packet loss caused by performance issues.
The Leaf or Replicator function is enabled per VPLS service by the configure service vpls vxlan assisted-replication {replicator | leaf} command. In addition, the Replicator requires the configuration of an Assisted-Replication IP (AR-IP) address. The AR-IP loopback address indicates whether the received VXLAN packets have to be replicated to the remote VTEPs. The AR-IP address is configured using the configure service system vxlan assisted-replication-ip <ip-address> command.
Based on the assisted-replication {replicator | leaf} configuration, the SR-series router can behave as a Replicator (AR-R), Leaf (AR-L), or Regular Network Virtualization Edge (RNVE) router. An RNVE router does not support the Assisted-Replication feature. Because it is configured with no assisted replication, the RNVE router ignores the AR-R and AR-L information and replicates to its flooding list where VTEPs are added based on the regular ingress replication routes.
Replicator (AR-R) procedures
An AR-R configuration is shown in the following example.
*A:PE-2>config>service>system>vxlan# info
----------------------------------------------
assisted-replication-ip 10.2.2.2
----------------------------------------------
*A:PE-2>config>service>vpls# info
----------------------------------------------
vxlan instance 1 vni 4000 create
assisted-replication replicator
exit
bgp
exit
bgp-evpn
evi 4000
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
<snip>
no shutdown
----------------------------------------------
In this example configuration, the BGP advertises a new inclusive multicast route with tunnel-type = AR, type (T) = AR-R, and tunnel-id = originating-ip = next-hop = assisted-replication-ip (IP address 10.2.2.2 in the preceding example). In addition to the AR route, the AR-R sends a regular IR route if ingress-repl-inc-mcast-advertisement is enabled.
The AR-R builds a flooding list composed of ACs (SAPs and SDP bindings) and VXLAN tunnels to remote nodes in the VPLS. All objects in the flooding list are broadcast/multicast (BM) and unknown unicast (U) capable. The following example output of the show service id vxlan command shows that the VXLAN destinations in the flooding list are tagged as ‟BUM”.
*A:PE-2# show service id 4000 vxlan
===============================================================================
Vxlan Src Vtep IP: N/A
===============================================================================
VPLS VXLAN, Ingress VXLAN Network Id: 4000
Creation Origin: manual
Assisted-Replication: replicator
RestProtSrcMacAct: none
===============================================================================
VPLS VXLAN service Network Specifics
===============================================================================
Ing Net QoS Policy : none Vxlan VNI Id : 4000
Ingress FP QGrp : (none) Ing FP QGrp Inst : (none)
===============================================================================
Egress VTEP, VNI
===============================================================================
VTEP Address Egress VNI Num. MACs Mcast Oper L2
State PBR
-------------------------------------------------------------------------------
192.0.2.3 4000 0 BUM Up No
192.0.2.5 4000 0 BUM Up No
192.0.2.6 4000 0 BUM Up No
-------------------------------------------------------------------------------
Number of Egress VTEP, VNI : 3
-------------------------------------------------------------------------------
===============================================================================
When the AR-R receives a BUM packet on an AC, the AR-R forwards the packet to its flooding list (including the local ACs and remote VTEPs).
When the AR-R receives a BM packet on a VXLAN tunnel, it checks the IP DA of the underlay IP header and performs the following BM packet processing.
If the destination IP matches its AR-IP, the AR-R forwards the BM packet to its flooding list (ACs and VXLAN tunnels). The AR-R performs source suppression to ensure that the traffic is not sent back to the originating Leaf.
If the destination IP matches its regular VXLAN termination IP (IR-IP), the AR-R skips all the VXLAN tunnels from the flooding list and only replicates to the local ACs. This is the default Ingress Replication (IR) behavior.
Leaf (AR-L) procedures
An AR-L is configured as shown in the following example.
A:PE-3>config>service>vpls# info
----------------------------------------------
vxlan instance 1 vni 4000 create
assisted-replication leaf replicator-activation-time 30
bgp
exit
bgp-evpn
evi 4000
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
mpls
shutdown
exit
exit
stp
shutdown
exit
sap 1/1/1:4000 create
no shutdown
exit
no shutdown
----------------------------------------------
In this example configuration, the BGP advertises a new inclusive multicast route with a tunnel-type = IR, type (T) = AR-L and tunnel-id = originating-ip = next-hop = IR-IP (IP address terminating VXLAN normally, either system-ip or vxlan-src-vtep address).
The AR-L builds a single flooding list per service but controlled by the BM and U flags. These flags are displayed in the following show service id vxlan command example output.
A:PE-3# show service id 4000 vxlan
===============================================================================
Vxlan Src Vtep IP: N/A
===============================================================================
VPLS VXLAN, Ingress VXLAN Network Id: 4000
Creation Origin: manual
Assisted-Replication: leaf Replicator-Activation-Time: 30
RestProtSrcMacAct: none
===============================================================================
VPLS VXLAN service Network Specifics
===============================================================================
Ing Net QoS Policy : none Vxlan VNI Id : 4000
Ingress FP QGrp : (none) Ing FP QGrp Inst : (none)
===============================================================================
Egress VTEP, VNI
===============================================================================
VTEP Address Egress VNI Num. MACs Mcast Oper L2
State PBR
-------------------------------------------------------------------------------
10.2.2.2 4000 0 BM Up No
10.4.4.4 4000 0 - Up No
192.0.2.2 4000 0 U Up No
192.0.2.5 4000 0 U Up No
192.0.2.6 4000 0 U Up No
-------------------------------------------------------------------------------
Number of Egress VTEP, VNI : 5
-------------------------------------------------------------------------------
===============================================================================
The AR-L creates the following VXLAN destinations when it receives and selects a Replicator-AR route or the Regular-IR routes:
A VXLAN destination to each remote PE that sent an IR route. These bindings have the U flag set.
A VXLAN destination to the selected AR-R. These bindings have only the BM flag set; the U flag is not set.
The non-selected AR-Rs create a binding with flag ‟-” (in the CPM) that is displayed by the show service id vxlan command. Although the VXLAN destinations to non-selected AR-Rs do not carry any traffic, the destinations count against the total limit and must be considered when accounting for consumed VXLAN destinations in the router.
The BM traffic is only sent to the selected AR-R, whereas the U (unknown unicast) traffic is sent to all the destinations with the U flag.
The AR-L performs per-service load-balancing of the BM traffic when two or more AR-Rs exist in the same service. The AR Leaf creates a list of candidate PEs for each AR-R (ordered by IP and VNI; candidate 0 being the lowest IP and VNI). The replicator is selected out of a modulo function of the service-id and the number of replicators, as shown in the following example output.
A:PE-3# show service id 4000 vxlan assisted-replication replicator
===============================================================================
Vxlan AR Replicator Candidates
===============================================================================
VTEP Address Egress VNI In Use In Candidate List Pending Time
-------------------------------------------------------------------------------
10.2.2.2 4000 yes yes 0
10.4.4.4 4000 no yes 0
-------------------------------------------------------------------------------
Number of entries : 2
-------------------------------------------------------------------------------
===============================================================================
A change in the number of Replicator-AR routes (for example, if a route is withdrawn or a new route appears) affects the result of the hashing, which may cause a different AR-R to be selected.
The following list summarizes other aspects of the AR-L behavior:
When a Leaf receives a BM packet on an AC, it sends the packet to its flood list that includes access SAP or SDP bindings and VXLAN destinations with BM or BUM flags. If a single AR-R is selected, only a VXLAN destination includes the BM flags.
Control plane-generated BM packets, such as ARP/ND (when proxy-ARP/ND is enabled) or Eth-CFM, follow the behavior of regular data plane BM packets.
When a Leaf receives an unknown unicast packet on an AC, it sends the packet to the flood-list, skipping the AR destination because the U flag is set to 0. To avoid packet re-ordering, the unknown unicast packets do not go through the AR-R.
When a Leaf receives a BUM packet on an overlay tunnel, it forwards the packet to the flood list, skipping the VXLAN tunnels (that is, the packet is sent to the local ACs and never to a VXLAN tunnel). This is the default IR behavior.
When the last Replicator-AR route is withdrawn, the AR-L removes the AR destination from the flood list and falls back to ingress replication.
AR BM replication behavior for a BM packet shows the expected replication behavior for BM traffic when received at the access on an AR-R, AR-L, or RNVE router. Unknown unicast follows regular ingress replication behavior regardless of the role of the ingress node for the specific service.
Assisted-Replication interaction with other VPLS features
The Assisted-Replication feature has the following limitations:
The following features are not supported on the same service where the Assisted-Replication feature is enabled.
Aggregate QoS per VNI
VXLAN IPv6 transport
IGMP/MLD/PIM-snooping
Assisted-Replication Leaf and Replicator functions are mutually exclusive within the same VPLS service.
The Assisted-Replication feature is supported with IPv4 non-system-ip VXLAN termination. However, the configured assisted-replication-ip (AR-IP) must be different from the tunnel termination IP address.
The AR-IP address must be a /32 loopback interface on the base router.
The Assisted-Replication feature is only supported in EVPN-VXLAN services (VPLS with BGP-EVPN vxlan enabled). Although services with a combination of EVPN-MPLS and EVPN-VXLAN are supported, the Assisted-Replication configuration is only relevant to the VXLAN.
DGW policy based forwarding/routing to an EVPN ESI
The Nuage Virtual Services Platform (VSP) supports a service chaining function that ensures traffic traverses a number of services (also known as Service Functions) between application hosts (FW, LB, NAT, IPS/IDS, and so on.) if the operator needs to do so. In the DC, tenants want the ability to specify these functions and their sequence, so that services can be added or removed without requiring changes to the underlying application.
This service chaining function is built based on a series of policy based routing/forwarding redirecting rules that are automatically coordinated and abstracted by the Nuage Virtual Services Directory (VSD). From a networking perspective, the packets are hop-by-hop redirected based on the location of the corresponding SF (Service Function) in the DC fabric. The location of the SF is specified by its VTEP and VNI and is advertised by BGP-EVPN along with an Ethernet Segment Identifier that is uniquely associated with the SF.
For more information about the Nuage Service Chaining solution, see the Nuage VSP documentation.
The 7750 SR, 7450 ESS, or 7950 XRS can be integrated as the first hop in the chain in a Nuage DC. This service chaining integration is intended to be used as described in the following three use cases.
Policy based forwarding in VPLS services for Nuage Service Chaining integration in L2-domains
PBF to ESI function shows the 7750 SR, 7450 ESS, and 7950 XRS Service Chaining integration with the Nuage VSP on VPLS services. In this example, the DC gateway, PE1, is connected to an L2-DOMAIN that exists in the DC and must redirect the traffic to the Service Function SF-1. The regular Layer 2 forwarding procedures would have taken the packets to PE2, as opposed to SF-1.
An operator must configure a PBF match/action filter policy entry in an IPv4 or MAC ingress access or network filter deployed on a VPLS interface using CLI/SNMP/NETCONF management interfaces. The PBF target is the first service function in the chain (SF-1) that is identified by an ESI.
In the example shown in PBF to ESI function, the PBF filter redirects the matching packets to ESI 0x01 in VPLS-1.
As soon as the redirection target is configured and associated with the vport connected to SF-1, the Nuage VSC (Virtual Services Controller, or the remote PE3 in the example) advertises the location of SF-1 via an Auto-Discovery Ethernet Tag route (route type 1) per-EVI. In this AD route, the ESI associated with SF-1 (ESI 0x01) is advertised along with the VTEP (PE3's IP) and VNI (VNI-1) identifying the vport where SF-1 is connected. PE1 sends all the frames matching the ingress filter to PE3's VTEP and VNI-1.
The following filter configuration shows an example of PBF rule redirecting all the frames to an ESI.
A:PE1>config>filter>mac-filter# info
----------------------------------------------
default-action forward
entry 10 create
action
forward esi ff:00:00:00:00:00:00:00:00:01 service-id 301
exit
exit
When the filter is properly applied to the VPLS service (VPLS-301 in this example), it shows 'Active' in the following show commands as long as the Auto-Discovery route for the ESI is received and imported.
A:PE1# show filter mac 1
===============================================================================
Mac Filter
===============================================================================
Filter Id : 1 Applied : Yes
Scope : Template Def. Action : Forward
Entries : 1 Type : normal
Description : (Not Specified)
-------------------------------------------------------------------------------
Filter Match Criteria : Mac
-------------------------------------------------------------------------------
Entry : 10 FrameType : Ethernet
Description : (Not Specified)
Log Id : n/a
Src Mac : Undefined
Dest Mac : Undefined
Dot1p : Undefined Ethertype : Undefined
DSAP : Undefined SSAP : Undefined
Snap-pid : Undefined ESnap-oui-zero : Undefined
Match action: Forward (ESI) Active
ESI : ff:00:00:00:00:00:00:00:00:01
Svc Id : 301
PBR Down Act: Forward (entry-default)
Ing. Matches: 3 pkts
Egr. Matches: 0 pkts
===============================================================================
A:PE1# show service id 301 es-pbr
===============================================================================
L2 ES PBR
===============================================================================
ESI Users Status
VTEP:VNI
-------------------------------------------------------------------------------
ff:00:00:00:00:00:00:00:00:01 1 Active
192.0.2.72:7272
-------------------------------------------------------------------------------
Number of entries : 1
-------------------------------------------------------------------------------
===============================================================================
Details of the received AD route that resolves the filter forwarding are shown in the following show router bgp routes command.
A:PE1# show router bgp routes evpn auto-
disc esi ff:00:00:00:00:00:00:00:00:01
===============================================================================
BGP Router ID:192.0.2.71 AS:64500 Local AS:64500
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP EVPN Auto-Disc Routes
===============================================================================
Flag Route Dist. ESI NextHop
Tag Label
-------------------------------------------------------------------------------
u*>i 192.0.2.72:100 ff:00:00:00:00:00:00:00:00:01 192.0.2.72
0 VNI 7272
-------------------------------------------------------------------------------
Routes : 1
=============================================================
This AD route, when used for PBF redirection, is added to the list of EVPN-VXLAN bindings for the VPLS service and shown as 'L2 PBR' type:
A:PE1# show service id 301 vxlan
===============================================================================
VPLS VXLAN, Ingress VXLAN Network Id: 301
===============================================================================
Egress VTEP, VNI
===============================================================================
VTEP Address Egress VNI Num. MACs Mcast Oper State L2 PBR
-------------------------------------------------------------------------------
192.0.2.69 301 1 Yes Up No
192.0.2.72 301 1 Yes Up No
192.0.2.72 7272 0 No Up Yes
-------------------------------------------------------------------------------
Number of Egress VTEP, VNI : 3
-------------------------------------------------------------------------------
===============================================================================
If the AD route is withdrawn, the binding disappears and the filter is inactive again. The user can control whether the matching packets are dropped or forwarded if the PBF target cannot be resolved by BGP.
Policy based routing in VPRN services for Nuage Service Chaining integration in L2-DOMAIN-IRB domains
PBR to ESI function shows the 7750 SR, 7450 ESS, and 7950 XRS Service Chaining integration with the Nuage VSP on L2-DOMAIN-IRB domains. In this example, the DC gateway, PE1, is connected to an L2-DOMAIN-IRB that exists in the DC and must redirect the traffic to the Service Function SF-1 with IP address 10.10.10.1. The regular Layer 3 forwarding procedures would have taken the packets to PE2, as opposed to SF-1.
In this case, an operator must configure a PBR match/action filter policy entry in an IPv4 ingress access or network filter deployed on IES/VPRN interface using CLI, SNMP or NETCONF management interfaces. The PBR target identifies first service function in the chain (ESI 0x01 in PBR to ESI function, identifying where the Service Function is connected and the IPv4 address of the SF) and EVPN VXLAN egress interface on the PE (VPRN routing instance and R-VPLS interface name). The BGP control plane together with ESI PBR configuration are used to forward the matching packets to the next-hop in the EVPN-VXLAN data center chain (through resolution to a VNI and VTEP). If the BGP control plane information is not available, the packets matching the ESI PBR entry is, by default, forwarded using regular routing. Optionally, an operator can select to drop the packets when the ESI PBR target is not reachable.
The following filter configuration shows an example of a PBR rule redirecting all the matching packets to an ESI.
*A:PE1>config>filter>ip-filter# info
----------------------------------------------
default-action forward
entry 10 create
match
dst-ip 10.10.10.253/32
exit
action
forward esi ff:00:00:00:00:21:5f:00:df:e5 sf-ip 10.10.10.1 vas-
interface "evi-301" router 300
exit
pbr-down-action-override filter-default-action
exit
----------------------------------------------
In this use case, the following are required in addition to the ESI: the sf-ip (10.10.10.1 in the example above), router instance (300), and vas-interface.
The sf-ip is used by the system to know which inner MAC DA it has to use when sending the redirected packets to the SF. The SF-IP is resolved to the SF MAC following regular ARP procedures in EVPN-VXLAN.
The router instance may be the same as the one where the ingress filter is configured or may be different: for instance, the ingress PBR filter can be applied on an IES interface pointing at a VPRN router instances that is connected to the DC fabric.
The vas-interface refers to the R-VPLS interface name through which the SF can be found. The VPRN instance may have more than one R-VPLS interface, therefore, it is required to specify which R-VPLS interface to use.
When the filter is properly applied to the VPRN or IES service (VPRN-300 in this example), it shows 'Active' in the following show commands as long as the Auto-Discovery route for the ESI is received and imported and the SF-IP resolved to a MAC address.
*A:PE1# show filter ip 1
===============================================================================
IP Filter
===============================================================================
Filter Id : 1 Applied : Yes
Scope : Template Def. Action : Forward
System filter: Unchained
Radius Ins Pt: n/a
CrCtl. Ins Pt: n/a
RadSh. Ins Pt: n/a
PccRl. Ins Pt: n/a
Entries : 1
Description : (Not Specified)
-------------------------------------------------------------------------------
Filter Match Criteria : IP
-------------------------------------------------------------------------------
Entry : 10
Description : (Not Specified)
Log Id : n/a
Src. IP : 0.0.0.0/0
Src. Port : n/a
Dest. IP : 10.16.0.253/32
Dest. Port : n/a
Protocol : Undefined Dscp : Undefined
ICMP Type : Undefined ICMP Code : Undefined
Fragment : Off Src Route Opt : Off
Sampling : Off Int. Sampling : On
IP-Option : 0/0 Multiple Option: Off
TCP-syn : Off TCP-ack : Off
Option-pres : Off
Egress PBR : Undefined
Match action : Forward (ESI) Active
ESI : ff:00:00:00:00:21:5f:00:df:e5
SF IP : 10.10.10.1
VAS If name: evi-301
Router : 300
PBR Down Act : Forward (filter-default-action) Ing. Matches : 3 pkts (318 bytes)
Egr. Matches : 0 pkts
===============================================================================
*A:PE1# show service id 300 es-pbr
===============================================================================
L3 ES PBR
===============================================================================
SF IP ESI Users Status
Interface MAC
VTEP:VNI
-------------------------------------------------------------------------------
10.10.10.1 ff:00:00:00:00:21:5f:00:df:e5 1 Active
evi-301 d8:47:01:01:00:0a
192.0.2.71:7171
-------------------------------------------------------------------------------
Number of entries : 1
-------------------------------------------------------------------------------
=================================================================================
In the FDB for the R-VPLS 301, the MAC address is associated with the VTEP and VNI specified by the AD route, and not by the MAC/IP route anymore. When a PBR filter with a forward action to an ESI and SF-IP (Service Function IP) exists, a MAC route is auto-created by the system and this route has higher priority that the remote MAC, or IP routes for the MAC (see BGP and EVPN route selection for EVPN routes).
The following shows that the AD route creates a new EVPN-VXLAN binding and the MAC address associated with the SF-IP uses that 'binding':
*A:PE1# show service id 301 vxlan
===============================================================================
VPLS VXLAN, Ingress VXLAN Network Id: 301
===============================================================================
Egress VTEP, VNI
===============================================================================
VTEP Address Egress VNI Num. MACs Mcast Oper State L2 PBR
-------------------------------------------------------------------------------
192.0.2.69 301 1 Yes Up No
192.0.2.71 301 0 Yes Up No
192.0.2.71 7171 1 No Up No
-------------------------------------------------------------------------------
Number of Egress VTEP, VNI : 3
-------------------------------------------------------------------------------
===============================================================================
*A:PE1# show service id 301 fdb detail
===============================================================================
Forwarding Database, Service 301
===============================================================================
ServId MAC Source-Identifier Type Last Change
Age
-------------------------------------------------------------------------------
301 d8:45:ff:00:00:6a vxlan-1: EvpnS 06/15/15 21:55:27
192.0.2.69:301
301 d8:47:01:01:00:0a vxlan-1: EvpnS 06/15/15 22:32:56
192.0.2.71:7171
301 d8:48:ff:00:00:6a cpm Intf 06/15/15 21:54:12
-------------------------------------------------------------------------------
No. of MAC Entries: 3
-------------------------------------------------------------------------------
Legend: L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================
For Layer 2, if the AD route is withdrawn or the SF-IP ARP not resolved, the filter is inactive again. The user can control whether the matching packets are dropped or forwarded if the PBF target cannot be resolved by BGP.
EVPN VXLAN multihoming
SR OS supports EVPN VXLAN multihoming as specified in RFC8365. Similar to EVPN-MPLS, as described in EVPN for MPLS tunnels, ESs and virtual ESs can be associated with VPLS and R-VPLS services where BGP-EVPN VXLAN is enabled. EVPN multihoming for EVPN-VXLAN illustrates the use of ESs in EVPN VXLAN networks.
As described in EVPN multihoming in VPLS services, the multihoming procedures consist of three components:
Designated Forwarder (DF) election
split-horizon
aliasing
DF election is the mechanism by which the PEs attached to the same ES elect a single PE to forward all traffic (in case of single-active mode) or all BUM traffic (in case of all-active mode) to the multihomed CE. The same DF Election mechanisms described in EVPN for MPLS tunnels are supported for VXLAN services.
Split-horizon is the mechanism by which BUM traffic received from a peer ES PE is filtered so that it is not looped back to the CE that first transmitted the frame. It is applicable to all-active multihoming. This is illustrated in EVPN multihoming for EVPN-VXLAN, where PE4 receives BUM traffic from PE3 but, in spite of being the DF for ES-2, PE4 filters the traffic and does not send it back to host-1. While split-horizon filtering uses ESI-labels in EVPN MPLS services, an alternative procedure called ‟Local Bias” is applied in VXLAN services, as described in RFC 8365. In MPLS services, split-horizon filtering may be used in single-active mode to avoid in-flight BUM packets from being looped back to the CE during transient times. In VXLAN services, split-horizon filtering is only used with all-active mode.
Aliasing is the procedure by which PEs that are not attached to the ES can process non-zero MAC/IP and AD routes and create ES destinations to which per-flow ecmp can be applied. Aliasing only applies to all-active mode.
As an example, the configuration of an ES that is used for VXLAN services follows. Note that this ES can be used for VXLAN services and MPLS services (in both cases VPLS and Epipes).
A:PE-3# configure service system bgp-evpn ethernet-segment "ES-2"
A:PE-3>config>service>system>bgp-evpn>eth-seg# info
----------------------------------------------
esi 01:02:00:00:00:00:00:00:00:00
service-carving
mode manual
manual
preference non-revertive create
value 10
exit
exit
exit
multi-homing all-active
lag 1
no shutdown
----------------------------------------------
An example of configuration of a VXLAN service using the above ES follows:
A:PE-3# configure service vpls 1
A:PE-3>config>service>vpls# info
----------------------------------------------
vxlan instance 1 vni 1 create
exit
bgp
exit
bgp-evpn
evi 1
vxlan bgp 1 vxlan-instance 1
ecmp 2
auto-disc-route-advertisement
mh-mode network
no shutdown
exit
exit
stp
shutdown
exit
sap lag-1:30 create
no shutdown
exit
no shutdown
----------------------------------------------
The auto-disc-route-advertisement and mh-mode network commands are required in all services that are attached to at least one ES, and they must be configured in both, the PEs attached to the ES locally and the remote PEs in the same service. The former enables the advertising of multihoming routes in the service, whereas the latter activates the multihoming procedures for the service, including the local bias mode for split-horizon.
In addition, the configuration of vpls>bgp-evpn>vxlan>ecmp 2 (or greater) is required so that VXLAN ES destinations with two or more next hops can be used for per-flow load balancing. The following command shows how PE1, as shown in EVPN multihoming for EVPN-VXLAN, creates an ES destination composed of two VXLAN next hops.
A:PE-1# show service id 1 vxlan destinations
===============================================================================
Egress VTEP, VNI
===============================================================================
Instance VTEP Address Egress VNI Evpn/ Num.
Mcast Oper State L2 PBR Static MACs
-------------------------------------------------------------------------------
1 192.0.2.3 1 evpn 0
BUM Up No
1 192.0.2.4 1 evpn 0
BUM Up No
-------------------------------------------------------------------------------
Number of Egress VTEP, VNI : 2
-------------------------------------------------------------------------------
===============================================================================
===============================================================================
BGP EVPN-VXLAN Ethernet Segment Dest
===============================================================================
Instance Eth SegId Num. Macs Last Change
-------------------------------------------------------------------------------
1 01:02:00:00:00:00:00:00:00:00 1 04/01/2019 08:54:54
-------------------------------------------------------------------------------
Number of entries: 1
-------------------------------------------------------------------------------
===============================================================================
A:PE-1# show service id 1 vxlan esi 01:02:00:00:00:00:00:00:00:00
===============================================================================
BGP EVPN-VXLAN Ethernet Segment Dest
===============================================================================
Instance Eth SegId Num. Macs Last Change
-------------------------------------------------------------------------------
1 01:02:00:00:00:00:00:00:00:00 1 04/01/2019 08:54:54
-------------------------------------------------------------------------------
Number of entries: 1
-------------------------------------------------------------------------------
===============================================================================
===============================================================================
BGP EVPN-VXLAN Dest TEP Info
===============================================================================
Instance TEP Address Egr VNI Last Change
-------------------------------------------------------------------------------
1 192.0.2.3 1 04/01/2019 08:54:54
1 192.0.2.4 1 04/01/2019 08:54:54
-------------------------------------------------------------------------------
Number of entries : 2
-------------------------------------------------------------------------------
===============================================================================
Local bias for EVPN VXLAN multihoming
EVPN MPLS, as described in EVPN for MPLS tunnels, uses ESI-labels to identify the BUM traffic sourced from a specified ES. The egress PE performs a label lookup to find the ESI label below the EVI label and to determine if a frame can be forwarded to a local ES. Because VXLAN does not support ESI-labels, or any MPLS label for that matter, the split-horizon filtering must be based on the tunnel source IP address. This also implies that the SAP-to-SAP forwarding rules must be changed when the SAPs belong to local ESs, irrespective of the DF state. This new forwarding is what RFC 8365 refers to as local bias. EVPN-VXLAN multihoming with local bias illustrates the local bias forwarding behavior.
Local bias is based on the following principles:
Every PE knows the IP addresses associated with the other PEs with which it has shared multihomed ESs.
When the PE receives a BUM frame from a VXLAN bind, it looks up the source IP address in the tunnel header and filters out the frame on all local interfaces connected to ESs that are shared with the ingress PE.
With this approach, the ingress PE must perform replication locally to all directly-attached ESs (regardless of the DF Election state) for all flooded traffic coming from the access interfaces. BUM frames received on any SAP are flooded to:
local non-ES SAPs and non-ES SDP-binds
local all-active ES SAPs (DF and NDF)
local single-active ES SDP-binds and SAPs (DF only)
EVPN-VXLAN destinations
As an example, in EVPN-VXLAN multihoming with local bias, PE2 receives BUM traffic from Host-3 and it forwards it to the remote PEs and the local ES SAP, even though the SAP is in NDF state.
The following rules apply to egress PE forwarding for EVPN-VXLAN services:
The source VTEP is looked up for BUM frames received on EVPN-VXLAN.
If the source VTEP matches one of the PEs with which the local PE shares both an ES and a VXLAN service:
the local PE is not forwarded to the shared ES local SAPs
the local PE forwards normally to ES SAPs unless they are in NDF state
Because there is no multicast label or multicast B-MAC in VXLAN, the egress PE only identifies BUM traffic using the customer MAC DA; as a result, BM or unknown MAC DAs identify BUM traffic.
For example, in EVPN-VXLAN multihoming with local bias, PE3 receives BUM traffic on VXLAN. PE3 identifies the source VTEP as a PE with which two ESs are shared, therefore it does not forward the BUM frames to the two shared ESs. It forwards to the non-shared ES (Host-5) because it is in DF state. PE4 receives BUM traffic and forwards it based on normal rules because it does not share any ESs with PE2.
The following command can be used to check whether the local PE has enabled the local bias procedures for a specific ES:
A:PE-2# tools dump service system bgp-evpn ethernet-segment "ES-1" local-bias
-------------------------------------------------------------------------------
[04/01/2019 08:45:08] Vxlan Local Bias Information
----------------------------------------------------------------------+--------
Peer | Enabled
----------------------------------------------------------------------+--------
192.0.2.3 | Yes
-------------------------------------------------------------------------------
Known limitations for local bias
In EVPN MPLS networks, an ingress PE that uses ingress replication to flood unknown unicast traffic pushes a BUM MPLS label that is different from a unicast label. The egress PEs use this BUM label to identify such BUM traffic to apply DF filtering for All-Active multihomed sites. In PBB-EVPN, in addition to the multicast label, the egress PE can also rely on the multicast B-MAC DA to identify customer BUM traffic.
In VXLAN there are no BUM labels or any tunnel indication that can assist the egress PE in identifying the BUM traffic. As such, the egress PE must solely rely on the C-MAC destination address, which may create some transient issues that are depicted in EVPN-VXLAN multihoming and unknown unicast issues.
As shown in EVPN-VXLAN multihoming and unknown unicast issues, top diagram, in absence of the mentioned unknown unicast traffic indication there can be transient duplicate traffic to All-Active multihomed sites under the following condition: CE1’s MAC address is learned by the egress PEs (PE1 and PE2) and advertised to the ingress PE3; however, the MAC advertisement has not been received or processed by the ingress PE, resulting in the host MAC address to be unknown on the ingress PE3 but known on the egress PEs. Therefore, when a packet destined for CE1 address arrives on PE3, it floods it through ingress replication to PE1 or PE2 and, because CE1’s MAC is known to PE1 and PE2, multiple copies are sent to CE1.
Another issue is shown at the bottom of EVPN-VXLAN multihoming and unknown unicast issues. In this case, CE1’s MAC address is known on the ingress PE3 but unknown on PE1 and PE2. If PE3’s aliasing hashing picks up the path to the ES’ NDF, a black-hole occurs.
The above two issues are solved in MPLS, as unicast known and unknown frames are identified with different labels.
Finally, another issue is described in Blackhole created by a remote SAP shutdown. Under normal circumstances, when CE3 sends BUM traffic to PE3, the traffic is ‟local-biased” to PE3’s SAP3 even though it is NDF for the ES. The flooded traffic to PE2 is forwarded to CE2, but not to SAP2 because the local bias split-horizon filtering takes place.
The right side of the diagram in Blackhole created by a remote SAP shutdown shows an issue when SAP3 is manually shutdown. In this case, PE3 withdraws the AD per-EVI route corresponding to SAP3; however, this does not change the local bias filtering for SAP2 in PE2. Therefore, when CE3 sends BUM traffic, it can neither be forwarded to CE23 via local SAP3 nor can it be forwarded by PE2.
Non-system IPv4 and IPv6 VXLAN termination for EVPN VXLAN multihoming
EVPN VXLAN multihoming is supported on VPLS and R-VPLS services when the PEs use non-system IPv4 or IPv6 termination, however, as with EVPN VPWS services, additional configuration steps are required.
The configure service system bgp-evpn eth-seg es-orig-ip ip-address command must be configured with the non-system IPv4 or IPv6 address used for the EVPN-VXLAN service. This command modifies the originating-ip field in the ES routes advertised for the Ethernet Segment, and makes the system use this IP address when adding the local PE as DF candidate.
The configure service system bgp-evpn eth-seg route-next-hop ip-address command must also be configured with the non-system IP address. This command changes the next-hop of the ES and AD per-ES routes to the configured address.
Finally, the non-system IP address (in each of the PEs in the ES) must match in these three commands for the local PE to be considered suitable for DF election:
es-orig-ip ip-address
route-next-hop ip-address
vxlan-src-vtep ip-address
EVPN for MPLS tunnels
This section provides information about EVPN for MPLS tunnels.
BGP-EVPN control plane for MPLS tunnels
EVPN routes and usage lists all the EVPN routes supported in 7750 SR, 7450 ESS, or 7950 XRS SR OS and their usage in EVPN-VXLAN, EVPN-MPLS, and PBB-EVPN.
EVPN route | Usage | EVPN-VXLAN | EVPN-MPLS | PBB-EVPN |
---|---|---|---|---|
Type 1 - Ethernet Auto-Discovery route (A-D) |
Mass-withdraw, ESI labels, Aliasing |
Y |
Y |
— |
Type 2 - MAC/IP Advertisement route |
MAC/IP advertisement, IP advertisement for ARP resolution |
Y |
Y |
Y |
Type 3 - Inclusive Multicast Ethernet Tag route |
Flooding tree setup (BUM flooding) |
Y |
Y |
Y |
Type 4 - ES route |
ES discovery and DF election |
Y |
Y |
Y |
Type 5 - IP Prefix advertisement route |
IP Routing |
Y |
Y |
— |
Type 6 - Selective Multicast Ethernet Tag route |
Signal interest on a multicast group |
Y |
Y |
— |
Type 7 - Multicast Join Synch route |
Join a multicast group on a multihomed ES |
Y |
Y |
— |
Type 8 - Multicast Leave Synch route |
Leave a multicast group on a multihomed ES |
Y |
Y |
— |
Type 10 - Selective Provider Multicast Service Interface Auto-Discovery route | Signal and setup Selective Provider Tunnels for IP Multicast | - | Y | - |
RFC 7432 describes the BGP-EVPN control plane for MPLS tunnels. If EVPN multihoming is not required, two route types are needed to set up a basic EVI (EVPN Instance): MAC/IP Advertisement and the Inclusive Multicast Ethernet Tag routes. If multihoming is required, the ES and the Auto-Discovery routes are also needed.
The route fields and extended communities for route types 2 and 3 are shown in EVPN-VXLAN required routes and communities. BGP-EVPN control plane for VXLAN overlay tunnels The changes compared to their use in EVPN-VXLAN are described below.
EVPN route type 3 - inclusive multicast Ethernet tag route
As in EVPN-VXLAN, route type 3 is used for setting up the flooding tree (BUM flooding) for a specified VPLS service. The received inclusive multicast routes add entries to the VPLS flood list in the 7750 SR, 7450 ESS, and 7950 XRS. Ingress replication, p2mp mLDP, and composite tunnels are supported as tunnel types in route type 3 when BGP-EVPN MPLS is enabled
The following route values are used for EVPN-MPLS services:
-
Route Distinguisher is taken from the RD of the VPLS service within the BGP context. The RD can be configured or derived from the bgp-evpn evi value.
-
Ethernet Tag ID is 0.
-
IP address length is always 32.
-
Originating router's IP address carries an IPv4 or IPv6 address.
-
The PMSI attribute can have different formats depending on the tunnel type enabled in the service.
-
Tunnel type = Ingress replication (6)
The route is referred to as an Inclusive Multicast Ethernet Tag IR (IMET-IR) route and the PMSI Tunnel Attribute (PTA) fields are populated as follows:
-
Leaf not required for Flags.
-
MPLS label carries the MPLS label allocated for the service in the high-order 20 bits of the label field.
Unless bgp-evpn mpls ingress-replication-bum-label is configured in the service, the MPLS label used is the same as that used in the MAC/IP routes for the service.
-
Tunnel endpoint is equal to the originating IP address.
-
-
Tunnel type=p2mp mLDP (2)
The route is referred to as an IMET-P2MP route and its PTA fields are populated as follows:
-
Leaf not required for Flags.
-
MPLS label is 0.
-
Tunnel endpoint includes the route node address and an opaque number. This is the tunnel identifier that the leaf-nodes use to join the mLDP P2MP tree.
-
-
Tunnel type=Composite tunnel (130)
The route is referred to as an IMET-P2MP-IR route and its PTA fields are populated as follows:
-
Leaf not required for Flags.
-
MPLS label 1 is 0.
-
Tunnel endpoint identifier includes the following:
- MPLS label2
- non-zero, downstream allocated label (like any other IR label). The leaf-nodes use the label to set up an EVPN-MPLS destination to the root and add it to the default-multicast list.
- mLDP tunnel identifier
- the route node address and an opaque number. This is the tunnel identifier that the leaf-nodes use to join the mLDP P2MP tree.
-
-
IMET-P2MP-IR routes are used in EVIs with a few root nodes and a significant number of leaf-only PEs. In this scenario, a combination of P2MP and IR tunnels can be used in the network, such that the root nodes use P2MP tunnels to send broadcast, Unknown unicast, and Multicast traffic but the leaf-PE nodes use IR to send traffic to the roots. This use case is documented in IETF RFC 8317 and the main advantage it offers is the significant savings in P2MP tunnels that the PE/P routers in the EVI need to handle (as opposed to a full mesh of P2MP tunnels among all the PEs in an EVI).
In this case, the root PEs signals a special tunnel type in the PTA, indicating that they intend to transmit BUM traffic using an mLDP P2MP tunnel but they can also receive traffic over an IR evpn-mpls binding. An IMET route with this special ‟composite” tunnel type in the PTA is called an IMET-P2MP-IR route and the encoding of its PTA is shown in Composite p2mp mLDP and IR tunnels—PTA.
EVPN route type 2 - MAC/IP advertisement route
The 7750 SR, 7450 ESS, or 7950 XRS router generates this route type for advertising MAC addresses (and IP addresses if proxy-ARP/proxy-ND is enabled). If mac-advertisement is enabled, the router generates MAC advertisement routes for the following:
-
learned MACs on SAPs or SDP bindings
-
conditional static MACs
Note: The unknown-mac-route is not supported for EVPN-MPLS services.
The route type 2 generated by a router uses the following fields and values:
-
Route Distinguisher is taken from the RD of the VPLS service within the BGP context. The RD can be configured or derived from the bgp-evpn evi value.
-
Ethernet Segment Identifier (ESI) is zero for MACs learned from single-homed CEs and different from zero for MACs learned from multihomed CEs.
-
Ethernet Tag ID is 0.
-
MAC address length is always 48.
-
MAC address can be learned or statically configured.
-
IP address and IP address length:
-
It is the IP address associated with the MAC being advertised with a length of 32 (or 128 for IPv6).
-
In general, any MAC route without IP has IPL=0 (IP length) and the IP is omitted.
-
When received, any IPL value not equal to zero, 32, or 128 discards the route.
-
MPLS Label 1 carries the MPLS label allocated by the system to the VPLS service. The label value is encoded in the high-order 20 bits of the field and is the same label used in the routes type 3 for the same service unless bgp-evpn mpls ingress-replication-bum-label is configured in the service.
-
-
MPLS Label 2 is 0.
-
The MAC mobility extended community is used for signaling the sequence number in case of MAC moves and the sticky bit in case of advertising conditional static MACs. If a MAC route is received with a MAC mobility ext-community, the sequence number and the 'sticky' bit are considered for the route selection.
When EVPN multihoming is enabled in the system, two more routes are required. EVPN routes type 1 and 4 shows the fields in routes type 1 and 4 and their associated extended communities.
EVPN route type 1 - Ethernet auto-discovery route (AD route)
The 7750 SR, 7450 ESS, or 7950 XRS router generates this route type for advertising for multihoming functions. The system can generate two types of AD routes:
-
Ethernet AD route per-ESI (Ethernet Segment ID)
-
Ethernet AD route per-EVI (EVPN Instance)
The Ethernet AD per-ESI route generated by a router uses the following fields and values:
-
Route Distinguisher is taken from the system level RD or service level RD.
-
Ethernet Segment Identifier (ESI) contains a 10-byte identifier as configured in the system for a specified ethernet-segment.
-
Ethernet Tag ID is MAX-ET (0xFFFFFFFF). This value is reserved and used only for AD routes per ESI.
-
MPLS label is 0.
-
ESI Label Extended community includes the single-active bit (0 for all-active and 1 for single-active) and ESI label for all-active multihoming split-horizon.
-
Route target extended community is taken from the service level RT or an RT-set for the services defined on the Ethernet segment.
The system can either send a separate Ethernet AD per-ESI route per service, or a few Ethernet AD per-ESI routes aggregating the route-targets for multiple services. While both alternatives inter-operate, RFC 7432 states that the EVPN Auto-Discovery per-ES route must be sent with a set of route-targets corresponding to all the EVIs defined on the Ethernet Segment (ES). Either option can be enabled using the command: config>service>system>bgp-evpn#ad-per-es-route-target <[evi-rt ] | [evi-rt-set]> route-distinguisher ip-address [extended-evi-range]
The default option ad-per-es-route-target evi-rt configures the system to send a separate AD per-ES route per service. When enabled, the evi-rt-set option supports route aggregation: a single AD per-ES route with the associated RD (ip-address:1) and a set of EVI route targets are advertised (up to a maximum of 128). When the number of EVIs defined in the Ethernet Segment is significant (therefore the number of route-targets), the system sends more than one route. For example:
-
AD per-ES route for evi-rt-set 1 is sent with RD ip-address:1
-
AD per-ES route for evi-rt-set 2 is sent with RD ip-address:2
-
up to an AD per-ES route is sent with RD ip-address:512
The extended-evi-range option is needed for the use of evi-rt-set with a comm-val extended range of 1 through 65535. This option is recommended when EVIs greater than 65535 are configured in some services. In this case, there are more EVIs for which the route-targets must be packed in the AD per-ES routes. This command option extends the maximum number of AD per-ES routes that can be sent (since the RD now supports up to ip-address:65535) and allows many more route-targets to be included in each set.
The Ethernet AD per-EVI route generated by a router uses the following fields and values:
-
Route Distinguisher is taken from the service level RD.
-
Ethernet Segment Identifier (ESI) contains a 10-byte identifier as configured in the system for a specified Ethernet Segment.
-
Ethernet Tag ID is 0.
-
MPLS label encodes the unicast label allocated for the service (high-order 20 bits).
-
Route-target extended community is taken from the service level RT.
EVPN route type 4 - ES route
The router generates this route type for multihoming ES discovery and DF (Designated Forwarder) election.
-
Route Distinguisher is taken from the service level RD.
-
Ethernet Segment Identifier (ESI) contains a 10-byte identifier as configured in the system for a specified ethernet-segment.
-
The value of ES-import route-target community is automatically derived from the MAC address portion of the ESI. This extended community is treated as a route-target and is supported by RT-constraint (route-target BGP family).
EVPN route type 5 - IP prefix route
IP Prefix Routes are also supported for MPLS tunnels. The route fields for route type 5 are shown in EVPN route-type 5 . The 7750 SR, 7450 ESS, or 7950 XRS router generates this route type for advertising IP prefixes in EVPN using the same fields that are described in section BGP-EVPN control plane for VXLAN overlay tunnels, with the following exceptions:
-
MPLS label carries the MPLS label allocated for the service.
-
This route is sent with the RFC 5512 tunnel encapsulation extended community with the tunnel type value set to MPLS
RFC 5512 - BGP tunnel encapsulation extended community
The following routes are sent with the RFC 5512 BGP Encapsulation Extended Community: MAC/IP, Inclusive Multicast Ethernet Tag, and AD per-EVI routes. ES and AD per-ESI routes are not sent with this Extended Community.
The router processes the following BGP Tunnel Encapsulation tunnel values registered by IANA for RFC 5512:
-
VXLAN encapsulation is 8.
-
MPLS encapsulation is 10.
Any other tunnel value makes the route 'treat-as-withdraw'.
If the encapsulation value is MPLS, the BGP validates the high-order 20-bits of the label field, ignoring the low-order 4 bits. If the encapsulation is VXLAN, the BGP takes the entire 24-bit value encoded in the MPLS label field as the VNI.
If the encapsulation extended community (as defined in RFC 5512) is not present in a received route, BGP treats the route as an MPLS or VXLAN-based configuration of the config>router>bgp>neighbor# def-recv-evpn-encap [mpls | vxlan] command. The command is also available at the bgp and group levels.
EVPN for MPLS tunnels in VPLS services (EVPN-MPLS)
EVPN can be used in MPLS networks where PEs are interconnected through any type of tunnel, including RSVP-TE, Segment-Routing TE, LDP, BGP, Segment Routing IS-IS, Segment Routing OSPF, RIB-API, MPLS-forwarding-policy, SR-Policy, or MPLSoUDP. As with VPRN services, tunnel selection for a VPLS service (with BGP-EVPN MPLS enabled) is based on the auto-bind-tunnel command. The BGP EVPN routes next-hops can be IPv4 or IPv6 addresses and can be resolved to a tunnel in the IPv4 tunnel-table or IPv6 tunnel-table.
EVPN-MPLS is modeled similar to EVPN-VXLAN, that is, using a VPLS service where EVPN-MPLS ‟bindings” can coexist with SAPs and SDP bindings. The following shows an example of a VPLS service with EVPN-MPLS.
*A:PE-1>config>service>vpls# info
----------------------------------------------
description "evpn-mpls-service"
bgp
exit
bgp-evpn
evi 10
mpls bgp 1
no shutdown
auto-bind-tunnel resolution any
exit
sap 1/1/1:1 create
exit
spoke-sdp 1:1 create
First configure a bgp-evpn context where VXLAN must be disabled and MPLS enabled. In addition to enabling MPLS the command, the minimum set of commands to be configured to set up the EVPN-MPLS instance are the evi and the auto-bind-tunnel resolution commands. The relevant configuration options are the following.
evi {1..16777215} — This EVPN identifier is unique in the system and is used for the service-carving algorithm used for multihoming (if configured), and for auto-deriving the route target and route distinguishers (if lower than 65535) in the service. It can be used for EVPN-MPLS and EVPN-VXLAN services.
The following options are supported:
- If this EVPN identifier is not specified, the value is zero and no route distinguisher or route target is automatically derived from it.
- If the specified EVPN identifier is lower than 65535 and no other route distinguisher or
route target is configured in the service, the following applies:
- The route distinguisher is derived from <system_ip>:evi.
- The route target is derived from <autonomous-system>:evi.
- If the specified EVPN identifier is higher than 65535 and no other route distinguisher or
route target is configured in the service, the following applies:
- The route distinguisher cannot be automatically derived. An error is generated if enabling EVPN is attempted without a route distinguisher. A manual or an auto-rd route distinguisher must be configured.
- The route target can only be automatically derived if the
evi-three-byte-auto-rt command is configured. If configured, the
route target is automatically derived in accordance with the following rules described
in RFC8365.
- The route target is composed of ASN(2-octets):A/type/D-ID/EVI.
- The ASN is a 2-octect value configured in the system. For AS numbers exceeding the 2-byte limit, the low order 16-bit value is used.
- The A=0 value is used for auto-derivation.
- The type=4 (EVI-based) is used.
- The BGP instance is encoded using D-ID= [1..2]. This allows the automatic derivation of different RTs in multi-instance services. The value is inherited from the corresponding BGP instance.
- EVI indicates the configured EVI in the service
For example, consider a service with the following characteristics:
- ASN=64500
- VPLS with two BGP instances, bgp 1 for VXLAN-instance 1 and bgp 2 for EVPN-MPLS
- EVI=100000
The automatically derived route targets for this service are:
- bgp 1 — 64500:1090619040 (ASN:0x410186A0)
- bgp 2 — 64500:1107396256 (ASN:0x420186A0)
If this EVPN identifier is not specified, the value is zero and no route distinguisher or route targets is automatically derived from it. If specified and no other route distinguisher/route target are configured in the service:, then the following applies:
-
the route distinguisher is derived from: <system_ip>:evi
-
the route target is derived from: <autonomous-system>:evi
When the evi is configured, a configure service vpls bgp node (even empty) is required to allow the user to see the correct information about the show service id 1 bgp and show service system bgp-route-distinguisher commands.
The configuration of an evi is enforced for EVPN services with SAPs/SDP bindings in an ethernet-segment. See EVPN multihoming in VPLS services for more information about ESs.
The following options are specific to EVPN-MPLS (and defined in configure service vpls bgp-evpn mpls):
-
control word
Enable or disable control word capability to guarantee interoperability to other vendors. When enabled along with the following command, the control word capability is signaled in the C flag of the EVPN Layer 2 Attributes extended community, as per draft-ietf-bess-rfc7432bis;- MD-CLI
configure service vpls bgp-evpn routes incl-mcast advertise-l2-attributes
- classic
CLI
configure service vpls bgp-evpn incl-mcast-l2-attributes-advertisement
Note: The control-word is required as per RFC 7432 to avoid frame disordering. - MD-CLI
- hash-label
Enables or disables the use of hash-label (also known as Flow Aware Transport label) in the EVPN unicast destinations. Similar to the control-word command, when the hash-label command is enabled along with the incl-mcast-l2-attributes-advertisement (advertise-l2-attributes in classic CLI) command, the F flag capability is signaled in the EVPN Layer 2 Attributes extended community, as per draft-ietf-bess-rfc7432bis. In addition:
- When the hash-label is enabled and advertise-l2-attributes false is configured, the hash-label is always pushed to a unicast EVPN destination. The hash label is never used for BUM packets, as per draft-ietf-bess-rfc7432bis.
- When hash-label is enabled and advertise-l2-attributes true is configured, the F bit is set in the Layer-2 Attributes extended community of the EVPN Inclusive Multicast Ethernet Tag (IMET) route for the service. The hash-label towards a specific remote PE is pushed in the datapath only if the remote PE previously signaled support for hash-label (F=1). Otherwise, the unicast EVPN destination is brought operationally down, with the corresponding operational flag indicating the reason.
-
auto bind tunnel
Select which type of MPLS transport tunnel to use for a particular instance; this command is used in the same way as in VPRN services.
For BGP-EVPN MPLS, you must explicitly add BGP to the resolution filter in EVPN (BGP is implicit in VPRNs).
-
force VLAN VC forwarding
This option allows the system to preserve the VLAN ID and pbits of the service-delimiting qtag in a new tag added in the customer frame before sending it to the EVPN core.
Note: You can use this option in conjunction with the sap ingress vlan-translation command. If so, the configured translated VLAN ID is sent to the EVPN binds as opposed to the service-delimiting tag VLAN ID. If the ingress SAP/binding is null-encapsulated, the output VLAN ID and pbits are zero.
-
force QinQ VC forwarding with c-tag-c-tag or s-tag-c-tag
This command allows the system to preserve the VLAN ID and pbits of the service-delimiting Q-tags (up to two tags) in customer frames before sending them to the EVPN core.
Note:You can use this option in conjunction with the sap ingress qinq-vlan-translation s-tag.c-tag command. If so, the configured translated S-tag and C-tag VLAN IDs are the VLAN IDs sent to the EVPN binds as opposed to the service-delimiting tags VLAN IDs. If the ingress SAP or binding is null-encapsulated, the output VLAN ID and pbits are zero.
-
split horizon group
This command allows the association of a user-created split horizon group to all the EVPN-MPLS destinations. See EVPN and VPLS integration for more information.
-
ecmp
Set this option to a value greater than 1 to activate aliasing to the remote PEs that are defined in the same all-active multihoming ES. See EVPN all-active multihoming for more information.
-
ingress replication bum label
You can use this option when you want the PE to advertise a label for BUM traffic (Inclusive Multicast routes) that is different from the label advertised for unicast traffic (with the MAC/IP routes). This is useful to avoid potential transient packet duplication in all-active multihoming.
In addition to these options, the following BGP EVPN options are also available for EVPN-MPLS services:
-
mac-advertisement
-
mac-duplication and settings
-
incl-mcast advertise-l2-attributes (MD-CLI)
incl-mcast-l2-attributes-advertisement (classic CLI)
This function enables the advertisement and processing of the EVPN Layer 2 Attributes extended community. The control-word, hash-label configuration, and the Service-MTU value are advertised in the extended community. On reception, the received MTU, hash-label and control-word flags are compared with the local MTU and hash-label or control-word configuration. In case of a mismatch in any of the three settings, the EVPN destination goes operationally down with the corresponding operational flag indicating what the mismatch is. The absence of an IMET route from an egress PE or the absence of the EVPN L2 Attributes extended community on a received IMET route from the PE, causes the route to bring down the EVPN destinations to that PE.
-
ignore-mtu-mismatch
This command makes the router ignore the received Layer 2 MTU in the EVPN L2 Attributes extended community of the IMET route for a peer. If disabled, the local service MTU is compared against the received Layer 2 MTU. If there is a mismatch, the EVPN destinations to the peer stay oper-state down.
When EVPN-MPLS is established among some PEs in the network, EVPN unicast and multicast 'bindings' are created on each PE to the remote EVPN destinations. A specified ingress PE creates:
-
A unicast EVPN-MPLS destination binding to a remote egress PE as soon as a MAC/IP route is received from that egress PE.
-
A multicast EVPN-MPLS destination binding to a remote egress PE, if and only if the egress PE advertises an Inclusive Multicast Ethernet Tag Route with a BUM label. That is only possible if the egress PE is configured with ingress-replication-bum-label.
Those bindings, as well as the MACs learned on them, can be checked through the following show commands. In the following example, the remote PE(192.0.2.69) is configured with no ingress-replication-bum-label and PE(192.0.2.70) is configured with ingress-replication-bum-label. Therefore, Dut has a single EVPN-MPLS destination binding to PE(192.0.2.69) and two bindings (unicast and multicast) to PE(192.0.2.70).
show service id 1 evpn-mpls
===============================================================================
BGP EVPN-MPLS Dest
===============================================================================
TEP Address Transport:Tnl Egr Label Oper Mcast Num
State MACs
-------------------------------------------------------------------------------
192.0.2.69 ldp:65537 524118 Up bum 0
192.0.2.70 ldp:65538 524160 Up none 1
192.0.2.70 ldp:65538 524164 Up bum 0
192.0.2.72 ldp:65547 524144 Up bum 0
192.0.2.72 ldp:65547 524138 Up none 2
192.0.2.73 ldp:65548 524148 Up bum 1
192.0.2.254 ldp:65550 524150 Up bum 0
-------------------------------------------------------------------------------
Number of entries : 7
-------------------------------------------------------------------------------
===============================================================================
show service id 1 fdb detail
===============================================================================
Forwarding Database, Service 1
===============================================================================
ServId MAC Source-Identifier Type Last Change
Age
-------------------------------------------------------------------------------
1 00:ca:fe:ca:fe:69 eMpls: EvpnS 06/11/15 21:53:48
192.0.2.69:262118
1 00:ca:fe:ca:fe:70 eMpls: EvpnS 06/11/15 19:59:57
192.0.2.70:262140
1 00:ca:fe:ca:fe:72 eMpls: EvpnS 06/11/15 19:59:57
192.0.2.72:262141
-------------------------------------------------------------------------------
No. of MAC Entries: 3
-------------------------------------------------------------------------------
Legend: L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================
EVPN and VPLS integration
The 7750 SR, 7450 ESS, or 7950 XRS router SR OS EVPN implementation supports RFC 8560 so that EVPN-MPLS and VPLS can be integrated into the same network and within the same service. Because EVPN is not deployed in green-field networks, this feature is useful for the integration between both technologies and even for the migration of VPLS services to EVPN-MPLS.
-
Systems with EVPN endpoints and SDP bindings to the same far-end bring down the SDP bindings.
-
The router allows the establishment of an EVPN endpoint and an SDP binding to the same far-end but the SDP binding is kept operationally down. Only the EVPN endpoint remains operationally up. This is true for spoke SDPs (manual, BGP-AD, and BGP-VPLS) and mesh SDPs. It is also possible between VXLAN and SDP bindings.
-
If there is an existing EVPN endpoint to a specified far-end and a spoke SDP establishment is attempted, the spoke SDP is setup but kept down with an operational flag indicating that there is an EVPN route to the same far-end.
-
If there is an existing spoke SDP and a valid/used EVPN route arrives, the EVPN endpoint is setup and the spoke SDP is brought down with an operational flag indicating that there is an EVPN route to the same far-end.
-
In the case of an SDP binding and EVPN endpoint to different far-end IPs on the same remote PE, both links are up. This can happen if the SDP binding is terminated in an IPv6 address or IPv4 address different from the system address where the EVPN endpoint is terminated.
-
-
The user can add spoke SDPs and all the EVPN-MPLS endpoints in the same split horizon group (SHG).
-
A CLI command is added under the bgp-evpn>mpls> context so that the EVPN-MPLS endpoints can be added to a split horizon group: bgp-evpn>mpls> [no] split-horizon-group group-name
-
The bgp-evpn mpls split-horizon-group must reference a user-configured split horizon group. User-configured split horizon groups can be configured within the service context. The same group-name can be associated with SAPs, spoke SDPs, pw-templates, pw-template-bindings, and EVPN-MPLS endpoints.
-
If the split-horizon-group command in bgp-evpn>mpls> is not used, the default split horizon group (that contains all the EVPN endpoints) is still used, but it is not possible to refer to it on SAPs/spoke SDPs.
-
SAPs and SDP bindings that share the same split horizon group of the EVPN-MPLS provider-tunnel are brought operationally down if the point-to-multipoint tunnel is operationally up.
-
-
The system disables the advertisement of MACs learned on spoke SDPs and SAPs that are part of an EVPN split horizon group.
-
When the SAPs and spoke SDPs (manual or BGP-AD/VPLS-discovered) are configured within the same split horizon group as the EVPN endpoints, MAC addresses are still learned on them, but they are not advertised in EVPN.
-
The preceding statement is also true if proxy-ARP/proxy-ND is enabled and an IP-MAC pair is learned on a SAP or SDP binding that belongs to the EVPN split horizon group.
-
The SAPs or spoke SDPs, or both, added to an EVPN split horizon group should not be part of any EVPN multihomed ES. If that happened, the PE would still advertise the AD per-EVI route for the SAP or spoke SDP, or both, attracting EVPN traffic that could not possibly be forwarded to that SAP or SDP binding, or both.
-
Similar to the preceding statement, a split horizon group composed of SAPs/SDP bindings used in a BGP-MH site should not be configured under bgp-evpn>mpls>split-horizon-group. This misconfiguration would prevent traffic being forwarded from the EVPN to the BGP-MH site, regardless of the DF/NDF state.
EVPN-VPLS integration shows an example of EVPN-VPLS integration.
An example CLI configuration for PE1, PE5, and PE2 is provided below.
*A:PE1>config>service# info ---------------------------------------------- pw-template 1 create vpls 1 name "vpls-1" customer 1 create split-horizon-group "SHG-1" create bgp route-target target:65000:1 pw-template-binding 1 split-horizon-group SHG-1 exit bgp-ad no shutdown vpls-id 65000:1 exit bgp-evpn evi 1 mpls bgp 1 no shutdown split-horizon-group SHG-1 exit spoke-sdp 12:1 create exit sap 1/1/1:1 create exit *A:PE5>config>service# info ---------------------------------------------- pw-template 1 create exit vpls 1 customer 1 create bgp route-target target:65000:1 pw-template-binding 1 split-horizon-group SHG-1 # auto-created SHG exit bgp-ad no shutdown vpls-id 65000:1 exit spoke-sdp 52:1 create exit *A:PE2>config>service# info ---------------------------------------------- vpls 1 name "vpls-1" customer 1 create end-point CORE create no suppress-standby-signaling exit spoke-sdp 21:1 end-point CORE precedence primary exit spoke-sdp 25:1 end-point CORE
-
PE1, PE3, and PE4 have BGP-EVPN and BGP-AD enabled in VPLS-1. PE5 has BGP-AD enabled and PE2 has active/standby spoke SDPs to PE1 and PE5.
In this configuration:
-
PE1, PE3, and PE4 attempt to establish BGP-AD spoke SDPs, but they are kept operationally down as long as there are EVPN endpoints active among them.
-
BGP-AD spoke SDPs and EVPN endpoints are instantiated within the same split horizon group, for example, SHG-1.
-
Manual spoke SDPs from PE1 and PE5 to PE2 are not part of SHG-1.
-
-
EVPN MAC advertisements:
-
MACs learned on FEC128 spoke SDPs are advertised normally in EVPN.
-
MACs learned on FEC129 spoke SDPs are not advertised in EVPN (because they are part of SHG-1, which is the split horizon group used for bgp-evpn>mpls). This prevents any data plane MACs learned on the SHG from being advertised in EVPN.
-
-
BUM operation on PE1:
-
When CE1 sends BUM, PE1 floods to all the active bindings.
-
When CE2 sends BUM, PE2 sends it to PE1 (active spoke SDP) and PE1 floods to all the bindings and SAPs.
-
When CE5 sends BUM, PE5 floods to the three EVPN PEs. PE1 floods to the active spoke SDP and SAPs, never to the EVPN PEs because they are part of the same SHG.
-
-
The operation in services with BGP-VPLS and BGP-EVPN is equivalent to the one described above for BGP-AD and BGP-EVPN.
EVPN single-active multihoming and BGP-VPLS integration
In a VPLS service to which multiple EVPN PEs and BGP-VPLS PEs are attached, single-active multihoming is supported on two or more of the EVPN PEs with no special considerations. All-active multihoming is not supported, because the traffic from the all-active multihomed CE could cause a MAC flip-flop effect on remote BGP-VPLS PEs, asymmetric flows, or other issues.
BGP-VPLS to EVPN integration and single-active MH illustrates a scenario with a single-active Ethernet-segment used in a service where EVPN PEs and BGP-VPLS are integrated.
Although other single-active examples are supported, in BGP-VPLS to EVPN integration and single-active MH, CE1 is connected to the EVPN PEs through a single LAG (lag-1). The LAG is associated with the Ethernet-segment 1 on PE1 and PE2, which is configured as single-active and with oper-group 1. PE1 and PE2 make use of lag>monitor-oper-group 1 so that the non-DF PE can signal the non-DF state to CE1 (in the form of LACP out-of-synch or power-off).
In addition to the BGP-VPLS routes sent for the service ve-id, the multihoming PEs in this case need to generate additional BGP-VPLS routes per Ethernet Segment (per VPLS service) for the purpose of MAC flush on the remote BGP-VPLS PEs in case of failure.
The sap>bgp-vpls-mh-veid number command should be configured on the SAPs that are part of an EVPN single-active Ethernet Segment, and allows the advertisement of L2VPN routes that indicate the state of the multihomed SAPs to the remote BGP-VPLS PEs. Upon a Designated Forwarder (DF) switchover, the F and D bits of the generated L2VPN routes for the SAP ve-id are updated so that the remote BGP-VPLS PEs can perform a mac-flush operation on the service and avoid blackholes.
As an example, in case of a failure on the Ethernet-segment sap on PE1, PE1 must indicate PE3 and PE4 the need to flush MAC addresses learned from PE1 (flush-all-from-me message). Otherwise, for example, PE3 continues sending traffic with MAC DA = CE1 to PE1, and PE1 blackholes the traffic.
In the BGP-VPLS to EVPN integration and single-active MH example:
Both ES peers (PE1 and PE2) should be configured with the same ve-id for the ES SAP. However, this is not mandatory.
In addition to the regular service ve-id L2VPN route, based on the sap>bgp-vpls-mh-ve-id configuration and upon BGP VPLS being enabled, the PE advertises an L2VPN route with the following fields:
ve-id = sap>bgp-vpls-mh-ve-id identifier
RD, RT, next hop and other attributes same as the service BGP VPLS route
L2VPN information extended community with the following flags:
D=0 if the SAP is oper-up or oper-down with a flag MHStandby (for example, the PE is non-DF in single-active MH)
D=0 also if there is an ES oper-group and the port is down because of the oper-group
D=1 if the SAP is oper-down with a different flag (for example, port-down or admin-down)
F (DF bit) =1 if the SAP is oper-up, F=0 otherwise
Upon a failure on the access SAP, there are only mac-flush messages triggered in case the command bgp-vpls-mh-ve-id is configured in the failing SAP. In case it is configured with ve-id 1:
If the non-DF PE has a failure on the access SAP, PE2 sends an update with ve-id=1/D=1/F=0. This is an indication for PE3/PE4 that PE2's SAP is oper-down but it should not trigger a mac-flush on PE3/PE4.
If the DF PE has a failure on the SAP, PE1 advertises ve-id=1/D=1/F=0. Upon receiving this update, PE3 and PE4 flushes all their MACs associated with the PE1's spoke SDP. Note, that the failure on PE1, triggers an EVPN DF Election on PE2, which becomes DF and advertises ve-id=1/D=0/F=1. This message does not trigger any mac-flush procedures.
Other considerations:
PE3/PE4 are SR OS or any third-party PEs that support the procedures in draft-ietf-bess-vpls-multihoming, so that BGP-VPLS mac-flush signaling is understood.
PE1 and PE2 are expected to run an SR OS version that supports the sap>bgp-vpls-mh-veid number configuration on the multihomed SAPs. Otherwise, the mac-flush behavior would not work as expected.
The procedures described above are also supported if the EVPN PEs use MC-LAG instead of an ES for the CE1 redundancy. In this case, the SAP ve-id route for the standby PE is sent as ve-id=1/D=1/F=0, whereas the active chassis advertises ve-id=1/D=0/F=1. A switchover triggers mac-flush on the remote PEs as described earlier.
The L2VPN routes generated for the ES and SAPs with the sap bgp-vpls-mh-veid number command are decoded in the remote nodes as bgp-mh routes (because they do not have label information) in the show router bgp routes l2-vpn command and debug.
Auto-derived RD in services with multiple BGP families
In a VPLS service, multiple BGP families and protocols can be enabled at the same time. When bgp-evpn is enabled, bgp-ad and bgp-mh are also supported. A single RD is used per service and not per BGP family or protocol.
The following rules apply:
The VPLS RD is selected based on the following precedence:
Manual RD or automatic RD always take precedence when configured.
If no manual or automatic RD configuration, the RD is derived from the bgp-ad>vpls-id.
If manual RD, automatic RD, or VPLS ID are not configured, the RD is derived from the bgp-evpn>evi, except for bgp-mh and except when the EVI is greater than 65535. In these two cases, no EVI-derived RD is possible.
If manual RD, automatic RD, VPLS ID, or EVI is not configured, there is no RD and the service fails.
The selected RD (see preceding rules) is displayed by the Oper Route Dist field of the show service id bgp command.
The service supports dynamic RD changes. For example, the CLI allows the dynamic update of VPLS ID be , even if it is used to automatically derive the service RD for bgp-ad, bgp-vpls, or bgp-mh.
Note: When the RD changes, the active routes for that VPLS are withdrawn and readvertised with the new RD.If one of the mechanisms to derive the RD for a specified service is removed from the configuration, the system selects a new RD based on the preceding rules. For example, if the VPLS ID is removed from the configuration, the routes are withdrawn, the new RD selected from the EVI, and the routes readvertised with the new RD.
Note: This reconfiguration fails if the new RD already exists in a different VPLS or Epipe.Because the vpls-id takes precedence over the EVI when deriving the RD automatically, adding evpn to an existing bgp-ad service does not impact the existing RD. The latter is important to support bgp-ad to evpn migration.
EVPN multihoming in VPLS services
EVPN multihoming implementation is based on the concept of the ethernet-segment. An ethernet-segment is a logical structure that can be defined in one or more PEs and identifies the CE (or access network) multihomed to the EVPN PEs. An ethernet-segment is associated with port, LAG, PW port, or SDP objects and is shared by all the services defined on those objects. In the case of virtual ESs, individual VID or VC-ID ranges can be associated with the port, LAG, or PW port, SDP objects defined in the ethernet-segment.
Each ethernet-segment has a unique Ethernet Segment Identifier (ESI) that is 10 bytes long and is manually configured in the router.
This section describes the behavior of the EVPN multihoming implementation in an EVPN-MPLS service.
EVPN all-active multihoming
As described in RFC 7432, all-active multihoming is only supported on access LAG SAPs and it is mandatory that the CE is configured with a LAG to avoid duplicated packets to the network. Configuring the LACP is optional. SR OS also supports the association of a PW port or a normal port to an all-active multihoming ES. When the ES is associated with a physical port and not a LAG, the CE must be configured with a single LAG without LACP.
Three different procedures are implemented in 7750 SR, 7450 ESS, and 7950 XRS SR OS to provide all-active multihoming for a specified Ethernet-Segment:
DF (Designated Forwarder) election
Split-horizon
Aliasing
DF election shows the need for DF election in all-active multihoming.
The DF election in EVPN all-active multihoming avoids duplicate packets on the multihomed CE. The DF election procedure is responsible for electing one DF PE per ESI per service; the rest of the PEs being non-DF for the ESI and service. Only the DF forwards BUM traffic from the EVPN network toward the ES SAPs (the multihomed CE). The non-DF PEs do not forward BUM traffic to the local Ethernet-Segment SAPs.
Split-horizon shows the EVPN split-horizon concept for all-active multihoming.
The EVPN split-horizon procedure ensures that the BUM traffic originated by the multihomed PE and sent from the non-DF to the DF, is not replicated back to the CE (echoed packets on the CE). To avoid these echoed packets, the non-DF (PE1) sends all the BUM packets to the DF (PE2) with an indication of the source Ethernet-Segment. That indication is the ESI Label (ESI2 in the example), previously signaled by PE2 in the AD per-ESI route for the Ethernet-Segment. When PE2 receives an EVPN packet (after the EVPN label lookup), the PE2 finds the ESI label that identifies its local Ethernet-Segment ESI2. The BUM packet is replicated to other local CEs but not to the ESI2 SAP.
Aliasing shows the EVPN aliasing concept for all-active multihoming.
Because CE2 is multihomed to PE1 and PE2 using an all-active Ethernet-Segment, 'aliasing' is the procedure by which PE3 can load-balance the known unicast traffic between PE1 and PE2, even if the destination MAC address was only advertised by PE1 as in the example. When PE3 installs MAC1 in the FDB, it associates MAC1 not only with the advertising PE (PE1) but also with all the PEs advertising the same esi (ESI2) for the service. In this example, PE1 and PE2 advertise an AD per-EVI route for ESI2, therefore, the PE3 installs the two next-hops associated with MAC1.
Aliasing is enabled by configuring ECMP greater than 1 in the bgp-evpn>mpls context.
All-active multihoming service model
The following shows an example PE1 configuration that provides all-active multihoming to the CE2 shown in Aliasing .
*A:PE1>config>lag(1)# info
----------------------------------------------
mode access
encap-type dot1q
port 1/1/2
lacp active administrative-key 1 system-id 00:00:00:00:00:22
no shutdown
*A:PE1>config>service>system>bgp-evpn# info
----------------------------------------------
route-distinguisher 10.1.1.1:0
ethernet-segment "ESI2" create
esi 01:12:12:12:12:12:12:12:12:12
multi-homing all-active
service-carving
lag 1
no shutdown
*A:PE1>config>redundancy>evpn-multi-homing# info
----------------------------------------------
boot-timer 120
es-activation-timer 10
*A:PE1>config>service>vpls# info
----------------------------------------------
description "evpn-mpls-service with all-active multihoming"
bgp
bgp-evpn
evi 10
mpls bgp 1
no shutdown
auto-bind-tunnel resolution any
sap lag-1:1 create
exit
In the same way, PE2 is configured as follows:
*A:PE1>config>lag(1)# info
----------------------------------------------
mode access
encap-type dot1q
port 1/1/1
lacp active administrative-key 1 system-id 00:00:00:00:00:22
no shutdown
*A:PE1>config>service>system>bgp-evpn# info
----------------------------------------------
route-distinguisher 10.1.1.1:0
ethernet-segment "ESI12" create
esi 01:12:12:12:12:12:12:12:12:12
multi-homing all-active
service-carving
lag 1
no shutdown
*A:PE1>config>redundancy>evpn-multi-homing# info
----------------------------------------------
boot-timer 120
es-activation-timer 10
*A:PE1>config>service>vpls# info
----------------------------------------------
description "evpn-mpls-service with all-active multihoming"
bgp
route-distinguisher 65001:60
route-target target:65000:60
bgp-evpn
evi 10
mpls bgp 1
no shutdown
auto-bind-tunnel resolution any
sap lag-1:1 create
exit
The preceding configuration enables the all-active multihoming procedures. The following must be considered:
The ethernet-segment must be configured with a name and a 10-byte esi:
config>service>system>bgp-evpn# ethernet-segment<es_name> create
config>service> system>bgp-evpn>ethernet-segment# esi <value>
When configuring the esi, the system enforces the 6 high-order octets after the type to be different from zero (so that the auto-derived route-target for the ES route is different from zero). Other than that, the entire esi value must be unique in the system.
Only a LAG or a PW port can be associated with the all-active ethernet-segment. This LAG is exclusively used for EVPN multihoming. Other LAG ports in the system can be still used for MC-LAG and other services.
When the LAG is configured on PE1 and PE2, the same admin-key, system-priority, and system-id must be configured on both PEs, so that CE2 responds as though it is connected to the same system.
The same ethernet-segment may be used for EVPN-MPLS, EVPN-VXLAN and PBB-EVPN services.
Note: The source-bmac-lsb attribute must be defined for PBB-EVPN (so that it is only used in PBB-EVPN, and ignored by EVPN). Other than EVPN-MPLS, EVPN-VXLAN and PBB-EVPN I-VPLS/Epipe services, no other Layer 2 services are allowed in the same ethernet-segment (regular VPLS defined on the ethernet-segment is kept operationally down).Only one SAP per service can be part of the same ethernet-segment.
ES discovery and DF election procedures
The ES discovery and DF election is implemented in three logical steps, as shown in ES discovery and DF election.
Step 1 - ES advertisement and discovery
The ethernet-segment ESI-1 is configured as per the previous section, with all the required parameters. When ethernet-segment no shutdown is executed, PE1 and PE2 advertise an ES route for ESI-1. They both include the route-target auto-derived from the MAC portion of the configured ESI. If the route-target address family is configured in the network, this allows the RR to keep the dissemination of the ES routes under control.
In addition to the ES route, PE1 and PE2 advertise AD per-ESI routes and AD per-EVI routes.
-
AD per-ESI routes announce the Ethernet-Segment capabilities, including the mode (single-active or all-active) as well as the ESI label for split-horizon.
-
AD per-EVI routes are advertised so that PE3 knows what services (EVIs) are associated with the ESI. These routes are used by PE3 for its aliasing and backup procedures.
Step 2 - DF election
When ES routes exchange between PE1 and PE2 is complete, both run the DF election for all the services in the ethernet-segment.
PE1 and PE2 elect a Designated Forwarder (DF) per <ESI, service>. The default DF election mechanism in 7750 SR, 7450 ESS, and 7950 XRS SR OS is service-carving (as per RFC 7432). The following applies when enabled on a specified PE:
-
An ordered list of PE IPs where ESI-1 resides is built. The IPs are gotten from the Origin IP fields of all the ES routes received for ESI-1, as well as the local system address. The lowest IP is considered ordinal '0' in the list.
-
The local IP can only be considered a ‟candidate” after successful ethernet-segment no shutdown for a specified service.
Note: The remote PE IPs must be present in the local PE's RTM so that they can participate in the DF election. -
A PE only considers a specified remote IP address as candidate for the DF election algorithm for a specified service if, as well as the ES route, the corresponding AD routes per-ESI and per-EVI for that PE have been received and properly activated.
-
All the remote PEs receiving the AD per-ES routes (for example, PE3), interpret that ESI-1 is all-active if all the PEs send their AD per-ES routes with the single-active bit = 0. Otherwise, if at least one PE sends an AD route per-ESI with the single-active flag set or the local ESI configuration is single-active, the ESI behaves as single-active.
-
An es-activation-timer can be configured at the redundancy>bgp-evpn-multi-homing>es-activation-timer level or at the service>system>bgp-evpn>eth-seg>es-activation-timer level. This timer, which is 3 seconds by default, delays the transition from non-DF to DF for a specified service, after the DF election has run.
-
This use of the es-activation-timer is different from zero and minimizes the risks of loops and packet duplication because of ‟transient” multiple DFs.
-
The same es-activation-timer should be configured in all the PEs that are part of the same ESI. It is up to the user to configure either a long timer to minimize the risks of loops/duplication or even es-activation-timer=0 to speed up the convergence for non-DF to DF transitions. When the user configures a specific value, the value configured at ES level supersedes the configured global value.
-
-
The DF election is triggered by the following events:
-
config>service>system>bgp-evpn>eth-seg# no shutdown triggers the DF election for all the services in the ESI.
-
Reception of a new update/withdrawal of an ES route (containing an ESI configured locally) triggers the DF election for all the services in the ESI.
-
Reception of a new update/withdrawal of an AD per-ES route (containing an ESI configured locally) triggers the DF election for all the services associated with the list of route-targets received along with the route.
-
Reception of a new update of an AD per-ES route with a change in the ESI-label extended community (single-active bit or MPLS label) triggers the DF election for all the services associated with the list of route-targets received along with the route.
-
Reception of a new update/withdrawal of an AD route per-EVI (containing an ESI configured locally) triggers the DF election for that service.
-
-
When the PE boots up, the boot-timer allows the necessary time for the control plane protocols to come up before bringing up the Ethernet-Segment and running the DF algorithm. The boot-timer is configured at system level - config>redundancy>bgp-evpn-multi-homing# boot-timer - and should use a value long enough to allow the IOMs and BGP sessions to come up before exchanging ES routes and running the DF election for each EVI/ISID.
-
The system does not advertise ES routes until the boot timer expires. This guarantees that the peer ES PEs do not run the DF election either until the PE is ready to become the DF if it needs to.
-
The following show command displays the configured boot-timer as well as the remaining timer if the system is still in boot-stage.
A:PE1# show redundancy bgp-evpn-multi-homing =============================================================================== Redundancy BGP EVPN Multi-homing Information =============================================================================== Boot-Timer : 10 secs Boot-Timer Remaining : 0 secs ES Activation Timer : 3 secs ===============================================================================
-
-
When service-carving mode auto is configured (default mode), the DF election algorithm runs the function [V(evi) mod N(peers) = i(ordinal)] to identify the DF for a specified service and ESI, as described in the following example.
As shown in ES discovery and DF election, PE1 and PE2 are configured with ESI-1. Given that V(10) mod N(2) = 0, PE1 is elected DF for VPLS-10 (because its IP address is lower than PE2's and it is the first PE in the candidate list).
Note: The algorithm takes the configured evi in the service as opposed to the service-id itself. The evi for a service must match in all the PEs that are part of the ESI. This guarantees that the election algorithm is consistent across all the PEs of the ESI. The evi must be always configured in a service with SAPs/SDP bindings that are created in an ES. -
A manual service-carving option is allowed so that the user can manually configure for which evi identifiers the PE is primary: service-carving mode manual / manual evi <start-evi> to <end-evi>
-
The system is the PE forwarding/multicasting traffic for the evi identifiers included in the configuration. The PE is secondary (non-DF) for the non-specified evi identifiers.
-
If a range is configured but the service-carving is not mode manual, then the range has no effect.
-
Only two PEs are supported when service-carving mode manual is configured. If a third PE is configured with service-carving mode manual for an ESI, the two non-primary PEs remain non-DF regardless of the primary status.
-
For example, as shown in ES discovery and DF election: if PE1 is configured with service-carving manual evi 1 to 100 and PE2 with service-carving manual evi 101 to 200, then PE1 is the primary PE for service VPLS 10 and PE2 the secondary PE.
-
-
When service-carving is disabled, the lowest originator IP wins the election for a specified service and ESI:
config>service>system>bgp-evpn>eth-seg>service-carving> mode off
The following show command displays the ethernet-segment configuration and DF status for all the EVIs and ISIDs (if PBB-EVPN is enabled) configured in the ethernet-segment.
*A:PE1# show service system bgp-evpn ethernet-segment name "ESI-1" all =============================================================================== Service Ethernet Segment =============================================================================== Name : ESI-1 Admin State : Up Oper State : Up ESI : 01:00:00:00:00:71:00:00:00:01 Multi-homing : allActive Oper Multi-homing : allActive Source BMAC LSB : 71-71 ES BMac Tbl Size : 8 ES BMac Entries : 1 Lag Id : 1 ES Activation Timer : 0 secs Exp/Imp Route-Target : target:00:00:00:00:71:00 Svc Carving : auto ES SHG Label : 262142 =============================================================================== =============================================================================== EVI Information =============================================================================== EVI SvcId Actv Timer Rem DF ------------------------------------------------------------------------------- 1 1 0 no ------------------------------------------------------------------------------- Number of entries: 1 =============================================================================== ------------------------------------------------------------------------------- DF Candidate list ------------------------------------------------------------------------------- EVI DF Address ------------------------------------------------------------------------------- 1 192.0.2.69 1 192.0.2.72 ------------------------------------------------------------------------------- Number of entries: 2 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- =============================================================================== ISID Information =============================================================================== ISID SvcId Actv Timer Rem DF ------------------------------------------------------------------------------- 20001 20001 0 no ------------------------------------------------------------------------------- Number of entries: 1 =============================================================================== ------------------------------------------------------------------------------- DF Candidate list ------------------------------------------------------------------------------- ISID DF Address ------------------------------------------------------------------------------- 20001 192.0.2.69 20001 192.0.2.72 ------------------------------------------------------------------------------- Number of entries: 2 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- =============================================================================== BMAC Information =============================================================================== SvcId BMacAddress ------------------------------------------------------------------------------- 20000 00:00:00:00:71:71 ------------------------------------------------------------------------------- Number of entries: 1 ===============================================================================
Step 3 - DF and non-DF service behavior
Based on the result of the DF election or the manual service-carving, the control plane on the non-DF (PE1) instructs the data path to remove the LAG SAP (associated with the ESI) from the default flooding list for BM traffic (unknown unicast traffic may still be sent if the EVI label is a unicast label and the source MAC address is not associated with the ESI). On PE1 and PE2, both LAG SAPs learn the same MAC address (coming from the CE). For instance, in the following show commands, 00:ca:ca:ba:ce:03 is learned on both PE1 and PE2 access LAG (on ESI-1). However, PE1 learns the MAC as 'Learned' whereas PE2 learns it as 'Evpn'. This is because of the CE2 hashing the traffic for that source MAC to PE1. PE2 learns the MAC through EVPN but it associates the MAC to the ESI SAP, because the MAC belongs to the ESI.
*A:PE1# show service id 1 fdb detail
===============================================================================
Forwarding Database, Service 1
===============================================================================
ServId MAC Source-Identifier Type Last Change
Age
-------------------------------------------------------------------------------
1 00:ca:ca:ba:ce:03 sap:lag-1:1 L/0 06/11/15 00:14:47
1 00:ca:fe:ca:fe:70 eMpls: EvpnS 06/11/15 00:09:06
192.0.2.70:262140
1 00:ca:fe:ca:fe:72 eMpls: EvpnS 06/11/15 00:09:39
192.0.2.72:262141
-------------------------------------------------------------------------------
No. of MAC Entries: 3
-------------------------------------------------------------------------------
Legend: L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================
*A:PE2# show service id 1 fdb detail
===============================================================================
Forwarding Database, Service 1
===============================================================================
ServId MAC Source-Identifier Type Last Change
Age
-------------------------------------------------------------------------------
1 00:ca:ca:ba:ce:03 sap:lag-1:1 Evpn 06/11/15 00:14:47
1 00:ca:fe:ca:fe:69 eMpls: EvpnS 06/11/15 00:09:40
192.0.2.69:262141
1 00:ca:fe:ca:fe:70 eMpls: EvpnS 06/11/15 00:09:40
192.0.2.70:262140
-------------------------------------------------------------------------------
No. of MAC Entries: 3
-------------------------------------------------------------------------------
Legend: L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================
When PE1 (non-DF) and PE2 (DF) exchange BUM packets for evi 1, all those packets are sent including the ESI label at the bottom of the stack (in both directions). The ESI label advertised by each PE for ESI-1 can be displayed by the following command:
*A:PE1# show service system bgp-evpn ethernet-segment name "ESI-1"
===============================================================================
Service Ethernet Segment
===============================================================================
Name : ESI-1
Admin State : Up Oper State : Up
ESI : 01:00:00:00:00:71:00:00:00:01
Multi-homing : allActive Oper Multi-homing : allActive
Source BMAC LSB : 71-71
ES BMac Tbl Size : 8 ES BMac Entries : 1
Lag Id : 1
ES Activation Timer : 0 secs
Exp/Imp Route-Target : target:00:00:00:00:71:00
Svc Carving : auto
ES SHG Label : 262142
===============================================================================
*A:PE2# show service system bgp-evpn ethernet-segment name "ESI-1"
===============================================================================
Service Ethernet Segment
===============================================================================
Name : ESI-1
Admin State : Up Oper State : Up
ESI : 01:00:00:00:00:71:00:00:00:01
Multi-homing : allActive Oper Multi-homing : allActive
Source BMAC LSB : 71-71
ES BMac Tbl Size : 8 ES BMac Entries : 0
Lag Id : 1
ES Activation Timer : 20 secs
Exp/Imp Route-Target : target:00:00:00:00:71:00
Svc Carving : auto
ES SHG Label : 262142
===============================================================================
Aliasing
Following the example in ES discovery and DF election, if the service configuration on PE3 has ECMP > 1, PE3 adds PE1 and PE2 to the list of next-hops for ESI-1. As soon as PE3 receives a MAC for ESI-1, it starts load-balancing between PE1 and PE2 the flows to the remote ESI CE. The following command shows the FDB in PE3.
*A:PE3# show service id 1 fdb detail
===============================================================================
Forwarding Database, Service 1
===============================================================================
ServId MAC Source-Identifier Type Last Change
Age
-------------------------------------------------------------------------------
1 00:ca:ca:ba:ce:03 eES: Evpn 06/11/15 00:14:47
01:00:00:00:00:71:00:00:00:01
1 00:ca:fe:ca:fe:69 eMpls: EvpnS 06/11/15 00:09:18
192.0.2.69:262141
1 00:ca:fe:ca:fe:70 eMpls: EvpnS 06/11/15 00:09:18
192.0.2.70:262140
1 00:ca:fe:ca:fe:72 eMpls: EvpnS 06/11/15 00:09:39
192.0.2.72:262141
-------------------------------------------------------------------------------
No. of MAC Entries: 4
-------------------------------------------------------------------------------
Legend: L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================
The following command shows all the EVPN-MPLS destination bindings on PE3, including the ES destination bindings.
The Ethernet-Segment eES:01:00:00:00:00:71:00:00:00:01 is resolved to PE1 and PE2 addresses:
*A:PE3# show service id 1 evpn-mpls
===============================================================================
BGP EVPN-MPLS Dest
===============================================================================
TEP Address Egr Label Num. MACs Mcast Last Change
Transport
-------------------------------------------------------------------------------
192.0.2.69 262140 0 Yes 06/10/2015 14:33:30
ldp
192.0.2.69 262141 1 No 06/10/2015 14:33:30
ldp
192.0.2.70 262139 0 Yes 06/10/2015 14:33:30
ldp
192.0.2.70 262140 1 No 06/10/2015 14:33:30
ldp
192.0.2.72 262140 0 Yes 06/10/2015 14:33:30
ldp
192.0.2.72 262141 1 No 06/10/2015 14:33:30
ldp
192.0.2.73 262139 0 Yes 06/10/2015 14:33:30
ldp
192.0.2.254 262142 0 Yes 06/10/2015 14:33:30
bgp
-------------------------------------------------------------------------------
Number of entries : 8
-------------------------------------------------------------------------------
===============================================================================
===============================================================================
BGP EVPN-MPLS Ethernet Segment Dest
===============================================================================
Eth SegId TEP Address Egr Label Last Change
Transport
-------------------------------------------------------------------------------
01:00:00:00:00:71:00:00:00:01 192.0.2.69 262141 06/10/2015 14:33:30
ldp
01:00:00:00:00:71:00:00:00:01 192.0.2.72 262141 06/10/2015 14:33:30
ldp
01:74:13:00:74:13:00:00:74:13 192.0.2.73 262140 06/10/2015 14:33:30
ldp
-------------------------------------------------------------------------------
Number of entries : 3
-------------------------------------------------------------------------------
===============================================================================
PE3 performs aliasing for all the MACs associated with that ESI. This is possible because PE1 is configured with ECMP parameter >1:
*A:PE3>config>service>vpls# info
----------------------------------------------
bgp
exit
bgp-evpn
evi 1
mpls bgp 1
ecmp 4
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
proxy-arp
shutdown
exit
stp
shutdown
exit
sap 1/1/1:2 create
exit
no shutdown
Network failures and convergence for all-active multihoming
All-active multihoming ES failure shows the behavior on the remote PEs (PE3) when there is an ethernet-segment failure.
The unicast traffic behavior on PE3 is as follows:
-
PE3 forwards MAC DA = CE2 to both PE1 and PE2 when the MAC advertisement route came from PE1 (or PE2) and the set of Ethernet AD per-ES routes and Ethernet AD per-EVI routes from PE1 and PE2 are active at PE3.
If there was a failure between CE2 and PE2, PE2 would withdraw its set of Ethernet AD and ES routes, then PE3 would forward traffic destined for CE2 to PE1 only. PE3 does not need to wait for the withdrawal of the individual MAC.
The same behavior would be followed if the failure had been at PE1.
If after step 2, PE2 withdraws its MAC advertisement route, then PE3 treats traffic to MAC DA = CE2 as unknown unicast, unless the MAC had been previously advertised by PE1.
For BUM traffic, the following events would trigger a DF election on a PE and only the DF would forward BUM traffic after the esi-activation-timer expiration (if there was a transition from non-DF to DF).
-
Reception of ES route update (local ES shutdown/no shutdown or remote route)
-
New AD-ES route update/withdraw
-
New AD-EVI route update/withdraw
-
Local ES port/SAP/service shutdown
-
Service carving range change (effecting the evi)
-
Multihoming mode change (single/all active to all/single-active)
Logical failures on ESs and blackholes
Be aware of the effects triggered by specific 'failure scenarios'; some of these scenarios are shown in Blackhole caused by SAP/SVC shutdown:
If an individual VPLS service is shutdown in PE1 (the example is also valid for PE2), the corresponding LAG SAP goes operationally down. This event triggers the withdrawal of the AD per-EVI route for that particular SAP. PE3 removes PE1 of its list of aliased next-hops and PE2 takes over as DF (if it was not the DF already). However, this does not prevent the network from black-holing the traffic that CE2 'hashes' to the link to PE1. Traffic sent from CE2 to PE2 or traffic from the rest of the CEs to CE2 is not affected, so this situation is not easily detected on the CE.
The same result occurs if the ES SAP is administratively shutdown instead of the service.
Transient issues caused by MAC route delays
Some situations may cause potential transient issues to occur. These are shown in Transient issues caused by ‟slow” MAC learning and described below.
Transient packet duplication caused by delay in PE3 to learn MAC1:
This scenario is illustrated by the diagram on the left in Transient issues caused by ‟slow” MAC learning. In an all-active multihoming scenario, if a specified MAC address is not yet learned in a remote PE, but is known in the two PEs of the ES, for example, PE1 and PE2, the latter PEs may send duplicated packets to the CE.
In an all-active multihoming scenario, if a specified MAC address (for example, MAC1), is not learned yet in a remote PE (for example, PE3), but it is known in the two PEs of the ES (for example, PE1 and PE2), the latter PEs may send duplicated packets to the CE.
This issue is solved by the use of ingress-replication-bum-label in PE1 and PE2. If configured, PE1/PE2 knows that the received packet is an unknown unicast packet, therefore, the NDF (PE1) does not send the packets to the CE and there is not duplication.
Transient blackhole caused by delay in PE1 to learn MAC1:
This case is illustrated by the diagram on the right in Transient issues caused by ‟slow” MAC learning. In an all-active multihoming scenario, MAC1 is known in PE3 and aliasing is applied to MAC1. However, MAC1 is not known yet in PE1, the NDF for the ES. If PE3 hashing picks up PE1 as the destination of the aliased MAC1, the packets are blackholed. This case is solved on the NDF by not blocking unknown unicast traffic that arrives with a unicast label. If PE1 and PE2 are configured using ingress-replication-bum-label, PE3 sends unknown unicast with a BUM label and known unicast with a unicast label. In the latter case, PE1 considers it is safe to forward the frame to the CE, even if it is unknown unicast. It is important to note that this is a transient issue and as soon as PE1 learns MAC1 the frames are forwarded as known unicast.
EVPN single-active multihoming
The 7750 SR, 7450 ESS, and 7950 XRS SR OS supports single-active multihoming on access LAG SAPs, regular SAPs, and spoke SDPs for a specified VPLS service.
The following SR OS procedures support EVPN single-active multihoming for a specified Ethernet-Segment:
DF (Designated Forwarder) election
As in all-active multihoming, DF election in single-active multihoming determines the forwarding for BUM traffic from the EVPN network to the Ethernet-Segment CE. Also, in single-active multihoming, DF election also determines the forwarding of any traffic (unicast/BUM) and in any direction (to/from the CE).
backup PE
In single-active multihoming, the remote PEs do not perform aliasing to the PEs in the Ethernet-Segment. The remote PEs identify the DF based on the MAC routes and send the unicast flows for the Ethernet-Segment to the PE in the DF and program a backup PE as an alternative next-hop for the remote ESI in case of failure.
This RFC 7432 procedure is known as 'Backup PE' and is shown in Backup PE for PE3.
Single-active multihoming service model
The following shows an example of PE1 configuration that provides single-active multihoming to CE2, as shown in Backup PE.
*A:PE1>config>service>system>bgp-evpn# info
----------------------------------------------
route-distinguisher 10.1.1.1:0
ethernet-segment "ESI2" create
esi 01:12:12:12:12:12:12:12:12:12
multi-homing single-active
service-carving
sdp 1
no shutdown
*A:PE1>config>redundancy>evpn-multi-homing# info
----------------------------------------------
boot-timer 120
es-activation-timer 10
*A:PE1>config>service>vpls# info
----------------------------------------------
description "evpn-mpls-service with single-active multihoming"
bgp
bgp-evpn
evi 10
mpls bgp 1
no shutdown
auto-bind-tunnel resolution any
spoke-sdp 1:1 create
exit
The PE2 example configuration for this scenario is as follows:
*A:PE1>config>service>system>bgp-evpn# info
----------------------------------------------
route-distinguisher 10.1.1.1:0
ethernet-segment "ESI2" create
esi 01:12:12:12:12:12:12:12:12:12
multi-homing single-active
service-carving
sdp 2
no shutdown
*A:PE1>config>redundancy>evpn-multi-homing# info
----------------------------------------------
boot-timer 120
es-activation-timer 10
*A:PE1>config>service>vpls# info
----------------------------------------------
description "evpn-mpls-service with single-active multihoming"
bgp
bgp-evpn
evi 10
mpls bgp 1
no shutdown
auto-bind-tunnel resolution any
spoke-sdp 2:1 create
exit
In single-active multihoming, the non-DF PEs for a specified ESI block unicast and BUM traffic in both directions (upstream and downstream) on the object associated with the ESI. Other than that, single-active multihoming is similar to all-active multihoming with the following differences:
The ethernet-segment is configured for single-active: service>system>bgp-evpn>eth-seg>multi-homing single-active.
The advertisement of the ESI-label in an AD per-ESI is optional for single-active Ethernet-Segments. The user can control the no advertisement of the ESI label by using the service system bgp-evpn eth-seg multi-homing single-active no-esi-label command. By default, the ESI label is used for single-active ESs too.
For single-active multihoming, the Ethernet-Segment can be associated with a port and sdp, as well as a lag-id, as shown in Backup PE, where:
port is used for single-active SAP redundancy without the need for lag.
sdp is used for single-active spoke SDP redundancy.
lag is used for single-active LAG redundancy
Note: In this case, key, system-id, and system-priority must be different on the PEs that are part of the Ethernet-Segment).
For single-active multihoming, when the PE is non-DF for the service, the SAPs/spoke SDPs on the Ethernet-Segment are down and show StandByForMHProtocol as the reason.
From a service perspective, single-active multihoming can provide redundancy to CEs (MHD, Multi-Homed Devices) or networks (MHN, Multi-Homed Networks) with the following setup:
LAG with or without LACP
In this case, the multihomed ports on the CE are part of the different LAGs (a LAG per multihomed PE is used in the CE). The non-DF PE for each service can signal that the SAP is operationally down if eth-cfm fault-propagation-enable {use-if-tlv | suspend-ccm} is configured.
regular Ethernet 802.1q/ad ports
In this case, the multihomed ports on the CE/network are not part of any LAG. Eth-cfm can also be used for non-DF indication to the multihomed device/network.
active-standby PWs
In this case, the multihomed CE/network is connected to the PEs through an MPLS network and an active/standby spoke SDP per service. The non-DF PE for each service makes use of the LDP PW status bits to signal that the spoke SDP is operationally down on the PE side.
ES and DF election procedures
In all-active multihoming, the non-DF keeps the SAP up, although it removes it from the default flooding list. In the single-active multihoming implementation the non-DF brings the SAP or SDP binding operationally down. See ES discovery and DF election procedures.
The following show commands display the status of the single-active ESI-7413 in the non-DF. The associated spoke SDP is operationally down and it signals PW Status standby to the multihomed CE:
*A:PE1# show service system bgp-evpn ethernet-segment name "ESI-7413"
===============================================================================
Service Ethernet Segment
===============================================================================
Name : ESI-7413
Admin State : Up Oper State : Up
ESI : 01:74:13:00:74:13:00:00:74:13
Multi-homing : singleActive Oper Multi-homing : singleActive
Source BMAC LSB : <none>
Sdp Id : 4
ES Activation Timer : 0 secs
Exp/Imp Route-Target : target:74:13:00:74:13:00
Svc Carving : auto
ES SHG Label : 262141
===============================================================================
*A:PE1# show service system bgp-evpn ethernet-segment name "ESI-7413" evi 1
===============================================================================
EVI DF and Candidate List
===============================================================================
EVI SvcId Actv Timer Rem DF DF Last Change
-------------------------------------------------------------------------------
1 1 0 no 06/11/2015 20:05:32
===============================================================================
===============================================================================
DF Candidates Time Added
-------------------------------------------------------------------------------
192.0.2.70 06/11/2015 20:05:20
192.0.2.73 06/11/2015 20:05:32
-------------------------------------------------------------------------------
Number of entries: 2
===============================================================================
*A:PE1# show service id 1 base
===============================================================================
Service Basic Information
===============================================================================
Service Id : 1 Vpn Id : 0
Service Type : VPLS
Name : (Not Specified)
Description : (Not Specified)
<snip>
-------------------------------------------------------------------------------
Service Access & Destination Points
-------------------------------------------------------------------------------
Identifier Type AdmMTU OprMTU Adm Opr
-------------------------------------------------------------------------------
sap:1/1/1:1 q-tag 9000 9000 Up Up
sdp:4:13 S(192.0.2.74) Spok 0 8978 Up Down
===============================================================================
* indicates that the corresponding row element may have been truncated.
*A:PE1# show service id 1 all | match Pw
Local Pw Bits : pwFwdingStandby
Peer Pw Bits : None
*A:PE1# show service id 1 all | match Flag
Flags : StandbyForMHProtocol
Flags : None
Backup PE function
A remote PE (PE3 in Backup PE) imports the AD routes per ESI, where the single-active flag is set. PE3 interprets that the Ethernet-Segment is single-active if at least one PE sends an AD route per-ESI with the single-active flag set. MACs for a specified service and ESI are learned from a single PE, that is, the DF for that <ESI, EVI>.
The remote PE installs a single EVPN-MPLS destination (TEP, label) for a received MAC address and
a backup next-hop to the PE for which the AD routes
per-ESI and per-EVI are received. For instance, in
the following command, 00:ca:ca:ba:ca:06 is
associated with the remote ethernet-segment
eES 01:74:13:00:74:13:00:00:74:13
. That
eES is resolved to PE (192.0.2.73), which is the DF
on the ES.
*A:PE3# show service id 1 fdb detail
===============================================================================
Forwarding Database, Service 1
===============================================================================
ServId MAC Source-Identifier Type Last Change
Age
-------------------------------------------------------------------------------
1 00:ca:ca:ba:ca:02 sap:1/1/1:2 L/0 06/12/15 00:33:39
1 00:ca:ca:ba:ca:06 eES: Evpn 06/12/15 00:33:39
01:74:13:00:74:13:00:00:74:13
1 00:ca:fe:ca:fe:69 eMpls: EvpnS 06/11/15 21:53:47
192.0.2.69:262118
1 00:ca:fe:ca:fe:70 eMpls: EvpnS 06/11/15 19:59:57
192.0.2.70:262140
1 00:ca:fe:ca:fe:72 eMpls: EvpnS 06/11/15 19:59:57
192.0.2.72:262141
-------------------------------------------------------------------------------
No. of MAC Entries: 5
-------------------------------------------------------------------------------
Legend: L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================
*A:PE3# show service id 1 evpn-mpls
===============================================================================
BGP EVPN-MPLS Dest
===============================================================================
TEP Address Egr Label Num. MACs Mcast Last Change
Transport
-------------------------------------------------------------------------------
192.0.2.69 262118 1 Yes 06/11/2015 19:59:03
ldp
192.0.2.70 262139 0 Yes 06/11/2015 19:59:03
ldp
192.0.2.70 262140 1 No 06/11/2015 19:59:03
ldp
192.0.2.72 262140 0 Yes 06/11/2015 19:59:03
ldp
192.0.2.72 262141 1 No 06/11/2015 19:59:03
ldp
192.0.2.73 262139 0 Yes 06/11/2015 19:59:03
ldp
192.0.2.254 262142 0 Yes 06/11/2015 19:59:03
bgp
-------------------------------------------------------------------------------
Number of entries : 7
-------------------------------------------------------------------------------
===============================================================================
===============================================================================
BGP EVPN-MPLS Ethernet Segment Dest
===============================================================================
Eth SegId TEP Address Egr Label Last Change
Transport
-------------------------------------------------------------------------------
01:74:13:00:74:13:00:00:74:13 192.0.2.73 262140 06/11/2015 19:59:03
ldp
-------------------------------------------------------------------------------
Number of entries : 1
-------------------------------------------------------------------------------
===============================================================================
If PE3 sees only two single-active PEs in the same ESI, the second PE is the backup PE. Upon receiving an AD per-ES/per-EVI route withdrawal for the ESI from the primary PE, the PE3 starts sending the unicast traffic to the backup PE immediately.
If PE3 receives AD routes for the same ESI and EVI from more than two PEs, the PE does not install any backup route in the data path. Upon receiving an AD per-ES/per-EVI route withdrawal for the ESI, it flushes the MACs associated with the ESI.
Network failures and convergence for single-active multihoming
Single-active multihoming ES failure shows the remote PE (PE3) behavior when there is an Ethernet-Segment failure.
The PE3 behavior for unicast traffic is as follows:
PE3 forwards MAC DA = CE2 to PE2 when the MAC advertisement route came from PE2 and the set of Ethernet AD per-ES routes and Ethernet AD per-EVI routes from PE1 and PE2 are active at PE3.
If there was a failure between CE2 and PE2, PE2 would withdraw its set of Ethernet AD and ES routes, then PE3 would immediately forward the traffic destined for CE2 to PE1 only (the backup PE). PE3 does not need to wait for the withdrawal of the individual MAC.
If after step 2, PE2 withdraws its MAC advertisement route, PE3 treats traffic to MAC DA = CE2 as unknown unicast, unless the MAC has been previously advertised by PE1.
Also, a DF election on PE1 is triggered. In general, a DF election is triggered by the same events as for all-active multihoming. In this case, the DF forwards traffic to CE2 when the esi-activation-timer expiration occurs (the timer kicks in when there is a transition from non-DF to DF).
EVPN ESI type 1 support
According to RFC 7432, specific Ethernet Segment Identifier (ESI) types support auto-derivation and the 10-byte ESI value does not need to be configured. SR OS supports the manual configuration of 10-byte ESI for the Ethernet segment, or alternatively, the auto-derivation of EVPN type 1 ESIs.
The auto-esi {none|type-1} command is supported in the Ethernet segment configuration. The default mode is none and it forces the user to configure a manual ESI. When type-1 is configured, a manual ESI cannot be configured in the ES and the ESI is auto-derived in accordance with the RFC 7432 ESI type 1 definition. An ESI type 1 encodes 0x01 in the ESI type octet (T=0x01) and indicates that IEEE 802.1AX LACP is used between the PEs and CEs.
The ESI is auto-derived from the CE's LACP PDUs by concatenating the following parameters:
CE LACP system MAC address (6 octets)
The CE LACP system MAC address is encoded in the high-order 6 octets of the ESI value field.
CE LACP port key (2 octets)
The CE LACP port key is encoded in the 2 octets next to the system MAC address.
The remaining octet is set to 0x00.
The following usage considerations apply to auto-ESI type 1:
ESI type 1 is only supported on non-virtual Ethernet segments associated with LAGs when LACP is enabled.
Single-active or all-active modes are supported. When used with a single-active node, the CE must be attached to the PEs by a single LAG, which allows the multihomed PEs to auto-derive the same ESI.
Changing the auto-esi command requires an ES shutdown.
When the ES is enabled but the ESI has not yet been auto-derived, no multihoming routes are advertised for the ES. ES and AD routes are advertised only after ESI type 1 is auto-derived and the ES is enabled.
When the ES LAG is operationally down as a result of the ports or LACP going down, the previously auto-derived ESI is retained. Consequently, convergence is not impacted when the LAG comes back up; if the CE's LACP information is changed, the ES goes down and a new auto-derived type 1 ESI is generated.
P2MP mLDP tunnels for BUM traffic in EVPN-MPLS services
P2MP mLDP tunnels for BUM traffic in EVPN-MPLS services are supported and enabled through the use of the provider-tunnel context. If EVPN-MPLS takes ownership over the provider-tunnel, bgp-ad is still supported in the service but it does not generate BGP updates, including the PMSI Tunnel Attribute. The following CLI example shows an EVPN-MPLS service that uses P2MP mLDP LSPs for BUM traffic.
*A:PE-1>config>service>vpls(vpls or b-vpls)# info
----------------------------------------------
description "evpn-mpls-service with p2mp mLDP"
bgp-evpn
evi 10
no ingress-repl-inc-mcast-advertisement
mpls bgp 1
no shutdown
auto-bind-tunnel resolution any
exit
provider-tunnel
inclusive
owner bgp-evpn-mpls
root-and-leaf
mldp
no shutdown
exit
exit
sap 1/1/1:1 create
exit
spoke-sdp 1:1 create
exit
When provider-tunnel inclusive is used in EVPN-MPLS services, the following commands can be used in the same way as for BGP-AD or BGP-VPLS services:
data-delay-interval
root-and-leaf
mldp
shutdown
The following commands are used by provider-tunnel in BGP-EVPN MPLS services:
[no] ingress-repl-inc-mcast-advertisement
This command allows you to control the advertisement of IMET-IR and IMET-P2MP-IR routes for the service. See BGP-EVPN control plane for MPLS tunnels for a description of the IMET routes. The following considerations apply:
If configured as no ingress-repl-inc-mcast-advertisement, the system does not send the IMET-IR or IMET-P2MP-IR routes, regardless of the service being enabled for BGP-EVPN MLPLS or BGP-EVPN VXLAN.
If configured as ingress-repl-inc-mcast-advertisement and the PE is root-and-leaf, the system sends an IMET-P2MP-IR route.
If configured as ingress-repl-inc-mcast-advertisement and the PE is no root-and-leaf, the system sends an IMET-IR route.
Default value is ingress-repl-inc-mcast-advertisement.
[no] owner {bgp-ad | bgp-vpls | bgp-evpn-mpls}
The owner of the provider tunnel must be configured. The default value is no owner. The following considerations apply:
Only one of the protocols supports a provider tunnel in the service and it must be explicitly configured.
bgp-vpls and bgp-evpn are mutually exclusive.
While bgp-ad and bgp-evpn can coexist in the same service, only bgp-evpn can be the provider-tunnel owner in such cases.
EVPN services with p2mp mLDP—control plane shows the use of P2MP mLDP tunnels in an EVI with a root node and a few leaf-only nodes.
Consider the use case of a root-and-leaf PE4 where the other nodes are configured as leaf-only nodes (no root-and-leaf). This scenario is handled as follows:
If ingress-repl-inc-mcast-advertisement is configured, then as soon as the bgp-evpn mpls option is enabled, the PE4 sends an IMET-P2MP route (tunnel type mLDP), or optionally, an IMET-P2MP-IR route (tunnel type composite). IMET-P2MP-IR routes allow leaf-only nodes to create EVPN-MPLS multicast destinations and send BUM traffic to the root.
If ingress-repl-inc-mcast-advertisement is configured, PE1/2/3 do not send IMET-P2MP routes; only IMET-IR routes are sent.
The root-and-leaf node imports the IMET-IR routes from the leaf nodes but it only sends BUM traffic to the P2MP tunnel as long as it is active.
If the P2MP tunnel goes operationally down, the root-and-leaf node starts sending BUM traffic to the evpn-mpls multicast destinations
When PE1/2/3 receive and import the IMET-P2MP or IMET-P2MP-IR from PE4, they join the mLDP P2MP tree signaled by PE4. They issue an LDP label-mapping message including the corresponding P2MP FEC.
As described in IETF Draft draft-ietf-bess-evpn-etree, mLDP and Ingress Replication (IR) can work in the same network for the same service; that is, EVI1 can have some nodes using mLDP (for example, PE1) and others using IR (for example, PE2). For scaling, this is significantly important in services that consist of a pair of root nodes sending BUM in P2MP tunnels and hundreds of leaf-nodes that only need to send BUM traffic to the roots. By using IMET-P2MP-IR routes from the roots, the operator makes sure the leaf-only nodes can send BUM traffic to the root nodes without the need to set up P2MP tunnels from the leaf nodes.
When both static and dynamic P2MP mLDP tunnels are used on the same router, Nokia recommends that the static tunnels use a tunnel ID lower than 8193. If a tunnel ID is statically configured with a value equal to or greater than 8193, BGP-EVPN may attempt to use the same tunnel ID for services with enabled provider-tunnel, and fail to set up an mLDP tunnel.
Inter-AS option C or seamless-MPLS models for non-segmented mLDP trees are supported with EVPN for BUM traffic. The leaf PE that joins an mLDP EVPN root PE supports Recursive and Basic Opaque FEC elements (types 7 and 1, respectively). Therefore, packet forwarding is handled as follows:
The ABR or ASBR may leak the root IP address into the leaf PE IGP, which allows the leaf PE to issue a Basic opaque FEC to join the root.
The ABR or ASBR may distribute the root IP using BGP label-ipv4, which results in the leaf PE issuing a Recursive opaque FEC to join the root.
For more information about mLDP opaque FECs, see the 7450 ESS, 7750 SR, 7950 XRS, and VSR Layer 3 Services Guide: IES and VPRN and the 7450 ESS, 7750 SR, 7950 XRS, and VSR MPLS Guide.
All-active multihoming and single-active with an ESI label multihoming are supported in EVPN-MPLS services together with P2MP mLDP tunnels. Both use an upstream-allocated ESI label, as described in RFC 7432 section 8.3.1.2, which is popped at the leaf PEs, resulting in the requirement that, in addition to the root PE, all EVPN-MPLS P2MP leaf PEs must support this capability (including the PEs not connected to the multihoming ES).
PBB-EVPN
This section contains information about PBB-EVPN.
BGP-EVPN control plane for PBB-EVPN
PBB-EVPN uses a reduced subset of the routes and procedures described in RFC 7432. The supported routes are:
ES routes
MAC/IP routes
Inclusive Multicast Ethernet Tag routes
EVPN route type 3 - inclusive multicast Ethernet tag route
This route is used to advertise the ISIDs that belong to I-VPLS services as well as the default multicast tree. PBB-Epipe ISIDs are not advertised in Inclusive Multicast routes. The following fields are used:
Route Distinguisher is taken from the RD of the B-VPLS service within the BGP context. The RD can be configured or derived from the value of the bgp-evpn evi.
Ethernet Tag ID encodes the ISID for a specified I-VPLS.
IP address length is always 32.
Originating router's IP address carries an IPv4 or IPv6 address.
PMSI attribute:
Tunnel type = Ingress replication (6).
Flags = Leaf not required.
MPLS label carries the MPLS label allocated for the service in the high-order 20 bits of the label field.
Note: This label is the same label used in the B-MAC routes for the same B-VPLS service unless bgp-evpn mpls ingress-replication-bum-label is configured in the B-VPLS service.Tunnel endpoint = equal to the originating IP address.
Note: The mLDP P2MP tunnel type is supported on PBB-EPVN services, but it can be used in the default multicast tree only.
EVPN route type 2 - MAC/IP advertisement route (or B-MAC routes)
The 7750 SR, 7450 ESS, or 7950 XRS generates this route type for advertising B-MAC addresses for the following:
Learned MACs on B-SAPs or B-SDP bindings (if mac-advertisement is enabled)
Conditional static MACs (if mac-advertisement is enabled)
B-VPLS shared B-MACs (source-bmacs) and dedicated B-MACs (es-bmacs).
The route type 2 generated by the router uses the following fields and values:
Route Distinguisher is taken from the RD of the VPLS service within the BGP context. The RD can be configured or derived from the bgp-evpn evi value.
Ethernet Segment Identifier (ESI):
ESI = 0 for the advertisement of source-bmac, es-bmacs, sap-bmacs, or sdp-bmacs if no multihoming or single-active multihoming is used.
ESI=MAX-ESI (0xFF..FF) in the advertisement of es-bmacs used for all-active multihoming.
ESI different from zero or MAX-ESI for learned B-MACs on B-SAPs/SDP bindings if EVPN multihoming is used on B-VPLS SAPs and SDP bindings.
Ethernet Tag ID is 0.
Note: A different Ethernet Tag value may be used only when send-bvpls-evpn-flush is enabled.MAC address length is always 48.
B-MAC address (learned, configured, or system-generated).
IP address length zero and IP address omitted.
MPLS Label 1 carries the MPLS label allocated by the system to the B-VPLS service. The label value is encoded in the high-order 20 bits of the field and is the same label used in the routes type 3 for the same service unless BGP-EVPN MPLS ingress-replication-bum-label is configured in the service.
The MAC Mobility extended community:
The MAC mobility extended community is used in PBB-EVPN for C-MAC flush purposes if per ISID load balancing (single-active multihoming) is used and a source-bmac is used for traffic coming from the ESI.
If there is a failure in one of the ES links, C-MAC flush through the withdrawal of the B-MAC cannot be done (other ESIs are still working); therefore, the MAC mobility extended community is used to signal C-MAC flush to the remote PEs.
When a dedicated es-bmac per ESI is used, the MAC flush can be based on the withdrawal of the B-MAC from the failing node.
es-bmacs are advertised as static (sticky bit set).
Source-bmacs are advertised as static MACs (sticky bit set). In the case of an update, if advertised to indicate that C-MAC flush is needed, the MAC mobility extended community is added to the B-MAC route including a higher sequence number (than the one previously advertised) in addition to the sticky bit.
EVPN route type 4 - ES route
This route type is used for DF election as described in section BGP-EVPN control plane for MPLS tunnels.
PBB-EVPN for I-VPLS and PBB Epipe services
The 7750 SR, 7450 ESS, and 7950 XRS SR OS implementation of PBB-EVPN reuses the existing PBB-VPLS model, where N I-VPLS (or Epipe) services can be linked to a B-VPLS service. BGP-EVPN is enabled in the B-VPLS and the B-VPLS becomes an EVI (EVPN Instance). PBB-EVPN for I-VPLS and PBB Epipe services shows the PBB-EVPN model in the SR OS.
Each PE in the B-VPLS domain advertises its source-bmac as either configured in vpls>pbb>source-bmac or auto-derived from the chassis MAC. The remote PEs install the advertised B-MACs in the B-VPLS FDB. If a specified PE is configured with an ethernet-segment associated with an I-VPLS or PBB Epipe, it may also advertise an es-bmac for the Ethernet-Segment.
In the example shown in PBB-EVPN for I-VPLS and PBB Epipe services, when a frame with MAC DA = AA gets to PE1, a MAC lookup is performed on the I-VPLS FDB and B-MAC-34 is found. A B-MAC lookup on the B-VPLS FDB yields the next-hop (or next-hops if the destination is in an all-active Ethernet-Segment) to which the frame is sent. As in PBB-VPLS, the frame is encapsulated with the corresponding PBB header. A label specified by EVPN for the B-VPLS and the MPLS transport label are also added.
If the lookup on the I-VPLS FDB fails, the system sends the frame encapsulated into a PBB packet with B-MAC DA = Group B-MAC for the ISID. That packet is distributed to all the PEs where the ISID is defined and contains the EVPN label distributed by the Inclusive Multicast routes for that ISID, as well as the transport label.
For PBB-Epipes, all the traffic is sent in a unicast PBB packet to the B-MAC configured in the pbb-tunnel.
The following CLI output shows an example of the configuration of an I-VPLS, PBB-Epipe, and their corresponding B-VPLS.
*A:PE-1>config#
service vpls 1 name "b-vpls1" b-vpls create
description "pbb-evpn-service"
service-mtu 2000
pbb
source-bmac 00:00:00:00:00:03
bgp
bgp-evpn
evi 1
mpls bgp 1
no shutdown
auto-bind-tunnel resolution any
sap 1/1/1:1 create
exit
spoke-sdp 1:1 create
*A:PE-1>config#
service vpls 101 name "vpls101" i-vpls create
pbb
backbone-vpls 1
sap 1/2/1:101 create
spoke-sdp 1:102 create
*A:PE-1>config#
service epipe 102 name "epipe102" create
pbb
tunnel 1 backbone-dest-mac 00:00:00:00:00:01 isid 102
sap 1/2/1:102 create
Configure the bgp-evpn context as described in section EVPN for MPLS tunnels in VPLS services (EVPN-MPLS).
Some EVPN configuration options are not relevant to PBB-EVPN and are not supported when BGP-EVPN is configured in a B-VPLS; these are as follows:
bgp-evpn> [no] ip-route-advertisement
bgp-evpn> [no] unknown-mac-route
bgp-evpn> vxlan [no] shutdown
bgp-evpn>mpls>force-vlan-vc-forwarding
When bgp-evpn>mpls no shutdown is added to a specified B-VPLS instance, the following considerations apply:
BGP-AD is supported along with EVPN in the same B-VPLS instance.
The following B-VPLS and BGP-EVPN commands are fully supported:
vpls>backbone-vpls
vpls>backbone-vpls>send-flush-on-bvpls-failure
vpls>backbone-vpls>source-bmac
vpls>backbone-vpls>use-sap-bmac
vpls>backbone-vpls>use-es-bmac (For more information, see PBB-EVPN multihoming in I-VPLS and PBB Epipe services)
vpls>isid-policies
vpls>static-mac
vpls>SAP or SDP-binding>static-isid
bgp-evpn>mac-advertisement - this command affects the 'learned' B-MACs on SAPs or SDP bindings and not on the system B-MAC or SAP/es-bmacs being advertised.
bgp-evpn>mac-duplication and settings.
bgp-evpn>mpls>auto-bind-tunnel and options.
bgp-evpn>mpls>ecmp
bgp-evpn>mpls>control-word
bgp-evpn>evi
bgp-evpn>mpls>ingress-replication-bum-label
Flood containment for I-VPLS services
In general, PBB technologies in the 7750 SR, 7450 ESS, or 7950 XRS SR OS support a way to contain the flooding for a specified I-VPLS ISID, so that BUM traffic for that ISID only reaches the PEs where the ISID is locally defined. Each PE creates an MFIB per I-VPLS ISID on the B-VPLS instance. That MFIB supports SAP or SDP bindings endpoints that can be populated by:
MMRP in regular PBB-VPLS
IS-IS in SPBM
In PBB-EVPN, B-VPLS EVPN endpoints can be added to the MFIBs using EVPN Inclusive Multicast Ethernet Tag routes.
The example in PBB-EVPN and I-VPLS flooding containment shows how the MFIBs are populated in PBB-EVPN.
When the B-VPLS 10 is enabled, PE1 advertises as follows:
A B-MAC route containing PE1's system B-MAC (00:01 as configured in pbb>source-bmac) along with an MPLS label.
An Inclusive Multicast Ethernet Tag route (IMET route) with Ethernet-tag = 0 that allows the remote B-VPLS 10 instances to add an entry for PE1 in the default multicast list.
When I-VPLS 2001 (ISID 2001) is enabled as per the CLI in the preceding section, PE1 advertises as follows:
An additional inclusive multicast route with Ethernet-tag = 2001. This allows the remote PEs to create an MFIB for the corresponding ISID 2001 and add the corresponding EVPN binding entry to the MFIB.
This default behavior can be modified by the configured isid-policy. For instance, for ISIDs 1-2000, configure as follows:
isid-policy
entry 10 create
no advertise-local
range 1 to 2000
use-def-mcast
This configuration has the following effect for the ISID range:
no advertise-local instructs the system to not advertise the local active ISIDs contained in the 1 to 2001 range.
use-def-mcast instructs the system to use the default flooding list as opposed to the MFIB.
The ISID flooding behavior on B-VPLS SAPs and SDP bindings is as follows:
B-VPLS SAPs and SDP bindings are only added to the TLS-multicast list and not to the MFIB list (unless static-isids are configured, which is only possible for SAPs/SDP bindings and not BGP-AD spoke SDPs).
As a result, if the system needs to flood ISID BUM traffic and the ISID is also defined in remote PEs connected through SAPs or spoke SDPs without static-isids, then an isid-policy must be configured for the ISID so that the ISID uses the default multicast list.
When an isid-policy is configured and a range of ISIDs use the default multicast list, the remote PBB-EVPN PEs are added to the default multicast list as long as they advertise an IMET route with an ISID included in the policy's ISID range. PEs advertising IMET routes with Ethernet-tag = 0 are also added to the default multicast list (7750 SR, 7450 ESS, or 7950 XRS SR OS behavior).
The B-VPLS 10 also allows the ISID flooding to legacy PBB networks via B-SAPs or B-SDPs. The legacy PBB network B-MACs are dynamically learned on those SAPs/binds or statically configured through the use of conditional static-macs. The use of static-isids is required so that non-local ISIDs are advertised.
sap 1/1/1:1 create
exit
spoke-sdp 1:1 create
static-mac
mac 00:fe:ca:fe:ca:fe create sap 1/1/1:1 monitor fwd-status
static-isid
range 1 isid 3000 to 5000 create
PBB-EVPN and PBB-VPLS integration
The 7750 SR, 7450 ESS, and 7950 XRS SR OS EVPN implementation supports RFC 8560 so that PBB-EVPN and PBB-VPLS can be integrated into the same network and within the same B-VPLS service.
All the concepts described in section EVPN and VPLS integration are also supported in B-VPLS services so that B-VPLS SAP or SDP bindings can be integrated with PBB-EVPN destination bindings. The features described in that section also facilitate a smooth migration from B-VPLS SDP bindings to PBB-EVPN destination bindings.
PBB-EVPN multihoming in I-VPLS and PBB Epipe services
The 7750 SR, 7450 ESS, and 7950 XRS SR OS PBB-EVPN implementation supports all-active and single-active multihoming for I-VPLS and PBB Epipe services.
PBB-EVPN multihoming reuses the ethernet-segment concept described in section EVPN multihoming in VPLS services. However, unlike EVPN-MPLS, PBB-EVPN does not use AD routes; it uses B-MACs for split-horizon checks and aliasing.
System B-MAC assignment in PBB-EVPN
RFC 7623 describes two types of B-MAC assignments that a PE can implement:
shared B-MAC addresses that can be used for single-homed CEs and a number of multihomed CEs connected to Ethernet-Segments
dedicated B-MAC addresses per Ethernet-Segment
In this document and in 7750 SR, 7450 ESS, and 7950 XRS terminology:
A shared-bmac (in IETF) is a source-bmac as configured in service>(b)vpls>pbb>source-bmac
A dedicated-bmac per ES (in IETF) is an es-bmac as configured in service>pbb>use-es-bmac
B-MAC selection and use depends on the multihoming model; for single-active mode, the type of B-MAC impacts the flooding in the network as follows:
All-active multihoming requires es-bmacs.
Single-active multihoming can use es-bmacs or source-bmacs.
The use of source-bmacs minimizes the number of B-MACs being advertised but has a larger impact on C-MAC flush upon ES failures.
The use of es-bmacs optimizes the C-MAC flush upon ES failures at the expense of advertising more B-MACs.
PBB-EVPN all-active multihoming service model
PBB-EVPN all-active multihoming shows the use of all-active multihoming in the 7750 SR, 7450 ESS, and 7950 XRS SR OS PBB-EVPN implementation.
For example, the following shows the ESI-1 and all-active configuration in PE3 and PE4. As in EVPN-MPLS, all-active multihoming is only possible if a LAG is used at the CE. All-active multihoming uses es-bmacs, that is, each ESI is assigned a dedicated B-MAC. All the PEs part of the ES source traffic using the same es-bmac.
In PBB-EVPN all-active multihoming and the following configuration, the es-bmac used by PE3 and PE4 is B-MAC-34 (for example, 00:00:00:00:00:34). The es-bmac for a specified ethernet-segment is configured by the source-bmac-lsb along with the (b-)vpls>pbb>use-es-bmac command.
Configuration in PE3:
*A:PE3>config>lag(1)# info
----------------------------------------------
mode access
encap-type dot1q
port 1/1/1
lacp active administrative-key 32768
no shutdown
*A:PE3>config>service>system>bgp-evpn# info
----------------------------------------------
route-distinguisher 10.3.3.3:0
ethernet-segment ESI-1 create
esi 00:34:34:34:34:34:34:34:34:34
multi-homing all-active
service-carving auto
lag 1
source-bmac-lsb 00:34 es-bmac-table-size 8
no shutdown
*A:PE3>config>service>vpls 1(b-vpls)# info
----------------------------------------------
bgp
exit
bgp-evpn
evi 1
mpls bgp 1
no shutdown
ecmp 2
auto-bind-tunnel resolution any
exit
pbb
source-bmac 00:00:00:00:00:03
use-es-bmac
*A:PE3>config>service>vpls (i-vpls)# info
----------------------------------------------
pbb
backbone-vpls 1
sap lag-1:101 create
*A:PE1>config>service>epipe (pbb)# info
----------------------------------------------
pbb
tunnel 1 backbone-dest-mac 00:00:00:00:00:01 isid 102
sap lag-1:102 create
Configuration in PE4:
*A:PE4>config>lag(1)# info
----------------------------------------------
mode access
encap-type dot1q
port 1/1/1
lacp active administrative-key 32768
no shutdown
*A:PE4>config>service>system>bgp-evpn# info
----------------------------------------------
route-distinguisher 10.4.4.4:0
ethernet-segment ESI-1 create
esi 00:34:34:34:34:34:34:34:34:34
multi-homing all-active
service-carving auto
lag 1
source-bmac-lsb 00:34 es-bmac-table-size 8
no shutdown
*A:PE4>config>service>vpls 1(b-vpls)# info
----------------------------------------------
bgp
exit
bgp-evpn
evi 1
mpls bgp 1
no shutdown
ecmp 2
auto-bind-tunnel resolution any
exit
pbb
source-bmac 00:00:00:00:00:04
use-es-bmac
*A:PE4>config>service>vpls (i-vpls)# info
----------------------------------------------
pbb
backbone-vpls 1
sap lag-1:101 create
*A:PE4>config>service>epipe (pbb)# info
----------------------------------------------
pbb
tunnel 1 backbone-dest-mac 00:00:00:00:00:01 isid 102
sap lag-1:102 create
The above configuration enables the all-active multihoming procedures for PBB-EVPN.
The following considerations apply when the ESI is used for PBB-EVPN:
ESI association
Only LAG is supported for all-active multihoming. The following commands are used for the LAG to ESI association:
config>service>system>bgp-evpn>ethernet-segment# lag <id>
config>service>system>bgp-evpn>ethernet-segment# source-bmaclsb <MAC-lsb> [es-bmac-table-size <size>]
Where:
The same ESI may be used for EVPN and PBB-EVPN services.
For PBB-EVPN services, the source-bmac-lsb attribute is mandatory and ignored for EVPN-MPLS services.
The source-bmac-lsb attribute must be set to a specific 2-byte value. The value must match on all the PEs part of the same ESI, for example, PE3 and PE4 for ESI-1. This means that the configured pbb>source-bmac on the two PEs for B-VPLS 1 must have the same 4 most significant bytes.
The es-bmac-table-size parameter modifies the default value (8) for the maximum number of virtual B-MACs that can be associated with the ethernet-segment (for example, es-bmacs). When the source-bmac-lsb is configured, the associated es-bmac-table-size is reserved out of the total FDB space.
When multi-homing all-active is configured within the ethernet-segment, only a LAG can be associated with it. The association of a port or an sdp is restricted by the CLI.
service-carving
If service-carving is configured in the ESI, the DF election algorithm is a modulo function of the ISID and the number of PEs part of the ESI, as opposed to a modulo function of evi and number of PEs (used for EVPN-MPLS).
-
service-carving mode manual
A service-carving mode manual option is added so that the user can control what PE is DF for a specified ISID. The PE is DF for the configured ISIDs and non-DF for the non-configured ISIDs.
DF election
An all-active Designated Forwarder (DF) election is also carried out for PBB-EVPN. In this case, the DF election defines which of the PEs of the ESI for a specified I-VPLS is the one able to send the downstream BUM traffic to the CE. Only one DF per ESI is allowed in the I-VPLS service, and the non-DF only blocks BUM traffic and in the downstream direction.
split-horizon function
In PBB-EVPN, the split-horizon function to avoid echoed packets on the CE is based on an ingress lookup of the ES B-MAC (as opposed to the ESI label in EVPN-MPLS). In PBB-EVPN all-active multihoming PE3 sends packets using B-MAC SA = BMAC-34. PE4 does not send those packets back to the CE because BMAC-34 is identified as the es-bmac for ESI-1.
aliasing
In PBB-EVPN, aliasing is based on the ES B-MAC sent by all the PEs part of the same ESI. See the following section for more information. In PBB-EVPN all-active multihoming PE1 performs load balancing between PE3 and PE4 when sending unicast flows to BMAC-34 (es-bmac for ESI-1).
In the configuration above, a PBB-Epipe is configured in PE3 and PE4, both pointing at the same remote pbb tunnel backbone-dest-mac. On the remote PE, for example PE1, the configuration of the PBB-Epipe points at the es-bmac:
*A:PE1>config>service>epipe (pbb)# info
----------------------------------------------
pbb
tunnel 1 backbone-dest-mac 00:00:00:00:00:34 isid 102
sap 1/1/1:102 create
When PBB-Epipes are used in combination with all-active multihoming, Nokia recommends using bgp-evpn mpls ingress-replication-bum-label in the PEs where the ethernet-segment is created, that is in PE3 and PE4. This guarantees that in case of flooding in the B-VPLS service for the PBB Epipe, only the DF forwards the traffic to the CE.
Aliasing for PBB-Epipes with all-active multihoming only works if shared-queuing or ingress policing is enabled on the ingress PE Epipe. In any other case, the IOM sends the traffic to a single destination (no ECMP is used in spite of the bgp-evpn mpls ecmp setting).
All-active multihomed es-bmacs are treated by the remote PEs as eES:MAX-ESI BMACs. The following example shows the FDB in B-VPLS 1 in PE1 as shown in PBB-EVPN all-active multihoming:
*A:PE1# show service id 1 fdb detail
===============================================================================
Forwarding Database, Service 1
===============================================================================
ServId MAC Source-Identifier Type Last Change
Age
-------------------------------------------------------------------------------
1 00:00:00:00:00:03 eMpls: EvpnS 06/12/15 15:35:39
192.0.2.3:262138
1 00:00:00:00:00:04 eMpls: EvpnS 06/12/15 15:42:52
192.0.2.4:262130
1 00:00:00:00:00:34 eES: EvpnS 06/12/15 15:35:57
MAX-ESI
-------------------------------------------------------------------------------
No. of MAC Entries: 3
-------------------------------------------------------------------------------
Legend: L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================
The show service id evpn-mpls on PE1 shows that the remote es-bmac (that is, 00:00:00:00:00:34) has two associated next-hops (for example, PE3 and PE4):
*A:PE1# show service id 1 evpn-mpls
===============================================================================
BGP EVPN-MPLS Dest
===============================================================================
TEP Address Egr Label Num. MACs Mcast Last Change
Transport
-------------------------------------------------------------------------------
192.0.2.3 262138 1 Yes 06/12/2015 15:34:48
ldp
192.0.2.4 262130 1 Yes 06/12/2015 15:34:48
ldp
-------------------------------------------------------------------------------
Number of entries : 2
-------------------------------------------------------------------------------
===============================================================================
===============================================================================
BGP EVPN-MPLS Ethernet Segment Dest
===============================================================================
Eth SegId TEP Address Egr Label Last Change
Transport
-------------------------------------------------------------------------------
No Matching Entries
===============================================================================
===============================================================================
BGP EVPN-MPLS ES BMAC Dest
===============================================================================
VBMacAddr TEP Address Egr Label Last Change
Transport
-------------------------------------------------------------------------------
00:00:00:00:00:34 192.0.2.3 262138 06/12/2015 15:34:48
ldp
00:00:00:00:00:34 192.0.2.4 262130 06/12/2015 15:34:48
ldp
-------------------------------------------------------------------------------
Number of entries : 2
-------------------------------------------------------------------------------
===============================================================================
Network failures and convergence for all-active multihoming
ES failures are resolved by the PEs withdrawing the es-bmac. The remote PEs withdraw the route and update their list of next-hops for a specified es-bmac.
No mac-flush of the I-VPLS FDB tables is required as long as the es-bmac is still in the FDB.
When the route corresponding to the last next-hop for a specified es-bmac is withdrawn, the es-bmac is flushed from the B-VPLS FDB and all the C-MACs associated with it are flushed too.
The following events trigger a withdrawal of the es-bmac and the corresponding next-hop update in the remote PEs:
B-VPLS transition to operationally down status.
Change of pbb>source-bmac.
Change of es-bmac (or removal of pbb use-es-bmac).
Ethernet-segment transition to operationally down status.
PBB-EVPN single-active multihoming service model
In single-active multihoming, the non-DF PEs for a specified ESI block unicast and BUM traffic in both directions (upstream and downstream) on the object associated with the ESI. Other than that, single-active multihoming follows the same service model defined in the PBB-EVPN all-active multihoming service model section with the following differences:
The ethernet-segment is configured for single-active: service>system>bgp-evpn>eth-seg>multi-homing single-active.
For single-active multihoming, the ethernet-segment can be associated with a port and sdp, as well as a lag.
From a service perspective, single-active multihoming can provide redundancy to the following services and access types:
I-VPLS LAG and regular SAPs
I-VPLS active/standby spoke SDPs
EVPN single-active multihoming is supported for PBB-Epipes only in two-node scenarios with local switching.
While all-active multihoming only uses es-bmac assignment to the ES, single-active multihoming can use source-bmac or es-bmac assignment. The system allows the following user choices per B-VPLS and ES:
A dedicated es-bmac per ES can be used. In that case, the pbb use-es-bmac command is configured in the B-VPLS and the same procedures described in PBB-EVPN all-active multihoming service model follow with one difference. While in all-active multihoming all the PEs part of the ESI source the PBB packets with the same source es-bmac, single-active multihoming requires the use of a different es-bmac per PE.
A non-dedicated source-bmac can be used. In this case, the user does not configure pbb>use-es-bmac and the regular source-bmac is used for the traffic. A different source-bmac has to be advertised per PE.
The use of source-bmacs or es-bmacs for single-active multihomed ESIs has a different impact on C-MAC flushing, as shown in Source-bmac versus es-bmac C-MAC flushing .
If es-bmacs are used as shown in the representation on the right in Source-bmac versus es-bmac C-MAC flushing , a less-impacting C-MAC flush is achieved, therefore, minimizing the flooding after ESI failures. In case of ESI failure, PE1 withdraws the es-bmac 00:12 and the remote PE3 only flushes the C-MACs associated with that es-bmac (only the C-MACs behind the CE are flushed).
If source-bmacs are used, as shown on the left side of Source-bmac versus es-bmac C-MAC flushing , in case of ES failure, a BGP update with higher sequence number is issued by PE1 and the remote PE3 flushes all the C-MACs associated with the source-bmac. Therefore, all the C-MACs behind the PE's B-VPLS are flushed, as opposed to only the C-MACs behind the ESI's CE.
As in EVPN-MPLS, the non-DF status can be notified to the access CE or network:
LAG with or without LACP
In this case, the multihomed ports on the CE are not part of the same LAG. The non-DF PE for each service may signal that the LAG SAP is operationally down by using eth-cfm fault-propagation-enable {use-if-tlv|suspend-ccm}.
regular Ethernet 802.1q/ad ports
In this case, the multihomed ports on the CE/network are not part of any LAG. The non-DF PE for each service signals that the SAP is operationally down by using eth-cfm fault-propagation-enable {use-if-tlv|suspend-ccm}.
active-standby PWs
In this case, the multihomed CE/network is connected to the PEs through an MPLS network and an active/standby spoke SDP per service. The non-DF PE for each service makes use of the LDP PW status bits to signal that the spoke SDP is standby at the PE side. Nokia recommends that the CE suppresses the signaling of PW status standby.
Network failures and convergence for single-active multihoming
ESI failures are resolved depending on the B-MAC address assignment chosen by the user:
If the B-MAC address assignment is based on the use of es-bmacs, DF and non-DFs do send the es-bmac/ESI=0 for a specified ESI. Each PE has a different es-bmac for the same ESI (as opposed to the same es-bmac on all the PEs for all-active).
In case of an ESI failure, the PE withdraws its es-bmac route triggering a mac-flush of all the C-MACs associated with it in the remote PEs.
If the B-MAC address assignment is based on the use of source-bmac, DF and non-DFs advertise their respective source-bmacs. In case of an ES failure:
The PE re-advertises its source-bmac with a higher sequence number (the new DF does not re-advertise its source-bmac).
The far-end PEs interpret a source-bmac advertisement with a different sequence number as a flush-all-from-me message from the PE detecting the failure. They flush all the C-MACs associated with that B-MAC in all the ISID services.
The following events trigger a C-MAC flush notification. A 'C-MAC flush notification' means the withdrawal of a specified B-MAC or the update of B-MAC with a higher sequence number (SQN). Both BGP messages make the remote PEs flush all the C-MACs associated with the indicated B-MAC:
B-VPLS transition to operationally down status. This triggers the withdrawal of the associated B-MACs, regardless of the use-es-bmac setting.
Change of pbb>source-bmac. This triggers the withdrawal and re-advertisement of the source-bmac, causing the corresponding C-MAC flush in the remote PEs.
Change of es-bmac (removal of pbb use-es-bmac). This triggers the withdrawal of the es-bmac and re-advertisement of the new es-bmac.
Ethernet-Segment (ES) transition to operationally down or admin-down status. This triggers an es-bmac withdrawal (if use-es-bmac is used) or an update of the source-bmac with a higher SQN (if no use-es-bmac is used).
Service Carving Range change for the ES. This triggers an es-bmac update with higher SQN (if use-es-bmac is used) or an update of the source-bmac with a higher SQN (if no use-es-bmac is used).
Change in the number of candidate PEs for the ES. This triggers an es-bmac update with higher SQN (if use-es-bmac is used) or an update of the source-bmac with a higher SQN (if no use-es-bmac is used).
In an ESI, individual SAPs/SDP bindings or individual I-VPLS going operationally down do not generate any BGP withdrawal or indication so that the remote nodes can flush their C-MACs. This is solved in EVPN-MPLS by the use of AD routes per EVI; however, there is nothing similar in PBB-EVPN for indicating a partial failure in an ESI.
PBB-Epipes and EVPN multihoming
EVPN multihoming is supported with PBB-EVPN Epipes, but only in a limited number of scenarios. In general, the following applies to PBB-EVPN Epipes:
PBB-EVPN Epipes do not support spoke SDPs that are associated with EVPN ESs.
PBB-EVPN Epipes support all-active EVPN multihoming as long as no local-switching is required in the Epipe instance where the ES is defined.
PBB-EVPN Epipes support single-active EVPN multihoming only in a two-node case scenario.
PBB-EVPN MH in a three-node scenario shows the EVPN MH support in a three-node scenario.
EVPN MH support in a three-node scenario has the following characteristics:
All-active EVPN multihoming is fully supported (diagram on the left in PBB-EVPN MH in a three-node scenario). CE1 may also be multihomed to other PEs, as long as those PEs are not PE2 or PE3. In this case, PE1 Epipe's pbb-tunnel would be configured with the remote ES B-MAC.
Single-active EVPN multihoming is not supported in a three (or more)-node scenario (diagram on the right in PBB-EVPN MH in a three-node scenario). Because PE1's Epipe pbb-tunnel can only point at a single remote B-MAC and single-active multihoming requires the use of separate B-MACs on PE2 and PE3, the scenario is not possible and not supported regardless of the ES association to port/LAG/sdps.
Regardless of the EVPN multihoming type, the CLI prevents the user from adding a spoke SDP to an Epipe, if the corresponding SDP is part of an ES.
PBB-EVPN MH in a two-node scenario shows the EVPN MH support in a two-node scenario.
EVPN MH support in a two-node scenario has the following characteristics, as shown in PBB-EVPN MH in a two-node scenario:
All-active multihoming is not supported for redundancy in this scenario because PE1's pbb-tunnel cannot point at a locally defined ES B-MAC. This is represented in the left-most scenario in PBB-EVPN MH in a two-node scenario.
Single-active multihoming is supported for redundancy in a two-node three or four SAP scenario, as displayed by the two right-most scenarios in PBB-EVPN MH in a two-node scenario).
In these two cases, the Epipe pbb-tunnel is configured with the source B-MAC of the remote PE node.
When two SAPs are active in the same Epipe, local-switching is used to exchange frames between the CEs.
PBB-EVPN and use of P2MP mLDP tunnels for default multicast list
P2MP mLDP tunnels can also be used in PBB-EVPN services. The use of provider-tunnel inclusive MLDP is only supported in the B-VPLS default multicast list; that is, no per-ISID IMET-P2MP routes are supported. IMET-P2MP routes in a B-VPLS are always advertised with Ethernet tag zero. All-active EVPN multihoming is supported in PBB-EVPN services together with P2MP mLDP tunnels; however, single-active multihoming is not supported. This capability is only required on the P2MP root PEs within PBB-EVPN services using all-active multihoming.
B-VPLS supports the use of MFIBs for ISIDs using ingress replication. The following considerations apply when provider-tunnel is enabled in a B-VPLS service:
Local I-VPLS or static-ISIDs configured on the B-VPLS generate IMET-IR routes and MFIBs are created per ISID by default.
The default IMET-P2MP or IMET-P2MP-IR route sent with Ethernet-tag = 0 is issued depending on the ingress-repl-inc-mcast-advertisement command.
The following considerations apply if an isid-policy is configured in the B-VPLS.
A range of ISIDs configured with use-def-mcast make use of the P2MP tree, assuming the node is configured as root-and-leaf.
A range of ISIDs configured with advertise-local make the system advertise IMET-IR routes for the local ISIDs included in the range.
The following example CLI output shows a range of ISIDs (1000-2000) that use the P2MP tree and the system does not advertise the IMET-IR routes for those ISIDs. Other local ISIDs advertise the IMET-IR and use the MFIB to forward BUM packets to the EVPN-MPLS destinations created by the IMET-IR routes.
*A:PE-1>config>service>vpls(b-vpls)# info
----------------------------------------------
service-mtu 2000
bgp-evpn
evi 10
mpls bgp 1
no shutdown
auto-bind-tunnel resolution any
isid-policy
entry 10 create
use-def-mcast
no advertise-local
range 1000 to 2000
exit
exit
provider-tunnel
inclusive
owner bgp-evpn-mpls
root-and-leaf
mldp
no shutdown
exit
exit
sap 1/1/1:1 create
exit
spoke-sdp 1:1 create
exit
PBB-EVPN ISID-based C-MAC flush
SR OS supports ISID-based C-MAC flush procedures for PBB-EVPN I-VPLS services where no single-active ESs are used. SR OS also supports C-MAC flush procedure where other redundancy mechanisms, such as BGP-MH, need these procedures to avoid blackholes caused by a SAP or spoke SDP failure.
The C-MAC flush procedures are enabled on the I-VPLS service using the config>service>vpls>pbb>send-bvpls-evpn-flush CLI command. The feature can be disabled on a per-SAP or per-spoke SDP basis by using the disable-send-bvpls-evpn-flush command in the config>service>vpls>sap or config>service>vpls>spoke-sdp context.
With the feature enabled on an I-VPLS service and a SAP or spoke SDP, if there is a SAP or spoke SDP failure, the router sends a C-MAC flush notification for the corresponding B-MAC and ISID. The router receiving the notification flushes all the C-MACs associated with the indicated B-MAC and ISID when the config>service>vpls>bgp-evpn>accept-ivpls-evpn-flush command is enabled for the B-VPLS service.
The C-MAC flush notification consists of an EVPN B-MAC route that is encoded as follows: the ISID to be flushed is encoded in the Ethernet Tag field and the sequence number is incremented with respect to the previously advertised route.
If send-bvpls-evpn-flush is configured on an I-VPLS with SAPs or spoke SDPs, one of the following rules must be observed:
-
The disable-send-bvpls-evpn-flush option is configured on the SAPs or spoke SDPs.
-
The SAPs or spoke SDPs are not on an ES.
-
The SAPs or spoke SDPs are on an ES or vES with no src-bmac-lsb enabled.
-
The no use-es-bmac is enabled on the B-VPLS.
ISID-based C-MAC flush can be enabled in I-VPLS services with ES or vES. If enabled, the expected interaction between the RFC 7623-based C-MAC flush and the ISID-based C-MAC flush is as follows.
If send-bvpls-evpn-flush is enabled in an I-VPLS service, the ISID-based C-MAC flush overrides (replaces) the RFC 7623-based C-MAC flushing.
For each ES, vES, or B-VPLS, the system checks for at least one I-VPLS service that does not have send-bvpls-evpn-flush enabled.
If ISID-based C-MAC flush is enabled for all I-VPLS services, RFC 7623-based C-MAC flushing is not triggered; only ISID-based C-MAC flush notifications are generated.
If at least one I-VPLS service is found with no ISID-based C-MAC flush enabled, then RFC 7623-based C-MAC flushing notifications are triggered based on ES events.
ISID-based C-MAC flush notifications are also generated for I-VPLS services that have send-bvpls-evpn-flush enabled.
Per-ISID C-MAC flush following a SAP failure shows an example where the ISID-based C-MAC flush prevents blackhole situations for a CE that is using BGP-MH as the redundancy mechanism in the I-VPLS with an ISID of 3000.
When send-bvpls-evpn-flush is enabled, the I-VPLS service is ready to send per-ISID C-MAC flush messages in the form of B-MAC/ISID routes. The first B-MAC/ISID route for an I-VPLS service is sent with sequence number zero; subsequent updates for the same route increment the sequence number. A B-MAC/ISID route for the I-VPLS is advertised or withdrawn during the following cases:
I-VPLS send-bvpls-evpn-flush configuration and deconfiguration
I-VPLS association and disassociation from the B-VPLS service
I-VPLS operational status change (up/down)
B-VPLS operational status change (up/down)
B-VPLS bgp-evpn mpls status change (no shutdown/shutdown)
B-VPLS operational source B-MAC change
If no disable-send-bvpls-evpn-flush is configured for a SAP or spoke SDP, upon a failure on that SAP or spoke SDP, the system sends a per-ISID C-MAC flush message; that is, a B-MAC/ISID route update with an incremented sequence number.
If the user explicitly configures disable-send-bvpls-evpn-flush for a SAP or spoke SDP, the system does not send per-ISID C-MAC flush messages for failures on that SAP or spoke SDP.
The B-VPLS on the receiving node must be configured with bgp-evpn>accept-ivpls-evpn-flush to accept and process C-MAC flush non-zero Ethernet-tag MAC routes. If the accept-ivpls-evpn-flush command is enabled (the command is disabled by default), the node accepts non-zero Ethernet-tag MAC routes (B-MAC/ISID routes) and processes them. When a new B-MAC/ISID update (with an incremented sequence number) for an existing route is received, the router flushes all the C-MACs associated with that B-MAC and ISID. The B-MAC/ISID route withdrawals also cause a C-MAC flush.
The following CLI example shows the commands that enable the C-MAC flush feature on PE1 and PE3.
*A:PE-1>config>service>vpls(i-vpls)# info
----------------------------------------------
pbb
backbone-vpls 10
send-bvpls-evpn-flush
exit
exit
bgp
route-distinguisher 65000:1
vsi-export ‟vsi_export”
vsi-import ‟vsi_import”
exit
site ‟CE-1” create
site-id 1
sap lag-1:1
site-activation-timer 3
no shutdown
exit
sap lag-1:1 create
no disable-send-bvpls-evpn-flush
no shutdown
exit
<snip>
*A:PE-3>config>service>vpls(b-vpls 10)# info
----------------------------------------------
<snip>
bgp-evpn
accept-ivpls-evpn-flush
In the preceding example, with send-bvpls-evpn-flush enabled on the I-VPLS service of PE1, a B-MAC/ISID route (for pbb source-bmac address B-MAC 00:..:01 and ISID 3000) is advertised. If the SAP goes operationally down, PE1 sends an update of the source B-MAC address (00:..:01) for ISID 3000 with a higher sequence number.
With accept-ivpls-evpn-flush enabled on PE3’s B-VPLS service, PE3 flushes all C-MACs associated with B-MAC 00:01 and ISID 3000. The C-MACs associated with other B-MACs or ISIDs are retained in PE3’s FDB.
PBB-EVPN ISID-based route targets
Routers with PBB-EVPN services use the following route types to advertise the ISID of a specific service:
Inclusive Multicast Ethernet Tag routes (IMET-ISID routes) are used to auto-discover ISIDs in the PBB-EVPN network. The routes encode the service ISID in the Ethernet Tag field.
BMAC-ISID routes are only used when ISID-based C-MAC flush is configured. The routes encode the ISID in the Ethernet Tag field.
Although the preceding routes are only relevant for routers where the advertised ISID is configured, they are sent with the B-VPLS route-target by default. As a result, the routes are unnecessarily disseminated to all the routers in the B-VPLS network.
SR OS supports the use of per-ISID or group of ISID route-targets, which limits the dissemination of IMET-ISID or BMAC-ISID routes for a specific ISID to the PEs where the ISID is configured.
The config>service>(b-)vpls>isid-route-target>isid-range from [to to] [auto-rt | route-target rt] command allows the user to determine whether the IMET-ISID and BMAC-ISID routes are sent with the B-VPLS route-target (default option, no command), or a route-target specific to the ISID or range of ISIDs.
The following configuration example shows how to configure ISID ranges as auto-rt or with a specific route-target.
*A:PE-3>config>service>(b-)vpls>bgp-evpn#
isid-route-target
[no] isid-range <from> [to <to>] {auto-rt|route-target <rt>}
/* For example:
*A:PE-3>config>service>(b-)vpls>bgp-evpn#
isid-route-target
isid-range 1000 to 1999 auto-rt
isid-range 2000 route-target target:65000:2000
The auto-rt option auto-derives a route-target per ISID in the following format:
<2-byte-as-number>:<4-byte-value>
Where: 4-byte-value = 0x30+ISID, as described in RFC 7623. PBB-EVPN auto-rt ISID-based route target format shows the format of the auto-rt option.
Where:
If it is 2 bytes, then the AS number is obtained from the config>router>autonomous-system command. If the AS number exceeds the 2 byte limit, then the low order 16-bit value is used.
A = 0 for auto-derivation
Type = 3, which corresponds to an ISID-type route-target
ISID is the 24-bit ISID
The type and sub-type are 0x00 and 0x02.
If isid-route-target is enabled, the export and import directions for IMET-ISID and BMAC-ISID route processing are modified as follows:
Exported IMET-ISID and BMAC-ISID routes
For local I-VPLS ISIDs and static ISIDs, IMET-ISID routes are sent individually with an ISID-based route-target (and without a B-VPLS route-target) unless the ISID is contained in an ISID policy for which no advertise-local is configured.
If both isid-route-target and send-bvpls-evpn-flush options are enabled for an I-VPLS, the BMAC-ISID route is also sent with the ISID-based route-target and no B-VPLS route-target.
The isid-route-target command affects the IMET-ISID and BMAC-ISID routes only. The BMAC-0, IMET-0 (B-MAC and IMET routes with Ethernet Tag == 0), and ES routes are not impacted by the command.
Imported IMET-ISID and BMAC-ISID routes
Upon enabling isid-route-target for a specific I-VPLS, the BGP starts importing IMET-ISID routes with ISID-based route-targets, and (assuming the bgp-evpn accept-ivpls-evpn-flush option is enabled) BMAC-ISID routes with ISID-based route-targets.
The new ISID-based RTs are added for import operations when the I-VPLS is associated with the B-VPLS service (and not based on the I-VPLS operational status), or when the static-isid is added.
The system does not maintain a mapping of the route-targets and ISIDs for the imported routes. For example, if I-VPLS 1 and 2 are configured with the isid-route-target option and IMET-ISID=2 route is received with a route-target corresponding to ISID=1, then BGP imports the route and the router processes it.
The router does not check the format of the received auto-derived route-targets. The route is imported as long as the route-target is on the list of RTs for the B-VPLS.
If the isid-route-target option is configured for one or more I-VPLS services, the vsi-import and vsi-export policies are blocked in the B-VPLS. BGP peer import and export policies are still allowed. Matching on the export ISID-based route-target is supported.
EVPN-VPWS for MPLS tunnels
This section contains information about EVPN-VPWS for MPLS tunnels.
BGP-EVPN control plane for EVPN-VPWS
EVPN-VPWS for MPLS tunnels uses the RFC 8214 BGP extensions described in EVPN-VPWS for VXLAN tunnels, with the following differences for the Ethernet AD per-EVI routes:
-
The MPLS field encodes an MPLS label as opposed to a VXLAN VNI.
-
The C flag is set if the control word is configured in the service.
-
The F flag is set if the hash label is configured in the service.
EVPN for MPLS tunnels in Epipe services (EVPN-VPWS)
The use and configuration of EVPN-VPWS services is described in EVPN-VPWS for VXLAN tunnels with the following differences when the EVPN-VPWS services use MPLS tunnels instead of VXLAN.
When MPLS tunnels are used, the bgp-evpn>mpls context must be configured in the Epipe. As an example, if Epipe 2 is an EVPN-VPWS service that uses MPLS tunnels between PE2 and PE4, this would be its configuration:
PE2>config>service>epipe(2)#
-----------------------
bgp
exit
bgp-evpn
evi 2
local-attachment-circuit "AC-1"
eth-tag 200
exit
remote-attachment-circuit "AC-2"
eth-tag 200
exit
mpls bgp 1
ecmp 2
no shutdown
exit
sap 1/1/1:1 create
PE4>config>service>epipe(2)#
-----------------------
bgp
exit
bgp-evpn
evi 2
local-attachment-circuit "AC-2"
eth-tag 200
exit
remote-attachment-circuit "AC-1"
eth-tag 100
exit
mpls bgp 1
ecmp 2
no shutdown
exit
spoke-sdp 1:1
Where the following BGP-EVPN commands, specific to MPLS tunnels, are supported in the same way as in VPLS services:
-
mpls auto-bind-tunnel
-
mpls control-word
-
mpls hash-label
-
mpls entropy-label
-
mpls force-vlan-vc-forwarding
-
mpls shutdown
EVPN-VPWS Epipes with MPLS tunnels can also be configured with the following characteristics:
-
Access attachment circuits can be SAPs or spoke SDPs. Manually configured and BGP-VPWS spoke SDPs are supported. The VC switching configuration is not supported on BGP-EVPN-enabled pipes.
-
EVPN-VPWS Epipes using null SAPs can be configured with sap>ethernet>llf. When enabled, upon removing the EVPN destination, the port is brought oper-down with flag LinkLossFwd, however the AD per EVI route for the SAP is still advertised (the SAP is kept oper-up). When the EVPN destination is created, the port is brought oper-up and the flag cleared.
-
EVPN-VPWS Epipes for MPLS tunnels support endpoints. The parameter endpoint endpoint name is configurable along with bgp-evpn>local-attachment-circuit and bgp-evpn>remote-attachment-circuit. The following conditions apply to endpoints on EVPN-VPWS Epipes with MPLS tunnels:
-
Up to two explicit endpoints are allowed per Epipe service with BGP-EVPN configured.
-
A limited endpoint configuration is allowed in Epipes with BGP-EVPN. Specifically, neither active-hold-delay nor revert-time are configurable.
-
When bgp-evpn>remote-attachment-circuit is added to an explicit endpoint with a spoke SDP, the spoke-sdp>precedence command is not allowed. The spoke SDP always has a precedence of four, which is always higher than the EVPN precedence. Therefore, the EVPN-MPLS destination is used for transmission if it is created, and the spoke SDP is only used when the EVPN-MPLS destination is removed.
-
-
EVPN-VPWS Epipes for MPLS tunnels support control word, hash label and entropy labels.
- When the control word is configured, the PE sets the C bit in its AD per-EVI advertisement and sends the control word in the datapath. In this case, the PE expects the control word to be received. If there is a mismatch between the received control word and the configured control word, the system does not set up the EVPN destination and the service does not come up.
- When the hash-label command is configured, the PE sets the F bit in its AD per-EVI routes and sends the hash label in the datapath. The PE expects the hash-label to also be received. In case of a mismatch between the received F flag and the locally configured hash-label, the router does not create the EVPN destination and the service does not come up. For the service, the use of the hash label and entropy labels are mutually exclusive.
-
EVPN-VPWS Epipes support force-qinq-vc-forwarding [c-tag-c-tag | s-tag-c-tag] command under bgp-evpn mpls and the qinq-vlan-translation s-tag.c-tag command on ingress QinQ SAPs.
When QinQ VLAN translation is configured at the ingress QinQ or dot1q SAP, the service-delimiting outer and inner VLAN values can be translated to the configured values. The force-qinq-vc-forwarding s-tag-c-tag command must be configured to preserve the translated QinQ tags in the payload when sending EVPN packets. This translation and preservation behavior is aligned with the ‟normalization” concept described in draft-ietf-bess-evpn-vpws-fxc. The VLAN tag processing described in Epipe service pseudowire VLAN tag processing applies to EVPN destinations in EVPN-VPWS services too.
The following features, described in EVPN-VPWS for VXLAN tunnels, are also supported for MPLS tunnels:
Advertisement of the Layer-2 MTU and consistency checking of the MTU of the AD per-EVI routes.
Use of A/S PW and MC-LAG at access.
EVPN multihoming, including:
Single-active and all-active
Regular or virtual ESs
All existing DF election modes
EVPN-VPWS services with local-switching support
Epipes with BGP-EVPN MPLS support the following configurations:
up to two endpoints
up to two SAPs associated with a different configured endpoint each
two pairs of local/remote attachment circuit Ethernet tags, also associated with different configured endpoints
EVPN destinations that can be used as Inter-Chassis Backup (ICB) links
The support of endpoints and up to two SAPs with local-switching allows two and three-node topologies for EVPN-VPWS. EVPN-VPWS endpoints example 1, EVPN-VPWS endpoints example 2, and EVPN-VPWS endpoints example 3 show examples of these topologies.
Example 1
The following figure shows an example of EVPN-VPWS endpoints.
In EVPN-VPWS endpoints example 1, PE1 is configured with the following Epipe services:
endpoint X create
exit
endpoint Y create
exit
bgp-evpn
evi 350
local-attachment-circuit "CE-1" endpoint "Y" create
eth-tag 1
exit
remote-attachment-circuit "ICB-1" endpoint "Y" create
eth-tag 2
exit
local-attachment-circuit "CE-2" endpoint "X" create
eth-tag 2
exit
remote-attachment-circuit "ICB-2" endpoint "X" create
eth-tag 1
exit
mpls bgp 1
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
sap lag-1:1 endpoint X create
exit
sap 1/1/1:1 endpoint Y create
exit
In EVPN-VPWS endpoints example 1, PE2 is configured with the following Epipe services:
bgp-evpn
evi 350
local-attachment-circuit "CE-1" create
eth-tag 1
exit
remote-attachment-circuit "ICB-1" create
eth-tag 2
exit
// implicit endpoint "Y"
mpls bgp 1
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
sap lag-1:1 create
exit
// implicit endpoint "X"
In this example, if we assume multihoming on CE1, the following applies:
PE1 advertises two AD per-EVI routes, for tags 1 and 2, respectively. PE2 advertises only the route for tag 1.
AD per-EVI routes for tag 1 are advertised based on CE1 SAPs' states
AD per-EVI route for tag 2 is advertised based on CE2 SAP state
PE1 creates endpoint X with sap lag-1:1 and ES-destination to tag 1 in PE2
PE2 creates the usual destination to tag 2 in PE1
In case of all-active MH:
traffic from CE1 to PE1 is forwarded to CE2 directly
traffic from CE1 to PE2 is forwarded to PE1 with the label that identifies CE2's SAP
traffic from CE2 is forwarded to CE1 directly because CE1's SAP is the endpoint Tx; in case of failure on CE1's SAP, PE1 changes the Tx object to the ES-destination to PE2
In case of single-active MH, traffic flows in the same way, except that a non-DF SAP is operationally down and therefore cannot be an endpoint Tx object.
Example 2
The following figure shows an example of EVPN-VPWS endpoints.
In EVPN-VPWS endpoints example 2, PE1 is configured with the following Epipe services.
endpoint X create
exit
endpoint Y create
exit
bgp-evpn
evi 350
local-attachment-circuit "CE-1" endpoint "Y" create
eth-tag 1
exit
remote-attachment-circuit "ICB-1" endpoint "Y" create
eth-tag 2
exit
local-attachment-circuit "CE-2" endpoint "X" create
eth-tag 2
exit
remote-attachment-circuit "ICB-2" endpoint "X" create
eth-tag 1
exit
mpls bgp 1
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
sap lag-1:1 endpoint X create
exit
sap lag-2:1 endpoint Y create
exit
In EVPN-VPWS endpoints example 2, PE2 is configured with the following Epipe services.
endpoint X create
exit
endpoint Y create
exit
bgp-evpn
evi 350
local-attachment-circuit "CE-1" endpoint "Y" create
eth-tag 1
exit
remote-attachment-circuit "ICB-1" endpoint "Y" create
eth-tag 2
exit
local-attachment-circuit "CE-2" endpoint "X" create
eth-tag 2
exit
remote-attachment-circuit "ICB-2" endpoint "X" create
eth-tag 1
exit
mpls bgp 1
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
sap lag-1:1 endpoint X create
exit
sap lag-2:1 endpoint Y create
exit
This example is similar to the EVPN-VPWS endpoints example 1 example, except that the two PEs are multihomed to both CEs. In EVPN-VPWS endpoints example 1, if CE2 goes down, then, no traffic exists between PEs because the two PEs lose all the objects in the endpoint connected to CE2. Traffic that arrives on EVPN is only forwarded to a SAP on a different endpoint.
Example 3
The following figure shows an example of EVPN-VPWS endpoints.
In EVPN-VPWS endpoints example 3, PE1 is configured with the following Epipe services.
bgp-evpn
evi 350
local-attachment-circuit "CE-1"
eth-tag 1
exit
remote-attachment-circuit "ICB-1"
eth-tag 2
exit
// implicit endpoint "Y"
mpls bgp 1
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
sap lag-1:1 create
// implicit endpoint "X"
exit
In EVPN-VPWS endpoints example 3, PE2 is configured with the following Epipe services.
endpoint X create
exit
endpoint Y create
exit
bgp-evpn
evi 350
local-attachment-circuit "CE-1" endpoint "Y"
eth-tag 1
exit
remote-attachment-circuit "ICB-1" endpoint "Y"
eth-tag 2
exit
local-attachment-circuit "CE-2" endpoint "X"
eth-tag 2
exit
remote-attachment-circuit "ICB-2" endpoint "X"
eth-tag 1
exit
mpls bgp 1
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
sap lag-1:1 endpoint X create
exit
sap lag-2:1 endpoint Y create
exit
In EVPN-VPWS endpoints example 3, PE3 is configured with the following Epipe services.
bgp-evpn
evi 350
local-attachment-circuit "CE-2"
eth-tag 2
exit
remote-attachment-circuit "ICB-2"
eth-tag 1
exit
// implicit endpoint "X"
mpls bgp 1
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
sap lag-1:1 create
// implicit endpoint "Y"
exit
This example is similar to the EVPN-VPWS endpoints example 2 example, except that a third node is added. Nodes PE1 and PE3 have implicit endpoints. Only node PE2 requires the configuration of endpoints.
EVPN for MPLS tunnels in routed VPLS services
EVPN-MPLS and IP-prefix advertisement (enabled by the ip-route-advertisement command) are fully supported in routed VPLS services and provide the same feature-set as EVPN-VXLAN. The following capabilities are supported in a service where bgp-evpn mpls is enabled:
R-VPLS with VRRP support on the VPRN or IES interfaces
R-VPLS support including ip-route-advertisement with regular interfaces
This includes the advertisement and process of ip-prefix routes defined in IETF Draft draft-ietf-bess-evpn-prefix-advertisement with the appropriate encoding for EVPN-MPLS.
R-VPLS support including ip-route-advertisement with evpn-tunnel interfaces
R-VPLS with IPv6 support on the VPRN or IES IP interface
IES interfaces do not support either ip-route-advertisement or evpn-tunnel.
EVPN-MPLS multihoming and passive VRRP
SAP and spoke SDP based ESs are supported on R-VPLS services where bgp-evpn mpls is enabled.
EVPN-MPLS multihoming in R-VPLS services shows an example of EVPN-MPLS multihoming in R-VPLS services, with the following assumptions:
-
There are two subnets for a specific customer (for example, EVI1 and EVI2 in EVPN-MPLS multihoming in R-VPLS services), and a VPRN is instantiated in all the PEs for efficient inter-subnet forwarding.
-
A ‟backhaul” R-VPLS with evpn-tunnel mode enabled is used in the core to interconnect all the VPRNs. EVPN IP-prefix routes are used to exchange the prefixes corresponding to the two subnets.
-
An all-active ES is configured for EVI1 on PE1 and PE2.
-
A single-active ES is configured for EVI2 on PE3 and PE4.
In the example in EVPN-MPLS multihoming in R-VPLS services, the hosts connected to CE1 and CE4 could use regular VRRP for default gateway redundancy; however, this may not be the most efficient way to provide upstream routing.
For example, if PE1 and PE2 are using regular VRRP, the upstream traffic from CE1 may be hashed to the backup IRB VRRP interface, instead of being hashed to the active interface. The same thing may occur for single-active multihoming and regular VRRP for PE3 and PE4. The traffic from CE4 is sent to PE3, while PE4 may be the active VRRP router. In that case, PE3 has to send the traffic to PE4, instead of route it directly.
In both cases, unnecessary bandwidth between the PEs is used to get to the active IRB interface. In addition, VRRP scaling is limited if aggressive keepalive timers are used.
Because of these issues, passive VRRP is recommended as the best method when EVPN-MPLS multihoming is used in combination with R-VPLS redundant interfaces.
Passive VRRP is a VRRP setting in which the transmission and reception of keepalive messages is completely suppressed, and therefore the VPRN interface always behaves as the active router. Passive VRRP is enabled by adding the passive keyword to the VRRP instance at creation, as shown in the following examples:
-
config service vprn 1 interface int-1 vrrp 1 passive
-
config service vprn 1 interface int-1 ipv6 vrrp 1 passive
For example, if PE1, PE2, and PE5 in EVPN-MPLS multihoming in R-VPLS services use passive VRRP, even if each individual R-VPLS interface has a different MAC/IP address, because they share the same VRRP instance 1 and the same backup IP, the three PEs own the same virtual MAC and virtual IP address (for example, 00-00-5E-00-00-01 and 10.0.0.254). The virtual MAC is auto-derived from 00-00-5E-00-00-VRID per RFC 3768. The following is the expected behavior when passive VRRP is used in this example:
-
All R-VPLS IRB interfaces for EVI1 have their own physical MAC/IP address; they also own the same default gateway virtual MAC and IP address.
-
All EVI1 hosts have a unique configured default gateway; for example, 10.0.0.254.
-
When CE1 or CE2 send upstream traffic to a remote subnet, the packets are routed by the closest PE because the virtual MAC is always local to the PE.
For example, the packets from CE1 hashed to PE1 are routed at PE1. The packets from CE1 hashed to PE2 are routed directly at PE2.
-
Downstream packets (for example, packets from CE3 to CE1), are routed directly by the PE to CE1, regardless of the PE to which PE5 routed the packets.
For example, the packets from CE3 sent to PE1 are routed at PE1. The packets from CE3 sent to PE2 are routed at PE2.
-
In case of ES failure in one of the PEs, the traffic is forwarded by the available PE.
For example, if the packets routed by PE5 arrive at PE1 and the link to CE1 is down, then PE1 sends the packets to PE2. PE2 forwards the packets to CE1 even if the MAC source address of the packets matches PE2's virtual MAC address. Virtual MACs bypass the R-VPLS interface MAC protection.
The following list summarizes the advantages of using passive VRRP mode versus regular VRRP for EVPN-MPLS multihoming in R-VPLS services:
-
Passive VRRP does not require multiple VRRP instances to achieve default gateway load-balancing. Only one instance per R-VPLS, therefore only one default gateway, is needed for all the hosts.
-
The convergence time for link/node failures is not impacted by the VRRP convergence, as all the nodes in the VRRP instance are acting as active routers.
-
Passive VRRP scales better than VRRP, as it does not use keepalive or BFD messages to detect failures and allow the backup to take over.
In EVPN all-active multi-homing scenarios with R-VPLS services where the advertisement of the ARP/ND entries is enabled, use the following command to avoid issues with MAC mobility caused by the MAC/IP Advertisement route for the ARP/ND entry being sent with ESI=0:
- MD-CLI
configure service vpls bgp-evpn routes mac-ip arp-nd-only-with-fdb-advertisement true
- classic
CLI
configure service vpls bgp-evpn arp-nd-only-with-fdb-advertisement
When this command is enabled, the local ARP/ND entries of VPRN interfaces using this VPLS are advertised in this BGP-EVPN service only when the corresponding local MAC is programmed in the FDB.
In an EVPN multi-homing scenario, this command prevents the router from advertising a MAC/IP Advertisement route with the MAC and IP binding but without the correct ESI value (which is taken only when the MAC is properly programmed in the FDB against the ESI).
In addition, if an Ethernet segment SAP receives a frame, the MAC address can be re-programmed as type learned, even if the MAC was previously programmed as type EVPN.
Virtual Ethernet Segment for EVPN multihoming
SR OS supports virtual Ethernet Segments (vES) for EVPN multihoming in accordance with draft-ietf-bess-evpn-virtual-eth-segment.
Regular ESs can only be associated with ports, LAGs, and SDPs, which satisfies the redundancy requirements for CEs that are directly connected to the ES PEs by a port, LAG, or SDP. However, this implementation does not work when an aggregation network exists between the CEs and the ES PEs, which requires different ESs to be defined for the port or LAG of the SDP.
All-active multihoming on vES shows an example of how CE1 and CE2 use all-active multihoming to the EVPN-MPLS network despite the third-party Ethernet aggregation network to which they are connected.
The ES association can be made in a more granular way by creating a vES. A vES can be associated with the following:
Q-tag ranges on dot1q ports or LAGs
S-tag ranges on QinQ ports or LAGs
C-tag ranges per S-tag on QinQ ports or LAGs
VC-ID ranges on SDPs
The following example displays the vES configuration options.
MD-CLI
[ex:/configure service system bgp evpn]
A:admin@node-2# info
ethernet-segment "vES-1" {
type virtual
association {
lag "lag-1" {
virtual-ranges {
dot1q {
q-tag 100 {
end 200
}
}
}
}
}
}
ethernet-segment "vES-2" {
type virtual
association {
port 1/1/1 {
virtual-ranges {
qinq {
s-tag-c-tag 1 c-tag-start 100 {
c-tag-end 200
}
s-tag 2 {
end 10
}
}
}
}
}
}
ethernet-segment "vES-3" {
type virtual
association {
sdp 1 {
virtual-ranges {
vc-id 1000 {
end 2000
}
}
}
}
}
classic CLI
A:node-2>config>service>system>bgp-evpn# info
----------------------------------------------
ethernet-segment "vES-1" virtual create
service-carving
mode auto
exit
lag 1
dot1q
q-tag-range 100 to 200
exit
shutdown
exit
ethernet-segment "vES-2" virtual create
service-carving
mode auto
exit
port 1/1/1
qinq
s-tag-range 2 to 10
s-tag 1 c-tag-range 100 to 200
exit
shutdown
exit
ethernet-segment "vES-3" virtual create
service-carving
mode auto
exit
sdp 1
vc-id-range 1000 to 2000
shutdown
exit
----------------------------------------------
Where:
-
The virtual keyword creates an ES as defined in draft-ietf-bess-evpn-virtual-eth-segment. The configuration of the dot1q or QinQ nodes is allowed only when the ES is created as virtual.
-
On the vES, the user must first create a port, LAG, or SDP before configuring a VLAN or VC-ID association. When added, the port/LAG type and encap-type is checked as follows:
-
If the encap-type is dot1q, only the dot1q context configuration is allowed; the qinq context cannot be configured.
-
If the encap-type is qinq, only the qinq context configuration is allowed; the dot1q context cannot be configured.
-
A dot1q, qinq, or VC-ID range is required for the vES to become operationally active.
-
-
The dot1q Q-tag range determines which VIDs are associated with the vES on a specific dot1q port or LAG. The group of SAPs that match the configured port/LAG and VIDs is part of the vES.
-
The QinQ S-tag range determines which outer VIDs are associated with the vES on the QinQ port or LAG.
-
The QinQ S-tag C-tag range determines which inner C-tags per S-tag are associated with the vES on the QinQ port or LAG.
-
The VC-ID range determines which VC IDs are associated with the vES on the configured SDP.
Although Q-tag values 0, * and 1 to 4094 are allowed, the following considerations must be taken in to account when configuring a dot1q or qinq vES:
-
Up to 8 dot1q or QinQ ranges may be configured in the same vES.
-
When configuring a QinQ vES, a Q-tag included in a S-tag range cannot be included in the S-tag Q-tag of the s-tag qtag1 c-tag-range qtag2 [to qtag2] command. For example, the following combination is not supported in the same vES.
s-tag-range 350 to 500 s-tag 500 c-tag-range 100 to 200
The following example shows a supported combination:
- MD-CLI
[ex:/configure service system bgp evpn] A:admin@node-2# info ethernet-segment "qinq" { ... ethernet-segment "vES-4" { type virtual association { port 1/1/1 { virtual-ranges { qinq { s-tag-c-tag 500 c-tag-start 100 { c-tag-end 200 } s-tag-c-tag 600 c-tag-start 100 { c-tag-end 200 } s-tag-c-tag 600 c-tag-start 150 { c-tag-end 200 } s-tag 100 { end 200 } s-tag 300 { end 400 } } } } } }
- classic
CLI
A:node-2>config>service>system>bgp-evpn>eth-seg>qinq# info ---------------------------------------------- s-tag-range 100 to 200 s-tag-range 300 to 400 s-tag 500 c-tag-range 100 to 200 s-tag 600 c-tag-range 100 to 200 s-tag 600 c-tag-range 150 to 200
Note: For more information about the contexts for this command, see:- 7450 ESS, 7750 SR, 7950 XRS, and VSR MD-CLI Command Reference Guide
- 7450 ESS, 7750 SR, 7950 XRS, and VSR Classic CLI Command Reference Guide
- MD-CLI
-
vES associations that contain Q-tags <0, *, null> are special and treated as follows:
-
When a special Q-tag value is configured in the from value of the range, the to value must be the same.
-
Q-tag values <0, *> are only supported for the Q-tag range and C-tag range; they are not supported in the S-tag range.
-
The Q-tag ‟null” value is only supported in the C-tag range if the s-tag is configured as ‟*”.
-
Examples of supported Q-tag values lists examples of the supported Q-tag values between 1 to 4094.
vES configuration for port 1/1/1 | SAP association |
---|---|
dot1q Q-tag range 100 |
1/1/1:100 |
dot1q Q-tag range 100 to 102 |
1/1/1:100, 1/1/1:101, 1/1/1:102 |
QinQ S-tag 100 C-tag-range 200 |
1/1/1:100.200 |
QinQ S-tag-range 100 |
All the SAPs 1/1/1:100.x where x is a value between 1 to 4094, 0, * |
QinQ S-tag range 100 to 102 |
All SAPs 1/1/1:100.x, 1/1/1:101.x, 1/1/1:102.x where x is a value between 1 to 4094, 0, * |
Examples of supported combinations lists all the supported combinations that include Q-tag values <0, *, null>. Any other combination of these special values is not supported.
vES configuration for port 1/1/1 | SAP association |
---|---|
dot1q Q-tag range 0 |
1/1/1:0 |
dot1q Q-tag range * |
1/1/1:* |
QinQ S-tag 0 C-tag range * |
1/1/1:0.* |
QinQ S-tag * C-tag range * |
1/1/1:*.* |
QinQ S-tag * C-tag range null |
1/1/1:*.null |
QinQ S-tag x C-tag range 0 |
1/1/1:x.0 where x is a value between 1 to 4094 |
QinQ S-tag x C-tag range * |
1/1/1:x.* where x is a value between 1 to 4094 |
On vESs, the single-active and all-active modes are supported for EVPN-MPLS VPLS, Epipe, and PBB-EVPN services. Single-active multihoming is supported on port and SDP-based vESs, and all-active multihoming is only supported on LAG-based vESs.
The following considerations apply if the vES is used with PBB-EVPN services:
-
B-MAC allocation procedures are the same as the regular ES procedures.
Note: Two all-active vESs must use different ES B-MACs, even if they are defined in the same LAG. -
The vES implements C-MAC flush procedures described in RFC 7623. Optionally, the ISID-based C‑MAC flush can be used for cases where the single-active vES does not use ES B-MAC allocation.
Preference-based and non-revertive DF election
In addition to the ES service-carving modes auto and off, the manual mode also supports the preference-based algorithm with the non-revertive option, as described in draft-rabadan-bess-evpn-pref-df.
When ES is configured to use the preference-based algorithm, the ES route is advertised with the Designated Forwarder (DF) election extended community (sub-type 0x06). DF election extended community shows the DF election extended community.
In the extended community, a DF type 2 preference algorithm is advertised with a 2-byte preference value (32767 by default) if the preference-based manual mode is configured. The Don't Preempt Me (DP) bit is set if the non-revertive option is enabled.
The following CLI excerpt shows the relevant commands to enable the preference-based DF election on a specific ES (regular or virtual):
config>service>system>bgp-evpn>ethernet-segment#
...
service-carving mode {manual|auto|off}
service-carving manual
[no] preference [create] [non-revertive]
value <value>
exit
[no] evi <evi> [to <evi>]
[no] isid <isid> [to <isid>]
# value 0..65535; default 32767
...
Where:
The preference value can be changed on an active ES without shutting down the ES, and therefore, a new DF can be forced for maintenance or other reasons.
The service-carving mode must be changed to manual mode to create the preference context.
The preference command is supported on regular or virtual ES, regardless of the multihoming mode (single-active or all-active) or the service type (VPLS, I-VPLS, or Epipe).
By default, the highest-preference PE in the ES becomes the DF for an EVI or ISID, using the DP bit as the tiebreaker first (DP=1 wins over DP=0) and the lowest PE-IP as the last-resort tiebreaker. All the explicitly configured EVI or ISID ranges select the lowest preference PE as the DF (whereas the non-configured EVI or ISID values select the highest preference PE).
This selection is displayed as Cfg Range Type: lowest-pref in the following show command example.
*A:PE-2# show service system bgp-evpn ethernet-segment name "vES-23" =============================================================================== Service Ethernet Segment =============================================================================== Name : vES-23 Eth Seg Type : Virtual Admin State : Enabled Oper State : Up ESI : 01:23:23:23:23:23:23:23:23:23 Multi-homing : allActive Oper Multi-homing : allActive ES SHG Label : 262141 Source BMAC LSB : 00-23 ES BMac Tbl Size : 8 ES BMac Entries : 0 Lag Id : 1 ES Activation Timer : 3 secs (default) Svc Carving : manual Oper Svc Carving : manual Cfg Range Type : lowest-pref ------------------------------------------------------------------------------- DF Pref Election Information ------------------------------------------------------------------------------- Preference Preference Last Admin Change Oper Pref Do No Mode Value Value Preempt ------------------------------------------------------------------------------- non-revertive 100 12/21/2016 15:16:38 100 Enabled ------------------------------------------------------------------------------- EVI Ranges: <none> ISID Ranges: <none> ===============================================================================
The EVI and ISID ranges configured on the service-carving context are not required to be consistent with any ranges configured for vESs.
If the non-revertive option is configured, when the former DF comes back up after a failure and checks existing ES routes, it advertises an operational preference and DP bit, which does not cause a DF switchover for the ES EVI/ISID values.
-
The non-revertive option prevents an ES DF switchover in the following events:
-
ES port recovery after a failure
-
node recovery after a reboot or power-up event
-
-
The non-revertive option does not prevent an ES DF switchover when the ES is administratively enabled or when any other event attempts to recover the ES when the ES routes from the ES peers are not present yet. An example of this is when the user executes the clear card command on all of the line cards in the router. When the ES is brought up, the BGP session is still recovering and therefore, there are no remote ES routes from the ES peers. Use the following command to prevent this situation for reboots or node power-up events (but not for any other events).
configure redundancy bgp-evpn-multi-homing boot-timer
The following configuration example shows the use of the preference-based algorithm and non-revertive option in an ES defined in PE1 and PE2.
*A:PE-1>config>service>system>bgp-evpn# info
----------------------------------------------
ethernet-segment "ES1" create
esi 01:00:00:00:00:12:00:00:00:01
service-carving manual
preference non-revertive create
value 10000
exit
evi 2001 to 4000
exit
multi-homing single-active
port 1/1/1
no shutdown
/* example of vpls 1 - similar config exists for evis 2-4000 */
*A:PE-1>config>service>vpls# info
----------------------------------------------
vxlan vni 1 create
exit
bgp-evpn
evi 1
mpls bgp 1
ecmp 2
auto-bind-tunnel
resolution any
exit
sap 1/1/1:1 create
no shutdown
----------------------------------------------
*A:PE-2>config>service>system>bgp-evpn# info
----------------------------------------------
ethernet-segment "ES1" create
esi 01:00:00:00:00:12:00:00:00:01
service-carving manual
preference non-revertive create
value 5000
exit
evi 2001 to 4000
exit
multi-homing single-active
port 1/1/1
no shutdown
*A:PE-2>config>service>vpls# info
----------------------------------------------
vxlan vni 1 create
exit
bgp-evpn
evi 1
mpls bgp 1
ecmp 2
auto-bind-tunnel
resolution any
exit
sap 1/1/1:1 create
no shutdown
----------------------------------------------
Based on the configuration in the preceding example, the PE behavior is as follows:
Assuming the ES is no shutdown on both PE1 and PE2, the PEs exchange ES routes, including the [Pref, DP-bit] in the DF election extended community.
For EVIs 1 to 2000, PE2 is immediately promoted to NDF, whereas PE1 becomes the DF, and (following the es-activation-timer) brings up its SAP in EVIs 1 to 2000.
For EVIs 2001 to 4000, the result is the opposite and PE2 becomes the DF.
If port 1/1/1 on PE1 goes down, PE1 withdraws its ES route and PE2 becomes the DF for EVIs 1 to 2000.
When port 1/1/1 on PE1 comes back up, PE1 compares its ES1 preference with the preferences in the remote PEs in ES1. PE1 advertises the ES route with an ‟in-use operational” Pref = 5000 and DP=0. Because PE2's Pref is the same as PE1's operational value, but PE2's DP=1, PE2 continues to be the DF for EVIs 1 to 4000.
Note: The DP bit is the tiebreaker in case of equal Pref and regardless of the choice of highest or lowest preference algorithm.
PE1's ‟in-use” Pref and DP continue to be [5000,0] until one of the following conditions is true:
-
PE2 withdraws its ES route, in which case PE1 re-advertises its admin Pref and DP [10000,DP=1]
-
The user changes PE1's Pref configuration
-
EVPN-MPLS routed VPLS multicast routing support
IPv4 multicast routing is supported in an EVPN-MPLS VPRN routed VPLS service through its IP interface, when the source of the multicast stream is on one side of its IP interface and the receivers are on either side of the IP interface. For example, the source for multicast stream G1 could be on the IP side sending to receivers on both other regular IP interfaces and the VPLS of the routed VPLS service, while the source for group G2 could be on the VPLS side sending to receivers on both the VPLS and IP side of the routed VPLS service. See IPv4 and IPv6 multicast routing support for more details.
IGMP snooping in EVPN-MPLS and PBB EVPN services
IGMP snooping is supported in EVPN-MPLS VPLS and PBB-EVPN I-VPLS (where BGP EVPN is running in the associated B-VPLS service) services. It is also supported in EVPN-MPLS VPRN and IES R-VPLS services. It is required in scenarios where the operator does not want to flood all of the IP multicast traffic to the access nodes or CEs, and only wants to deliver IP multicast traffic for which IGMP reports have been received.
The following points apply when IGMP snooping is configured in EVPN-MPLS VPLS or PBB-EVPN I-VPLS services:
IGMP snooping is enabled using the configure service vpls igmp-snooping no shutdown command.
Queries and reports received on SAP or SDP bindings are snooped and properly handled; they are sent to SAP or SDP bindings as expected.
Queries and reports on EVPN-MPLS or PBB-EVPN B-VPLS destinations are handled as follows.
If received from SAP or SDP bindings, the queries and reports are sent to all EVPN-MPLS and PBB-EVPN B-VPLS destinations, regardless of whether the service is using an ingress replication or mLDP provider tunnel.
If received on an EVPN-MPLS or PBB-EVPN B-VPLS destination, the queries and reports are processed and propagated to access SAP or SDP bindings, regardless of whether the service is using an ingress replication or mLDP provider tunnel.
EVPN-MPLS and PBB-EVPN B-VPLS destinations are is treated as a single IGMP snooping interface and is always added as an mrouter.
The debug trace output displays one copy of messages being sent to all EVPN-MPLS and PBB-EVPN B-VPLS destinations (the trace does not show a copy for each destination) and displays messages received from all EVPN-MPLS and PBB-EVPN B-VPLS destinations as coming from a single EVPN-MPLS interface.
In the following show command output, the EVPN-MPLS destinations are shown as part of the MFIB (when igmp-snooping is in a no shutdown state), and the EVPN-MPLS logical interface is shown as an mrouter.
*A:PE-2# show service id 2000 mfib
===============================================================================
Multicast FIB, Service 2000
===============================================================================
Source Address Group Address SAP or SDP Id Svc Id Fwd
Blk
-------------------------------------------------------------------------------
* * eMpls:192.0.2.3:262132 Local Fwd
eMpls:192.0.2.4:262136 Local Fwd
eMpls:192.0.2.5:262131 Local Fwd
-------------------------------------------------------------------------------
Number of entries: 1
===============================================================================
*A:PE-2# show service id 2000 igmp-snooping base
===============================================================================
IGMP Snooping Base info for service 2000
===============================================================================
Admin State : Up
Querier : 10.0.0.3 on evpn-mpls
-------------------------------------------------------------------------------
SAP or SDP Oper MRtr Pim Send Max Max Max MVR Num
Id Stat Port Port Qrys Grps Srcs Grp From-VPLS Grps
Srcs
-------------------------------------------------------------------------------
sap:1/1/1:2000 Up No No No None None None Local 0
evpn-mpls Up Yes N/A N/A N/A N/A N/A N/A N/A
===============================================================================
*A:PE-4# show service id 2000 igmp-snooping mrouters
===============================================================================
IGMP Snooping Multicast Routers for service 2000
===============================================================================
MRouter SAP or SDP Id Up Time Expires Version
-------------------------------------------------------------------------------
10.0.0.3 evpn-mpls 0d 00:38:49 175s 3
-------------------------------------------------------------------------------
Number of mrouters: 1
===============================================================================
The equivalent output for PBB-EVPN services is similar to the output above for EVPN-MPLS services, with the exception that the EVPN destinations are named "b-EVPN-MPLS".
Data-driven IGMP snooping synchronization with EVPN multihoming
When single-active multihoming is used, the IGMP snooping state is learned on the active multihoming object. If a failover occurs, the system with the newly active multihoming object must wait for IGMP messages to be received to instantiate the IGMP snooping state after the ES activation timer expires; this could result in an increased outage.
The outage can be reduced by using MCS synchronization, which is supported for IGMP snooping in both EVPN-MPLS and PBB-EVPN services (see Multi-chassis synchronization for Layer 2 snooping states). However, MCS only supports synchronization between two PEs, whereas EVPN multihoming is supported between a maximum of four PEs. Also, IGMP snooping state can be synchronized only on a SAP.
An increased outage would also occur when using all-active EVPN multihoming. The IGMP snooping state on an ES LAG SAP or virtual ES to the attached CE must be synchronized between all the ES PEs, as the LAG link used by the DF PE may not be the same as that used by the attached CE. MCS synchronization is not applicable to all-active multihoming as MCS only supports active/standby synchronization.
To eliminate any additional outage on a multihoming failover, IGMP snooping messages can be synchronized between the PEs on an ES using data-driven IGMP snooping state synchronization, which is supported in EVPN-MPLS services, PBB-EVPN services, EVPN-MPLS VPRN and IES R-VPLS services. The IGMP messages received on an ES SAP or spoke SDP are sent to the peer ES PEs with an ESI label (for EVPN-MPLS) or ES B-MAC (for PBB-EVPN) and these are used to synchronize the IGMP snooping state on the ES SAP or spoke SDP on the receiving PE.
Data-driven IGMP snooping state synchronization is supported for both all-active multihoming and single-active with an ESI label multihoming in EVPN-MPLS, EVPN-MPLS VPRN and IES R-VPLS services, and for all-active multihoming in PBB-EVPN services. All PEs participating in a multihomed ES must be running an SR OS version supporting this capability. PBB-EVPN with IGMP snooping using single-active multihoming is not supported.
Data-driven IGMP snooping state synchronization is also supported with P2MP mLDP LSPs in both EVPN-MPLS and PBB-EVPN services. When P2MP mLDP LSPs are used in EVPN-MPLS services, all PEs (including the PEs not connected to a multihomed ES) in the EVPN-MPLS service must be running an SR OS version supporting this capability with IGMP snooping enabled and all network interfaces must be configured on FP3 or higher-based line cards.
Data-driven IGMP snooping synchronization with EVPN multihoming shows the processing of an IGMP message for EVPN-MPLS. In PBB-EVPN services, the ES B-MAC is used instead of the ESI label to synchronize the state.
Data-driven synchronization is enabled by default when IGMP snooping is enabled within an EVPN-MPLS service using all-active multihoming or single-active with an ESI label multihoming, or in a PBB-EVPN service using all-active multihoming. If IGMP snooping MCS synchronization is enabled on an EVPN-MPLS or PBB-EVPN (I-VPLS) multihoming SAP then MCS synchronization takes precedence over the data-driven synchronization and the MCS information is used. Mixing data-driven and MCS IGMP synchronization within the same ES is not supported.
When using EVPN-MPLS, the ES should be configured as non-revertive to avoid an outage when a PE takes over the DF role. The Ethernet A-D per ESI route update is withdrawn when the ES is down which prevents state synchronization to the PE with the ES down, as it does not advertise an ESI label. The lack of state synchronization means that if the ES comes up and that PE becomes DF after the ES activation timer expires, it may not have any IGMP snooping state until the next IGMP messages are received, potentially resulting in an additional outage. Configuring the ES as non-revertive can avoid this potential outage. Configuring the ES to be non-revertive would also avoid an outage when PBB-EVPN is used, but there is no outage related to the lack of the ESI label as it is not used in PBB-EVPN.
The following steps can be used when enabling IGMP snooping in EVPN-MPLS and PBB-EVPN services:
-
Upgrade SR OS on all ES PEs to a version supporting data-driven IGMP snooping synchronization with EVPN multihoming.
-
Enable IGMP snooping in the required services on all ES PEs. Traffic loss occurs until all ES PEs have IGMP snooping enabled and the first set of join/query messages are processed by the ES PEs.
Note: There is no action required on the non-ES PEs.
If P2MP mLDP LSPs are also configured, the following steps can be used when enabling IGMP snooping in EVPN-MPLS and PBB-EVPN services:
-
Upgrade SR OS on all PEs (both ES and non-ES) to a version supporting data-driven IGMP snooping synchronization with EVPN multihoming.
- Enable IGMP snooping in EVPN-MPLS and PBB-EVPN services.
-
Perform the following steps for EVPN-MPLS:
-
Enable IGMP snooping on all non-ES PEs. Traffic loss occurs until the first set of join/query messages are processed by the non-ES PEs.
-
Then enable IGMP snooping on all ES PEs. Traffic loss occurs until all PEs have IGMP snooping enabled and the first set of join/query messages are processed by the ES PEs.
-
-
Perform the following steps for PBB-EVPN:
-
Enable IGMP snooping on all ES PEs. Traffic loss occurs until all PEs have IGMP snooping enabled and the first set of join/query messages are processed by the ES PEs.
-
There is no action required on the non-ES PEs.
-
-
To aid with troubleshooting, the debug packet output displays the IGMP packets used for the snooping state synchronization. An example of a join sent on ES esi-1 from one ES PE and the same join received on another ES PE follows.
6 2017/06/16 18:00:07.819 PDT MINOR: DEBUG #2001 Base IGMP
"IGMP: TX packet on svc 1
from chaddr 5e:00:00:16:d8:2e
send towards ES:esi-1
Port : evpn-mpls
SrcIp : 0.0.0.0
DstIp : 239.0.0.22
Type : V3 REPORT
Num Group Records: 1
Group Record Type: MODE_IS_EXCL (2), AuxDataLen 0, Num Sources 0
Group Addr: 239.0.0.1
4 2017/06/16 18:00:07.820 PDT MINOR: DEBUG #2001 Base IGMP
"IGMP: RX packet on svc 1
from chaddr d8:2e:ff:00:01:41
received via evpn-mpls on ES:esi-1
Port : sap lag-1:1
SrcIp : 0.0.0.0
DstIp : 239.0.0.22
Type : V3 REPORT
Num Group Records: 1
Group Record Type: MODE_IS_EXCL (2), AuxDataLen 0, Num Sources 0
Group Addr: 239.0.0.1
PIM snooping for IPv4 in EVPN-MPLS and PBB-EVPN services
PIM snooping for VPLS allows a VPLS PE router to build multicast states by snooping PIM protocol packets that are sent over the VPLS. The VPLS PE then forwards multicast traffic based on the multicast states. When all receivers in a VPLS are IP multicast routers running PIM, multicast forwarding in the VPLS is efficient when PIM snooping for VPLS is enabled.
PIM snooping for IPv4 is supported in EVPN-MPLS (for VPLS and R-VPLS) and PBB-EVPN I-VPLS (where BGP EVPN is running in the associated B-VPLS service) services. It is enabled using the following command (as IPv4 multicast is enabled by default):
configure service vpls <service-id> pim-snooping
PIM snooping on SAPs and spoke SDPs operates in the same way as in a plain VPLS service. However, EVPN-MPLS/PBB-EVPN B-VPLS destinations are treated as a single PIM interface, specifically:
Hellos and join/prune messages from SAPs or SDPs are always sent to all EVPN-MPLS or PBB-EVPN B-VPLS destinations.
As soon as a hello message is received from one PIM neighbor on an EVPN-MPLS or PBB-EVPN I-VPLS destination, then the single interface representing all EVPN-MPLS or PBB-EVPN I-VPLS destinations has that PIM neighbor.
The EVPN-MPLS or PBB-EVPN B-VPLS destination split horizon logic ensures that IP multicast traffic and PIM messages received on an EVPN-MPLS or PBB-EVPN B-VPLS destination are not forwarded back to other EVPN-MPLS or PBB-EVPN B-VPLS destinations.
The debug trace output displays one copy of messages being sent to all EVPN-MPLS or PBB-EVPN B-VPLS destinations (the trace does not show a copy for each destination) and displays messages received from all EVPN-MPLS or PBB-EVPN B-VPLS destinations as coming from a single EVPN-MPLS interface.
PIM snooping for IPv4 is supported in EVPN-MPLS services using P2MP LSPs and PBB-EVPN I-VPLS services with P2MP LSPs in the associated B-VPLS service. When PIM snooping is enabled with P2MP LSPs, at least one EVPN-MPLS multicast destination is required to be established to enable the processing of PIM messages by the system.
Multi-chassis synchronization (MCS) of PIM snooping for IPv4 state is supported for both SAPs and spoke SDPs which can be used with single-active multihoming. Care should be taken when using *.null to define the range for a QinQ virtual ES if the associated SAPs are also being synchronized by MCS, as there is no equivalent MCS sync-tag support to the *.null range.
PBB-EVPN services operate in a similar way to regular PBB services, specifically:
The multicast flooding between the I-VPLS and the B-VPLS works in a similar way as for PIM snooping for IPv4 with an I-VPLS using a regular B-VPLS. The first PIM join message received over the local B-VPLS from a B-VPLS SAP or SDP or EVPN destination adds all of the B-VPLS SAP or SDP or EVPN components into the related multicast forwarding table associated with that I-VPLS context. The multicast packets are forwarded throughout the B-VPLS on the per ISID single tree.
When a PIM router is connected to a remote I-VPLS instance over the B-VPLS infrastructure, its location is identified by the B-VPLS SAP, SDP or by the set of all EVPN destinations on which its PIM hellos are received. The location is also identified by the source B-MAC address used in the PBB header for the PIM hello message (this is the B-MAC associated with the B-VPLS instance on the remote PBB PE).
In EVPN-MPLS services, the individual EVPN-MPLS destinations appear in the MFIB but the information for each EVPN-MPLS destination entry is always identical, as shown below:
*A:PE# show service id 1 mfib
===============================================================================
Multicast FIB, Service 1
===============================================================================
Source Address Group Address Port Id Svc Id Fwd
Blk
-------------------------------------------------------------------------------
* 239.252.0.1 sap:1/1/9:1 Local Fwd
eMpls:1.1.1.2:262141 Local Fwd
eMpls:1.1.1.3:262141 Local Fwd
-------------------------------------------------------------------------------
Number of entries: 1
===============================================================================
*A:PE#
Similarly for the PIM neighbors:
*A:PE# show service id 1 pim-snooping neighbor
===============================================================================
PIM Snooping Neighbors ipv4
===============================================================================
Port Id Nbr DR Prty Up Time Expiry Time Hold Time
Nbr Address
-------------------------------------------------------------------------------
SAP:1/1/9:1 1 0d 00:08:17 0d 00:01:29 105
10.0.0.1
EVPN-MPLS 1 0d 00:27:26 0d 00:01:19 105
10.0.0.2
EVPN-MPLS 1 0d 00:27:26 0d 00:01:19 105
10.0.0.3
-------------------------------------------------------------------------------
Neighbors : 3
===============================================================================
*A:PE#
A single EVPN-MPLS interface is shown in the outgoing interface, as can be seen in the following output:
*A:PE# show service id 1 pim-snooping group detail
===============================================================================
PIM Snooping Source Group ipv4
===============================================================================
Group Address : 239.252.0.1
Source Address : *
Up Time : 0d 00:07:07
Up JP State : Joined Up JP Expiry : 0d 00:00:37
Up JP Rpt : Not Joined StarG Up JP Rpt Override : 0d 00:00:00
RPF Neighbor : 10.0.0.1
Incoming Intf : SAP:1/1/9:1
Outgoing Intf List : EVPN-MPLS, SAP:1/1/9:1
Forwarded Packets : 0 Forwarded Octets : 0
-------------------------------------------------------------------------------
Groups : 1
===============================================================================
*A:PE#
An example of the debug trace output for a join received on an EVPN-MPLS destination is shown below:
A:PE1# debug service id 1 pim-snooping packet jp
A:PE1#
32 2016/12/20 14:21:22.68 CET MINOR: DEBUG #2001 Base PIM[vpls 1 ]
"PIM[vpls 1 ]: Join/Prune
[000 02:16:02.460] PIM-RX ifId 1071394 ifName EVPN-MPLS 10.0.0.3 -> 224.0.0.13
Length: 34
PIM Version: 2 Msg Type: Join/Prune Checksum: 0xd3eb
Upstream Nbr IP : 10.0.0.1 Resvd: 0x0, Num Groups 1, HoldTime 210
Group: 239.252.0.1/32 Num Joined Srcs: 1, Num Pruned Srcs: 0
Joined Srcs:
10.0.0.1/32 Flag SWR <*,G>
The equivalent output for PBB-EVPN services is similar to that above for EVPN-MPLS services, with the exception that the EVPN destinations are named ‟b-EVPN-MPLS”.
Data-driven PIM snooping for IPv4 synchronization with EVPN multihoming
When single-active multihoming is used, PIM snooping for IPv4 state is learned on the active multihoming object. If a failover occurs, the system with the newly active multihoming object must wait for IPv4 PIM messages to be received to instantiate the PIM snooping for IPv4 state after the ES activation timer expires, which could result in an increased outage.
This outage can be reduced by using MCS synchronization, which is supported for PIM snooping for IPv4 in both EVPN-MPLS and PBB-EVPN services (see Multi-chassis synchronization for Layer 2 snooping states). However, MCS only supports synchronization between two PEs, whereas EVPN multihoming is supported between a maximum of four PEs.
An increased outage would also occur when using all-active EVPN multihoming. The PIM snooping for IPv4 state on an all-active ES LAG SAP or virtual ES to the attached CE must be synchronized between all the ES PEs, as the LAG link used by the DF PE may not be the same as that used by the attached CE. MCS synchronization is not applicable to all-active multihoming as MCS only supports active/standby synchronization.
To eliminate any additional outage on a multihoming failover, snooped IPv4 PIM messages should be synchronized between the PEs on an ES using data-driven PIM snooping for IPv4 state synchronization, which is supported in both EVPN-MPLS and PBB-EVPN services. The IPv4 PIM messages received on an ES SAP or spoke SDP are sent to the peer ES PEs with an ESI label (for EVPN-MPLS) or ES B-MAC (for PBB-EVPN) and are used to synchronize the PIM snooping for IPv4 state on the ES SAP or spoke SDP on the receiving PE.
Data-driven PIM snooping state synchronization is supported for all-active multihoming and single-active with an ESI label multihoming in EVPN-MPLS services. All PEs participating in a multihomed ES must be running an SR OS version supporting this capability with PIM snooping for IPv4 enabled. It is also supported with P2MP mLDP LSPs in the EVPN-MPLS services, in which case all PEs (including the PEs not connected to a multihomed ES) must have PIM snooping for IPv4 enabled and all network interfaces must be configured on FP3 or higher-based line cards.
In addition, data-driven PIM snooping state synchronization is supported for all-active multihoming in PBB-EVPN services and with P2MP mLDP LSPs in PBB-EVPN services. All PEs participating in a multihomed ES, and all PEs using PIM proxy mode (including the PEs not connected to a multihomed ES) in the PBB-EVPN service must be running an SR OS version supporting this capability and must have PIM snooping for IPv4 enabled. PBB-EVPN with PIM snooping for IPv4 using single-active multihoming is not supported.
Data-driven PIM snooping for IPv4 synchronization with EVPN multihoming shows the processing of an IPv4 PIM message for EVPN-MPLS. In PBB-EVPN services, the ES B-MAC is used instead of the ESI label to synchronize the state.
Data-driven synchronization is enabled by default when PIM snooping for IPv4 is enabled within an EVPN-MPLS service using all-active multihoming and single-active with an ESI label multihoming, or in a PBB-EVPN service using all-active multihoming. If PIM snooping for IPv4 MCS synchronization is enabled on an EVPN-MPLS or PBB-EVPN (I-VPLS) multihoming SAP or spoke SDP, then MCS synchronization takes preference over the data-driven synchronization and the MCS information is used. Mixing data-driven and MCS PIM synchronization within the same ES is not supported.
When using EVPN-MPLS, the ES should be configured as non-revertive to avoid an outage when a PE takes over the DF role. The Ethernet A-D per ESI route update is withdrawn when the ES is down, which prevents state synchronization to the PE with the ES down as it does not advertise an ESI label. The lack of state synchronization means that if the ES comes up and that PE becomes DF after the ES activation timer expires, it may not have any PIM snooping for IPv4 state until the next PIM messages are received, potentially resulting in an additional outage. Configuring the ES as non-revertive can avoid this potential outage. Configuring the ES to be non-revertive would also avoid an outage when PBB-EVPN is used, but there is no outage related to the lack of the ESI label as it is not used in PBB-EVPN.
The following steps can be used when enabling PIM snooping for IPv4 (using PIM snooping and PIM proxy modes) in EVPN-MPLS and PBB-EVPN services:
PIM snooping mode
-
Upgrade SR OS on all ES PEs to a version supporting data-driven PIM snooping for IPv4 synchronization with EVPN multihoming.
-
Enable PIM snooping for IPv4 on all ES PEs. Traffic loss occurs until all PEs have PIM snooping for IPv4 enabled and the first set of join/hello messages are processed by the ES PEs.
Note: There is no action required on the non-ES PEs.
-
PIM proxy mode
EVPN-MPLS
-
Upgrade SR OS on all ES PEs to a version supporting data-driven PIM snooping for IPv4 synchronization with EVPN multihoming.
-
Enable PIM snooping for IPv4 on all ES PEs. Traffic loss occurs until all PEs have PIM snooping for IPv4 enabled and the first set of join/hello messages are processed by the ES PEs.
Note: There is no action required on the non-ES PEs.
-
PBB-EVPN
-
Upgrade SR OS on all PEs (both ES and non-ES) to a version supporting data-driven PIM snooping for IPv4 synchronization with EVPN multihoming.
-
Enable PIM snooping for IPv4 on all non-ES PEs. Traffic loss occurs until all PEs have PIM snooping for IPv4 enabled and the first set of join/hello messages are processed by each non-ES PE.
-
Enable PIM snooping for IPv4 on all ES PEs. Traffic loss occurs until all PEs have PIM snooping for IPv4 enabled and the first set of join/hello messages are processed by the ES PEs.
-
If P2MP mLDP LSPs are also configured, the following steps can be used when enabling PIM snooping or IPv4 (using PIM snooping and PIM proxy modes) in EVPN-MPLS and PBB-EVPN services.
PIM snooping mode
-
Upgrade SR OS on all PEs (both ES and non-ES) to a version supporting data-driven PIM snooping for IPv4 synchronization with EVPN multihoming.
-
Then enable PIM snooping for IPv4 on all ES PEs. Traffic loss occurs until all PEs have PIM snooping enabled and the first set of join/hello messages are processed by the ES PEs.
Note: There is no action required on the non-ES PEs.
-
PIM proxy mode
-
Upgrade SR OS on all PEs (both ES and non-ES) to a version supporting data-driven PIM snooping for IPv4 synchronization with EVPN multihoming.
-
Enable PIM snooping for IPv4 on all non-ES PEs. Traffic loss occurs until all PEs have PIM snooping for IPv4 enabled and the first set of join/hello messages are processed by each non-ES PE.
-
Enable PIM snooping for IPv4 on all ES PEs. Traffic loss occurs until all PEs have PIM snooping enabled and the first set of join/hello messages are processed by the ES PEs.
-
In the above steps, when PIM snooping for IPv4 is enabled, the traffic loss can be reduced or eliminated by configuring a larger hold-time (up to 300 seconds), during which multicast traffic is flooded.
To aid with troubleshooting, the debug packet output displays the PIM packets used for the snooping state synchronization. An example of a join sent on ES esi-1 from one ES PE and the same join received on another ES PE follows:
6 2017/06/16 17:36:37.144 PDT MINOR: DEBUG #2001 Base PIM[vpls 1 ]
"PIM[vpls 1 ]: pimVplsFwdJPToEvpn
Forwarding to remote peer on bgp-evpn ethernet-segment esi-1"
7 2017/06/16 17:36:37.144 PDT MINOR: DEBUG #2001 Base PIM[vpls 1 ]
"PIM[vpls 1 ]: Join/Prune
[000 00:19:37.040] PIM-TX ifId 1071394 ifName EVPN-MPLS-ES:esi-1 10.0.0.10 -> 22
10.0.0.13 Length: 34
PIM Version: 2 Msg Type: Join/Prune Checksum: 0xd2de
Upstream Nbr IP : 10.0.0.1 Resvd: 0x0, Num Groups 1, HoldTime 210
Group: 239.0.0.10/32 Num Joined Srcs: 1, Num Pruned Srcs: 0
Joined Srcs:
10.0.0.1/32 Flag SWR <*,G>
4 2017/06/16 17:36:37.144 PDT MINOR: DEBUG #2001 Base PIM[vpls 1 ]
"PIM[vpls 1 ]: pimProcessPdu
Received from remote peer on bgp-evpn ethernet-segment esi-1, will be applied on
lag-1:1
"
5 2017/06/16 17:36:37.144 PDT MINOR: DEBUG #2001 Base PIM[vpls 1 ]
"PIM[vpls 1 ]: Join/Prune
[000 00:19:30.740] PIM-RX ifId 1071394 ifName EVPN-MPLS-ES:esi-1 10.0.0.10 -> 22
10.0.0.13 Length: 34
PIM Version: 2 Msg Type: Join/Prune Checksum: 0xd2de
Upstream Nbr IP : 10.0.0.1 Resvd: 0x0, Num Groups 1, HoldTime 210
Group: 239.0.0.10/32 Num Joined Srcs: 1, Num Pruned Srcs: 0
Joined Srcs:
10.0.0.1/32 Flag SWR <*,G>
EVPN E-Tree
This section contains information about EVPN E-Tree.
BGP EVPN control plane for EVPN E-Tree
BGP EVPN control plane is extended and aligned with IETF RFC 8317 to support EVPN E-Tree services. EVPN E-Tree BGP routes shows the main EVPN extensions for the EVPN E-Tree information model.
The following BGP extensions are implemented for EVPN E-Tree services:
An EVPN E-Tree extended community (EC) sub-type 0x5 is defined. The following information is included:
The lower bit of the Flags field contains the L bit (where L=1 indicates leaf AC).
The leaf label contains a 20-bit MPLS label in the high-order 20 bits of the label field. This leaf label is automatically allocated by the system or statically assigned by the evpn-etree-leaf-label <value> command.
-
The new E-Tree EC is sent with the following routes:
-
AD per-ES per PE route for BUM egress filtering:
Each EVPN E-Tree capable PE advertises an AD per-ES route with the E-Tree EC, and the following information:
-
Service RD and route-target; if ad-per-es-route-target evi-rt-set is configured, then non-zero ESI AD per-ES routes (used for multihoming) are sent per the evi-rt-set configuration, but E-Tree zero-ESI routes (used for E-Tree) are sent based on the default evi-rt configuration
-
ESI = 0
-
Eth Tag = MAX-ET
-
MPLS label = zero
-
-
AD per-EVI route for root or leaf configuration consistency check as follows:
-
The E-Tree EC is sent with the AD per-EVI routes for a specific ES. In this case, no validation is performed by the implementation, and the leaf indication is only used for troubleshooting on the remote PEs.
-
The MPLS label value is zero.
-
All attachment circuits (ACs) in each EVI for a specific ES must be configured as either a root or leaf AC, but not a combination. In case of a configuration error, for example, where the AC in PE1 is configured as root and in PE2 as leaf AC, the remote PE3 receives the AD per-EVI routes with inconsistent leaf indication. However, the unicast filtering remains unaffected and is still handled by the FDB lookup information.
-
-
MAC/IP routes for known unicast ingress filtering as follows:
-
An egress PE sends all MAC/IP routes learned over a leaf AC SAP or spoke SDP with this E-Tree EC indicating that the MAC/IP belongs to a leaf AC.
-
The MPLS label value in the EC is 0.
-
Upon receiving a route with E-Tree EC, the ingress PE imports the route and installs the MAC in the FDB with a leaf flag (if there is a leaf indication in the route). Any frame coming from a leaf AC for which the MAC destination address (DA) matches a leaf AC MAC is discarded at the ingress.
-
If two PEs send the same MAC with the same ESI but inconsistent root or leaf AC indication, the MAC is installed in the FDB as root.
-
-
EVPN for MPLS tunnels in E-Tree services
EVPN E-Tree services are modeled as VPLS services configured as E-Trees with the bgp-evpn mpls context enabled.
The following example shows a CLI configuration of a VPLS E-Tree service with EVPN E-Tree service enabled.
*A:PE1>config>service>system>bgp-evpn#
evpn-etree-leaf-label
*A:PE1>config>service# vpls 1 customer 1 etree create
*A:PE1>config>service>vpls(etree)# info
----------------------------------------------
description "ETREE-enabled evpn-mpls-service"
bgp-evpn
evi 10
mpls bgp 1
no shutdown
ecmp 2
auto-bind-tunnel resolution any
ingress-replication-bum-label
sap lag-1:1 leaf-ac create
exit
sap 2/1/1:1 leaf-ac create
exit
sap 2/2/2:1 create
exit
spoke-sdp 3:1 leaf-ac create
exit
The following configuration guidelines apply to the EVPN E-Tree service:
Before configuring an EVPN E-Tree service, the user must first run the evpn-etree-leaf-label <value> command. This is relevant for EVPN E-Tree services only. The command allocates an E-Tree leaf label on the system and, when a specific value is configured, the leaf label must match on all other PE nodes attached to the same EVPN service.
Optionally, the evpn-etree-leaf-label <value> command may be configured with a static label value (within the static label range configured in the system using the config>router>mpls>mpls-label>static-label-range context). The static label is used when global leaf labels are needed in the network. For example, the case where at least one 7250 IXR Gen 1 router is attached to the EVPN E-Tree service.
The configure service vpls create etree command is compatible with the bgp-evpn mpls context.
As in VPLS E-Tree services, an AC that is not configured as a leaf AC is treated as root AC.
MAC addresses learned over a leaf AC SAP or SDP binding are advertised as leaf MAC addresses.
Any PE with one or more bgp-evpn enabled VPLS E-Tree service advertises an AD per-ES per-PE route with the leaf indication and leaf label used for BUM egress filtering.
Any leaf AC SAP or SDP binding defined in an ES triggers the advertisement of an AD per-EVI route with the leaf indication.
EVPN E-Tree services use the following CLI commands:
-
sap sap-id leaf-ac create command using the configure service vpls context
-
mesh-sdp sdp-id:vc-id create leaf-ac command using the configure service vpls context
-
spoke-sdp sdp-id:vc-id create leaf-ac command using the configure service vpls context
-
The root-leaf-tag command is blocked in VPLS E-Tree services where bgp-evpn mpls context is enabled.
EVPN E-Tree operation
EVPN E-Tree supports all operations related to flows among local root AC and leaf AC objects in accordance with IETF RFC 8317. This section describes the extensions required to forward traffic from (or to) root AC and leaf AC objects to (or from) BGP EVPN destinations.
EVPN E-Tree known unicast ingress filtering
Known unicast traffic forwarding is based on ingress PE filtering. EVPN E-Tree known unicast ingress filtering shows an example of EVPN-E-Tree forwarding behavior for known unicast.
MAC addresses learned on leaf-ac objects are advertised in EVPN with their corresponding leaf indication.
In EVPN E-Tree known unicast ingress filtering , PE1 advertises MAC1 using the E-Tree EC and leaf indication, and PE2 installs MAC1 with a leaf flag in the FDB.
Assuming MAC DA is present in the local FDB (MAC1 in the FDB of PE2) when PE2 receives a frame, it is handled as follows:
If the unicast frame enters a root-ac, the frame follows regular data plane procedures; that is, it is sent to the owner of the MAC DA (local SAP or SDP binding or remote BGP EVPN PE) without any filtering.
If the unicast frame enters a leaf-ac, it is handled as follows:
-
A MAC DA lookup is performed on the FDB.
-
If there is a hit and the MAC was learned as an EVPN leaf (or from a leaf-ac), then the frame is dropped at ingress.
-
The source MAC (MAC2) is learned and marked as a leaf-learned MAC. It is advertised by the EVPN with the corresponding leaf indication.
-
A MAC received with a root and leaf indication from different PEs in the same ES is installed as root.
The ingress filtering for E-Tree leaf-to-leaf traffic requires the implementation of an extra leaf EVPN MPLS destination per remote PE (containing leaf objects) per E-Tree service. The ingress filtering for E-Tree leaf-to-leaf traffic is as follows:
A separate EVPN MPLS bind is created for unicast leaf traffic in the service. The internal EVPN MPLS destination is created for each remote PE that contains a leaf and advertises at least one leaf MAC.
The creation of the internal EVPN MPLS destination is triggered when a MAC route with L=1 in the E-Tree EC is received. Any EVPN E-Tree service can potentially use one extra EVPN MPLS destination for leaf unicast traffic per remote PE.
The extra destination in the EVPN E-Tree service is for unicast only and it is not part of the flooding list. It is resource-accounted and displayed in the tools dump service evpn usage command, as shown in the following example output.
A:PE-4# tools dump service evpn usage vxlan-evpn-mpls usage statistics at 01/23/2017 00:53:14: MPLS-TEP : 3 VXLAN-TEP : 0 Total-TEP : 3/ 16383 Mpls Dests (TEP, Egress Label + ES + ES-BMAC) : 10 Mpls Etree Leaf Dests : 1 Vxlan Dests (TEP, Egress VNI) : 0 Total-Dest : 10/196607 Sdp Bind + Evpn Dests : 13/245759 ES L2/L3 PBR : 0/ 32767 Evpn Etree Remote BUM Leaf Labels : 3
MACs received with L=1 point to the EVPN MPLS destination, whereas root MACs point to the ‟root” destination.
EVPN E-Tree BUM egress filtering
BUM traffic forwarding is based on egress PE filtering. EVPN E-Tree BUM egress filtering shows an example of EVPN E-Tree forwarding behavior for BUM traffic.
In EVPN E-Tree BUM egress filtering , BUM frames are handled as follows when they ingress PE or PE2:
If the BUM frame enters a root-ac, the frame follows regular EVPN data plane procedures.
If the BUM frame enters a leaf-ac, the frame handling is as follows:
-
The frame is marked as leaf and forwarded or replicated to the egress IOM.
-
At the egress IOM, the frame is flooded in the default multicast list subject to the following:
-
Leaf entries are skipped when BUM traffic is forwarded. This prevents leaf-to-leaf BUM traffic forwarding.
-
Traffic to remote BGP EVPN PEs is encapsulated with the EVPN label stack. If a leaf ESI label present for the far-end PE (L1 in EVPN E-Tree BUM egress filtering ), the leaf ESI label is added at the bottom of the stack; the remaining stack follows (including EVI label). If there is no leaf ESI label for the far-end egress PE, no additional label is added to the stack. This means that the egress PE does not have any E-Tree enabled service, but it can still work with the VPLS E-Tree service available in PE2.
-
-
The BUM-encapsulated packet is received on the network ingress interface at the egress PE or PE1. The packet is processed as follows.
-
A normal ILM lookup is performed for each label (including the EVI label) in the stack.
-
Further label lookups are performed when the EVI label ILM lookup is complete. If the lookup yields a leaf label, all the leaf-acs are skipped when flooding to the default-multicast list at the egress PE.
EVPN E-Tree egress filtering based on MAC source address
The egress PE checks the MAC Source Address (SA) for traffic received without the leaf MPLS label. This check covers corner cases where the ingress PE sends traffic originating from a leaf-ac but without a leaf indication.
In EVPN E-Tree BUM egress filtering , PE2 receives a frame with MAC DA = MAC3 and MAC SA = MAC2. Because MAC3 is a root MAC, MAC lookup at PE2 allows the system to unicast the packet to PE1 without the leaf label. If MAC3 was no longer in PE1's FDB, PE1 would flood the frame to all the root and leaf-acs, despite the frame having originated from a leaf-ac.
To minimize and prevent leaf traffic from leaking to other leaf-acs (as described in the preceding case), the egress PE always performs a MAC SA check for all types of traffic. The data path performs MAC SA-based egress filtering as follows:
-
An Ethernet frame may be treated as originating from a leaf-ac because of several reasons, which requires the system to set a flag to indicate leaf traffic. The flag is set if one of the following conditions is true:
- The frames arrive on a leaf SAP.
- EVPN traffic arrives with a leaf label.
- A MAC SA is flagged as a leaf SA.
-
After the flag is set, the action taken depends on the type of traffic:
-
unicast traffic
An FDB lookup is performed, and if the MAC DA FDB entry is marked as a leaf type, the frame is dropped to prevent leaf-to-leaf forwarding.
-
BUM traffic
The flag is considered at the egress IOM and leaf-to-leaf forwarding is suppressed.
-
EVPN E-Tree and EVPN multihoming
EVPN E-Tree procedures support all-active and single-active EVPN multihoming. Ingress filtering can handle MACs learned on ES leaf-ac SAP or SDP bindings. If a MAC associated with an ES leaf-ac is advertised with a different E-Tree indication or if the AD per-EVI routes have inconsistent leaf indications, then the remote PEs performing the aliasing treat the MAC as root.
EVPN E-Tree BUM egress filtering and multihoming shows the expected behavior for multihoming and egress BUM filtering.
Multihoming and egress BUM filtering in EVPN E-Tree BUM egress filtering and multihoming is handled as follows:
BUM frames received on an ES leaf-ac are flooded to the EVPN based on EVPN E-Tree procedures. The leaf ESI label is sent when flooding to other PEs in the same ES, and additional labels are not added to the stack.
When flooding in the default multicast list, the egress PE skips all the leaf-acs (including the ES leaf-acs) on the assumption that all ACs in a specific ES for a specified EVI have a consistent E-Tree configuration, and they send an AD per-EVI route with a consistent E-Tree indication.
BUM frames received on an ES root-ac are flooded to the EVPN based on regular EVPN procedures. The regular ES label is sent for split-horizon when packets are sent to the DF or NDF PEs in the same ES. When flooding in the default multicast list, the egress PE skips the ES SAPs based on the ES label lookup.
If the PE receives an ES MAC from a peer that shares the ES and decides to install it against the local ES SAP that is oper-up, it checks the E-Tree configuration (root or leaf) of the local ES SAP against the received MAC route. The MAC route is processed as follows:
If the E-Tree configuration does not match, then the MAC is not installed against any destination until the misconfiguration is resolved.
If the SAP is oper-down, the MAC is installed against the EVPN destination to the peer.
PBB-EVPN E-Tree services
SR OS supports PBB-EVPN E-Tree services in accordance with IETF RFC 8317. PBB-EVPN E-Tree services are modeled as PBB-EVPN services where some I-VPLS services are configured as etree and some of their SAP or spoke SDPs are configured as leaf-acs.
The procedures for the PBB-EVPN E-Tree are similar to those for the EVPN E-Tree, except that the egress leaf-to-leaf filtering for BUM traffic is based on the B-MAC source address. Also, the leaf label and the EVPN AD routes are not used.
The PBB-EVPN E-Tree operation is as follows:
When one or more I-VPLS E-Tree services are linked to a B-VPLS, the leaf backbone source MAC address (leaf-source-bmac parameter) is used for leaf-originated traffic in addition to the source B-VPLS MAC address (source-bmac parameter) that is used for sourcing root traffic.
The leaf backbone source MAC address for PBB must be configured using the command config>service>pbb>leaf-source-bmac ieee-address before the configuration of any I-VPLS E-Tree service.
The leaf-source-bmac address is advertised in a B-MAC route with a leaf indication.
Known unicast filtering occurs at the ingress PE. When a frame enters an I-VPLS leaf-ac, a MAC lookup is performed. If the C-MAC DA is associated with a leaf B-MAC, the frame is dropped.
Leaf-to-leaf BUM traffic filtering occurs at the egress PE. When flooding BUM traffic with the B-MAC SA matching a leaf B-MAC, the egress PE skips the I-VPLS leaf-acs.
The following CLI example shows an I-VPLS E-Tree service that uses PBB-EVPN E-Tree. The leaf-source-bmac address must be configured before the configuration of the I-VPLS E-Tree. As is the case in regular E-Tree services, SAP and spoke SDPs that are not explicitly configured as leaf-acs are considered root-ac objects.
A:PE-2>config>service# info
----------------------------------------------
pbb
leaf-source-bmac 00:00:00:00:00:22
exit
vpls 1000 customer 1 name "vpls1000" b-vpls create
service-mtu 2000
bgp
exit
bgp-evpn
evi 1000
exit
mpls bgp 1
ingress-replication-bum-label
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
exit
vpls 1001 customer 1 i-vpls etree create
pbb
backbone-vpls 1000
exit
exit
stp
shutdown
exit
sap 1/1/1:1001 leaf-ac create
no shutdown
exit
sap 1/1/1:1002 create
no shutdown
exit
no shutdown
exit
The following considerations apply to PBB-EVPN E-Trees and multihoming:
All-active multihoming is not supported on leaf-ac I-VPLS SAPs.
Single-active multihoming is supported on leaf-ac I-VPLS SAPs and spoke SDPs.
ISID- and RFC 7623-based C-MAC flush are supported in addition to PBB-EVPN E-Tree services and single-active multihoming.
MPLS entropy label and hash label
The router supports the MPLS entropy label (RFC 6790) for EVPN VPLS and Epipe services, and the Flow Aware Transport label, known as the hash label, (RFC 6391) on spoke SDPs bound to a VPLS EVPN service, as well as on EVPN unicast destinations (in Epipe and VPLS services) if enabled by the hash-label command. This label allows LSR nodes in a network to load-balance labeled packets in a much more granular fashion than allowed by simply hashing on the standard label stack. The entropy label can be enabled on BGP-EVPN services (VPLS and Epipe).
To configure insertion of the entropy label on a BGP-EVPN VPLS or Epipe, use the entropy-label command in the bgp-evpn>mpls context. Use the entropy-label command under the spoke-sdp context to configure insertion of the entropy label on spoke SDPs bound to a BGP-EVPN VPLS. Note that the entropy label is only inserted if the far end of the MPLS tunnel is also entropy-label-capable. For more information, see the 7450 ESS, 7750 SR, 7950 XRS, and VSR MPLS Guide.
The hash label is configured using the hash-label command in the spoke-sdp context. Either the hash label or the entropy label can be configured on one object, but not both.
Inter-AS Option B and Next-Hop-Self Route-Reflector for EVPN-MPLS
Inter-AS Option B and Next-Hop-Self Route-Reflector (VPN-NH-RR) functions are supported for the BGP-EVPN family in the same way both functions are supported for IP-VPN families.
A typical use case for EVPN Inter-AS Option B or EVPN VPN-NH-RR is Data Center Interconnect (DCI) networks, where cloud and service providers are looking for efficient ways to extend their Layer 2 and Layer 3 tenant services beyond the data center and provide a tighter DC-WAN integration. While the instantiation of EVPN services in the DGW to provide this DCI connectivity is a common model, some operators use Inter-AS Option B or VPN-NH-RR connectivity to allow the DGW to function as an ASBR or ABR respectively, and the services are only instantiated on the edge devices.
EVPN inter-AS Option B or VPN-NH-RR model shows a DCI example where the EVPN services in two DCs are interconnected without the need for instantiating services on the DC GWs.
The ASBRs or ABRs connect the DC to the WAN at the control plane and data plane levels where the following considerations apply:
From a control plane perspective, the ASBRs or ABRs perform the following tasks:
-
accept EVPN-MPLS routes from a BGP peer
EVPN-VXLAN routes are not supported.
-
extract the MPLS label from the EVPN NLRI or attribute and program a label swap operation on the IOM
-
re-advertise the EVPN-MPLS route to the BGP peer in the other Autonomous Systems (ASs) or IGP domains
The re-advertised route has a Next-Hop-Self and a new label encoded for those routes that came with a label.
-
From a data plan perspective, the ASBRs and ABRs terminate the ingress transport tunnel, perform an EVPN label swap operation, and send the packets on to an interface (if E-BGP is used) or a new tunnel (if IBGP is used).
The ASBR or ABR resolves the EVPN routes based on the existing bgp next-hop-resolution command for family vpn, where vpn refers to EVPN, VPN-IPv4, and VPN-IPv6 families.
*A:ABR-1# configure router bgp next-hop-resolution labeled-routes transport-tunnel
family vpn resolution-filter
- resolution-filter
[no] bgp - Use BGP tunnelling for next hop resolution
[no] ldp - Use LDP tunnelling for next hop resolution
[no] rsvp - Use RSVP tunnelling for next hop resolution
[no] sr-isis - Use sr-isis tunnelling for next hop resolution
[no] sr-ospf - Use sr-ospf for next hop resolution
[no] sr-te - Use sr-te for next hop resolution
[no] udp - Use udp for next hop resolution
For more information about the next-hop resolution of BGP-labeled routes, see the 7450 ESS, 7750 SR, 7950 XRS, and VSR Unicast Routing Protocols Guide
Inter-AS Option B for EVPN services on ABSRs and VPN-NH-RR on ABRs re-use the existing commands enable-inter-as-vpn and enable-rr-vpn-forwarding respectively. The two commands enable the ASBR or ABR function for both EVPN and IP-VPN routes. These two features can be used with the following EVPN services:
EVPN-MPLS Epipe services (EVPN-VPWS)
EVPN-MPLS VPLS services
EVPN-MPLS R-VPLS services
PBB-EVPN and PBB-EVPN E-Tree services
EVPN-MPLS E-Tree services
PE and ABR functions (EVPN services and enable-rr-vpn-forwarding), which are both supported on the same router
PE and ASBR functions (EVPN services and enable-inter-as-vpn), which are both supported on the same router
The following sub-sections clarify some aspects of EVPN when used in an Inter-AS Option B or VPN-NH-RR network.
Inter-AS Option B and VPN-NH-RR procedures on EVPN routes
When enable-rr-vpn-forwarding or enable-inter-as-vpn is configured, only EVPN-MPLS routes are processed for label swap and the next hop is changed. EVPN-VXLAN routes are re-advertised without a change in the next hop.
The following shows how the router processes and re-advertises the different EVPN route types. For more information about the route fields, see the BGP-EVPN control plane for MPLS tunnels Guide.
Auto-discovery (AD) routes (type 1)
For AD per EVI routes, the MPLS label is extracted from the route NLRI. The route is re-advertised with Next-Hop-Self (NHS) and a new label. No modifications are made for the remaining attributes.
For AD per ES routes, the MPLS label in the NLRI is zero. The route is re-advertised with NHS and the MPLS label remains zero. No modifications are made for the remaining attributes.
MAC/IP routes (type 2)
The MPLS label (Label-1) is extracted from the NLRI. The route is re-advertised with NHS and a new Label-1. No modifications are made for the remaining attributes.
Inclusive Multicast Ethernet Tag (IMET) routes (type 3)
Because there is no MPLS label present in the NLRI, the MPLS label is extracted from the PMSI Tunnel Attribute (PTA) if needed, and the route is then re-advertised with NHS, with the following considerations:
For IMET routes with tunnel-type Ingress Replication, the router extracts the IR label from the PTA. The router programs the label swap and re-advertises the route with a new label in the PTA.
For tunnel-type P2MP mLDP, the router re-advertises the route with NHS. No label is extracted; therefore, no swap operation occurs.
For tunnel-type Composite, the IR label is extracted from the PTA, the swap operation is programmed and the route re-advertised with NHS. A new label is encoded in the PTA’s IR label with no other changes in the remaining fields.
For tunnel-type AR, the routes are always considered VXLAN routes and are re-advertised with the next-hop unchanged.
Ethernet-Segment (ES) routes (type 4)
Because ES routes do not contain an MPLS label, the route is re-advertised with NHS and no modifications to the remaining attributes. Although an ASBR or ABR re-advertises ES routes, EVPN multihoming for ES PEs located in different ASs or IGMP domains is not supported.
IP-Prefix routes (type 5)
The MPLS label is extracted from the NLRI and the route is re-advertised with NHS and a new label. No modifications are made to the remaining attributes.
BUM traffic in inter-AS Option B and VPN-NH-RR networks
Inter-AS Option B and VPN-NH-RR support the use of non-segmented trees for forwarding BUM traffic in EVPN.
For ingress replication and non-segmented trees, the ASBR or ABR performs an EVPN BUM label swap without any aggregation or further replication. This concept is shown in VPN-NH-RR and ingress replication for BUM traffic.
In VPN-NH-RR and ingress replication for BUM traffic, when PE2, PE3, and PE4 advertise their IMET routes, the ABRs re-advertise the routes with NHS and a different label. However, IMET routes are not aggregated; therefore, PE1 sets up three different EVPN multicast destinations and sends three copies of every BUM packet, even if they are sent to the same ABR. This example is also applicable to ASBRs and Inter-AS Option B.
P2MP mLDP may also be used with VPN-NH-RR, but not with Inter-AS Option B. The ABRs, however, do not aggregate or change the mLDP root IP addresses in the IMET routes. The root IP addresses must be leaked across IGP domains. For example, if PE2 advertises an IMET route with mLDP or composite tunnel type, PE1 is able to join the mLDP tree if the root IP is leaked into PE1’s IGP domain.
EVPN multihoming in inter-AS Option B and VPN-NH-RR networks
In general, EVPN multihoming is supported in Inter-AS Option B or VPN-NH-RR networks with the following limitations:
An ES PE can only process a remote ES route correctly if the received next hop and origination IP address match. EVPN multihoming is not supported when the ES PEs are in different ASs or IGP domains, or if there is an NH-RR peering the ES PEs and overriding the ES route next hops.
EVPN multihoming ESs are not supported on EVPN PEs that are also ABRs or ASBRs.
Mass-withdraw based on the AD per-ES routes is not supported for a PE that is in a different AS or IGP domain that the ES PEs. EVPN multihoming with inter-AS Option B or VPN-NH-RR shows an EVPN multihoming scenario where the ES PEs, PE2 and PE3, and the remote PE, PE1, are in different ASs or IGP domains.
In EVPN multihoming with inter-AS Option B or VPN-NH-RR, PE1’s aliasing and backup functions to the remote ES-1 are supported. However, PE1 cannot identify the originating PE for the received AD per-ES routes because they are both arriving with the same next hop (ASBR/ABR4) and RDs may not help to correlate each AD per-ES route to a specified PE. Therefore, if there is a failure on PE2’s ES link, PE1 cannot remove PE2 from the destinations list for ES-1 based on the AD per-ES route. PE1 must wait for the AD per-EVI route withdrawals to remove PE2 from the list. In summary, when the ES PEs and the remote PE are in different ASs or IGP domains, per-service withdrawal based on AD per-EVI routes is supported, but mass-withdrawal based on AD per-ES routes is not supported.
EVPN E-Tree in inter-AS Option B and VPN-NH-RR networks
Unicast procedures known to EVPN-MPLS E-Tree are supported in Inter-AS Option B or VPN-NH-RR scenarios, however, the BUM filtering procedures are affected.
As described in EVPN E-Tree, leaf-to-leaf BUM filtering is based on the Leaf Label identification at the egress PE. In a non-Inter-AS or non-VPN-NH-RR scenario, EVPN E-tree AD per-ES (ESI-0) routes carrying the Leaf Label are distinguished by the advertised next hop. In Inter-AS or VPN-NH-RR scenarios, all the AD per-ES routes are received with the ABR or ASBR next hop. Therefore, AD per-ES routes originating from different PEs would all have the same next hop, and the ingress PE would not be able to determine which leaf label to use for a specific EVPN multicast destination.
A simplified EVPN E-Tree solution is supported, where an E-Tree Leaf Label is not installed in the IOM if the PE receives more than one E-Tree AD per-ES route, with different RDs, for the same next hop. In this case, leaf BUM traffic is transmitted without a Leaf Label and the leaf-to-leaf traffic filtering depends on the egress source MAC filtering on the egress PE. See EVPN E-Tree egress filtering based on MAC source address.
PBB-EVPN E-tree services are not affected by Inter-AS or VPN-NH-RR scenarios, as AD per-ES routes are not used.
ECMP for EVPN-MPLS destinations
ECMP is supported for EVPN route next hops that are resolved to EVPN-MPLS destinations as follows:
ECMP for Layer 2 unicast traffic on Epipe and VPLS services for EVPN-MPLS destinations
This is enabled by the configure service epipe bgp-evpn mpls auto-bind-tunnel ecmp number and configure service vpls bgp-evpn mpls auto-bind-tunnel ecmp commands and allows the resolution of an EVPN-MPLS next hop to a group of ECMP tunnels of type RSVP-TE, SR-TE or BGP.
ECMP for Layer 3 unicast traffic on R-VPLS services with EVPN-MPLS destinations
This is enabled by the configure service vpls bgp-evpn mpls auto-bind-tunnel ecmp and configure service vpls allow-ip-int-bind evpn-mpls-ecmp commands.
The VPRN unicast traffic (IPv4 and IPv6) is sprayed among ‟m” paths, with ‟m” being the lowest value of (16,n), where ‟n” is the number of ECMP paths configured in the configure service vpls bgp-evpn mpls auto-bind-tunnel ecmp command.
CPM originated traffic is not sprayed and picks up the first tunnel in the set.
This feature is limited to FP3 and above systems.
ECMP for Layer 3 multicast traffic on R-VPLS services with EVPN-MPLS destinations
This is enabled by the configure service vpls allow-ip-int-bind ip-multicast-ecmp and configure service vpls bgp-evpn mpls auto-bind-tunnel ecmp commands. The VPRN multicast traffic (IPv4 and IPv6) are sprayed among up to ‟m” paths, with ‟m” being the lowest value of (16,n), and ‟n” is the number of ECMP paths configured in the configure service vpls bgp-evpn mpls auto-bind-tunnel ecmp command.
In all of these cases, the configure service epipe bgp-evpn mpls auto-bind-tunnel ecmp number and configure service vpls bgp-evpn mpls auto-bind-tunnel ecmp number commands determine the number of Traffic Engineering (TE) tunnels that an EVPN next hop can resolved to. TE tunnels refer to RSVP-TE or SR-TE types. For shortest path tunnels, such as, ldp, sr-isis, sr-ospf, udp, and so on, the number of tunnels in the ECMP group are determined by the configure router ecmp command.
Weighted ECMP for Layer 2 unicast traffic on Epipe and VPLS services for EVPN-MPLS destinations is supported. Packets are sprayed across the LSPs according to the outcome of the hash algorithm and the configured load balancing weight of each LSP when both:
-
the Epipe or VPLS service directly uses an ECMP set of RSVP or SR-TE LSPs with the configure router mpls lsp load-balancing-weight command configured
-
the configure service epipe bgp-evpn mpls auto-bind-tunnel weighted-ecmp or configure service vpls bgp-evpn mpls auto-bind-tunnel weighted-ecmp commands are configured
If the service uses a BGP tunnel which uses an ECMP set of RSVP or SR-TE LSPs with a load-balancing-weight configured, the router performs weighted ECMP regardless of the setting of weighted-ecmp under the auto-bind-tunnel context.
IPv6 tunnel resolution for EVPN MPLS services
EVPN MPLS services can be deployed in a pure IPv6 network infrastructure, where IPv6 addresses are used as next-hops of the advertised EVPN routes, and EVPN routes received with IPv6 next-hops are resolved to tunnels in the IPv6 tunnel-table.
To change the system-ipv4 address that is advertised as the next-hop for a local EVPN MPLS service by default, configure the config>service>vpls>bgp-evpn>mpls>route-next-hop {system-ipv4 | system-ipv6 | ip-address} command or the config>service>epipe>bgp-evpn>mpls>route-next-hop {system-ipv4 | system-ipv6 | ip-address} command.
The configured IP address is used as a next-hop for the MAC/IP, IMET, and AD per-EVI routes advertised for the service. Note that this configured next-hop can be overridden by a policy with the next-hop-self command.
In the case of Inter-AS model B or next-hop-self route-reflector scenarios, at the ASBR/ABR:
A route received with an IPv4 next-hop can be re-advertised to a neighbor with an IPv6 next-hop. The neighbor must be configured with the advertise-ipv6-next-hops evpn command.
A route received with an IPv6 next-hop can be re-advertised to a neighbor with an IPv4 next-hop. The no advertise-ipv6-next-hops evpn command must be configured on that neighbor.
EVPN multihoming support for MPLS tunnels resolved to non-system IPv4/IPv6 addresses
EVPN MPLS multihoming is supported on PEs that use non-system IPv4 or IPv6 addresses for tunnel resolution. Similar to multihoming in EVPN VXLAN networks, (see Non-system IPv4 and IPv6 VXLAN termination for EVPN VXLAN multihoming), additional configuration steps are required.
The configure service system bgp-evpn eth-seg es-orig-ip ip-address command must be configured with the non-system IPv4 or IPv6 address used for the EVPN-MPLS service. This command modifies the originating IP field in the ES routes advertised for the Ethernet Segment, and makes the system use this IP address when adding the local PE as DF candidate.
The configure service system bgp-evpn eth-seg route-next-hop ip-address command must also be configured with the non-system IP address. This command changes the next-hop of the ES and AD per-ES routes to the configured address.
All the EVPN MPLS services that make use of the Ethernet Segment must be configured with the configure service vpls|epipe bgp-evpn mpls route-next-hop ip-address command.
When multihoming is used in the service, the same IP address should be configured in all three of the commands detailed above, so the DF Election candidate list is built correctly.
EVPN for SRv6 tunnels
EVPN-VPWS, EVPN on VPLS services, and EVPN on VPRN services (EVPN-IFL) are supported with SRv6 tunnels. See the 7750 SR and 7950 XRS Segment Routing and PCE User Guide for more information about EVPN for SRv6 tunnels.
General EVPN topics
This section provides information about general topics related to EVPN.
ARP/ND snooping and proxy support
VPLS services support proxy-ARP (Address Resolution Protocol) and proxy-ND (Neighbor Discovery) functions that can be enabled or disabled independently per service. When enabled (proxy-ARP/proxy-ND no shutdown), the system populates the corresponding proxy-ARP/proxy-ND table with IP--MAC entries learned from the following sources:
EVPN-received IP-MAC entries
User-configured static IP-MAC entries
Snooped dynamic IP-MAC entries (learned from ARP/GARP/NA messages received on local SAPs/SDP bindings)
In addition, any ingress ARP or ND frame on a SAP or SDP binding is intercepted and processed. ARP requests and Neighbor Solicitations are answered by the system if the requested IP address is present in the proxy table.
Proxy-ARP example usage in an EVPN network shows an example of how proxy-ARP is used in an EVPN network. Proxy-ND would work in a similar way. The MAC address notation in the diagram is shortened for readability.
PE1 is configured as follows:
*A:PE1>config>service>vpls# info
----------------------------------------------
vxlan instance 1 vni 600 create
exit
bgp
route-distinguisher 192.0.2.71:600
route-target export target:64500:600 import target:64500:600
exit
bgp-evpn
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
proxy-arp
age-time 600
send-refresh 200
dup-detect window 3 num-moves 3 hold-down max anti-spoof-
mac 00:ca:ca:ca:ca:ca
dynamic-arp-populate
no shutdown
exit
sap 1/1/1:600 create
exit
no shutdown
----------------------------------------------
Proxy-ARP example usage in an EVPN network shows the following steps, assuming proxy-ARP is no shutdown on PE1 and PE2, and the tables are empty:
ISP-A sends ARP-request for (10.10.)10.3.
PE1 learns the MAC 00:01 in the FDB as usual and advertises it in EVPN without any IP. Optionally, the MAC can be configured as a CStatic mac, in which case it is advertised as protected. If the MAC is learned on a SAP or SDP binding where auto-learn-mac-protect is enabled, the MAC is also advertised as protected.
The ARP-request is sent to the CPM where:
An ARP entry (IP 10.1'MAC 00:01) is populated into the proxy-ARP table.
EVPN advertises MAC 00:01 and IP 10.1 in EVPN with the same SEQ number and Protected bit as the previous route-type 2 for MAC 00:01.
A GARP is also issued to other SAPs/SDP bindings (assuming they are not in the same split horizon group as the source). If garp-flood-evpn is enabled, the GARP message is also sent to the EVPN network.
The original ARP-request can still be flooded to the EVPN or not based on the unknown-arp-request-flood-evpn command.
Assuming PE1 was configured with unknown-arp-request-flood-evpn, the ARP-request is flooded to PE2 and delivered to ISP-B. ISP-B replies with its MAC in the ARP-reply. The ARP-reply is finally delivered to ISP-A.
PE2 learns MAC 00:01 in the FDB and the entry 10.1'00:01 in the proxy-ARP table, based on the EVPN advertisements.
When ISP-B replies with its MAC in the ARP-reply:
MAC 00:03 is learned in FDB at PE2 and advertised in EVPN.
MAC 00:03 and IP 10.3 are learned in the proxy-ARP table and advertised in EVPN with the same SEQ number as the previous MAC route.
ARP-reply is unicasted to MAC 00:01.
EVPN advertisements are used to populate PE1's FDB (MAC 00:03) and proxy-ARP (IP 10.3—>MAC 00:03) tables as mentioned in 5.
From this point onward, the PEs reply to any ARP-request for 00:01 or 00:03, without the need for flooding the message in the EVPN network. By replying to known ARP-requests / Neighbor Solicitations, the PEs help to significantly reduce the flooding in the network.
Use the following commands to customize proxy-ARP/proxy-ND behavior:
dynamic-arp-populate and dynamic-nd-populate
Enables the addition of dynamic entries to the proxy-ARP or proxy-ND table (disabled by default). When executed, the system populates proxy-ARP/proxy-ND entries from snooped GARP/ARP/NA messages on SAPs/SDP bindings in addition to the entries coming from EVPN (if EVPN is enabled). These entries are shown as dynamic.
static <IPv4-address> <mac-address> and static <IPv4-address> <mac-address> and static <ipv6-address> <mac-address> {host | router}
Configures static entries to be added to the table.
Note: A static IP-MAC entry requires the addition of the MAC address to the FDB as either learned or CStatic (conditional static mac) to become active (Status —> active).age-time <60 to 86400> (seconds)
Specifies the aging timer per proxy-ARP/proxy-ND entry. When the aging expires, the entry is flushed. The age is reset when a new ARP/GARP/NA for the same IP MAC is received.
send-refresh <120 to 86400> (seconds)
If enabled, the system sends ARP-request/Neighbor Solicitation messages at the configured time, so that the owner of the IP can reply and therefore refresh its IP MAC (proxy-ARP entry) and MAC (FDB entry).
table-size [1 to 16384]
Enables the user to limit the number of entries learned on a specified service. By default, the table-size limit is 250.
The unknown ARP-requests, NS, or the unsolicited GARPs and NA messages can be configured to be flooded or not in an EVPN network with the following commands:
proxy-arp [no] unknown-arp-request-flood-evpn
proxy-arp [no] garp-flood-evpn
proxy-nd [no] unknown-ns-flood-evpn
proxy-nd [no] host-unsolicited-na-flood-evpn
proxy-nd [no] router-unsolicited-na-flood-evpn
dup-detect [anti-spoof-mac <mac-address>] window <minutes> num-moves <count> hold-down <minutes | max>
Enables a mechanism that detects duplicate IPs and ARP/ND spoofing attacks. The working of the dup-detect command can be summarized as follows:
Attempts (relevant to dynamic and EVPN entry types) to add the same IP (different MAC) are monitored for <window> minutes and when <count> is reached within that window, the proxy-ARP/proxy-ND entry for the IP is suspected and marked as duplicate. An alarm is also triggered.
The condition is cleared when hold-down time expires (max does not expire) or a clear command is issued.
If the anti-spoof-mac is configured, the proxy-ARP/proxy-ND offending entry's MAC is replaced by this <mac-address> and advertised in an unsolicited GARP/NA for local SAP or SDP bindings and in EVPN to remote PEs.
This mechanism assumes that the same anti-spoof-mac is configured in all the PEs for the same service and that traffic with destination anti-spoof-mac received on SAPs/SDP bindings are dropped. An ingress MAC filter has to be configured to drop traffic to the anti-spoof-mac.
Proxy-arp entry combinations shows the combinations that produce a Status = Active proxy-arp entry in the table. The system replies to proxy-ARP requests for active entries. Any other combination results in a Status = inActv entry. If the service is not active, the proxy-arp entries are not active either, regardless of the FDB entries
Proxy-arp entry type | FDB entry type (for the same MAC) |
---|---|
Dynamic |
learned |
Static |
learned |
Dynamic |
CStatic/Static |
Static |
CStatic/Static |
EVPN |
EVPN, learned/CStatic/Static with matching ESI |
Duplicate |
— |
When proxy-ARP/proxy-ND is enabled on services with all-active multihomed Ethernet Segments, a proxy-arp entry type evpn may be associated with learned/CStatic/Static FDB entries (because for example, the CE can send traffic for the same MAC to all the multihomed PEs in the ES). If this is the case, the entry is active if the ESI of the EVPN route and the FDB entry match, or inactive otherwise, as per Proxy-arp entry combinations.
Proxy-ARP/ND periodic refresh, unsolicited refresh and confirm-messages
When proxy-ARP/proxy-ND is enabled, the system starts populating the proxy table and responding to ARP-requests/NS messages. To keep the active IP-MAC entries alive and ensure that all the host/routers in the service update their ARP/ND caches, the system may generate the following three types of ARP/ND messages for a specified IP-MAC entry:
-
periodic refresh messages (ARP-requests or NS for a specified IP)
These messages are activated by the send-refresh command and their objective is to keep the existing FDB and Proxy-ARP/ND entries alive to minimize EVPN withdrawals and re-advertisements.
-
unsolicited refresh messages (unsolicited GARP or NA messages)
These messages are sent by the system when a new entry is learned or updated. Their objective is to update the attached host/router caches.
-
confirm messages (unicast ARP-requests or unicast NS messages)
These messages are sent by the system when a new MAC is learned for an existing IP. The objective of the confirm messages is to verify that a specified IP has really moved to a different part of the network and is associated with the new MAC. If the IP has not moved, it forces the owners of the duplicate IP to reply and cause dup-detect to kick in.
Advertisement of Proxy-ARP/ND flags in EVPN
- The Router flag (R) is used in IPv6 Neighbor Advertisement messages to indicate if the proxy-ND entry belongs to an IPv6 router or an IPv6 host.
- The Override flag (O) is used in IPv6 Neighbor Advertisement messages to indicate whether the resolved entry should override a potential ND entry that the solicitor may already have for the same IPv6 address.
- The Immutable flag (I) indicates that the proxy-ARP or proxy-ND entry cannot change its binding to a different MAC addresses. This Flag is always set for static proxy-ARP/ND entries or configured dynamic IP addresses that are associated with a mac-list.
RFC9047 describes how to convey the flags (R, O and I) in the EVPN ARP/ND extended community that is advertised with the EVPN MAC/IP Advertisement routes. This enables the ingress and egress EVPN PEs to install the proxy-ARP/ND entries with the same property flags. The following figure shows the format of the EVPN ARP/ND extended community.
By default, the router does not advertise the ARP/ND extended community. Use the following command to configure the router to advertise all the proxy ARP/ND MAC/IP Advertisement routes with the extended community:
configure service vpls bgp-evpn arp-nd-extended-community
Proxy-ARP/ND and flag processing
Proxy-ND and the Router Flag
RFC 4861 describes the use of the (R) or Router flag in NA messages as follows:
-
A node capable of routing IPv6 packets must reply to NS messages with NA messages where the R flag is set (R=1).
-
Hosts must reply with NA messages where R=0.
The R flag in NA messages impacts how the hosts select their default gateways when sending packets off-link. The proxy-ND function on the router does one of the following, depending on whether it can provide the appropriate R flag information:
-
provides the appropriate R flag information in the proxy-ND NA replies, if possible
-
floods the received NA messages, if it cannot provide the appropriate R flag when replying
The use of the R flag (only present in NA messages and not in NS messages) makes the procedure for learning proxy-ND entries and replying to NS messages different from the procedures for proxy-ARP in IPv4. The NA messages snooping determines the router or host flag to add to each entry, and that determines the flag to use when responding to an NS message.
The procedure to add the R flag to a specified entry is as follows:
-
Dynamic entries are learned based on received NA messages. The R flag is also learned and added to the proxy-ND entry so that the appropriate R flag is used in response to NS requests for a specified IP.
-
Static entries are configured as host or router using the following command.
- MD-CLI
configure service vpls proxy-nd static-neighbor ip-address type
- classic
CLI
configure service vpls proxy-nd static
- MD-CLI
-
EVPN entries are learned from BGP and the following command determines the R flag added to them;
- MD-CLI
configure service vpls proxy-nd evpn advertise-neighbor-type
- classic
CLI
configure service vpls proxy-nd evpn-nd-advertise
- MD-CLI
configure service vpls bgp-evpn routes mac-ip arp-nd-extended-community
- classic
CLI
configure service vpls bgp-evpn arp-nd-extended-community-advertisement
- MD-CLI
-
In addition, the EVPN ND advertisement indicates what static and dynamic IP → MAC entries the system advertises in EVPN.
-
If you specify the router option for EVPN ND advertisement, the system should flood the received unsolicited NA messages for hosts. This is controlled by the following command:
- MD-CLI
configure service vpls proxy-nd evpn flood unknown-neighbor-advertise-host
- classic
CLI
configure service vpls proxy-nd host-unsolicited-na-flood-evpn
- MD-CLI
-
The opposite is also true so that the host option for EVPN ND advertisement is configured with the following command:
- MD-CLI
configure service vpls proxy-nd evpn flood unknown-neighbor-advertise-router
- classic
CLI
configure service vpls proxy-nd router-unsolicited-na-flood-evpn
- MD-CLI
- The router-host option for EVPN ND advertisement allows the router to advertise both types of entries in EVPN at the same time. That is, static and dynamic entries with the router or host flag are advertised in EVPN with the corresponding flag in the ARP/ND extended community. This option can be enabled only if the ARP/ND extended community is configured.
-
EVPN proxy-ND MAC/IP Advertisement routes received without the EVPN ARP/ND extended communities create an entry with type Router (which is the default value). Entries created as duplicate are advertised in EVPN with an R flag value that depends on the configuration of the EVPN ND advertisement command. If the host option is configured for the EVPN ND advertisement, the duplicate entry is treated as a host. If the router or router-host option is configured for the EVPN ND advertisement, the duplicate entry behaves as a router.
Proxy-ARP/ND and the Immutable Flag
The I bit or Immutable flag in the ARP/ND extended community is advertised and used as follows:
- Any static proxy-ARP/ND entry is advertised with I=1 if you enable ARP/ND extended community advertisement.
- Any configured dynamic IP address (associated with a mac-list) proxy-ARP/ND entry is advertised with I=1 if you enable ARP/ND extended community
- Duplicate entries are advertised with I=1 as well (in addition to O=1 and R=0 or 1 based on the configuration).
- The setting of the I bit is independent of the static bit associated with the FDB entry, and it is only used with proxy-ARP/ND advertisements.
The I bit in the ARP/ND extended community is processed on reception as follows:
- A PE receiving an EVPN MAC/IP Advertisement route containing an IP-MAC and the I flag set, installs the IP-MAC entry in the ARP/ND or proxy-ARP/ND table as an immutable binding.
- This immutable binding entry overrides an existing non-immutable binding for the same
IP-MAC. In general, the ARP/ND extended community command changes the selection
of ARP/ND entries when multiple routes with the same IP address exist. This
preferred order of ARP/ND entries selection is as follows:
- Local immutable ARP/ND entries (static and dynamic)
- EVPN immutable ARP/ND entries
- Remaining ARP/ND entries
- The absence of the EVPN ARP/ND Extended Community in a MAC/IP Advertisement route indicates that the IP→MAC entry is not an immutable binding.
- Receiving multiple EVPN MAC/IP Advertisement routes with the I flag set to 1 for the same IP but a different MAC address is considered a misconfiguration or a transient error condition. If this happens in the network, a PE receiving multiple routes (with the I flag set to 1 for the same IP and a different MAC address) selects one of them based on the previously described selection rules.
Proxy-ND and the Override Flag
- The O flag is learned for dynamic entries (being 0 or 1) and added to the proxy-ND table. If the ARP/ND extended community is configured, the O flag associated with the entry is advertised along with the EVPN MAC/IP Advertisement route. Static and duplicate entries are always advertised with O=1.
- Upon receiving an EVPN MAC/IP Advertisement route, the received O flag is stored in the entry created in the proxy-ND table, and used when replying to local NS messages for the IP address.
Proxy-ARP/ND mac-List for dynamic entries
SR OS supports the association of configured MAC lists with a configured dynamic proxy-ARP or proxy-ND IP address. The actual proxy-ARP or proxy-ND entry is not created until an ARP or Neighbor Advertisement message is received for the IP and one of the MACs in the associated MAC-list. This is in accordance with IETF RFC 9161, which states that a proxy-ARP or proxy-ND IP entry can be associated with one MAC among a list of allowed MACs.
The following example shows the use of MAC lists for dynamic entries.
A:PE-2>config>service#
proxy-arp-nd
mac-list ISP-1 create
mac 00:de:ad:be:ef:01
mac 00:de:ad:be:ef:02
mac 00:de:ad:be:ef:03
A:PE-2>config>service>vpls>proxy-arp#
dynamic 1.1.1.1 create
mac-list ISP-1
resolve 30
A:PE-2>config>service>vpls>proxy-nd#
dynamic 2001:db8:1000::1 create
mac-list ISP-1
resolve 30
where:
-
A dynamic IP (dynamic ip create) is configured and associated with a MAC list (mac-list name).
-
The MAC list is created in the config>service context and can be reused by multiple configured dynamic IPs as follows:
-
in different services
-
in the same service, for proxy-ARP and proxy-ND entries
-
-
If the MAC list is empty, the proxy-ARP or proxy-ND entry is not created for the configured IP.
-
The same MAC list can be applied to multiple configured dynamic entries even within the same service.
-
The new proxy-ARP and proxy-ND entries behave as dynamic entries and are displayed as type dyn in the show commands.
The following output example displays the entry corresponding to the configured dynamic IP.
show service id 1 proxy-arp detail
-------------------------------------------------------------------------------
Proxy Arp
-------------------------------------------------------------------------------
Admin State : enabled
Dyn Populate : enabled
Age Time : 900 secs Send Refresh : 300 secs
Table Size : 250 Total : 1
Static Count : 0 EVPN Count : 0
Dynamic Count : 1 Duplicate Count : 0
Dup Detect
-------------------------------------------------------------------------------
Detect Window : 3 mins Num Moves : 5
Hold down : 9 mins
Anti Spoof MAC : None
EVPN
-------------------------------------------------------------------------------
Garp Flood : enabled Req Flood : enabled
Static Black Hole : disabled
-------------------------------------------------------------------------------
===============================================================================
VPLS Proxy Arp Entries
===============================================================================
IP Address Mac Address Type Status Last Update
-------------------------------------------------------------------------------
1.1.1.1 00:de:ad:be:ef:01 dyn active 02/23/2016 09:05:49
-------------------------------------------------------------------------------
Number of entries : 1
===============================================================================
show service proxy-arp-nd mac-list "ISP-1" associations
===============================================================================
MAC List Associations
===============================================================================
Service Id IP Addr
-------------------------------------------------------------------------------
1 1.1.1.1
1 2001:db8:1000::1
-------------------------------------------------------------------------------
Number of Entries: 2
===============================================================================
Although no new proxy-ARP or proxy-ND entries are created when a dynamic IP is configured, the router triggers the following resolve procedure:
-
The router sends a resolve message with a configurable frequency of 1 to 60 minutes; the default value is five minutes.
Note: The resolve message is an ARP-request or NS message flooded to all the non-EVPN endpoints in the service. -
The router sends resolve messages at the configured frequency until a dynamic entry for the IP is created.
Note: The dynamic entry is created only if an ARP, GARP, or NA message is received for the configured IP, and the associated MAC belongs to the configured MAC list of the IP. If the MAC list is empty, the proxy-ARP or proxy-ND entry is not created for the configured IP.
After a dynamic entry (with a MAC address included in the list) is successfully created, its behavior (for send-refresh, age-time, and other activities) is the same as a configured dynamic entry with the following exceptions.
-
Regular dynamic entries may override configured dynamic entries, but static or EVPN entries cannot override configured dynamic entries.
-
If the corresponding MAC is flushed from the FDB after the entry is successfully created, the entry becomes inactive in the proxy-ARP or proxy-ND table and the resolve process is restarted.
-
If the MAC list is changed, all the IPs that point to the list delete the proxy entries and the resolve process is restarted.
-
If there is an existing configured dynamic entry and the router receives a GARP, ARP, or NA for the IP with a MAC that is not contained in the MAC list, the message is discarded and the proxy-ARP or proxy-ND entry is deleted. The resolve process is restarted.
-
If there is an existing configured dynamic entry and the router receives a GARP, ARP, or NA for the IP with a MAC contained in the MAC list, the existing entry is overridden by the IP and new MAC, assuming the confirm procedure passes.
-
The dup-detect and confirm procedures work for the configured dynamic entries when the MAC changes are between MACs in the MAC list. Changes to an off-list MAC cause the entry to be deleted and the resolve process is restarted.
- The CPM drops received dynamic ARP/ND messages without learning them, if they match a dynamic (immutable) entry.
- If there is a local configured dynamic address (irrespective of whether there is an entry for it or not), a received EVPN immutable entry for the same IP address is not installed. Therefore the IP duplication mechanisms do not apply to immutable entries.
BGP-EVPN MAC-mobility
EVPN defines a mechanism to allow the smooth mobility of MAC addresses from an NVE to another NVE. The 7750 SR, 7450 ESS, and 7950 XRS support this procedure as well as the MAC-mobility extended community in MAC advertisement routes as follows:
The router honors and generates the SEQ (Sequence) number in the MAC mobility extended community for MAC moves.
When a MAC is EVPN-learned and it is attempted to be learned locally, a BGP update is sent with SEQ number changed to ‟previous SEQ”+1 (exception: MAC duplication num-moves value is reached).
SEQ number = zero or no MAC mobility ext-community are interpreted as sequence zero.
In case of mobility, the following MAC selection procedure is followed:
If a PE has two or more active remote EVPN routes for the same MAC (VNI can be the same or different), the highest SEQ number is selected. The tie-breaker is the lowest IP (BGP NH IP).
If a PE has two or more active EVPN routes and it is the originator of one of them, the highest SEQ number is selected. The tie-breaker is the lowest IP (BGP NH IP of the remote route is compared to the local system address).
BGP-EVPN MAC-duplication
EVPN defines a mechanism to protect the EVPN service from control plane churn as a result of loops or accidental duplicated MAC addresses. The 7750 SR, 7450 ESS, and 7950 XRS support an enhanced version of this procedure as described in this section.
A situation may arise where the same MAC address is learned by different PEs in the same VPLS because of two (or more hosts) being misconfigured with the same (duplicate) MAC address. In such situation, the traffic originating from these hosts would trigger continuous MAC moves among the PEs attached to these hosts. It is important to recognize such situation and avoid incrementing the sequence number (in the MAC Mobility attribute) to infinity.
configure service vpls bgp-evpn mac-duplication detect window
configure service vpls bgp-evpn mac-duplication detect num-moves
The router then
alerts the operator with a trap message when a duplicate MAC situation occurs.
10 2014/01/14 01:00:22.91 UTC MINOR: SVCMGR #2331 Base
"VPLS Service 1 has MAC(s) detected as duplicates by EVPN mac-
duplication detection."
Use the following command in the BGP EVPN Table section to
display the offending MAC address:show service id svc-id bgp-evpn
===============================================================================
BGP EVPN Table
===============================================================================
EVI : 1000
Creation Origin : manual
Adv L2 Attributes : Disabled
Ignore Mtu Mismatch: Disabled
MAC/IP Routes
MAC Advertisement : Enabled Unknown MAC Route : Disabled
CFM MAC Advertise : Disabled
ARP/ND Ext Comm Adv: Disabled
Multicast Routes
Sel Mcast Advert : Disabled
Ing Rep Inc McastAd: Enabled
IP Prefix Routes
IP Route Advert : Disabled
MAC Duplication Detection
Num. Moves : 5 Window : 3
Retry : 9 Number of Dup MACs : 1
Black Hole : Enabled
Local Learned Trusted MAC
MAC time : 1 MAC move factor : 3
-------------------------------------------------------------------------------
Detected Duplicate MAC Addresses Time Detected
-------------------------------------------------------------------------------
00:de:fe:ca:da:04 05/18/2023 09:55:22
-------------------------------------------------------------------------------
===============================================================================
-------------------------------------------------------------------------------
Local Learned Trusted MAC
-------------------------------------------------------------------------------
MAC Address Time Detected
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
After detecting the duplicate, the router stops sending and processing any BGP MAC advertisement routes for that MAC address until one of the following occurs:
-
The MAC is flushed because of a local event (SAP or SDP binding associated with the MAC fails) or the reception of a remote update with better SEQ number (because of a MAC flush at the remote router).
-
The retry <in-minutes> timer expires, which flushes the MAC and restart the process. The retry timer is configured using the following command.
configure service vpls bgp-evpn mac-duplication retry
The values of num-moves and window are configurable to allow for the required flexibility in different environments. In scenarios where BGP configure router bgp rapid-update evpn is configured, the operator may want to configure a shorter window timer than in scenarios where BGP updates are sent every (default) min-route-advertisement interval.
configure service vpls bgp-evpn mac-duplication
The following example shows a MAC duplication detection configuration.
MD-CLI
[ex:/configure service vpls "bd-1000-mac-dup-mpls" bgp-evpn mac-duplication]
A:admin@node-2# info detail
retry 9
detect {
num-moves 5
window 3
}
classic CLI
A:node-2>config>service>vpls>bgp-evpn>mac-duplication# info detail
----------------------------------------------
detect num-moves 5 window 3 trusted-mac-move-factor 3
retry 9
Conditional static MAC and protection
RFC 7432 defines the use of the sticky bit in the MAC mobility extended community to signal static MAC addresses. These addresses must be protected in case there is an attempt to dynamically learn them in a different place in the EVPN-VXLAN VPLS service.
In the 7750 SR, 7450 ESS, and 7950 XRS, any conditional static MAC defined in an EVPN-VXLAN VPLS service is advertised by BGP-EVPN as a static address, that is, with the sticky bit set. The following example shows the configuration of a conditional static MAC.
A:node2config>service>vpls# info
----------------------------------------------
description "vxlan-service"
...
sap 1/1/1:1000 create
exit
static-mac
mac 00:ca:ca:ca:ca:00 create sap 1/1/1:1000 monitor fwd-status
exit
no shutdown
A:node-2# show router bgp routes evpn mac hunt mac-address 00:ca:ca:ca:ca:00
...
===============================================================================
BGP EVPN Mac Routes
===============================================================================
Network : 0.0.0.0/0
Nexthop : 192.0.2.63
From : 192.0.2.63
Res. Nexthop : 192.168.19.1
Local Pref. : 100 Interface Name : NotAvailable
Aggregator AS : None Aggregator : None
Atomic Aggr. : Not Atomic MED : 0
AIGP Metric : None
Connector : None
Community : target:65000:1000 mac-mobility:Seq: 0/Static
Cluster : No Cluster Members
Originator Id : None Peer Router Id : 192.0.2.63
Flags : Used Valid Best IGP
Route Source : Internal
AS-Path : No As-Path
EVPN type : MAC
ESI : 0:0:0:0:0:0:0:0:0:0 Tag : 1063
IP Address : :: RD : 65063:1000
Mac Address : 00:ca:ca:ca:ca:00 Mac Mobility : Seq:0
Neighbor-AS : N/A
Source Class : 0 Dest Class : 0
-------------------------------------------------------------------------------
Routes : 1
===============================================================================
Local static MACs or remote MACs with sticky bit are considered as "protected". A packet entering a SAP / SDP binding is discarded if its source MAC address matches one of these 'protected' MACs.
Auto-learn MAC protect and restricting protected source MACs
Auto-learn MAC protect, together with the ability to restrict where the protected source MACs are allowed to enter the service, can be enabled within an EVPN-MPLS and EVPN-VXLAN VPLS and routed VPLS services, but not in PBB-EVPN services. The protection, using the auto-learn-mac-protect command (described in Auto-learn MAC protect), and the restrictions, using the restrict-protected-src [discard-frame] command, operate in the same way as in a non-EVPN VPLS service.
When auto-learn-mac-protect is enabled on an object, source MAC addresses learned on that object are marked as protected within the FDB.
When restrict-protected-src is enabled on an object and a protected source MAC is received on that object, the object is automatically shutdown (requiring the operator to shutdown then no shutdown the object to make it operational again).
When restrict-protected-src discard-frame is enabled on an object and a frame with a protected source MAC is received on that object, that frame is discarded.
In addition, the following behavioral differences are specific to EVPN services:
An implicit restrict-protected-src discard-frame command is enabled by default on SAPs, mesh-SDPs and spoke SDPs. As this is the default, it is not possible to configure this command in an EVPN service. This default state can be seen in the show output for these objects, for example on a SAP:
*A:PE# show service id 1 sap 1/1/9:1 detail =============================================================================== Service Access Points(SAP) =============================================================================== Service Id : 1 SAP : 1/1/9:1 Encap : q-tag ... RestMacProtSrc Act : none (oper: Discard-frame)
A restrict-protected-src discard-frame can be optionally enabled on EVPN-MPLS/VXLAN destinations within EVPN services. When enabled, frames that have a protected source MAC address are discarded if received on any EVPN-MPLS/VXLAN destination in this service, unless the MAC address is learned and protected on an EVPN-MPLS/VXLAN destination in this service. This is enabled as follows:
configure service vpls <service id> bgp-evpn mpls bgp <instance> [no] restrict-protected-src discard-frame vxlan instance <instance> vni <vni-id> [no] restrict-protected-src discard-frame
Auto-learned protected MACs are advertised to remote PEs in an EVPN MAC/IP advertisement route with the sticky bit set.
The source MAC protection action relating to the restrict-protected-src [discard-frame] commands also applies to MAC addresses learned by receiving an EVPN MAC/IP advertisement route with the sticky bit set from remote PEs. This causes remotely configured conditional static MACs and auto-learned protected MACs to be protected locally.
In all-active multihoming scenarios, if auto-learn-mac-protect is configured on all-active SAPs and restrict-protected-src discard-frame is enabled on EVPN-MPLS/VXLAN destinations, traffic from the CE that enters one multihoming PE and needs to be switched through the other multihoming PE is discarded on the second multihoming PE. Each multihoming PE protects the CE's MAC on its local all-active SAP, which results in any frames with the CE's MAC address as the source MAC being discarded as they are received on the EVPN-MPLS/VXLAN destination from the other multihoming PE.
Conditional static MACs, EVPN static MACs and locally protected MACs are marked as protected within the FDB, as shown in the example output.
*A:PE# show service fdb-mac
===============================================================================
Service Forwarding Database
===============================================================================
ServId MAC Source-Identifier Type Last Change
Age
-------------------------------------------------------------------------------
1 00:00:00:00:00:01 sap:1/1/9:1 LP/30 01/05/16 11:58:22
1 00:00:00:00:00:02 vxlan-1: EvpnS:P 01/05/16 11:58:23
10.1.1.2:1
1 00:00:00:00:01:01 sap:1/1/9:1 CStatic: 01/04/16 20:05:02
P
1 00:00:00:00:01:02 vxlan-1: EvpnS:P 01/04/16 20:18:02
10.1.1.2:1
-------------------------------------------------------------------------------
No. of Entries: 4
-------------------------------------------------------------------------------
Legend: L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
===============================================================================
In this output:
the first MAC is locally protected using the auto-learn-mac-protect command
the second MAC has been protected using the auto-learn-mac-protect command on a remote PE
the third MAC is a locally configured conditional static MAC
the fourth MAC is a remotely configured conditional static MAC
The command auto-learn-mac-protect can be optionally extended with an exclude-list by using the following command:
auto-learn-mac-protect [exclude-list name]
This list refers to a mac-list <name> created under the config>service context and contains a list of MACs and associated masks.
When auto-learn-mac-protect [exclude-list name] is configured on a service object, dynamically learned MACs are excluded from being learned as protected if they match a MAC entry in the MAC list. Dynamically learned MAC SAs are protected only if they are learned on an object with ALMP configured and one of the following conditions is true:
there is no exclude list associated with the same object
there is an exclude-list but the MAC does not match any entry
The MAC lists can be used in multiple objects of the same or different service. When empty, ALMP does not exclude any learned MAC from protection on the object. This extension allows the mobility of specific MACs in objects where MACs are learned as protected.
Blackhole MAC and its application to proxy-ARP/proxy-ND duplicate detection
A blackhole MAC is a local FDB record. It is similar to a conditional static MAC; it is associated with a black-hole (similar to a VPRN blackhole static-route in VPRNs) instead of a SAP or SDP binding. A blackhole MAC can be added by using the following command:
config>service>vpls# static-mac mac
mac <ieee-address> [create] black-hole
The static blackhole MAC can have security applications (for example, replacement of MAC filters) for specific MACs. When used in combination with restrict-protected-src, the static blackhole MAC provides a simple and scalable way to filter MAC DA or SA in the data plane, regardless of how the frame arrived at the system (using SAP or SDP bindings or EVPN endpoints).
For example, when a specified static-mac mac 00:00:ca:fe:ca:fe create black-hole is added to a service, the following behavior occurs:
-
The configured MAC is created as a static MAC with a black-hole source identifier.
*A:PE1# show service id 1 fdb detail =============================================================================== Forwarding Database, Service 1 =============================================================================== ServId MAC Source-Identifier Type Last Change Age ------------------------------------------------------------------------------- 1 00:ca:ca:ba:ca:01 eES: Evpn 06/29/15 23:21:34 01:00:00:00:00:71:00:00:00:01 1 00:ca:ca:ba:ca:06 eES: Evpn 06/29/15 23:21:34 01:74:13:00:74:13:00:00:74:13 1 00:ca:00:00:00:00 sap:1/1/1:2 CStatic:P 06/29/15 23:20:58 1 00:ca:fe:ca:fe:00 black-hole CStatic:P 06/29/15 23:20:00 1 00:ca:fe:ca:fe:69 eMpls: EvpnS:P 06/29/15 20:40:13 192.0.2.69:262133 ------------------------------------------------------------------------------- No. of MAC Entries: 5 ------------------------------------------------------------------------------- Legend: L=Learned O=Oam P=Protected-MAC C=Conditional S=Static ===============================================================================
-
After it has been successfully added to the FDB, the blackhole MAC is treated like any other protected MAC, as follows:
-
The blackhole MAC is added as protected (CStatic:P) and advertised in EVPN as static.
-
SAP or SDP bindings or EVPN endpoints, where the restrict-protected-src discard-frame is enabled, discard frames where MAC SA is equal to blackhole MAC.
-
SAP or SDP bindings, where restrict-protected-src (no discard-frame) is enabled, go operationally down if a frame with MAC SA is equal to blackhole MAC is received.
-
-
After the blackhole MAC has been successfully added to the FDB, any frame arriving at any SAP or SDP binding or EVPN endpoint with MAC DA equal to blackhole MAC is discarded.
Blackhole MACs can also be used in services with proxy-ARP/proxy-ND enabled to filter traffic with destination to anti-spoof-macs. The anti-spoof-mac provides a way to attract traffic to a specified IP when a duplicate condition is detected for that IP address (see section ARP/ND snooping and proxy support for more information); however, the system still needs to drop the traffic addressed to the anti-spoof-mac by using either a MAC filter or a blackhole MAC.
The user does not need to configure MAC filters when configuring a static-black-hole MAC address for the anti-spoof-mac function. To use a blackhole MAC entry for the anti-spoof-mac function in a proxy-ARP/proxy-ND service, the user needs to configure:
the static-black-hole option for the anti-spoof-mac
*A:PE1# config>service>vpls>proxy-arp# dup-detect window 3 num-moves 5 hold-down max anti-spoof- mac 00:66:66:66:66:00 static-black-hole
a static blackhole MAC using the same MAC address used for the anti-spoof-mac
*A:PE1# config>service>vpls# static-mac mac 00:66:66:66:66:00 create black-hole
When this configuration is complete, the behavior of the anti-spoof-mac function changes as follows:
In the EVPN, the MAC is advertised as static. Locally, the MAC is shown in the FDB as ‟CStatic” and associated with a black-hole.
The combination of the anti-spoof-mac and the static-black-hole ensures that any frame that arrives at the system with MAC DA = anti-spoof-mac is discarded, regardless of the ingress endpoint type (SAP or SDP binding or EVPN) and without the need for a filter.
If, instead of discarding traffic, the user wants to redirect it using MAC DA as the anti-spoof-mac, then redirect filters should be configured on SAPs or SDP bindings (instead of the static-black-hole option).
When the static-black-hole option is not configured with the anti-spoof-mac, the behavior of the anti-spoof-mac function, as described in ARP/ND snooping and proxy support, remains unchanged. In particular:
the anti-spoof-mac is not programmed in the FDB
any attempt to add a static MAC (or any other MAC) with the anti-spoof-mac value is rejected by the system
a MAC filter is needed to discard traffic with MAC DA = anti-spoof-mac.
Blackhole MAC for EVPN loop detection
SR OS can combine a blackhole MAC address concept and the EVPN MAC duplication procedures to provide loop protection in EVPN networks. The feature is compliant with the MAC mobility and multihoming functionality in RFC 7432, and the Loop Protection section in draft-ietf-bess-rfc7432bis. Use the following command to enable the feature:
- MD-CLI
configure service vpls bgp-evpn mac-duplication blackhole enable
- classic
CLI
configure service vpls bgp-evpn mac-duplication black-hole-dup-mac
If enabled, there are no apparent changes in the MAC duplication; however, if a duplicated MAC is detected (for example, M1), then the router performs the following:
-
adds M1 to the duplicate MAC list
-
programs M1 in the FDB as a Protected MAC associated with a blackhole endpoint (where type is set to EvpnD:P and Source-Identifier is black-hole)
While the MAC type value remains EvpnD:P, the following additional operational details apply.
-
Incoming frames with MAC DA = M1 are discarded by the ingress IOM, regardless of the ingress endpoint type (SAP, SDP, or EVPN), based on an FDB MAC lookup.
-
Incoming frames with MAC SA = M1 are discarded by the ingress IOM or cause the router to bring down the SAP or SDP binding, depending on the restrict-protected-src setting on the SAP, SDP, or EVPN endpoint.
The following example shows an EVPN-MPLS service where blackhole is enabled and MAC duplication programs the duplicate MAC as a blackhole.
19 2016/12/20 19:45:59.69 UTC MINOR: SVCMGR #2331 Base
"VPLS Service 1000 has MAC(s) detected as duplicates by EVPN mac-duplication
detection."
MD-CLI
[ex:/configure service vpls "bd-1000"]
A:admin@node-2# info
admin-state enable
service-id 1000
customer "1"
bgp 1 {
}
bgp-evpn {
evi 1000
mac-duplication {
blackhole true
detect {
num-moves 5
window 3
}
}
mpls 1 {
admin-state enable
ingress-replication-bum-label true
auto-bind-tunnel {
resolution any
}
}
}
sap 1/1/1:1000 {
}
spoke-sdp 56:1000 {
}
classic CLI
A:node-2# configure service vpls 1000
A:node-2>config>service>vpls# info
----------------------------------------------
bgp
exit
bgp-evpn
evi 1000
mac-duplication
detect num-moves 5 window 3
retry 6
black-hole-dup-mac
exit
mpls bgp 1
ingress-replication-bum-label
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
stp
shutdown
exit
sap 1/1/1:1000 create
no shutdown
exit
spoke-sdp 56:1000 create
no shutdown
exit
no shutdown
----------------------------------------------
The following command displays BGP EVPN table values.
show service id 1000 bgp-evpn
===============================================================================
BGP EVPN Table
===============================================================================
EVI : 1000
Creation Origin : manual
Adv L2 Attributes : Disabled
Ignore Mtu Mismatch: Disabled
MAC/IP Routes
MAC Advertisement : Enabled Unknown MAC Route : Disabled
CFM MAC Advertise : Disabled
ARP/ND Ext Comm Adv: Disabled
Multicast Routes
Sel Mcast Advert : Disabled
Ing Rep Inc McastAd: Enabled
IP Prefix Routes
IP Route Advert : Disabled
MAC Duplication Detection
Num. Moves : 5 Window : 3
Retry : 9 Number of Dup MACs : 1
Black Hole : Enabled
Local Learned Trusted MAC
MAC time : 1 MAC move factor : 3
-------------------------------------------------------------------------------
Detected Duplicate MAC Addresses Time Detected
-------------------------------------------------------------------------------
00:de:fe:ca:da:04 05/18/2023 09:55:22
-------------------------------------------------------------------------------
===============================================================================
-------------------------------------------------------------------------------
Local Learned Trusted MAC
-------------------------------------------------------------------------------
MAC Address Time Detected
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
===============================================================================
BGP EVPN MPLS Information
===============================================================================
Admin Status : Enabled Bgp Instance : 1
Force Vlan Fwding : Disabled
Force Qinq Fwding : none
Route NextHop Type : system-ipv4
Control Word : Disabled
Max Ecmp Routes : 1
Entropy Label : Disabled
Default Route Tag : none
Split Horizon Group: (Not Specified)
Ingress Rep BUM Lbl: Enabled
Ingress Ucast Lbl : 524262 Ingress Mcast Lbl : 524261
RestProtSrcMacAct : none
Evpn Mpls Encap : Enabled Evpn MplsoUdp : Disabled
Oper Group :
MH Mode : network
Evi 3-byte Auto-RT : Disabled
Dyn Egr Lbl Limit : Disabled
Hash Label : Disabled
-------------------------------------------------------------------------------
===============================================================================
===============================================================================
BGP EVPN MPLS Auto Bind Tunnel Information
===============================================================================
Allow-Flex-Algo-Fallback : false
Resolution : any Strict Tnl Tag : false
Max Ecmp Routes : 1
Bgp Instance : 1
Filter Tunnel Types : (Not Specified)
Weighted Ecmp : false
-------------------------------------------------------------------------------
===============================================================================
The following command displays Forwarding Database details.
show service id 1000 fdb detail
===============================================================================
Forwarding Database, Service 1000
===============================================================================
ServId MAC Source-Identifier Type Last Change
Transport:Tnl-Id Age
-------------------------------------------------------------------------------
1000 00:de:fe:da:da:04 black-hole EvpnD:P 05/18/23 10:04:49
-------------------------------------------------------------------------------
No. of MAC Entries: 1
-------------------------------------------------------------------------------
Legend:L=Learned O=Oam P=Protected-MAC C=Conditional S=Static Lf=Leaf T=Trusted
===============================================================================
If the retry time expires, the MAC is flushed from the FDB and the process starts again. The following command clears the duplicate blackhole MAC address.
clear service id evpn mac-dup-detect
Support for the blackhole enable and black-hole-dup-mac commands and the preceding associated loop detection procedures is as follows:
-
not supported on B-VPLS, I-VPLS, or M-VPLS services
-
fully supported on EVPN-VXLAN VPLS/R-VPLS services, EVPN-MPLS VPLS/R-VPLS services (including EVPN E-Tree) and EVPN-SRv6 VPLS services
-
fully supported with EVPN MAC mobility and EVPN multihoming
Deterministic EVPN loop detection with trusted MACs
The EVPN loop detection procedure, described in the preceding section, is compliant with draft-ietf-bess-rfc7432bis and is an efficient way of detecting and blocking loops in EVPN networks. Contrary to other intrusive methods that inject Ethernet beacons into the customer network and detect loops depending on whether the beacon messages get back to the PEs, the EVPN loop detection mechanism is non intrusive since relies entirely on the learning of the same MAC on different nodes. However, the mechanism lacks of determinism as seen in EVPN non-intrusive loop detection mechanism.
Suppose PE1, PE2 and PE3 are attached to the same EVPN VPLS service, and there is an accidental backdoor link between the Base Stations connected to PE2 and PE3. When the Controller with MAC M1 issues a broadcast frame, PE1 forwards it to PE2 and PE3, and the frame is looped back via the backdoor link. The mac-duplication procedure kicks in and M1 is detected as duplicate and turned into a blackhole MAC in the FDB, effectively solving the loop. However, the user does not know beforehand if M1 is blackholed in PE1, PE2, PE3 or multiple of them at the same time. If M1 is blackholed in PE1, this represents an issue for the hosts connected to other PEs (not shown) attached to the same service. Therefore, in the example, we want to influence the mac-duplication procedure so that M1 gets blackholed in PE2, PE3, or both, but not in PE1. In order to make the procedure more deterministic, the Trusted MAC concept is used.
configure service vpls bgp-evpn mac-duplication trusted-mac-time
If
the MAC moves from a SAP to another SAP in the same service and PE, the MAC does not
reset its trusted MAC timer. configure service vpls bgp-evpn mac-duplication detect
While
non-trusted MACs are detected as duplicate after num-moves, trusted
MACs will need more moves to be declared as duplicate. The following example shows the configuration of three PEs as shown in EVPN non-intrusive loop detection mechanism.
MD-CLI
// Applicable to PE1, PE2 and PE3
[ex:/configure service vpls "bd-1000" bgp-evpn mac-duplication]
A:admin@node-2# info
blackhole true
trusted-mac-time 5 // value 1..15, default: 5
detect {
num-moves 5
window 3
trusted-mac-move-factor 3 // value 1..10, default: 1
}
classic CLI
// Applicable to PE1, PE2 and PE3
A:node-2>config>service>vpls>bgp-evpn>mac-duplication# info
----------------------------------------------
detect num-moves 5 window 3 trusted-mac-move-factor 3 // value 1..10, default: 1
black-hole-dup-mac
trusted-mac-time 5 // value 1..15, default: 5
Based on the preceding configuration, recall the example described at the beginning of this section and assume M1 is a trusted MAC in PE1 (it has been dynamically learned for 5 minutes), then M1 requires 15 moves to be declared as duplicate (therefore a blackhole MAC) in PE1, whereas M1 only need 5 moves to be declared as duplicate in PE2 and PE3. This procedure guarantees that M1 does not get blackholed in the PE of its location (PE1).
The trusted MACs are displayed in the following FDB show command with a "T" in the Type field:
show service id 1000 fdb detail
===============================================================================
Forwarding Database, Service 1000
===============================================================================
ServId MAC Source-Identifier Type Last Change
Transport:Tnl-Id Age
-------------------------------------------------------------------------------
1000 00:de:fe:da:da:04 sap:1/1/1:1000 LT/0 05/18/23 10:54:54
-------------------------------------------------------------------------------
No. of MAC Entries: 1
-------------------------------------------------------------------------------
Legend:L=Learned O=Oam P=Protected-MAC C=Conditional S=Static Lf=Leaf T=Trusted
===============================================================================
CFM interaction with EVPN services
Ethernet Connectivity and Fault Management (ETH-CFM) allows the operator to validate and measure Ethernet Layer 2 services using standard IEEE 802.1ag and ITU-T Y.1731 protocols. Each tool performs a unique function and adheres to that tool's specific PDU and frame format and the associate rules governing the transmission, interception, and process of the PDU. Detailed information describing the ETH-CFM architecture, the tools, and various functions is located in the various OAM and Diagnostics guides and is not repeated here.
EVPN provides powerful solution architectures. ETH-CFM is supported in the various Layer 2 EVPN architectures. Because the destination Layer 2 MAC address, unicast or multicast, is ETH-CFM tool dependent (for example, ETH-CC is sent as an L2 multicast and ETH-DM is sent as an L2 unicast), the ETH-CFM function is allowed to multicast and broadcast to the virtual EVPN connections. The Maintenance Endpoint (MEP) and Maintenance Intermediate Point (MIP) do not populate the local Layer 2 MAC Address forwarding database (FDB) with the MAC related to the MEP and MIP. This means that the 48-bit IEEE MAC address is not exchanged with peers and all ETH-CFM frames are broadcast across all virtual connections. To prevent the flooding of unicast packets and allow the remote forwarding databases to learn the remote MEP and MIP Layer 2 MAC addresses, the command cfm-mac-advertisement must be configured under the config>service>vpls>bgp-evpn context. This allows the MEP and MIP Layer 2 IEEE MAC addresses to be exchanged with peers. This command tracks configuration changes and send the required updates via the EVPN notification process related to a change.
Up MEP, Down MEP, and MIP creation is supported on the SAP, spoke, and mesh connections within the EVPN service. There is no support for the creation of ETH-CFM Management Points (MPs) on the virtual connection. VirtualMEP (vMEP) is supported with a VPLS context and the applicable EVPN Layer 2 VPLS solution architectures. The vMEP follows the same rules as the general MPs. When a vMEP is configured within the supported EVPN service, the ETH-CFM extraction routines are installed on the SAP, Binding, and EVPN connections within an EVPN VPLS Service. The vMEP extraction within the EVPN-PBB context requires the vmep-extensions parameter to install the extraction on the EVPN connections.
When MPs are used in combination with EVPN multihoming, the following must be considered:
Behavior of operationally down MEPs on SAPs/SDP bindings with EVPN multihoming:
all-active multihoming
No ETH-CFM is expected to be used in this case, because the two (or more) SAPs/SDP bindings on the PEs are oper-up and active; however, the CE has a single LAG and responds as though it is connected to a single system. In addition to that, cfm-mac-advertisement can lead to traffic loops in all-active multihoming.
single-active multihoming
Operationally down MEPs defined on single-active Ethernet-Segment SAPs/SDP bindings do not send any CCMs when the PE is non-DF for the ES and fault-propagation is configured. For single-active multihoming, the behavior is equivalent to MEPs defined on BGP-MH SAPs/binds.
Behavior for operationally up MEPs on ES SAPs/SDP bindings with EVPN multihoming:
all-active multihoming
Operationally up MEPs defined on non-DF ES SAPs can send CFM packets. However, they cannot receive CCMs (the SAP is removed from the default multicast list) or unicast CFM packets (because the MEP MAC is not installed locally in the FDB; unicast CFM packets are treated as unknown, and not sent to the non-DF SAP MEP).
-
single-active multihoming
Operationally up MEPs should be able to send or receive CFM packets normally.
-
operationally up MEPs defined on LAG SAPs
Operationally up MEPs defined on LAG SAPs require the command process_cpm_traffic_on_sap_down so that they can process CFM when the LAG is down and act as regular Ethernet ports.
Because of the above considerations, the use of ETH-CFM in EVPN multihomed SAPs/SDP bindings is only recommended on operationally down MEPs and single-active multihoming. ETH-CFM is used in this case to notify the CE of the DF or non-DF status.
Multi-instance EVPN: Two instances of different encapsulation in the same VPLS/R-VPLS/Epipe service
SR OS supports a maximum of two BGP instances in the same VPLS or R-VPLS, where the two instances can be:
- one EVPN-VXLAN instance and one EVPN-MPLS instance in the same VPLS or R-VPLS service
- two EVPN-VXLAN instances in the same VPLS or R-VPLS service
- two EVPN-MPLS instances in the same VPLS or R-VPLS service
- one EVPN-MPLS instance and one EVPN-SRv6 instance in the same VPLS service
- one EVPN-VXLAN instance and one EVPN-SRv6 instance in the same VPLS service
In all the preceding cases, the procedures are compliant with RFC 9014.
SR OS also supports up to two BGP instances in the same Epipe. These two instances can be an EVPN-MPLS instance and an EVPN-SRv6 instance in the same Epipe service.
The procedures to support two BGP instances in the same Epipe adhere to draft-sr-bess-evpn-vpws-gateway.
EVPN-VXLAN to EVPN-MPLS interworking
This section describes the configuration aspects of a VPLS/R-VPLS with EVPN-VXLAN and EVPN-MPLS.
In a service where EVPN-VXLAN and EVPN-MPLS are configured together, the configure service vpls bgp-evpn vxlan bgp 1 and configure service vpls bgp-evpn mpls bgp 2 commands allow the user to associate EVPN-MPLS to a different instance from that associated with EVPN-VXLAN, and have both encapsulations simultaneously enabled in the same service. At the control plane level, EVPN MAC/IP advertisement routes received in one instance are consumed and readvertised in the other instance as long as the route is the best route for a specific MAC. Inclusive multicast routes are independently generated for each BGP instance. In the data plane, the EVPN-MPLS and EVPN-VXLAN destinations are instantiated in different implicit Split Horizon Groups (SHGs) so that traffic can be forwarded between them.
The following example shows a VPLS service with two BGP instances and both VXLAN and MPLS encapsulations configured for the same BGP-EVPN service.
*A:PE-1>config>service>vpls# info
----------------------------------------------
description "evpn-mpls and evpn-vxlan in the same service"
vxlan instance 1 vni 7000 create
exit
bgp
route-distinguisher 10:2
route-target target:64500:1
exit
bgp 2
route-distinguisher 10:1
route-target target:64500:1
exit
bgp-evpn
evi 7000
incl-mcast-orig-ip 10.12.12.12
vxlan bgp 1 vxlan-instance 1
no shutdown
mpls bgp 2
control-word
auto-bind-tunnel
resolution any
exit
force-vlan-vc-forwarding
no shutdown
exit
exit
no shutdown
The following list describes the preceding example:
-
bgp 1 or bgp is the default BGP instance
-
bgp 2 is the additional instance required when both bgp-evpn vxlan and bgp-evpn mpls are enabled in the service
-
The commands supported in instance 1 are also available in instance 2 with the following considerations:
-
pw-template-binding
The pw-template-binding can only exist in instance 1; it is not supported in instance 2.
-
route-distinguisher
The operating route-distinguisher in both BGP instances must be different.
-
route-target
The route target in both instances can be the same or different.
-
vsi-import and vsi-export
Import and export policies can also be defined for either BGP instance.
-
-
MPLS and VXLAN can use either BGP instance, and the instance is associated when bgp-evpn mpls or bgp-evpn vxlan is created. The bgp-evpn vxlan command must include not only the association to a BGP instance, but also to a vxlan-instance (because the VPLS services support two VXLAN instances).
Note: The bgp-evpn vxlan no shutdown command is only allowed if bgp-evpn mpls shutdown is configured, or if the BGP instance associated with the MPLS has a different route distinguisher than the VXLAN instance.
The following features are not supported when two BGP instances are enabled on the same VPLS/R-VPLS service:
-
SDP bindings
-
M-VPLS, I-VPLS, B-VPLS, or E-Tree VPLS
-
Proxy-ARP and proxy-ND
-
BGP Multihoming
-
IGMP, MLD, and PIM snooping
-
BGP-VPLS or BGP-AD (SDP bindings are not created)
The service>vpls>bgp-evpn>ip-route-advertisement command is not supported on R-VPLS services with two BGP instances.
EVPN-SRv6 to EVPN-MPLS or EVPN-VXLAN interworking
EVPN-SRv6 and EVPN-MPLS or EVPN-VXLAN can be simultaneously configured in the same VPLS service (but not R-VPLS), in different instances. In addition, EVPN-SRv6 and EVPN-MPLS can be simultaneously configured in the same Epipe service, so that border routers can stitch SRv6 and MPLS domains for point-to-point services.
VPLS services
EVPN-SRv6 and EVPN-VXLAN instances in the same VPLS service follow the same configuration rules as described in EVPN-VXLAN to EVPN-MPLS interworking, and the same processing of MAC/IP Advertisement routes and Inclusive Multicast Ethernet Tag routes is applied.
The following example shows a VPLS service with two BGP instances, with both VXLAN and SRv6 encapsulations configured under BGP-EVPN.
MD-CLI
A:node-2>config>service>vpls "evpn-srv6-vxlan-1"> info
admin-state enable
description "evpn-srv6 and evpn-vxlan in the same service"
vxlan {
instance 1 {
vni 12340
}
}
segment-routing-v6 1 {
locator "loc-1" {
function {
end-dt2u {
}
end-dt2m {
}
}
}
}
bgp 1 {
route-distinguisher "12340:1"
route-target {
export "target:64500:12340"
import "target:64500:12340"
}
}
bgp 2 {
route-distinguisher "12340:2"
route-target {
export "target:64500:12341"
import "target:64500:12341"
}
}
bgp-evpn {
evi 12340
incl-mcast-orig-ip 10.12.12.12
segment-routing-v6 2 {
admin-state enable
ecmp 4
force-vc-forwarding vlan
srv6 {
default-locator "loc-1"
}
}
vxlan 1 {
admin-state enable
vxlan-instance 1
}
}
classic CLI
A:node-2>config>service>vpls# info
----------------------------------------------
description "evpn-srv6 and evpn-vxlan in the same service"
vxlan instance 1 vni 12340 create
exit
segment-routing-v6 1 create
locator "loc-1"
function
end-dt2u
end-dt2m
exit
exit
exit
bgp
route-distinguisher 12340:1
route-target export target:64500:12340 import target:64500:12340
exit
bgp 2
route-distinguisher 12340:2
route-target export target:64500:12341 import target:64500:12341
exit
bgp-evpn
incl-mcast-orig-ip 10.12.12.12
evi 12340
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
segment-routing-v6 bgp 2 srv6-instance 1 default-locator "loc-1" create
ecmp 4
force-vlan-vc-forwarding
route-next-hop 2001:db8::76
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
----------------------------------------------
When an EVPN-SRv6 instance and an EVPN-MPLS instance are both configured in the same VPLS service, each instance can be configured in a different or the same split horizon group. The former option allows the interconnection of domains of different encapsulation, and the rules of configuration and route processing described in EVPN-VXLAN to EVPN-MPLS interworking apply. The latter option is used for domains where MPLS and SRv6 PEs are attached to the same service, typically for migration purposes.
When the EVPN-SRv6 and the EVPN-MPLS instances are configured in the same split horizon group:
- MAC/IP Advertisement routes are not redistributed between the two instances
- Two BUM EVPN destinations to the same far-end PE (identified by the originator IP of the Inclusive Multicast Ethernet Tag routes) cannot be created. An EVPN-MPLS BUM destination is removed if there is another BUM destination to the same far end with an SRv6 encapsulation. This is to prevent BUM traffic duplication between multi-instance nodes
- SAPs are supported, but SDP binds are not supported
The following example shows a VPLS service with two BGP instances, with both MPLS and SRv6 encapsulations configured under BGP-EVPN, with the same split horizon group.
MD-CLI
configure service vpls "evpn-srv6-mpls-1" >info
admin-state enable
description "evpn-srv6 and evpn-mpls in the same service"
segment-routing-v6 1 {
locator "loc-1" {
function {
end-dt2u {
}
end-dt2m {
}
}
}
}
bgp 1 {
route-distinguisher "12341:1"
route-target {
export "target:64500:12342"
import "target:64500:12342"
}
}
bgp 2 {
route-distinguisher "12341:2"
route-target {
export "target:64500:12343"
import "target:64500:12343"
}
}
bgp-evpn {
evi 12340
incl-mcast-orig-ip 10.12.12.12
segment-routing-v6 2 {
admin-state enable
ecmp 4
force-vc-forwarding vlan
srv6 {
default-locator "loc-1"
}
route-next-hop {
ip-address 2001:db8::76
}
}
mpls 1 {
admin-state enable
force-vc-forwarding vlan
split-horizon-group "SHG-1"
ingress-replication-bum-label true
ecmp 4
mh-mode access
auto-bind-tunnel {
resolution any
}
}
}
split-horizon-group "SHG-1" {
}
classic CLI
A:node-2>config>service>vpls# info
----------------------------------------------
description "evpn-srv6 and evpn-mpls in the same service"
split-horizon-group "SHG-1" create
exit
segment-routing-v6 1 create
locator "loc-1"
function
end-dt2u
end-dt2m
exit
exit
exit
bgp
route-distinguisher 12341:1
route-target export target:64500:12342 import target:64500:12342
exit
bgp 2
route-distinguisher 12341:2
route-target export target:64500:12343 import target:64500:12343
exit
bgp-evpn
incl-mcast-orig-ip 10.12.12.12
evi 12341
mpls bgp 1
mh-mode access
force-vlan-vc-forwarding
split-horizon-group "SHG-1"
ingress-replication-bum-label
ecmp 4
auto-bind-tunnel
resolution any
exit
no shutdown
exit
segment-routing-v6 bgp 2 srv6-instance 1 default-locator "loc-1" create
ecmp 4
force-vlan-vc-forwarding
route-next-hop 2001:db8::76
split-horizon-group "SHG-1"
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
----------------------------------------------
Epipe services
The following example shows an Epipe service with two BGP instances, with both MPLS and SRv6 encapsulations configured under BGP-EVPN. This is the gateway configuration when it is stitching MPLS and SRv6 domains for E-Line services:
MD-CLI
[ex:/configure service epipe "multi-inst-evpn-vpws-100"]
A:admin@node-2# info
admin-state enable
service-id 100
customer "1"
segment-routing-v6 1 {
locator "LOC-2-16bits" {
function {
end-dx2 {
}
}
}
}
bgp 1 {
route-distinguisher "23.23.23.1:100"
}
bgp 2 {
route-distinguisher "23.23.23.2:100"
}
endpoint "mpls" {
}
endpoint "srv6" {
}
bgp-evpn {
evi 100
local-attachment-circuit "mpls" {
endpoint "mpls"
eth-tag 1
}
local-attachment-circuit "srv6" {
endpoint "srv6"
eth-tag 1
bgp 2
}
remote-attachment-circuit "mpls" {
endpoint "mpls"
eth-tag 1
}
remote-attachment-circuit "srv6" {
endpoint "srv6"
eth-tag 1
bgp 2
}
mpls 1 {
admin-state enable
ecmp 2
domain-id "64500:1"
auto-bind-tunnel {
resolution any
}
}
segment-routing-v6 2 {
admin-state enable
source-address 2001:db8::2
mh-mode access
domain-id "64500:2"
srv6 {
instance 1
default-locator "LOC-2-16bits"
}
}
}
classic CLI
A:node-2# configure service epipe 100
A:node-2>config>service>epipe# info
----------------------------------------------
endpoint "mpls" create
exit
endpoint "srv6" create
exit
segment-routing-v6 1 create
locator "LOC-2-16bits"
function
end-dx2
exit
exit
exit
bgp 1
route-distinguisher 23.23.23.1:100
exit
bgp 2
route-distinguisher 23.23.23.2:100
exit
bgp-evpn
local-attachment-circuit mpls bgp 1 endpoint mpls create
eth-tag 1
exit
local-attachment-circuit srv6 bgp 2 endpoint srv6 create
eth-tag 1
exit
remote-attachment-circuit mpls bgp 1 endpoint mpls create
eth-tag 1
exit
remote-attachment-circuit srv6 bgp 2 endpoint srv6 create
eth-tag 1
exit
evi 100
mpls bgp 1
domain-id 64500:1
ecmp 2
auto-bind-tunnel
resolution any
exit
no shutdown
exit
segment-routing-v6 bgp 2 srv6-instance 1 default-locator "LOC-2-16bits" create
domain-id 64500:2
mh-mode access
source-address 2001:db8::2
no shutdown
exit
exit
no shutdown
----------------------------------------------
- The epipe bgp command supports up to two instances, where the default value is 1, and the accepted values are now in the range 1..2.
- The bgp-instance of 1 or 2 can be matched under the
following
contexts:
MPLS and SRv6 can be configured in Epipes with one or two instances, and they can indistinctly use instance “1” or “2”. The preceding example shows an Epipe service with MPLS configured in bgp-instance 1 and segment-routing-v6 configured in bgp-instance 2.configure service epipe bgp-evpn mpls configure service epipe bgp-evpn segment-routing-v6
- The bgp-instance 2 requires the support of the explicit route distinquisher (RD) configuration because the EVI-based autoderivation of the RD only applies to bgp-instance 1. The route target EVI-based autoderivation applies to both instances.
- The following command can also be associated with a BGP instance.
By default, all local attachment circuits in the service are associated to bgp-instance 1. When the local attachment circuits are associated to different BGP instances, no local SAPs or Spoke-SDPs are supported in the service (this is blocked by the CLI).configure service epipe bgp-evpn local-attachment-circuit
- The BGP instances are configured with a D-PATH domain-id. The D-PATH attribute is described in section BGP D-PATH attribute for Layer 3 loop protection and can also be used in multi-instance Epipe services. D-PATH is used in the EVPN-VPWS AD per-EVI routes for best-path selection and loop avoidance in case of redundant gateways. In the preceding example, configuring segment-routing-v6 bgp 2 domain-id 64500:2 means that the received EVPN AD per-EVI route in the SRv6 instance is redistributed to the MPLS instance with a D-PATH attribute where domain 64500:2 is prepended.
- When configuring an SRv6 and an MPLS instance in an Epipe service, one of the two instances must be configured as mh-mode access, with the other one configured as mh-mode network (default value for SRv6 and MPLS instances). The command is added under the MPLS and SRv6 instances (not in VXLAN instances).
As in the case of any Epipe service, two explicit or implicit endpoints
exist, where traffic always flows from one endpoint to the other endpoint. The
preceding example uses the configuration of two explicit endpoints, however one
implicit endpoint plus one explicit endpoint can also be configured, and the
behavior would be identical. In other words, the preceding configuration is also
valid if the endpoint "mpls"
is not configured. In this case, the
local-attachment-circuit and
remote-attachment-circuit associated with bgp
1 would be part of an implicit end-point.
BGP-EVPN routes in services configured with two BGP instances
The following sections describe BGP-EVPN routes in EVPN and VPLS services configured with two BGP instances.
VPLS services
From a BGP perspective, the two BGP instances configured in the service are independent of each other. The redistribution of routes between the BGP instances is resolved at the EVPN application layer.
By default, if EVPN-VXLAN and EVPN-MPLS are both enabled in the same service, BGP sends the generated EVPN routes twice: with the RFC 9012 BGP encapsulation extended community set to VXLAN and a second time with the encapsulation type set to MPLS.
Usually, a DCGW peers a pair of Route Reflectors (RRs) in the DC and a pair of RRs in the WAN. For this reason, the user needs to add router policies so that EVPN-MPLS routes are only sent to the WAN RRs and EVPN-VXLAN routes are only sent to the DC RRs. The following examples show how to configure router policies.
MD-CLI
[ex:/configure router "Base" bgp]
A:admin@node-2# info
vpn-apply-export true
vpn-apply-import true
group "WAN" {
type internal
family {
evpn true
}
export {
policy ["allow only mpls"]
}
}
group "DC" {
type internal
family {
evpn true
}
export {
policy ["allow only vxlan"]
}
}
neighbor "192.0.2.2" {
group "WAN"
}
neighbor "192.0.2.75" {
group "DC"
}
*[ex:/configure policy-options]
A:admin@PE-76# info
community "mpls" {
member "bgp-tunnel-encap:MPLS" { }
}
community "vxlan" {
member "bgp-tunnel-encap:VXLAN" { }
}
policy-statement "allow only mpls" {
entry 10 {
from {
family [evpn]
community {
name "vxlan"
}
}
action {
action-type reject
}
}
}
policy-statement "allow only vxlan" {
entry 10 {
from {
family [evpn]
community {
name "mpls"
}
}
action {
action-type reject
}
}
}
classic CLI
config>router>bgp#
vpn-apply-import
vpn-apply-export
group "WAN"
family evpn
type internal
export "allow only mpls"
neighbor 192.0.2.6
group "DC"
family evpn
type internal
export "allow only vxlan"
neighbor 192.0.2.2
A:node-2>config>router>policy-options# info
----------------------------------------------
community "vxlan" members "bgp-tunnel-encap:VXLAN"
community "mpls" members "bgp-tunnel-encap:MPLS"
policy-statement "allow only mpls"
entry 10
from
family evpn
community vxlan
action drop
exit
exit
exit
policy-statement "allow only vxlan"
entry 10
from
family evpn
community mpls
action drop
exit
exit
exit
In a BGP instance, the EVPN routes are imported based on the route-targets and regular BGP selection procedures, regardless of their encapsulation.
The BGP-EVPN routes are generated and redistributed between BGP instances based on the following rules:
-
Auto-discovery (AD) routes (type 1) are not generated by services with two BGP EVPN instances, unless a local Ethernet segment is present on the service. However, AD routes received from the EVPN-MPLS peers are processed for aliasing and backup functions as usual.
-
MAC/IP routes (type 2) received in one of the two BGP instances are imported and the MACs added to the FDB according to the existing selection rules. If the MAC is installed in the FDB, it is readvertised in the other BGP instance with the new BGP attributes corresponding to the BGP instance (route target, route distinguisher, and so on). The following considerations apply to these routes:
-
The mac-advertisement command governs the advertisement of any MACs (even those learned from BGP).
-
A MAC route is redistributed only if it is the best route based on the EVPN selection rules.
-
If a MAC route is the best route and has to be redistributed, the MAC/IP information, along with the MAC mobility extended community, is propagated in the redistribution.
-
The router redistributes any MAC route update for which any attribute has changed. For example, a change in the SEQ or sticky bit in one instance is updated in the other instance for a route that is selected as the best MAC route.
-
-
EVPN inclusive multicast routes are generated independently for each BGP instance with the corresponding BGP encapsulation extended community (VXLAN or MPLS). Also, the following considerations apply to these routes:
-
Ingress Replication (IR) and Assisted Replication (AR) routes are supported in the EVPN-VXLAN instance. If AR is configured, the AR IP address must be a loopback address different from the system-ip and the configured originating-ip address.
-
IR, P2MP mLDP, and composite inclusive multicast routes are supported in the EVPN-MPLS instance.
-
The modification of the incl-mcast-orig-ip command is supported, subject to the following considerations:
-
The configured IP in the incl-mcast-orig-ip command is encoded in the originating-ip field of the inclusive multicast Routes for IR, P2MP, and composite tunnels.
-
The originating-ip field of the AR routes is still derived from the service>system>vxlan>assisted-replication-ip configured value.
-
-
EVPN handles the inclusive multicast routes in a service based on the following rules:
-
For IR routes, the EVPN destination is set up based on the NLRI next hop.
-
For P2MP mLDP routes, the PMSI Tunnel Attribute tunnel-id is used to join the mLDP tree.
-
For composite P2MP-IR routes, the PMSI Tunnel Attribute tunnel-id is used to join the tree and create the P2MP bind. The NLRI next-hop is used to build the IR destination.
-
For AR routes, the NLRI next-hop is used to build the destination.
-
The following applies if a router receives two inclusive multicast routes in the same instance:
-
If the routes have the same originating-ip but different route distinguishers and next-hops, the router processes both routes. In the case of IR routes, it sets up two destinations: one to each next-hop.
-
If the routes have the same originating-ip, different route distinguishers, but same next hops, the router sets up only one binding for IR routes.
-
The router ignores inclusive multicast routes received with its own originating-ip, regardless of the route distinguisher.
-
-
-
-
IP-Prefix routes (type 5) are not generated or imported by a service with two BGP instances.
The rules in this section can be extrapolated to VPLS services where SRv6 and MPLS or VXLAN are configured in different instances of the same VPLS with different split horizon groups.
Epipe services
local-attachment-circuit
eth-tags
. These services "redistribute" AD per-EVI routes received in
one instance into the other. The redistribution rules follow
draft-sr-bess-evpn-vpws-gateway as follows:- Upon receiving an AD per-EVI route in bgp-instance 1, if the route is selected to be installed and the route does not contain a local domain-id in its D-PATH attribute (local means the domain-idis configured in the Epipe), an AD per-EVI route is triggered in bgp-instance 2, using the eth-tag, RD, RT and properties of bgp-instance 2.
- The EVPN Layer 2 attributes extended community is regenerated for the redistributed route. The value of the P and B flags is set to 0 when redistributing routes.
- The encapsulation-specific attributes of the redistributed route are regenerated based on the encapsulation of the BGP instance in which the route is advertised.
- The redistributed route carries the Communities, Extended Communities, and Large Communities of the source route when the following command is configured:
configure service system bgp evpn ad-per-evi-routes attribute-propagation true
configure service system bgp-evpn ad-per-evi-routes attribute-propagation
The exception are RTs (which are re-originated), EVPN Extended Communities, and BGP Encapsulation Extended Communities [RFC9012]. EVPN Extended Communities and BGP Encapsulation Extended Communities are not propagated across domains.
- The redistributed AD per-EVI route updates the D-PATH attribute of the received route or adds the D-PATH attribute if the received route did not contain a D-PATH.
- The ESI of the redistributed AD per-EVI route is always zero.
- AD per-ES and ES routes are never redistributed.
Route selection of AD per-EVI routes
The redistribution of attributes, as well as the BGP best-path selection for AD per-EVI routes is controlled by the following commands:
MD-CLI[ex:/configure service system bgp evpn ad-per-evi-routes]
A:admin@PE-2# tree detail
+-- attribute-propagation <boolean>
+-- bgp-path-selection <boolean>
+-- d-path-ignore <boolean>
*A:PE-2>config>service>system>bgp-evpn>ad-per-evi-routes# tree detail
ad-per-evi-routes
|
+---attribute-propagation
| no attribute-propagation
|
+---bgp-path-selection
| no bgp-path-selection
|
+---d-path-ignore
| no d-path-ignore
Where both (bgp-path-selection and attribute-propagation) are disabled by default, and the router enforces that bgp-path-selection can only be enabled if attribute-propagation is enabled before.
If bgp-path-selection false (default) is configured, in case of multiple AD per-EVI routes for the same Ethernet tag are received in the same Epipe BGP instance, the lowest IP route is selected. Those routes may have zero ESI, or different non-zero ESI.
When multiple non-zero ESI AD per-EVI routes are received and the ESI matches on all of them, the bgp-path-selection command impacts the following procedures:
- The command influences the selection of AD per-EVI routes to create the ES destination. If disabled, the lowest IP address routes are selected, up to the number of configured ECMP paths. If enabled, the routes are selected based on BGP best-path selection.
- The command influences the selection of the best AD per-EVI route of the ES for the purpose of attribute propagation. If enabled, the attributes of the best-path route are propagated.
- High Local Pref wins
- Shortest D-PATH wins (if
d-path-ignore false
) - Lowest left-most D-PATH domain-id wins (if
d-path-ignore false
) - Shortest AS_PATH wins
- Lowest Origin wins
- Lowest MED wins
- EBGP wins
- Lowest tunnel-table cost to the next-hop
- Lowest next-hop type wins (resolution in TTM wins vs RTM)
- Lowest next-hop type wins
- Lowest router ID wins (applicable to ibgp peers only)
- Shortest cluster_list length wins (applicable to ibgp peers only)
- Lowest IP address
- Next-hop check (IPv4 NH wins, then lowest NH wins)
- RD check (lowest RD wins)
- Path-Id (add path)
Anycast redundant solution for dual BGP-instance services
The following sections describe the anycast redundant solution for dual BGP instances in VPLS and Epipe services.
VPLS services
The following figure shows the anycast mechanism used to support gateway redundancy for dual BGP-instance services. The example shows two redundant DC gateways (DCGWs) where the VPLS services contain two BGP instances: one each for EVPN-VXLAN and EVPN-MPLS.
The example shown in Multihomed anycast solution depends on the ability of the two DCGWs to send the same inclusive multicast route to the remote PE or NVEs, such that:
-
The remote PE or NVEs create a single BUM destination to one of the DCGWs (because the BGP selects only the best route to the DCGWs).
-
The DCGWs do not create a destination between each other.
This solution avoids loops for BUM traffic, and known unicast traffic can use either DCGW router, depending on the BGP selection. The following CLI example output shows the configuration of each DCGW.
MD-CLI
/* bgp configuration on DCGW1 and DCGW2 */
[ex:/configure router "Base" bgp]
A:admin@DCGW# info
vpn-apply-export true
vpn-apply-import true
group "DC" {
type internal
family {
evpn true
}
}
group "WAN" {
type internal
family {
evpn true
}
}
neighbor "192.0.2.2" {
group "DC"
}
neighbor "192.0.2.6" {
group "WAN"
}
/* vpls service configuration in DCGW1 */
[ex:/configure service vpls "1"]
A:admin@DCGW1# info
admin-state enable
customer "1"
vxlan {
instance 1 {
vni 1
}
}
bgp 1 {
route-distinguisher "64501:12"
route-target {
export "target:64500:1"
import "target:64500:1"
}
}
bgp 2 {
route-distinguisher "64502:12"
route-target {
export "target:64500:1"
import "target:64500:1"
}
}
bgp-evpn {
evi 1
incl-mcast-orig-ip 10.12.12.12
vxlan 1 {
admin-state enable
vxlan-instance 1
}
mpls 2 {
admin-state enable
auto-bind-tunnel {
resolution any
}
}
}
/* vpls service configuration in DCGW2 */
[ex:/configure service vpls "1"]
A:admin@DCGW2# info
admin-state enable
customer "1"
vxlan {
instance 1 {
vni 1
}
}
bgp 1 {
route-distinguisher "64501:12"
route-target {
export "target:64500:1"
import "target:64500:1"
}
}
bgp 2 {
route-distinguisher "64502:12"
route-target {
export "target:64500:1"
import "target:64500:1"
}
}
bgp-evpn {
evi 1
incl-mcast-orig-ip 10.12.12.12
vxlan 1 {
admin-state enable
vxlan-instance 1
}
mpls 2 {
admin-state enable
auto-bind-tunnel {
resolution any
}
}
}
classic CLI
/* bgp configuration on DCGW1 and DCGW2 */
config>router>bgp#
group ”WAN"
family evpn
type internal
neighbor 192.0.2.6
group ”DC"
family evpn
type internal
neighbor 192.0.2.2
/* vpls service configuration */
DCGW-1# config>service>vpls(1)#
-----------------------
bgp
route-distinguisher 64501:12
route-target target:64500:1
exit
bgp 2
route-distinguisher 64502:12
route-target target:64500:1
exit
vxlan instance 1 vni 1 create
exit
bgp-evpn
evi 1
incl-mcast-orig-ip 10.12.12.12
vxlan bgp 1 vxlan-instance 1
no shutdown
mpls bgp 2
no shutdown
auto-bind-tunnel
resolution any
exit
<snip>
DCGW-2# config>service>vpls(1)#
-----------------------
bgp
route-distinguisher 64501:12
route-target target:64500:1
exit
bgp 2
route-distinguisher 64502:12
route-target target:64500:1
exit
vxlan instance 1 vni 1 create
exit
bgp-evpn
evi 1
incl-mcast-orig-ip 10.12.12.12
vxlan bgp 1 vxlan-instance 1
no shutdown
mpls bgp 2
no shutdown
auto-bind-tunnel
resolution any
<snip>
Based on the preceding configuration example, the behavior of the DCGWs in this scenario is as follows:
-
DCGW-1 and DCGW-2 send inclusive multicast routes to the DC RR and WAN RR with the same route key. For example:
-
DCGW-1 and DCGW-2 both send an IR route to DC RR with RD=64501:12, orig-ip=10.12.12.12, and a different next hop and tunnel ID
-
DCGW-1 and DCGW-2 both send an IR route to WAN RR with RD=64502:12, orig-ip=10.12.12.12, and different next hop and tunnel ID
-
-
DCGW-1 and DCGW-2 both receive MAC/IP routes from DC and WAN that are redistributed to the other BGP instances, assuming that the route is selected as best route and the MAC is installed in the FDB.
As described in section BGP-EVPN routes in services configured with two BGP instances, router peer policies are required so that only VXLAN or MPLS routes are sent or received for a specific peer.
-
Configuration of the same incl-mcast-orig-ip address in both DCGWs enables the anycast solution for BUM traffic for all the following reasons:
-
The configured originating-ip is not required to be a reachable IP address and this forces the remote PE or NVEs to select only one of the two DCGWs.
-
The BGP next hops are allowed to be the system-ip or even a loopback address. In both cases, the BGP next hops are not required to be reachable in their respective networks.
-
In the example shown in Multihomed anycast solution, PE-1 picks up DCGW-1's inclusive multicast route (because of its lower BGP next hop) and creates a BUM destination to 192.0.2.4. When sending BUM traffic for VPLS-1, it only sends the traffic to DCGW-1. In the same way, the DCGWs do not set up BUM destinations between each other as they use the same originating-ip in their inclusive multicast routes.
The remote PE or NVEs perform a similar BGP selection for MAC/IP routes, as a specific MAC is sent by the two DCGWs with the same route key. A PE or NVE sends known unicast traffic for a specific MAC to only one DCGW.
Epipe services
The Anycast redundancy solution can also be used for gateways that stitch SRv6 to MPLS domains for EVPN-VPWS services. The principle is similar to the one explained in VPLS services. The following figure shows an example.
The configuration on the two gateways (BR-2 and BR-3 in the preceding example) must generate AD per-EVI routes with the same route key (including the same RD) from both gateways so that the ingress PEs select one of the two gateways based on BGP best-path selection.
The following is an example of the (identical) configuration in BR-2 and BR-3.
MD-CLI
[ex:/configure service epipe "1"]
A:admin@BR-2/BR-3# info
admin-state enable
service-id 1
customer "1"
segment-routing-v6 1 {
locator "LOC-1" {
function {
end-dx2 {
}
}
}
}
bgp 1 {
route-distinguisher 2323:1
}
bgp 2 {
route-distinguisher 2323:2
}
endpoint "MPLS" {
}
endpoint "SRv6" {
}
bgp-evpn {
evi 1
local-attachment-circuit "gw-mpls" { // implicitly associated to bgp 1
eth-tag 1
endpoint “MPLS”
}
remote-attachment-circuit "ac-1-mpls" {
eth-tag 1
endpoint “MPLS”
}
local-attachment-circuit "gw-srv6" { // associated to bgp 2
eth-tag 1
endpoint “SRv6”
bgp 2
}
remote-attachment-circuit "ac-2-srv6" {
eth-tag 1
endpoint “SRv6”
bgp 2
}
mpls 1 {
admin-state enable
ecmp 2
domain 64500:1
mh-mode access
auto-bind-tunnel {
resolution any
}
route-next-hop {
ip-address 2001:db8::2
}
}
segment-routing-v6 2 {
admin-state enable
source-address 2001:db8::2
ecmp 2
domain 64500:2
mh-mode network // default
srv6 {
instance 1
default-locator "LOC-1"
}
route-next-hop {
system-ipv6
}
}
}
classic CLI
*A:BR-2/BR-3# configure service epipe 1
*A:BR-2/BR-3>config>service>epipe# info
----------------------------------------------
endpoint "MPLS" create
exit
endpoint "SRv6" create
exit
segment-routing-v6 1 create
locator "LOC-1"
function
end-dx2
exit
exit
exit
bgp 1
route-distinguisher 2323:1
exit
bgp 2
route-distinguisher 2323:2
exit
bgp-evpn
local-attachment-circuit "gw-mpls" bgp 1 endpoint "MPLS" create
eth-tag 1
exit
local-attachment-circuit "gw-srv6" bgp 2 endpoint "SRv6" create
eth-tag 1
exit
remote-attachment-circuit "ac-1-mpls" bgp 1 endpoint "MPLS" create
eth-tag 1
exit
remote-attachment-circuit "ac-2-srv6" bgp 2 endpoint "SRv6" create
eth-tag 1
exit
evi 1
mpls bgp 1
domain-id 64500:1
ecmp 2
auto-bind-tunnel
resolution any
exit
no shutdown
exit
segment-routing-v6 bgp 2 srv6-instance 1 default-locator "LOC-1" create
domain-id 64500:2
mh-mode access
source-address 2001:db8::3
no shutdown
exit
exit
no shutdown
----------------------------------------------
In this example:
- The Anycast gateways attached to the same two domains redistribute the EVPN AD per-EVI routes between domains, where ESI is always reset to zero.
- The redundant gateways may set the same Ethernet Tag ID in the redistributed A-D
per-EVI route (the example shows the same
eth-tag
values, but the gateways could also use other values). - The Anycast gateways process the received D-PATH attribute and update the D-PATH
(with the source
domain-id
) when redistributing the AD per-EVI route to the next domain. The D-PATH attribute avoids control plane loops.
The following considerations related to the use of D-PATH in this configuration apply:
- Based on the domain configuration, when an AD per-EVI route is imported in domain X and redistributed into domain Y, the domain ID of X is prepended to the D-PATH in the redistributed AD per-EVI route.
- When two AD per-EVI routes for the same and Ethernet tag (same route key) are received on the same services from different next-hops, D-PATH is considered in the BGP best-path selection, unless d-path-ignore true is configured.
- When two AD per-EVI routes for the same service are received with different route distinguishers and same Ethernet tag, from different next hops, D-PATH is considered in the BGP best-path selection, unless d-path-ignore true is configured, and assuming bgp-path-selection true is configured.
- If d-path-ignore false is configured, the router compares
the D-PATH attribute received in VPWS AD per-EVI routes with the same key
(same or different RDs) as follows:
- The routes with the shortest D-PATH are preferred, therefore routes not tied to the shortest D-PATH are removed. Routes without D-PATH are considered zero-length D-PATH.
- The routes with the numerically lowest left-most domain ID are preferred, therefore routes not tied to the numerically lowest left-most domain ID are removed from consideration.
Using P2MP mLDP in redundant anycast DCGWs
Anycast multihoming and mLDP shows an example of a common BGP EVPN service configured in redundant anycast DCGWs and mLDP used in the MPLS instance.
When mLDP is used with multiple anycast multihoming DCGWs, the same originating IP address must be used by all the DCGWs. Failure to do so may result in packet duplication.
In the example shown in Anycast multihoming and mLDP, each pair of DCGWs (DCGW1/DCGW2 and DCGW3/DCGW4) is configured with a different originating IP address, which causes the following behavior:
-
DCGW3 and DCGW4 receive the inclusive multicast routes with the same route key from DCGW1 and DCGW2.
-
Both DCGWs (DCGW3 and DCGW4) select only one route, which is generally the same, for example, DCGW1's inclusive multicast route.
-
As a result, DCGW3 and DCGW4 join the mLDP tree with root in DCGW1, creating packet duplication when DCGW1 sends BUM traffic.
-
Remote PE nodes with a single BGP-EVPN instance join the mLDP tree without any problem.
To avoid the packet duplication shown in Anycast multihoming and mLDP, Nokia recommends to configure the same originating IP address in all four DCGWs (DCGW1/DCGW2 and DCGW3/DCGW4). However, the route distinguishers can be different per pair.
The following behavior occurs if the same originating IP address is configured on the DCGW pairs shown in Anycast multihoming and mLDP.
-
DCGW3 and DCGW4 do not join any mLDP tree sourced from DCGW1 or DCGW2, which prevents any packet duplication. This is because a router ignore inclusive multicast routes received with its own originating-ip, regardless of the route-distinguisher.
-
PE1 joins the mLDP trees from the two DCs.
I-ES solution for dual BGP instance services
SR OS supports Interconnect ESs (I-ES) for VXLAN as per RFC9014. An I-ES is a virtual ES that allows DCGWs with two BGP instances to handle VXLAN access networks as any other type of ES. I-ESs support the RFC 7432 multihoming functions, including single-active and all-active, ESI-based split-horizon filtering, DF election, and aliasing and backup on remote EVPN-MPLS PEs.
In addition to the EVPN multihoming features, the main advantages of the I-ES redundant solution compared to the redundant solution described in Anycast redundant solution for dual BGP-instance services are as follows:
-
The use of I-ES for redundancy in dual BGP-instance services allows local SAPs on the DCGWs.
-
P2MP mLDP can be used to transport BUM traffic between DCs that use I-ES without any risk of packet duplication. As described in Using P2MP mLDP in redundant anycast DCGWs, packet duplication may occur in the anycast DCGW solution when mLDP is used in the WAN.
Where EVPN-MPLS networks are interconnected to EVPN-VXLAN networks, the I-ES concept applies only to the access VXLAN network; the EVPN-MPLS network does not modify its existing behavior.
The Interconnect ES concept shows the use of I-ES for Layer 2 EVPN DCI between VXLAN and MPLS networks.
The following example shows how I-ES-1 would be provisioned on DCGW1 and the association between I-ES to a specified VPLS service. A similar configuration would occur on DCGW2 in the I-ES.
New I-ES configuration:
DCGW1#config>service>system>bgp-evpn#
ethernet-segment I-ES-1 virtual create
esi 01:00:00:00:12:12:12:12:12:00
service-carving
mode auto
multi-homing all-active
network-interconnect-vxlan 1
service-id
service-range 1 to 1000
no shutdown
Service configuration:
DCGW1#config>service>vpls(1)#
vxlan instance 1 vni 1 instance 1 create
exit
bgp
route-distinguisher 1:1
bgp 2
route-distinguisher 2:2
bgp-evpn
evi 1
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
mpls bgp 2
auto-bind-tunnel resolution any
no shutdown
...
DCGW1#config>service>vpls(2)#
vxlan instance 1 vni 2 create
exit
bgp
route-distinguisher 3:3
bgp 2
route-distinguisher 4:4
bgp-evpn
evi 2
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
mpls bgp 2
auto-bind-tunnel resolution any
no shutdown
sap 1/1/1:1 create
exit
The above configuration associates I-ES-1 to the VXLAN instance in services VPLS1 and VPLS 2. The I-ES is modeled as a virtual ES, with the following considerations:
-
The commands network-interconnect-vxlan and service-id service-range svc-id [to svc-id] are required within the ES.
-
The network-interconnect-vxlan parameter identifies the VXLAN instance associated with the virtual ES. The value of the parameter must be set to 1. This command is rejected in a non-virtual ES.
-
The service-range parameter associates the specific service range to the ES. The ES must be configured as network-interconnect-vxlan before any service range can be added.
-
The ES parameters port, lag, sdp, vc-id-range, dot1q, and qinq cannot be configured in the ES when a network-interconnect-vxlan instance is configured. The source-bmac-lsb option is blocked, as the I-ES cannot be associated with an I-VPLS or PBB-Epipe service. The remaining ES configuration options are supported.
-
All services with two BGP instances associate the VXLAN destinations and ingress VXLAN instances to the ES.
-
-
Multiple services can be associated with the same ES, with the following considerations:
-
In a DC with two DCGWs (as in The Interconnect ES concept), only two I-ESs are needed to load-balance, where one half of the dual BGP-instance services would be associated with one I-ES (for example, I-ES-1, in the above configuration) and one half to another I-ES.
-
Up to eight service ranges per VXLAN instance can be configured. Ranges may overlap within the same ES, but not between different ESs.
-
The service range can be configured before the service.
-
-
After the I-ES is configured using network-interconnect-vxlan, the ES operational state depends exclusively on the ES administrative state. Because the I-ES is not associated with a physical port or SDP, when testing the non-revertive service carving manual mode, an ES shutdown and no shutdown event results in the node sending its own administrative preference and DP bit and taking control if the preference and DP bit are higher than the current DF. This is because the peer ES routes are not present at the EVPN application layer when the ES is configured for no shutdown; therefore, the PE sends its own administrative preference and DP values. For I-ESs, the non-revertive mode works only for node failures.
-
A VXLAN instance may be placed in MhStandby under any of the following situations:
-
if the PE is single-active NDF for that I-ES
-
if the VXLAN service is added to the I-ES and either the ES or BGP-EVPN MPLS is shut down in all the services included in the ES
The following example shows the change of the MhStandby flag from false to true when BGP-EVPN MPLS is shut down for all the services in the I-ES.
A:PE-4# show service id 500 vxlan instance 1 oper-flags =============================================================================== VPLS VXLAN oper flags =============================================================================== MhStandby : false =============================================================================== A:PE-4# configure service vpls 500 bgp-evpn vxlan shutdown *A:PE-4# show service id 500 vxlan instance 1 oper-flags =============================================================================== VPLS VXLAN oper flags =============================================================================== MhStandby : true ===============================================================================
-
BGP-EVPN routes on dual BGP-instance services with I-ES
The configuration of an I-ES on DCGWs with two BGP instances has the following impact on the advertisement and processing of BGP-EVPN routes.
-
For EVPN MAC/IP routes, the following considerations apply:
If bgp-evpn>vxlan>no auto-disc-route-advertisement and mh-mode access are configured on the access instance:
-
MAC/IP routes received in the EVPN-MPLS BGP instance are readvertised in the EVPN-VXLAN BGP instance with the ESI set to zero.
-
EVPN-VXLAN PEs and NVEs in the DC receive the same MAC from two or more different MAC/IP routes from the DCGWs, which perform regular EVPN MAC/IP route selection.
-
MAC/IP routes received in the EVPN-VXLAN BGP instance are readvertised in the EVPN-MPLS BGP instance with the configured non-zero I-ESI value, assuming the VXLAN instance is not in an MhStandby operational state; otherwise the MAC/IP routes are dropped.
-
EVPN-MPLS PEs in the WAN receive the same MAC from two or more DCGWs set with the same ESI. In this case, regular aliasing and backup functions occur as usual.
-
-
If bgp-evpn>vxlan>auto-disc-route-advertisement and mh-mode access are configured, the following differences apply to the above:
-
MAC/IP routes received in the EVPN-MPLS BGP instance are readvertised in the EVPN-VXLAN BGP instance with the ESI set to the I-ESI.
-
In this case, EVPN-VXLAN PEs and NVEs in the DC receive the same MAC from two or more different MAC/IP routes from the DCGWs, with the same ESI, therefore they can perform aliasing.
-
-
ES routes are exchanged for the I-ES. The routes should be sent only to the MPLS network and not to the VXLAN network. This can be achieved by using router policies.
-
AD per-ES and AD per-EVI routes are also advertised for the I-ES, and are sent only to the MPLS network and not to the VXLAN if bgp-evpn>vxlan>no auto-disc-route-advertisement is configured. For ES routes, router polices can be used to prevent these routes from being sent to VXLAN peers. If bgp-evpn>vxlan>auto-disc-route-advertisement is configured, AD routes must be sent to the VXLAN peers so that they can apply backup or aliasing functions.
In general, when I-ESs are used for redundancy, the use of router policies is needed to avoid control plane loops with MAC/IP routes. Consider the following to avoid control plane loops:
-
loops created by remote MACs
Remote EVPN-MPLS MAC/IP routes are readvertised into EVPN-VXLAN routes with an SOO (Site Of Origin) EC added by a BGP peer or VSI export policy identifying the DCGW pair. The other DCGW in the pair drops EVPN-VXLAN MAC/IP routes tagged with the pair SOO. Router policies are needed to add SOO and drop routes received with self SOO.
When remote EVPN-VXLAN MAC/IP routes are readvertised into EVPN-MPLS, the DCGWs automatically drop EVPN-MPLS MAC/IP routes received with their own non-zero I-ESI.
-
loops created by local SAP MACs
Local SAP MACs are learned and MAC/IP routes are advertised into both BGP instances. The MAC/IP routes advertised in the EVPN-VXLAN instance are dropped by the peer based on the SOO router policies as described above for loops created by remote MACs. The DCGW local MACs are always learned over the EVPN-MPLS destinations between the DCGWs.
The following describes the considerations for BGP peer policies on DCGW1 to avoid control plane loops. Similar policies would be configured on DCGW2.
-
Avoid sending service VXLAN routes to MPLS peers and service MPLS routes to VXLAN peers.
-
Avoid sending AD and ES routes to VXLAN peers. If bgp-evpn>vxlan>auto-disc-route-advertisement is configured AD routes must be sent to the VXLAN peers.
-
Add SOO to VXLAN routes sent to the ES peer.
-
Drop VXLAN routes received from the ES peer.
The following shows the CLI configuration:
A:DCGW1# configure router bgp
A:DCGW1>config>router>bgp# info
----------------------------------------------
family vpn-ipv4 evpn
vpn-apply-import
vpn-apply-export
rapid-withdrawal
rapid-update vpn-ipv4 evpn
group "wan"
type internal
export "allow only mpls"
neighbor 192.0.2.4
exit
neighbor 192.0.2.5
exit
exit
group "internal"
type internal
neighbor 192.0.2.1
export "allow only vxlan"
exit
neighbor 192.0.2.3
import "drop SOO-DCGW-23"
export "add SOO to vxlan routes"
exit
exit
no shutdown
----------------------------------------------
A:DCGW1>config>router>bgp# /configure router policy-options
A:DCGW1>config>router>policy-options# info
----------------------------------------------
community "mpls" members "bgp-tunnel-encap:MPLS"
community "vxlan" members "bgp-tunnel-encap:VXLAN"
community "SOO-DCGW-23" members "origin:64500:23"
// This policy prevents the router from sending service VXLAN routes to MPLS peers. //
policy-statement "allow only mpls"
entry 10
from
community "vxlan"
family evpn
exit
action drop
exit
exit
exit
This policy ensures the router only exports routes that include the VXLAN encapsulation.
policy-statement "allow only vxlan"
entry 10
from
community "vxlan"
family evpn
exit
action accept
exit
exit
default-action drop
exit
exit
This import policy avoids importing routes with a self SOO.
policy-statement "drop SOO-DCGW-23"
entry 10
from
community "SOO-DCGW-23"
family evpn
exit
action drop
exit
exit
exit
This import policy adds SOO only to VXLAN routes. This allows the peer to drop routes based on the SOO, without affecting the MPLS routes.
policy-statement "add SOO to vxlan routes"
entry 10
from
community "vxlan"
family evpn
exit
action accept
community add "SOO-DCGW-23"
exit
exit
default-action accept
exit
exit
----------------------------------------------
Single-active multihoming on I-ES
When an I-ES is configured as single-active and configured as no shutdown with at least one associated service, the DCGWs send ES and AD routes as for any ES. It also runs DF election as normal, based on the ES routes, with the candidate list being pruned by the AD routes.
I-ES — single-active shows the expected behavior for a single-active I-ES.
As shown in I-ES — single-active, the Non-Designated Forwarder (NDF) for a specified service carries out the following tasks:
-
From a data path perspective, the VXLAN instance on the NDF goes into an MhStandby operational state and blocks ingress and egress traffic on the VXLAN destinations associated with the I-ES.
-
The MAC/IP routes and the FDB process
-
MAC/IP routes associated with the VXLAN instance and readvertised to EVPN-MPLS peers are withdrawn.
-
MAC/IP routes corresponding to local SAP MACs or EVPN-MPLS binding MACs are withdrawn if they were advertised to the EVPN-VXLAN instance.
-
Received MAC/IP routes associated with the VXLAN instance are not installed in the FDB. MAC/IP routes show as ‟used” in BGP; however, only the MAC/IP route received from MPLS (from the ES peer) is programmed.
-
-
The Inclusive Multicast Ethernet Tag (IMET) routes process
-
IMET-AR-R routes (IMET-AR with replicator role) must be withdrawn if the VXLAN instance goes into an MhStandby operational state. Only the DF advertises the IMET-AR-R routes.
-
IMET-IR advertisements in the case of the NDF (or MhStandby) are controlled by the command config>service>vpls>bgp-evpn>vxlan [no] send-imet-ir-on-ndf.
By default, the command is enabled and the router advertises IMET-IR routes, even if the PE is NDF (MhStandby). This attracts BUM traffic, but also speeds up convergence in the case of a DF switchover. The command is supported for single-active and all-active.
If the command is disabled, the router withdraws the IMET-IR routes when the PE is NDF and do not attract BUM traffic.
-
The I-ES DF PE for the service continues advertising IMET and MAC/IP routes for the associated VXLAN instance as usual, as well as forwarding on the DF VXLAN bindings. When the DF DCGW receives BUM traffic, it sends the traffic with the egress ESI label if needed.
All-active multihoming on I-ES
The same considerations for ES and AD routes, and DF election apply for all-active multihoming as for single-active multihoming; the difference is in the behavior on the NDF DCGW. The NDF for a specified service performs the following tasks:
-
From a data path perspective, the NDF blocks ingress and egress paths for broadcast and multicast traffic on the VXLAN instance bindings associated with the I-ES, while unknown and known unicast traffic is still allowed. The unknown unicast traffic is transmitted on the NDF if there is no risk of duplication. For example, unknown unicast packets are transmitted on the NDF if they do not have an ESI label, do not have an EVPN BUM label, and they pass a MAC SA suppression. In the example in All-active multihoming and unknown unicast on the NDF, the NDF transmits unknown unicast traffic. Regardless of whether DCGW1 is a DF or NDF, it accepts the unknown unicast packets and floods to local SAPs and EVPN destinations. When sending to DGW2, the router sends the ESI label identifying the I-ES. Because of the ESI-label suppression, DCGW2 does not send unknown traffic back to the DC.
-
The MAC/IP routes and the FDB process
-
MAC/IP routes associated with the VXLAN instance are advertised normally.
-
MACs are installed as normal in the FDB for received MAC/IP routes associated with the VXLAN instance.
-
-
The IMET routes process
-
As with single-active multihoming, IMET-AR-R routes must be withdrawn on the NDF (MhStandby state). Only the DF advertises the IMET-AR-R routes.
-
The IMET-IR advertisements in the case of the NDF (or MhStandby) are controlled by the command config>service>vpls>bgp-evpn>vxlan [no] send-imet-ir-on-ndf, as in single-active multihoming.
-
The behavior on the non-DF for BUM traffic can also be controlled by the command config>service>vpls>vxlan>rx-discard-on-ndf {bm | bum | none}, where the default option is bm. However, the user can change this option to discard all BUM traffic, or forward all BUM traffic (none).
The I-ES DF PE for the service continues advertising IMET and MAC/IP routes for the associated VXLAN instance as usual. When the DF DCGW receives BUM traffic, it sends the traffic with the egress ESI label if needed.
Multi-instance EVPN: Two instances of the same encapsulation in the same VPLS/R-VPLS service
As described in Multi-instance EVPN: Two instances of different encapsulation in the same VPLS/R-VPLS/Epipe service, two BGP instances are supported in VPLS services, where one instance can be associated with the EVPN-VXLAN and the other instance with the EVPN-MPLS. In addition, both BGP instances in a VPLS/R-VPLS service can also be associated with EVPN-VXLAN, or both instances can be associated with EVPN-MPLS.
For example, a VPLS service can be configured with two VXLAN instances that use VNI 500 and 501 respectively, and those instances can be associated with different BGP instances:
*A:PE-2# configure service vpls 500
*A:PE-2>config>service>vpls# info
----------------------------------------------
vxlan instance 1 vni 500 create
exit
vxlan instance 2 vni 501 create
exit
bgp
route-distinguisher 192.0.2.2:500
vsi-export "vsi-500-export"
vsi-import "vsi-500-import"
exit
bgp 2
route-distinguisher 192.0.2.2:501
vsi-export "vsi-501-export"
vsi-import "vsi-501-import"
exit
bgp-evpn
incl-mcast-orig-ip 23.23.23.23
evi 500
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
vxlan bgp 2 vxlan-instance 2
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
----------------------------------------------
From a data plane perspective, each VXLAN instance is instantiated in a different implicit SHG, so that traffic can be forwarded between them.
In addition, multi-instance EVPN-VXLAN services support:
-
assisted-replication for IPv4 VTEPs in both VXLAN instances, where a single assisted-replication IPv4 address can be used for both instances
-
non-system IP and IPv6 termination, where a single vxlan-src-vtep ip-address can be configured for each service, and therefore used for the two instances
*A:PE-2# configure service vpls 700
*A:PE-2>config>service>vpls# info
----------------------------------------------
description "two bgp-evpn mpls instances"
bgp
route-distinguisher auto-rd
vsi-export "vsi-700-export"
vsi-import "vsi-700-import"
exit
bgp 2
route-distinguisher auto-rd
vsi-export "vsi-701-export"
vsi-import "vsi-701-import"
exit
bgp-evpn
evi 700
mpls bgp 1
mh-mode access
ingress-replication-bum-label
auto-bind-tunnel
resolution any
exit
no shutdown
exit
mpls bgp 2
mh-mode network
ingress-replication-bum-label
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
----------------------------------------------
Multi-instance EVPN-MPLS VPLS/R-VPLS services have the same limitations as any multi-instance service, as described in Multi-Instance EVPN: EVPN-VXLAN and EVPN-MPLS in the same VPLS/R-VPLS service. In addition, services with two EVPN-MPLS instances do not support SAPs.
The mh-mode {network|access} command in the vpls>bgp-evpn>mpls context determines which instance is considered access and which instance is considered network.
- The default form of the bgp-evpn>mpls command is mh-mode network and only one instance can be configured. The other instance must be configured as mh-mode access.
- The use of provider-tunnel is supported if there is one instance configured as network, and the P2MP tunnel is implicitly associated with the network instance.
Multi-instance EVPN-MPLS VPLS/R-VPLS services support:
-
all of the auto-bind-tunnel resolution options in each of the two instances. This includes resolution of IPv4 next-hops to TTMv4 entries and resolution of IPv6 next-hops to TTMv6 entries.
-
different address families in different instances. For instance, mpls bgp 1 may resolve routes to TTMv4 and mpls bgp 2 to TTMv6, or the reverse. In a VPLS service with two EVPN-VXLAN instances, it is not possible to have an instance with routes resolved to IPv4 VXLAN tunnels and the other instance with routes resolved to IPv6 VXLAN tunnels.
-
an explicit split-horizon-group in each instance; however, the same split-horizon-group cannot be configured on the two instances of the same VPLS service
-
a restrict-protected-src discard-frame per instance. If a MAC is protected in one instance and a frame arrives at the other instance with the protected MAC as source MAC, the frame is discarded if restrict-protected-src discard-frame is configured.
BGP-EVPN routes in multi-instance EVPN services with the same encapsulation
If two BGP instances with the same encapsulation (VXLAN or MPLS) are configured in the same VPLS/R-VPLS service, different import route targets in each BGP instance are mandatory (although this is not enforced).
BGP-EVPN routes in services configured with two BGP instances describes the use of policies to avoid sending WAN routes (routes meant to be redistributed from DC to WAN) to the DC again and DC routes (routes meant to be redistributed from WAN to DC) to the WAN again. Those policies are based on export policy statements that match on the RFC 9012 BGP encapsulation extended community (MPLS and VXLAN respectively).
When the two BGP instances are of the same encapsulation (VXLAN or MPLS), the policies matching on different BGP encapsulation extended community are not feasible because both instances advertise routes with the same encapsulation value. Because the export route targets in the two BGP instances must be different, the policies, to avoid sending WAN routes back to the WAN and DC routes back to the DC, can be based on export policies that prevent routes with a DC route target from being sent to the WAN peers (and opposite for routes with a WAN route target).
In scaled scenarios, matching based on route targets, does not scale well. An alternative and preferred solution is to configure a default-route-tag that identifies all the EVPN instances connected to the DC (or one domain), and a different default-route-tag in all the EVPN instances connected to the WAN (or the other domain). Anycast redundant solution for multi-instance EVPN services with the same encapsulation shows an example that demonstrates the use of default-route-tags.
Other than the specifications described in this section, the processing of MAC/IP routes and inclusive multicast Ethernet tag routes in multi-instance EVPN services of the same encapsulation follow the rules described in BGP-EVPN routes in services configured with two BGP instances.
Anycast redundant solution for multi-instance EVPN services with the same encapsulation
The solution described in Anycast redundant solution for dual BGP-instance services is also supported in multi-instance EVPN VPLS/R-VPLS services with the same encapsulation.
The following CLI example output shows the configuration of DCGW-1 and DCGW-2 in Multihomed anycast solution where VPLS 500 is a multi-instance EVPN-VXLAN service and BGP instance 2 is associated with VXLAN instead of MPLS.
Different default-route-tags are used in BGP instance 1 and instance 2, so that in the export route policies, DC routes are not advertised to the WAN, and WAN routes are not advertised to the DC, respectively.
*A:DCGW-1(and DCGW-2)>config>service>vpls(500)# info
----------------------------------------------
vxlan instance 1 vni 500 create
exit
vxlan instance 2 vni 501 create
exit
bgp
route-distinguisher 192.0.2.2:500
route-target target:64500:500
exit
bgp 2
route-distinguisher 192.0.2.2:501
route-target target:64500:501
exit
bgp-evpn
incl-mcast-orig-ip 23.23.23.23
evi 500
vxlan bgp 1 vxlan-instance 1
default-route-tag 500
no shutdown
exit
vxlan bgp 2 vxlan-instance 2
default-route-tag 501
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
----------------------------------------------
config>router>bgp#
vpn-apply-import
vpn-apply-export
group "WAN"
family evpn
type internal
export "allow only mpls"
neighbor 192.0.2.6
group "DC"
family evpn
type internal
export "allow only vxlan"
neighbor 192.0.2.2
config>router>policy-options# info
----------------------------------------------
policy-statement "allow only mpls"
entry 10
from
family evpn
tag 500
action drop
exit
exit
exit
policy-statement "allow only vxlan"
entry 10
from
family evpn
tag 501
action drop
exit
exit
exit
The same Anycast redundant solution can be applied to VPLS/R-VPLS with two instances of EVPN-MPLS encapsulation. The configuration would be identical, other than replacing the VXLAN aspects with the EVPN-MPLS-specific parameters.
For a full description of this solution, see the Anycast redundant solution for dual BGP-instance services
I-ES solution for dual BGP EVPN instance services with the same encapsulation
The I-ES of network-interconnect VXLAN Ethernet segment is described in I-ES solution for dual BGP instance services. I-ES’s are also supported on VPLS and R-VPLS services with two EVPN-VXLAN instances.
I-ES in dual EVPN-VXLAN services shows the use of an I-ES in a dual EVPN-VXLAN instance service.
Similar to (single-instance) EVPN-VXLAN all-active multihoming, the BUM forwarding procedures follow the ‟Local Bias” behavior.
At the ingress PE, the forwarding rules for EVPN-VXLAN services are as follows:
-
The no send-imet-ir-on-ndf or rx-discard-on-ndf bum command must be enabled so that the NDF does not forward any BUM traffic.
-
BUM frames received on any SAP or I-ES VXLAN binding are flooded to:
-
local non-ES and single-active DF ES SAPs
-
local all-active ES SAPs (DF and NDF)
-
EVPN-VXLAN destinations
BUM received on an I-ES VXLAN binding follows SHG rules, for example, it can only be forwarded to EVPN-VXLAN destinations that belong to the other VXLAN instance (instance 2), which is a different SHG.
-
-
As an example, in I-ES in dual EVPN-VXLAN services:
-
GW1 and GW2 are configured with no send-imet-ir-on-ndf.
-
TOR1 generates BUM traffic that only reaches GW1 (DF).
-
GW1 forwards to CE1 and EVPN-VXLAN destinations.
-
The forwarding rules at the egress PE are as follows:
-
The source VTEP is looked up for BUM frames received on EVPN-VXLAN.
-
If the source VTEP matches one of the PEs with which the local PE shares an ES _AND_ a VXLAN service:
-
Then the local PE does not forward to the shared local ESs (this includes port, lag, or network-interconnect-vxlan ESs). It forwards though to non-shared ES SAPs unless they are in NDF state.
-
Else, the local PE forwards normally to local ESs unless they are in NDF state.
-
-
Because there is no multicast label or multicast B-MAC in VXLAN, the only way the egress PE can identify BUM traffic is by looking at the customer MAC DA. Therefore, BM or unknown MAC DAs identify BUM traffic.
-
As an example, in I-ES in dual EVPN-VXLAN services:
-
GW2 receives BUM on EVPN-VXLAN. GW2 identifies the source VTEP as a PE with which the I-ES-1 is shared, therefore it does not forward the BUM frames to the local I-ES. It forwards to the non-shared ES and local SAPs though (CE2).
-
GW3 receives BUM on EVPN-VXLAN, however the source VTEP does not match any PE with which GW3 shares an ES. Hence GW3 forwards to all local ESs that are DF, in other words, CE3.
-
The following configuration example shows how I-ES-1 would be provisioned on DCGW1 and the association between I-ES to a specified VPLS service. A similar configuration would occur on DCGW2 in the I-ES.
I-ES configuration:
*A:GW1>config>service>system>bgp-evpn>eth-seg# info
----------------------------------------------
esi 00:23:23:23:23:23:23:00:00:01
service-carving
mode manual
manual
preference non-revertive create
value 150
exit
evi 101 to 200
exit
exit
multi-homing all-active
network-interconnect-vxlan 1
service-id
service-range 1
service-range 1000 to 1002
service-range 2000
exit
no shutdown
Service configuration:
*A:GW1>config>service>vpls# info
----------------------------------------------
vxlan instance 1 vni 1000 create
rx-discard-on-ndf bum
exit
vxlan instance 2 vni 1002 create
exit
bgp
route-target export target:64500:1000 import target:64500:1000
exit
bgp 2
route-distinguisher auto-rd
route-target export target:64500:1002 import target:64500:1002
exit
bgp-evpn
evi 1000
vxlan bgp 1 vxlan-instance 1
ecmp 2
default-route-tag 100
auto-disc-route-advertisement
no shutdown
exit
vxlan bgp 2 vxlan-instance 2
ecmp 2
default-route-tag 200
auto-disc-route-advertisement
mh-mode network
no shutdown
exit
exit
no shutdown
Multi-instance EVPN VPLS/R-VPLS services with two EVPN-MPLS instances do not support I-ESs.
For information about how the EVPN routes are processed and advertised in an I-ES, see the I-ES solution for dual BGP instance services.
Configuring static VXLAN and EVPN in the same VPLS/R-VPLS service
In some DCGW use cases, static VXLAN must be used to connect DC switches that do not support EVPN to the WAN so that a tenant subnet can be extended to the WAN. For those cases, the DC Gateway is configured with VPLS services that include a static VXLAN instance and a BGP-EVPN instance in the same service. The following combinations are supported in the same VPLS/R-VPLS service:
-
two static VXLAN instances
-
one static VXLAN instance and one EVPN-VXLAN instance
-
one static VXLAN instance and one EVPN-MPLS instance
When a static VXLAN instance coexists with EVPN-MPLS in the same VPLS/R-VPLS service, the VXLAN instance can be associated with a network-interconnect-vxlan ES if VXLAN uses instance 1. Both single-active and all-active multihoming modes are supported as follows:
-
In single-active mode, the following behavior is for a VXLAN binding associated with the ES on the NDF:
-
TX (transmission to VXLAN)
No MACs are learned against the binding, and the binding is removed from the default multicast list.
-
RX (reception from VXLAN)
The RX state is down for the binding.
-
-
In all-active mode, the following behavior is for the NDF:
-
on TX
The binding is kept in the default multicast list, but only forwards the unknown-unicast traffic.
-
on RX
The behavior is determined by the command rx-discard-on-ndf {bm | bum | none} where:
-
The option bm is the default option, discards broadcast and multicast traffic, and allows unicast (known and unknown).
-
The option bum discards any BUM frame on the NDF reception.
-
The option none does not discard any BUM frame on the NDF reception.
-
-
The use of the rx-discard-on-ndf options is shown in the following cases.
Use case 1: Static VXLAN with anycast VTEPs and all-active ES
This use case, which is illustrated in I-ES multihoming – static VXLAN with anycast VTEPs, works only for all-active I-ESs.
In this use case, the DCGWs use anycast VTEPs, that is, PE1 has a single egress VTEP configured to the DCGWs, for example, 12.12.12.12. Normally, PE1 finds ECMP paths to send the traffic to both DCGWs. However, because a specified BUM flow can be sent to either the DF or the NDF (but not to both at the same time), the DCGWs must be configured with the following option so that BUM is not discarded on the NDF:
rx-discard-on-ndf none
Similar to any LAG-like scenario at the access, the access CE load balances the traffic to the multihomed PEs, but a specific flow is only sent to one of these PEs. With the option none, the BUM traffic on RX is accepted, and there are no duplicate packets or black-holed packets.
Use case 2: Static VXLAN with non-anycast VTEPs
This use case, which is shown in the following figure, works with single or all-active multihoming.
In this case, the DCGWs use different VTEPs, for example 1.1.1.1 and 2.2.2.2 respectively. PE1 has two separate egress VTEPs to the DCGWs. Therefore, PE1 sends BUM flows to both DCGWs at the same time. Concerning all-active multihoming, if the default option for rx-discard-on-ndf is configured, PE2 and PE3 receive duplicate unknown unicast packets from PE1 (because the default option accepts unknown unicast on the RX of the NDF). So, the DCGWs must be configured with rx-discard-on-ndf bum .
Any use case in which the access PE sends BUM flows to all multihomed PEs, including the NDF, is similar to I-ES multihoming - static VXLAN with non-anycast VTEPs. BUM traffic must be blocked on the NDF’s RX to avoid duplicate unicast packets.
For single-active multihoming, the rx-discard-on-ndf is irrelevant because BUM and known unicast are always discarded on the NDF.
Also, when non-anycast VTEPs are used on DCGWs, the following can be stated:
-
MAC addresses learned on one DCGW and advertised in EVPN, are not learned on the redundant DCGW through EVPN, based on the presence of a local ES in the route. I-ES multihoming - static VXLAN with non-anycast VTEPs, shows a scenario in which the MAC of VM can be advertised by DCGW1, but not learned by DCGW2.
-
As a result of the above behavior and because PE2 known unicast to M1 can be aliased to DGW2, when traffic to M1 gets to DCGW2, it is flooded because M1 is unknown. DCGW2 floods to all the static bindings, as well as local SAPs.
-
ESI-label filtering, and no VXLAN binding between DCGWs, avoid loops for BUM traffic sent from the DF.
When a static VXLAN instance coexists with EVPN-VXLAN in the same VPLS or R-VPLS service, no VXLAN instance should be associated with an all-active network-interconnect-vxlan ES. This is because when multihoming is used with an EVPN-VXLAN core, the non-DF PE always discards unknown unicast traffic to the static VXLAN instance (this is not the case with EVPN-MPLS if the unknown traffic has a BUM label) and traffic blackholes may occur. This is discussed in the following example:
-
Consider the example in I-ES multihoming - static VXLAN with non-anycast VTEPs I-ES multihoming – static VXLAN with non-anycast VTEPs, only replacing EVPN-MPLS by EVPN-VXLAN in the WAN network.
-
Consider the PE2 has learned VM’s MAC via ES-1 EVPN destination. Because of the regular aliasing procedures, PE2 may send unicast traffic with destination VM to DCGW1, which is the non-DF for I-ES 1.
-
Because EVPN-VXLAN is used in the WAN instead of EVPN-MPLS, when the traffic gets to DCGW1, it is dropped if the VM’s MAC is not learned on DCGW1, creating a blackhole for the flow. If the I-ES had used EVPN-MPLS in the WAN, DCGW1 would have flooded to the static VXLAN binds and no blackhole would have occurred.
Because of the behavior illustrated above, when a static VXLAN instance coexists with an EVPN-VXLAN instance in the same VPLS/R-VPLS service, redundancy based on all-active I-ES is not recommended and single-active or an anycast solution without I-ES should be used instead. Anycast solutions are discussed in Anycast redundant solution for multi-instance EVPN services with the same encapsulation, only with a static VXLAN instance in instance 1 instead of EVPN-VXLAN in this case.
EVPN IP-prefix route interoperability
SR OS supports the three IP-VRF-to-IP-VRF models defined in draft-ietf-bess-evpn-prefix-advertisement for EVPN-VXLAN and EVPN-MPLS R-VPLS services. Those three models are known as:
interface-less IP-VRF-to-IP-VRF
interface-ful IP-VRF-to-IP-VRF with SBD IRB (Supplementary Bridge Domain Integrated Routing Bridging)
interface-ful IP-VRF-to-IP-VRF with unnumbered SBD IRB
SR OS supports all three models for IPv4 and IPv6 prefixes. The three models have pros and cons, and different vendors have chosen different models depending on the use cases that they intend to address. When a third-party vendor is connected to an SR OS router, it is important to know which of the three models the third-party vendor implements. The following sections describe the models and the required configuration in each of them.
Interface-ful IP-VRF-to-IP-VRF with SBD IRB model
The SBD is equivalent to an R-VPLS that connects all the PEs that are attached to the same tenant VPRN. Interface-ful refers to the fact that there is a full IRB interface between the VPRN and the SBD (an interface object with MAC and IP addresses, over which interface parameters can be configured).
Interface-ful IP-VRF-to-IP-VRF with SBD IRB model illustrates this model.
Interface-ful IP-VRF-to-IP-VRF with SBD IRB model shows a 7750 SR and a third-party router using interface-ful IP-VRF-to-IP-VRF with SBD IRB model. The two routers are attached to a VPRN for the same tenant, and those VPRNs are connected by R-VPLS-2, or SBD. Both routers exchange IP prefix routes with a non-zero gateway IP (this is the IP address of the SBD IRB). The SBD IRB MAC and IP are advertised in a MAC/IP route. On reception, the IP prefix route creates a route-table entry in the VPRN, where the gateway IP must be recursively resolved to the information provided by the MAC/IP route and installed in the ARP and FDB tables.
This model is described in detail in EVPN for VXLAN in IRB backhaul R-VPLS services and IP prefixes. As an example, and based on Interface-ful IP-VRF-to-IP-VRF with SBD IRB model above, the following CLI output shows the configuration of a 7750 SR SBD and VPRN, using on this interface-ful with SBD IRB mode:
7750SR#config>service#
vpls 2 customer 1 name "sbd" create
allow-ip-int-bind
exit
bgp
exit
bgp-evpn
evi 2
ip-route-advertisement
mpls bgp 1
auto-bind-tunnel resolution any
no shutdown
vprn 1 customer 1 name "vprn1" create
route-distinguisher auto-rd
interface "sbd" create
address 192.168.0.1/16
ipv6
30::3/64
exit
vpls "sbd"
The model is, also, supported for IPv6 prefixes. There are no configuration differences except the ability to configure an IPv6 address and interface.
Interface-ful IP-VRF-to-IP-VRF with unnumbered SBD IRB model
Interface-ful refers to the fact that there is a full IRB interface between the VPRN and the SBD. However, the SBD IRB is unnumbered in this model, which means no IP address is configured on it. In SR OS, an unnumbered SBD IRB is equivalent to an R-VPLS linked to a VPRN interface through an EVPN tunnel. See EVPN for VXLAN in EVPN tunnel R-VPLS services for more information.
Interface-ful IP-VRF-to-IP-VRF with unnumbered SBD IRB model illustrates this model.
Interface-ful IP-VRF-to-IP-VRF with unnumbered SBD IRB model shows a 7750 SR and a third-party router running interface-ful IP-VRF-to-IP-VRF with unnumbered SBD IRB model. The IP prefix routes are now expected to have a zero gateway IP and the MAC in the router's MAC extended community used for the recursive resolution to a MAC/IP route.
The corresponding configuration of the 7750 SR VPRN and SBD in the example could be:
7750SR#config>service#
vpls 2 customer 1 name "sbd" create
allow-ip-int-bind
exit
bgp
exit
bgp-evpn
evi 2
ip-route-advertisement
mpls bgp 1
auto-bind-tunnel resolution any
no shutdown
vprn 1 customer 1 create
route-distinguisher auto-rd
interface "sbd" create
ipv6
exit
vpls "sbd"
evpn-tunnel ipv6-gateway-address mac
Note that the evpn-tunnel command controls the use of the Router's MAC extended community and the zero gateway IP in the IPv4-prefix route. For IPv6, the ipv6-gateway-address mac option makes the router advertise the IPv6-prefix routes with a Router's MAC extended community and zero gateway IP.
Interoperable interface-less IP-VRF-to-IP-VRF model (Ethernet encapsulation)
This model is interface-less because no Supplementary Broadcast Domain (SBD) is required to connect the VPRNs of the tenant, and no recursive resolution is required upon receiving an IP prefix route. In other words, the next-hop of the IP prefix route is directly resolved to an EVPN tunnel, without the need for any other route. The standard specification draft-ietf-bess-evpn-ip-prefix supports two variants of this model that are not interoperable with each other:
EVPN IFL for Ethernet NVO (Network Virtualization Overlay) tunnels
Ethernet NVO indicates that the EVPN packets contain an inner Ethernet header. This is the case for tunnels such as VXLAN.
In the Ethernet NVO option, the ingress PE uses the received router's MAC extended community address (received along with the route type 5) as the inner destination MAC address for the EVPN packets sent to the prefix
EVPN IFL for IP NVO tunnels
IP NVO indicates that the EVPN packets contain an inner IP packet, but without Ethernet header. This is similar to the IPVPN packets exchanged between PEs.
Interface-less IP-VRF-to-IP-VRF model illustrates the Interface-less IP-VRF-to-IP-VRF model.
SR OS supports the interoperable Interface-less IP-VRF-to-IP-VRF Model for Ethernet NVO tunnels. In Interface-less IP-VRF-to-IP-VRF model this interoperable model is shown on the left side PE router. The following is the model implementation:
There is no data path difference between this model and the existing R-VPLS EVPN tunnel model or the model described in Interface-ful IP-VRF-to-IP-VRF with unnumbered SBD IRB model.
This model is enabled by configuring config>service>vprn>if>vpls>evpn-tunnel (with ipv6-gateway-address mac for IPv6), and bgp-evpn>ip-route-advertisement. In addition, because the SBD IRB MAC/IP route is no longer needed, the bgp-evpn no mac-advertisement command prevents the advertisement of the MAC/IP route.
The IP prefix routes are processed as follows:
On transmission, there is no change in the IP prefix route processing compared to the configuration of the Interface-ful IP-VRF-to-IP-VRF with Unnumbered SBD IRB Model.
IPv4/IPv6 prefix routes are advertised based on the information in the route-table for IPv4 and IPv6, with GW-IP=0 and the corresponding MAC extended community.
If bgp-evpn no mac-advertisement is configured, no MAC/IP route is sent for the R-VPLS.
The received IPv4/IPv6 prefix routes are processed as follows:
-
Upon receiving an IPv4/IPv6 prefix route with a MAC extended community for the router, an internal MAC/IP route is generated with the encoded MAC and the RD, Ethernet tag, ESI, Label/VNI and next hop derived from the IP prefix route itself.
-
If no competing received MAC/IP routes exist for the same MAC, this IP prefix-derived MAC/IP route is selected and the MAC is installed in the R-VPLS FDB with type ‟Evpn”.
-
After the MAC is installed in FDB, there are no differences between this interoperable interface-less model and the interface-ful with unnumbered SBD IRB model. Therefore, SR OS is compatible with the received IP prefix routes for both models.
-
The following is an example of a typical configuration of a PE's SBD and VPRN that work in interface-less model for IPv4 and IPv6:
7750SR#config>service#
vpls 2 customer 1 name "sbd" create
allow-ip-int-bind
exit
bgp
exit
bgp-evpn
evi 2
no mac-advertisement
ip-route-advertisement
mpls bgp 1
auto-bind-tunnel resolution any
no shutdown
vprn 1 customer 1 create
route-distinguisher auto-rd
interface "sbd" create
ipv6
exit
vpls "sbd"
evpn-tunnel ipv6-gateway-address mac
Interface-less IP-VRF-to-IP-VRF model (IP encapsulation) for MPLS tunnels
In addition to the Interface-ful and interoperable Interface-less models described in the previous sections, SR OS also supports Interface-less Model (EVPN IFL) with IP encapsulation for MPLS tunnels. In the standard specification - draft-ietf-bess-evpn-ip-prefix - this refers to the EVPN IFL model for IP NVO tunnels.
Compared to the Ethernet NVO option, the ingress PE no longer pushes an inner Ethernet header, but the IP packet is directly encapsulated with an EVPN service label and the transport labels.
Interface-less IP-VRF-to-IP-VRF model for IP encapsulation in MPLS tunnels illustrates the Interface-less Model (EVPN IFL) with IP encapsulation for MPLS tunnels.
EVPN IFL uses EVPN IP Prefix routes to exchange prefixes between PEs without the need for an R-VPLS service, termed Supplementary Broadcast Domain (SBD) in the standards, and any destination MAC lookup. The data path used in EVPN IFL is the same as that is used for IP-VPN services in the VPRN.
In the example of Interface-less IP-VRF-to-IP-VRF model for IP encapsulation in MPLS tunnels:
-
PE2 advertises IP Prefix 20.0/24 (shorthand for 20.0.0.0/24) in an EVPN IP Prefix route that does not contain a Router's MAC extended community anymore. As usual, and depicted in step 1, arriving frames with IP destination of 20.0.0.1 on PE1's R-VPLS-1 are processed for a route lookup on VPRN-1.
-
However, in step 2 and as opposed to the previous models, the lookup yields a route-table entry that does not point at an SBD R-VPLS, but rather to an MPLS tunnel terminated on PE2. PE1 then pushes the EVPN service label that was received on the IP Prefix route at the top of the IP packet, and the packet is sent on the wire without any inner Ethernet header.
-
In step 3, the MPLS tunnel is terminated on PE2 and the EVPN label identifies the VPRN-1 service for a route lookup.
-
Step 4 corresponds to the regular R-VPLS forwarding that happens in the other EVPN L3 models.
A new vprn>bgp-evpn>mpls context has been added to configure a VPRN service for EVPN IFL. This context is like the one existing in VPLS and Epipe services and enables the use of EVPN IFL in the VPRN service. When configured, no R-VPLS with evpn-tunnel should be added to the VPRN, that is, no SBD is configured. As an example, in Interface-less IP-VRF-to-IP-VRF model for IP encapsulation in MPLS tunnels PE1 and PE2 VPRN-1 service are configured as follows:
[ex:configure service vprn "vprn-1"]
A:admin@PE1# info
admin-state enable
ecmp 2
bgp-evpn {
mpls 1 {
admin-state enable
route-distinguisher "192.0.2.1:12"
vrf-target {
community "target:64500:2"
}
auto-bind-tunnel {
resolution any
}
}
}
interface "irb-1" {
ipv4 {
primary {
address 10.0.0.254
prefix-length 24
}
}
vpls "r-vpls-1" {
}
}
[ex:configure service vprn "vprn-1"]
A:admin@PE2# info
admin-state enable
ecmp 2
bgp-evpn {
mpls 1 {
admin-state enable
route-distinguisher "192.0.2.2:21"
vrf-target {
community "target:64500:2"
}
auto-bind-tunnel {
resolution any
}
}
}
interface "irb-2" {
ipv4 {
primary {
address 20.0.0.254
prefix-length 24
}
}
vpls "r-vpls-1" {
}
}
ARP-ND host routes for extended Layer 2 Data Centers
SR OS supports the creation of host routes for IP addresses that are present in the ARP or neighbor tables of a routing context. These host routes are referred to as ARP-ND routes and can be advertised using EVPN or IP-VPN families. A typical use case where ARP-ND routes are needed is the extension of Layer 2 Data Centers (DCs). Extended Layer-2 Data Centers illustrates this use case.
Subnet 10.0.0.0/16 in Extended Layer-2 Data Centers is extended throughout two DCs. The DC gateways are connected to the users of subnet 20.0.0.0/24 on PE1 using IP-VPN (or EVPN). If the virtual machine VM 10.0.0.1 is connected to DC1, when PE1 needs to send traffic to host 10.0.0.1, it performs a Longest Prefix Match (LPM) lookup on the VPRN’s route table. If the only IP prefix advertised by the four DC GWs was 10.0.0.0/16, PE1 could send the packets to the DC where the VM is not present.
To provide efficient downstream routing to the DC where the VM is located, DGW1 and DGW2 must generate host routes for the VMs to which they connect. When the VM moves to the other DC, DGW3 and DGW4 must be able to learn the VM’s host route and advertise it to PE1. DGW1 and DGW2 must withdraw the route for 10.0.0.1, because the VM is no longer in the local DC.
In this case, the SR OS is able to learn the VM’s host route from the generated ARP or ND messages when the VM boots or when the VM moves.
A route owner type called ‟ARP-ND” is supported in the base or VPRN route table. The ARP-ND host routes have a preference of 1 in the route table and are automatically created out of the ARP or ND neighbor entries in the router instance.
The following commands enable ARP-ND host routes to be created in the applicable route tables:
configure service vprn/ies interface arp-host-route populate {evpn | dynamic | static}
configure service vprn/ies interface ipv6 nd-host-route populate {evpn | dynamic | static}
When the command is enabled, the EVPN, dynamic and static ARP entries of the routing context create ARP-ND host routes in the route table. Similarly, ARP-ND host routes are created in the IPv6 route table out of static, dynamic, and EVPN neighbor entries if the command is enabled.
The arp and nd-host-route populate commands are used with the following features:
adding ARP-ND hosts
A route tag can be added to ARP-ND hosts using the route-tag command. This tag can be matched on BGP VRF export and peer export policies.
keeping entries active
The ARP-ND host routes are kept in the route table as long as the corresponding ARP or neighbor entry is active. Even if there is no traffic destined for them, the arp-proactive-refresh and nd-proactive-refresh commands configure the node to keep the entries active by sending an ARP refresh message 30 seconds before the arp-timeout or starting NUD when the stale time expires.
speeding up learning
To speed up the learning of the ARP-ND host routes, the arp-learn-unsolicited and nd-learn-unsolicited commands can be configured. When arp-learn-unsolicited is enabled, received unsolicited ARP messages (typically GARPs) create an ARP entry, and consequently, an ARP-ND route if arp-populate-host-route is enabled. Similarly, unsolicited Neighbor Advertisement messages create a stale neighbor. If nd-populate-host-route is enabled, a confirmation message (NUD) is sent for all the neighbor entries created as stale, and if confirmed, the corresponding ARP-ND routes are added to the route table.
In Extended Layer-2 Data Centers, enabling arp-host-route-populate on the DCGWs allows them to learn or advertise the ARP-ND host route 10.0.0.1/32 when the VM is locally connected and to remove or withdraw the host routes when the VM is no longer present in the local DC.
ARP-ND host routes installed in the route table can be exported to VPN IPv4, VPN IPv6, or EVPN routes. No other BGP families or routing protocols are supported.
EVPN host mobility procedures within the same R-VPLS service
EVPN host mobility is supported in SR OS as in Section 4 of draft-ietf-bess-evpn-inter-subnet-forwarding. When a host moves from a source PE to a target PE, it can behave in one of the following ways:
-
The host initiates an ARP request or GARP upon moving to the target PE.
-
The host sends a data packet without first initiating an ARP request of GARP.
-
The host is silent.
The SR OS supports the above scenarios as follows.
EVPN host mobility configuration
Host mobility within the same R-VPLS – initial phase shows an example of a host connected to a source PE, PE1, that moved to the target, PE2. The figure shows the expected configuration on the VPRN interface, where R-VPLS 1 is attached (for both PE1 and PE2). PE1 and PE2 are configured with an ‟anycast gateway”, that is, a VRRP passive instance with the same backup MAC and IP in both PEs.
In this initial phase:
-
PE1 learns Host-1 IP to MAC (10.1-M1) in the ARP table and generates a host route (RT5) for 10.1/32, because Host-1 is locally connected to PE1. In particular:
-
arp-learn-unsolicited triggers the learning of 10.1-M1 upon receiving a GARP from Host-1 or any other ARP
-
arp-proactive-refresh triggers the refresh of host-1’s ARP entry 30 seconds before the entry ages out
-
local-proxy-arp makes sure PE1 replies to any received ARP request on behalf of other hosts in the R-VPLS
-
arp-host-route populate dynamic ensures that only the dynamically learned ARP entries create a host route, for example, 10.1
-
no flood-garp-and-unknown-req suppresses ARP flooding (from CPM) within the R-VPLS1 context and reduces significantly the unnecessary ARP flooding because the ARP entries are synchronized through EVPN
-
advertise dynamic triggers the advertisement of MAC/IP routes for the dynamic ARP entries, including the IP and MAC addresses, for example, 10.1-M1; a MAC/IP route for M1-only that has been previously advertised as M1 is learned on the FDB as local or dynamic
-
-
PE2 learns Host-1 10.1-M1 in the ARP and FDB tables as EVPN type. PE2 must not learn 10.1-M1 as dynamic, so that PE2 is prevented from advertising an RT5 for 10.1/32. If PE2 advertises 10.1/32, then PE3 could select PE2 as the next-hop to reach Host-1, creating an unwanted hair-pinning forwarding behavior. PE2 is expected to have the same configuration as PE1, including the following commands, as well as those described for PE1:
-
no learn-dynamic prevents PE2 from learning ARP entries from ARP traffic received on an EVPN tunnel.
-
populate dynamic, as in PE1, makes sure PE2 only creates route-table ARP-ND host routes for dynamic entries. Hence, 10.1-M1 does not create a host route as long as it is learned via EVPN only.
-
The configuration described in this section and the cases in the following sections are for IPv4 hosts, however, the functionality is also supported for IPv6 hosts. The IPv6 configuration requires equivalent commands, that use the prefix "nd-" instead of "arp-". The only exception is the flood-garp-and-unknown-req command, which does not have an equivalent command for ND.
Host initiates an ARP/GARP upon moving to the target PE
An example is illustrated in Host mobility within the same R-VPLS – move with GARP. This is the expected behavior based on the configuration described in EVPN host mobility configuration.
Host-1 moves from PE1 to PE2 and issues a GARP with 10.1-M1.
Upon receiving the GARP, PE2 updates its FDB and ARP table.
The route-table entry for 10.1/32 changes from EVPN to type arp-nd (based on populate dynamic), therefore, PE2 advertises a RT5 with 10.1/32. Also, M1 is now learned in FDB and ARP as local, therefore, MAC/IP routes with a higher sequence number are advertised (one MAC/IP route with M1 only and another one with 10.1-M1).
Upon receiving the routes, PE1:
Updates its FDB and withdraws its RT2(M1) based on the higher SEQ number.
Updates its ARP entry 10.1-M1 from dynamic to type evpn.
Removes its arp-nd host from the route-table and withdraws its RT5 for 10.1/32 (based on populate dynamic).
The move of 10.1-M1 from dynamic to evpn triggers an ARP request from PE1 asking for 10.1. The no flood-garp-and-unknown-req command prevents PE1 from flooding the ARP request to PE2.
After step 5, no one replies to PE1’s ARP request and the procedure is over. If a host replied to the ARP for 10.1, the process starts again.
Host sends a data packet upon a move to target PE
In this case, the host does not send a GARP/ARP packet when moving to the target PE. Only regular data packets are sent. The steps are illustrated in Host mobility within the same R-VPLS – move with data packet.
Host-1 moves from PE1 to PE2 and issues a (non-ARP) frame with MAC SA=M1.
When receiving the frame, PE2 updates its FDB and starts the mobility procedures for M1 (because it was previously learned from EVPN). At the same time, PE2 also creates a short-lived dynamic ARP entry for the host, and triggers an ARP request for it.
PE2 advertises a RT2 with M1 only, and a higher sequence number.
PE1 receives the RT2, updates its FDB and withdraws its RT2s for M1 (this includes the RT2 with M1-only and the RT2 with 10.1-M1).
PE1 issues an ARP request for 10.1, triggered by the update on M1.
In this case, the PEs are configured with flood-garp-and-unknown-req and therefore, the generated ARP request is flooded to local SAP and SDP-binds and EVPN destinations. When the ARP request gets to PE2, it is flooded to PE2’s SAP and SDP-binds and received by Host-1.
Host-1 sends an ARP reply that is snooped by PE2 and triggers a similar process described in Host initiates an ARP/GARP upon moving to the target PE (this is illustrated in the following).
Because passive VRRP is used in this scenario, the ARP reply uses the anycast backup MAC that is consumed by PE2.
Upon receiving the ARP reply, PE2 updates its ARP table to dynamic.
Because the route-table entry for 10.1/32 now changes from EVPN to type arp-nd (based on populate dynamic), PE2 advertises a RT5 with 10.1/32. Also, M1 is now learned in ARP as local, therefore a RT2 for 10.1-M1 is sent (the sequence number follows the RT2 with M1 only).
Upon receiving the route, PE1:
Updates the ARP entry 10.1-M1, from type local to type evpn.
Removes its arp-nd host from the route-table and withdraws its RT5 for 10.1/32 (based on populate dynamic).
Silent host upon a move to the target PE
This case assumes the host moves but it stays silent after the move. The steps are illustrated in Host mobility within the same R-VPLS – silent host.
Host-1 moves from PE1 to PE2 but remains silent.
Eventually M1 ages out in PE1’s FDB and the RT2s for M1 are withdrawn. This update on M1 triggers PE1 to issue an ARP request for 10.1.
The flood-garp-and-unknown-req is configured. The ARP request makes it to PE2 and Host-1.
Host-1 sends an ARP reply that is consumed by PE2. FDB and ARP tables are updated.
The FDB and ARP updates trigger RT2s with M1-only and with 10.1-M1. Because an arp-nd dynamic host route is also created in the route-table, an RT5 with 10.1/32 is triggered.
Upon receiving the routes, PE1 updates FDB and ARP tables. The update on the ARP table from dynamic to evpn removes the host route from the route-table and withdraws the RT5 route.
BGP and EVPN route selection for EVPN routes
When two or more EVPN routes are received at a PE, BGP route selection typically takes place when the route key or the routes are equal. When the route key is different, but the PE has to make a selection (for instance, the same MAC is advertised in two routes with different RDs), BGP hands over the routes to EVPN and the EVPN application performs the selection.
EVPN and BGP selection criteria are described below:
-
EVPN route selection for MAC routes
When two or more routes are received with the same mac-length/mac but different route key, BGP hands the routes over to EVPN. EVPN selects the route based on the following tiebreaking order:
-
Conditional static MACs (local protected MACs)
-
Auto-learned protected MACs (locally learned MACs on SAPs or mesh or spoke SDPs because of the configuration of auto-learn-mac-protect)
-
EVPN ES PBR MACs (see ES PBR MAC routes below)
-
EVPN static MACs (remote protected MACs)
-
Data plane learned MACs (regular learning on SAPs or SDP bindings) and EVPN MACs with higher SEQ numbers. Learned MACs and EVPN MACs are considered equal if they have the same SEQ number.
-
EVPN MACs with higher SEQ number
-
EVPN E-tree root MACs
-
EVPN non-RT-5 MACs (this tie-breaking rule is only observed if the selection algorithm is comparing received MAC routes and internal MAC routes derived from the MACs in IP-Prefix routes, for example, RT-5 MACs)
-
Lowest IP (next-hop IP of the EVPN NLRI)
-
Lowest Ethernet tag (that is zero for MPLS and may be different from zero for VXLAN)
-
Lowest RD
-
Lowest BGP instance (this tie-breaking rule is only considered if the above rules fail to select a unique MAC and the service has two BGP instances of the same encapsulation)
-
-
ES PBR MAC routes
When a PBR filter with a forward action to an ESI and SF-IP (Service Function IP) exists, a MAC route is created by the system. This MAC route is compared to other MAC routes received from BGP.
-
When ARP resolves (it can be static, EVPN, or dynamic) for a SF-IP and the system has an AD EVI route for the ESI, a ‟MAC route” is created by ES PBR with the <MAC Address = ARPed MAC Address, VTEP = AD EVI VTEP, VNI = AD EVI VNI, RD = ES PBR RD (special RD), Static = 1> and installed in EVPN.
-
This MAC route does not add anything (back) to ARP; however, it goes through the MAC route selection in EVPN and triggers the FDB addition if it is the best route.
-
In terms of priority, this route's priority is lower than local static but higher than remote EVPN static (number 2 in the tiebreaking order above).
-
If there are two competing ES PBR MAC routes, then the selection goes through the rest of checks (Lowest IP > Lowest RD).
-
-
EVPN route selection for IP-prefix and IPv6-prefix routes
See Route selection across EVPN-IFL and other owners in the VPRN service.
- EVPN route selection for EVPN AD per-EVI routes
-
BGP route selection
The BGP route selection for MAC routes with the same route-key follows the following priority order:
-
EVPN static MACs (remote protected MACs).
-
EVPN MACs with higher sequence number.
-
Regular BGP selection (local-pref, aigp metric, shortest as-path, lowest IP).
Regular BGP selection is followed for the rest of the EVPN routes.
-
LSP tagging for BGP next-hops or prefixes and BGP-LU
It is possible to constrain the tunnels used by the system for resolution of BGP next-hops or prefixes and BGP labeled unicast routes using LSP administrative tags. For more information, see the 7450 ESS, 7750 SR, 7950 XRS, and VSR MPLS Guide, "LSP Tagging and Auto-Bind Using Tag Information".
Oper-groups interaction with EVPN services
Operational groups, also referred to as oper-groups, are supported in EVPN services. In addition to supporting SAP and SDP-binds, oper-groups can also be configured under the following objects:
EVPN-VXLAN instances (except on Epipe services)
EVPN-MPLS instances
Ethernet segments
These oper-groups can be monitored in LAGs or service objects. Oper-groups are particularly useful for the following applications:
Link Loss Forwarding (LLF) for EVPN VPWS services
core isolation blackhole avoidance
LAG standby signaling to CE on non-DF EVPN PEs (single-active)
LAG-based LLF for EVPN-VPWS services
SR OS uses Eth-CFM fault-propagation to support CE-to-CE fault propagation in EVPN-VPWS services. That is, upon detecting a CE failure, an EVPN-VPWS PE withdraws the corresponding Auto-Discovery per-EVI route, which then triggers a down MEP on the remote PE that signals the fault to the connected CE. In cases where the CE connected to EVPN-VPWS services does not support Eth-CFM, the fault can be propagated to the remote CE by using LAG standby-signaling, which can be LACP-based or simply power-off.
Link loss forwarding for EVPN-VPWS shows an example of link loss forwarding for EVPN-VPWS.
In this example, PE1 is configured as follows:
A:PE1>config>lag(1)# info
----------------------------------------------
mode access
encap-type null
port 1/1/1
port 1/1/2
standby-signaling power-off
monitor-oper-group "llf-1"
no shutdown
----------------------------------------------
*A:PE1>config>service>epipe# info
----------------------------------------------
bgp
exit
bgp-evpn
evi 1
local-attachment-circuit ac-1
eth-tag 1
exit
remote-attachment-circuit ac-2
eth-tag 2
exit
mpls bgp 1
oper-group "llf-1"
auto-bind-tunnel
resolution any
exit
no shutdown
exit
sap lag-1 create
no shutdown
exit
no shutdown
The following applies to the PE1 configuration:
The EVPN Epipe service is configured on PE1 with a null LAG SAP and the oper-group ‟llf-1” under bgp-evpn>mpls. This is the only member of oper-group ‟llf-1”.
Note: Do not configure the oper-group under config>service>epipe, because circular dependencies are created when the access SAPs go down because of the LAG monitor-oper-group command.The operational group monitors the status of the BGP-EVPN instance in the Epipe service. The status of the BGP-EVPN instance is determined by the existence of an EVPN destination at the Epipe.
The LAG, in access mode and encap-type null, is configured with the command monitor-oper-group ‟llf-1”.
Note: The configure>lag>monitor-oper-group name command is only supported in access mode. Any encap-type can be used.
As shown in Link loss forwarding for EVPN-VPWS, upon failure on CE2, the following events occur:
-
PE2 withdraws the EVPN route.
-
The EVPN destination is removed in PE1 and oper-group ‟llf-1” also goes down.
-
Because lag-1 is monitoring ‟llf-1”, the oper-group that is becoming inactive triggers standby signaling on the LAG; that is, power-off or LACP out-of-sync signaling to the CE1.
When the SAP or port is down because of the LAG monitoring of the oper-group, PE1 does not trigger an AD per-EVI route withdrawal, even if the SAP is brought operationally down.
-
After CE2 recovers and PE2 re-advertises the AD per-EVI route, PE1 creates the EVPN destination and oper-group ‟llf-1” comes up. As a result, the monitoring LAG stops signaling standby and the LAG is brought up.
Core isolation blackhole avoidance
Core isolation blackhole avoidance shows how blackholes can be avoided when a PE becomes isolated from the core.
In this example, consider that PE2 and PE1 are single-active multihomed to CE1. If PE2 loses all its core links, PE2 must somehow notify CE1 so that PE2 does not continue attracting traffic and so that PE1 can take over. This notification is achieved by using oper-groups under the BGP-EVPN instance in the service. The following is an example output of the PE2 configuration.
*[ex:configure service vpls ”evi1"]
A:admin@PE-2# info
admin-state enable
bgp-evpn {
evi 1
mpls 1 {
admin-state enable
oper-group ‟evpn-mesh”
auto-bind-tunnel {
resolution any
}
}
}
sap lag-1:351 {
monitor-oper-group ‟evpn-mesh”
}
*[ex:configure service oper-group ”evpn-mesh"]
A:admin@PE-2# info detail
hold-time {
up 4
}
With the PE2 configuration and Core isolation blackhole avoidance example, the following steps occur:
PE2 loses all its core links, therefore, it removes its EVPN-MPLS destinations. This causes oper-group ‟evpn-mesh” to go down.
Because PE2 is the DF in the Ethernet Segment (ES) ES-1 and sap lag-1:351 is monitoring the oper-group, the SAP becomes operationally down. If ETH-CFM fault propagation is enabled on a down MEP configured on the SAP, CE1 is notified of the failure.
PE1 takes over as the DF based on the withdrawal of the ES (and AD) routes from PE2, and CE1 begins sending traffic immediately to PE1 only, therefore, avoiding a traffic blackhole.
Generally, when oper-groups are associated with EVPN instances:
-
The oper-group state is determined by the existence of at least one EVPN destination in the EVPN instance.
-
The oper-group that is configured under a BGP EVPN instance cannot be configured under any other object (for example, SAP, SDP binding, and so on) of the same or different service.
- The status of an oper-group associated with an EVPN instance does not go down if all the EVPN destinations are operationally down due to a control-word or MTU mismatch.
-
The status of an oper-group associated with an EVPN instance goes down in the following cases:
-
the service admin-state is disabled (only for VPLS services, not for Epipes)
-
the BGP EVPN VXLAN or MPLS admin-state are disabled
-
there are no EVPN destinations associated with the instance
-
LAG or port standby signaling to the CE on non-DF EVPN PEs (single-active)
As described in EVPN for MPLS tunnels, EVPN single-active multihoming PEs that are elected as non-DF must notify their attached CEs so the CE does not send traffic to the non-DF PE. This can be performed on a per-service basis that is based on the ETH-CFM and fault-propagation. However, sometimes ETH-CFM is not supported in multihomed CEs and other notification mechanisms are needed, such as LACP standby or power-off. This scenario is shown in the following figure.
As shown in the preceding figure, the multihomed PEs are configured with multiple EVPN services that use ES-1. ES-1 and its associated LAG is configured as follows:
*[ex:configure lag 1]
A:admin@PE-2# info
admin-state enable
standby-signaling {power-off|lacp}
monitor-oper-group ”DF-signal-1"
mode access
port 1/1/c2/1 {
}
<snip>
ex:configure service system bgp evpn]
A:admin@PE-2# info
ethernet-segment "ES-1" {
admin-state enable
esi 0x01010000000000000000
multi-homing-mode single-active
oper-group ‟DF-signal-1”
association {
lag 1 {
}
<snip>
When the operational group is configured on the ES and monitored on the associated LAG:
The operational group status is driven by the ES DF status (defined by the number of DF SAPs or oper-up SAPs owned by the ES).
The operational group goes down if all the SAPs in the ES go down (this happens in PE2 in LACP standby signaling from the non-DF). The ES operational group goes up when at least one SAP in the ES goes up.
As a result, if PE2 becomes non-DF on all the SAPs in the ES, they all go operationally down, including the ES-1 operational group.
Because LAG-1 is monitoring the operational group, when its status goes down, LAG-1 signals LAG standby state to the CE. The standby signaling can be configured as LACP or power-off.
The ES and AD routes for the ES are not withdrawn because the router recognizes that the LAG becomes standby for the ES operational group.
If the Single-Active ES is associated with a port instead of a LAG, the config>port> monitor-oper-group DF-signal-1 command can be configured. In this case, the port monitors the ES operational group and the following rules apply:
- As in the case of the LAG, if the ES goes non-DF, its operational group also goes down.
- The port that is monitoring the ES operational group signals standby state by powering off the port itself.
- As in the case of the LAG, the ES and AD routes for the ES are not withdrawn because the router recognizes that the port is in standby state because of the ES operational group.
Operational groups cannot be assigned to ESs that are configured as virtual, all-active or service-carving mode auto.
AC-Influenced DF Election Capability on an ES with oper-group
The Attachment Circuit Influenced (AC-Influenced) Designated Forwarder Election Capability (AC-DF), as described in RFC8584, is supported in SR OS. By default, the ac-df-capability command is set to the include option. This configuration addresses the need to consider EVPN Auto-discovery per EVI/ES (AD per EVI/ES) routes for a specific PE, which ensures that the PE is included on the candidate DF list.
Configuring ac-df-capability to exclude disables the AC-DF capability. When ac-df-capability exclude is configured on a specific ES, the presence or absence of the AD per EVI/ES routes from the ES peers does not modify the DF Election candidate list for the ES. The exclude option is recommended in ESs that use an oper-group, that is monitored by the access LAG, to signal standby lacp or power-off, as described in LAG or port standby signaling to the CE on non-DF EVPN PEs (single-active). All PE routers attached to the same ES must be configured consistently for the specific ac-df-capability.
EVPN Layer 3 OISM
Optimized Inter-Subnet Multicast (OISM) is an EVPN-based solution that optimizes the forwarding of IP multicast across R-VPLS of the same or a different subnet. EVPN OISM is supported for EVPN-MPLS and EVPN-VXLAN services, IPv4 and IPv6 multicast groups, and is described in this section.
Introduction and terminology
EVPN OISM is similar to Multicast VPNs (MVPN) in some aspects, because it does IP multicast routing in VPNs, uses MP-BGP to signal the interest of a PE in a specified multicast group and uses Provider Multicast Service Interface (PMSI) trees among the PEs to send and receive the IP multicast traffic.
However, OISM is simpler than MVPN and allows efficient multicast in networks that integrate Layer 2 and Layer 3; that is, networks where PEs may be attached to different subnets, but could also be attached to the same subnet.
OISM is simpler than MVPN in some aspects:
-
it does not need to setup shared trees (that need to switchover to shortest path trees)
-
it does not require of the MVPN Any Source Multicast (ASM) complex procedures or the Rendezvous Point (RP) function
-
it does not require Upstream Multicast Hop (UMH) selection and therefore does not have the UMH potential issues and limitations described in RFC6513 and RFC6514
-
multiple PEs can be attached to the same Receiver subnet or Source subnet, which provides full flexibility when designing the multicast network
EVPN OISM is defined by draft-ietf-bess-evpn-irb-mcast and uses the following terminology that is also used in the rest of this section:
- BD with IRB
- Broadcast Domain with an Integrated Routing and Bridging interface. It is an R-VPLS service in SR OS.
- Ordinary BD
- refers to an R-VPLS where sources or receivers, or both, are connected
- SBD
- Supplementary Broadcast Domain. It is a backhaul R-VPLS that connects the PEs' VPRN services and is configured as an evpn-tunnel interface in the VPRN services. The SBD is mandatory in OISM and is needed to receive multicast traffic on the PEs that are not attached to the source ordinary BD.
- EVPN Tenant Domain
- refers to the group of BDs and IP-VRFs (VPRNs) of the same tenant
- SMET route or EVPN route type 6
- the EVPN route that the PEs use to signal interest for a specific multicast group (S ,G) or (*,G)
- IIF and OIF
- refers to Incoming Interface and Outgoing Interface. A multicast enabled VPRN has Layer 3 IIF and OIFs. A multicast enabled R-VPLS have Layer 2 OIFs.
- Upstream and Downstream PEs
- refers to the PEs that are connected to sources and receivers respectively
- I-PMSIs and S-PMSIs
- refers to Inclusive and Selective (Provider Multicast Service Interface) trees. The inclusive trees are signaled via IMET routes and include all the PEs attached to the service. The selective trees are signaled via S-PMSI A-D routes, and only the downstream PEs with receivers for the group signaled by the S-PMSI A-D route join the tree.
- S-PMSI A-D route or EVPN route type 10
- Selective Provider Multicast Service Interface (S-PMSI) Auto-Discovery route, the EVPN route that the root PEs use to signal S-PMSI trees, when the root PE decides that setting up a specific tree for a specific (S,G) or (*G) is needed.
OISM forwarding plane
In an EVPN OISM network, it is assumed that the sources and receivers are connected to ordinary BDs and EVPN is the only multicast control plane protocol used among the PEs. Also, the subnets (and optionally hosts) are advertised normally by the EVPN IP Prefix routes. The IP-Prefix routes are installed in the PEs' VPRN route tables and are used for multicast RPF checks when routing multicast packets. EVPN OISM forwarding plane illustrates a simple EVPN OISM network.
In EVPN OISM forwarding plane, and from the perspective of the multicast flow (S1,G1), PE1 is considered an upstream PE, whereas PE2 and PE3 are downstream PEs. The OISM forwarding rules are as follows.
On the upstream PE (PE1), the multicast traffic is sent to local receivers irrespective of the receivers being attached to the source BD (BD1) or not (BD2).
Note: OISM does not use any multicast Designated Router (DR) concept, therefore the upstream PE always routes locally as long as it has local receivers.On downstream PEs that are attached to the source BD (PE2), the multicast traffic is always received on the source BD (BD1) and forwarded locally to receivers in the same or different ordinary BD (as in the case of Receiver-22 or Receiver-21). Multicast traffic received on this PE is never sent back to the SBD or remote EVPN PEs.
On downstream PEs that are not attached to the source BD (PE3), the multicast traffic is always received on the SBD and sent to local receivers. Multicast received on this PE is never sent to remote EVPN PEs.
Note: In order for PE3 to receive the multicast traffic on the SBD, the source PE, PE1, forms an EVPN destination from BD1 to PE3's SBD. This EVPN destination on PE1 is referred to as an SBD destination.
OISM control plane
OISM uses the Selective Multicast Ethernet Tag (SMET) route or route type 6 to signal interest on a specific (S,G) or (*,G). Use of the SMET route provides an example.
As shown in Use of the SMET route, a PE with local receivers interested in a multicast group G1 issues an SMET route encoding the source and group information (upon receiving local IGMP join messages for that group). EVPN OISM uses the SMET route in the following way:
A route type-6 (SMET) can carry information for IPv4 or IPv6 multicast groups, for (S,G) or (*,G) or even wildcard groups (*,*).
Note: MVPN uses different route types or even families to address the different multicast group types.The SMET routes are advertised with the route-target of the SBD, that guarantees that the SMET routes are imported by all the PEs of the tenant.
The SMET routes also help minimize the control plane overhead because they aggregate the multicast state created on the downstream PEs. This is illustrated in Use of the SMET route, where PE2 sends the minimum number of SMET routes to pull multicast traffic for G1. That is, if PE2 has state for (S1,G1) and (*,G1), the SMET route for (*,G1) is enough to attract the multicast traffic required by the local receivers. There is no need to send an SMET route for (S1,G1) and a different route for (*,G1). Only (*,G1) SMET route is advertised.
-
The SMET routes also provide an implicit S-PMSI (Selective Provider Multicast Service Interface) tree in case Ingress Replication is used to transport IP multicast. That is, PE1 sends the multicast traffic only to the PEs requesting it, for example, PE2 and not to PE3. In MVPN, even for Ingress Replication, a separate S-PMSI tree is setup to avoid PE1 from sending multicast to PE3.
EVPN OISM and multihoming
EVPN OISM supports multihomed multicast sources and receivers.
While MVPN requires complex UMH (Upstream Multicast Hop) selection procedures to provide multihoming for sources, EVPN simply reuses the existing EVPN multihoming procedures. EVPN OISM and multihomed sources illustrates an example of a multihomed source that makes use of EVPN all-active multihoming.
The source S1 is attached to a switch SW1 that is connected via single LAG to PE1 and PE3, a pair of EVPN OISM PEs. PE1 and PE3 define Ethernet Segment ES-1 for SW1, where ES-1 is all-active in this case (single-active multihoming being supported too). Even in case of all-active, the multicast flow for (S1,G1) is only sent to one OISM PE, and the regular all-active multihoming procedures (Split-Horizon) make sure that PE3 does not send the multicast traffic back to SW1. This is true for EVPN-MPLS and EVPN-VXLAN BDs.
Convergence, in case of failure, is very fast because the downstream PEs, for example, PE2, advertise the SMET route for (*,G1) with the SBD route target and it is imported by both PE1 and PE3. In case of failure on PE1, PE3 already has state for (*,G1) and can forward the multicast traffic immediately.
EVPN OISM also supports multihomed receivers. EVPN OISM and multihomed receivers illustrates an example of multihomed receivers.
Multi-homed receivers as depicted in EVPN OISM and multihomed receivers, require the support of multicast state synchronization on the multihoming PEs to avoid blackholes. As an example, consider that SW1 hashes an IGMP join (*,G1) to PE2, and PE2 adds the ES-1 SAP to the OIF list for (*,G1). Consider PE1 is the ES-1 DF. Unless the (*,G1) state is synchronized on PE1, the multicast traffic is pulled to PE2 only and then discarded. The state synchronization on PE1 pulls the multicast traffic to PE1 too, and PE1 forwards to the receiver using its DF SAP.
In SR OS, the IGMP/MLD-snooping state is synchronized across ES peers using EVPN Multicast Synch routes, as specified in RFC 9251.
The same mechanism must be used in all the PEs attached to the same Ethernet Segment. MCS takes precedence when both mechanisms are simultaneously used.
EVPN Multicast Synch routes are supported as specified in RFC 9251 for OISM services too. They use EVPN route types 7 and 8, and are known as the Multicast Join Synch and Multicast Leave Synch routes, respectively.
When a PE that is attached to an EVPN Ethernet Segment receives an IGMP or MLD join, it creates multicast state and advertises a Multicast Join Synch route so that the peer ES PEs can synchronize the state. Similarly, when a PE in the Ethernet Segment receives a leave message, it advertises a Multicast Leave Synch route so that all the PEs in the Ethernet Segment can synchronize the Last Member Query procedures.
The Multicast Join Synch route or EVPN route type 7 is similar to the SMET route, but also includes the ESI. The Multicast Join Synch route indicates the multicast group that must be synchronized in all objects of the Ethernet Segment. Multicast join synch route depicts the format of the Multicast Join Synch route.
In accordance with RFC 9251, the following rules pertain:
-
All fields except for the Flags are part of the route key for BGP processing purposes.
-
Synch routes are resolved by BGP auto-bind resolution, as any other service route.
-
The Flags are advertised and processed based on the received IGMP or MLD report that triggered the advertisement of the route (this includes the versions for IGMP or MLD and Include/Exclude bit for IGMPv3).
-
The Route Distinguisher (RD) is the service RD.
-
This route is only distributed to the ES peers - it is advertised with the ES-import route target, which limits its distribution to ES peers only.
-
In addition, the route is sent with one EVI-RT extended community. The EVI-RT EC does not use a route target type/sub-type, therefore, it does not affect the distribution of the route, for example, it is not considered for route target constraint filtering; only the ES-import route target is. However, its value is still taken from the configured service route target or EVI auto-derived route target.
The Multicast Leave Synch route or EVPN route type 8 indicates the multicast group Leave states that must be synchronized in all objects of the Ethernet Segment. Multicast leave synch route depicts the format of the Multicast Leave Synch route.
In accordance with RFC 9251, the following rules pertain:
-
All fields except for the Flags, the Maximum Response Time and ‟reserved” field are part of the route key for BGP processing purposes.
-
Synch routes are resolved by BGP auto-bind resolution, as any other service route.
-
The Flags are generated based on the version of the leave message that triggered the advertisement of the route.
-
As with the Multicast Join Synch route, this is a service level route sent with one ES-import route target and one EVI-RT EC. RD, Flags, ES-import and EVI-RT EC are advertised and processed in the same way as for the Multicast Join Synch route.
The EVI-RT is automatically added to the routes type 7 and 8, depending on the type of route target being configured on the service.
-
If the service is configured with target:2byte-asnumber:ext-comm-val as route target, an EVI-RT type 0 is automatically added to routes type 7 and 8. No route target (other than the ES-import route target) is added to the route.
-
If the service is configured with target:ip-addr:comm-val as route target, an EVI-RT type 1 is automatically added to routes type 7 and 8. No route target (other than the ES-import route target) is added to the route.
-
If the service is configured with target:4byte-asnumber:comm-val as route target, an EVI-RT type 2 is automatically added to routes type 7 and 8. No route target (other than the ES-import route target) is added to the route.
-
If auto-derived service RTs are used in the service, the corresponding operating route target is used as the EVI-RT.
-
EVI-RT type 3 is not supported (type 3 is specified in RFC 9251).
-
In general, vsi-import and vsi-export must not be used in OISM mode services or when the Multicast Synch routes are used. Using vsi-import or vsi-export policies instead of the route target command or the EVI-derived route target leads to issues when advertising and processing the Multicast Synch routes.
The following are additional considerations about the Multicast Synch routes:
-
The routes are advertised without the need to configure any command as long as igmp-snooping or mld-snooping are enabled on an R-VPLS in OISM mode attached to a regular or virtual Ethernet Segment.
-
The reception of Multicast Join or Leave Synch routes triggers the synchronization of states and the associated procedures in RFC 9251.
-
Upon receiving a Leave message, the triggered Multicast Synch route encodes the configured Last Member Query interval times robust-count (LMQ ✕ robust-count) in the Maximum Response Time field. The local PE expires the multicast state after the usual time plus an additional time that accounts for the BGP propagation to the remote ES peers and can be configured with the following command.
This timer value should be configured the same in all the PEs attached to the same ES.configure service system bgp-evpn multicast-leave-sync-propagation
EVPN OISM configuration guidelines
This section shows a configuration example for the network illustrated in EVPN OISM example.
The following CLI excerpt shows the configuration required on PE4 for services 2000 (VPRN), BD-2003 and BD-2004 (ordinary BDs) and BD-2002 (SBD).
vprn 2000 name "tenant-2k" customer 1 create
route-distinguisher auto-rd
interface "bd-2003" create
address 10.41.0.1/24
vpls "bd-2003"
exit
exit
interface "bd-2004" create
address 10.42.0.1/24
vpls "bd-2004"
exit
exit
interface "bd-2002" create
vpls "bd-2002"
evpn-tunnel supplementary-broadcast-domain <------
exit
exit
igmp <------
interface "bd-2003" <------
no shutdown
exit
interface "bd-2004" <------
no shutdown
exit
no shutdown
exit
pim <------
rpf-table both <------
interface "bd-2002" <------
multicast-senders always <------
exit
apply-to all <------
no shutdown
exit
no shutdown
exit
As shown in the previous configuration commands, the VPRN must be configured as follows:
-
The SBD interface in the VPRN must be configured as using the following command so that the OISM forwarding mode is enabled.
configure service vprn interface vpls evpn-tunnel supplementary-broadcast-domain
-
IGMP must be enabled on the ordinary BD (R-VPLS) interfaces so that the PEs can process the received IGMP messages from the receivers.
-
Even though the protocol itself is not used, PIM is enabled in the VPRN on all the IRB interfaces so that the multicast source addresses can be resolved. Also, the following command must be enabled on the SBD interface;
this is because the SBD interface is unnumbered (it does not have an IP address associated) and the multicast traffic source RPF-check would discard the multicast traffic arriving at the SBD interface unless the system is informed that legal multicast traffic may be expected on the SBD. The multicast-senders always command allows the system to process multicast on the unnumbered SBD interface. The following command is needed in case sources are added to the VPRN route-table as ARP-ND host routes (which is typical in Data Centers).configure service vprn pim interface multicast-senders always
- MD-CLI
configure service vprn pim ipv4 rpf-table both configure service vprn pim ipv6 rpf-table both
- classic
CLI
configure service vprn pim rpf-table both
- MD-CLI
Besides the VPRN, BD-2003, BD-2004 and BD-2002 (SBD) must be configured as follows.
vpls 2003 name "bd-2003" customer 1 create
allow-ip-int-bind
forward-ipv4-multicast-to-ip-int <------
exit
bgp
exit
bgp-evpn
evi 2003
mpls bgp 1
ingress-replication-bum-label
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
igmp-snooping <------
no shutdown <------
exit
sap 1/1/1:2003 create
igmp-snooping
mrouter-port
exit
no shutdown
exit
no shutdown
exit
vpls 2004 name "bd-2004" customer 1 create
allow-ip-int-bind
forward-ipv4-multicast-to-ip-int <------
exit
bgp
exit
bgp-evpn
evi 2004
mpls bgp 1
ingress-replication-bum-label
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
igmp-snooping <------
no shutdown <------
exit
sap 1/1/1:2004 create
igmp-snooping
fast-leave
exit
no shutdown
exit
no shutdown
exit
vpls 2002 name "bd-2002" customer 1 create
allow-ip-int-bind
forward-ipv4-multicast-to-ip-int <------
exit
bgp
exit
bgp-evpn
no mac-advertisement
ip-route-advertisement
sel-mcast-advertisement <------
evi 2002
mpls bgp 1
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
igmp-snooping <------
no shutdown <------
exit
no shutdown
exit
- MD-CLI
configure service vpls igmp-snooping admin-state enable
- classic
CLI
configure service vpls no igmp-snooping
- MD-CLI
configure service vpls routed-vpls multicast ipv4 forward-to-ip-interface
- classic
CLI
configure service vpls allow-ip-int-bind forward-ipv4-multicast-to-ip-int
- MD-CLI
configure service vpls bgp-evpn routes sel-mcast advertise true
- classic
CLI
configure service vpls bgp-evpn sel-mcast-advertisement
PE2 and PE3 are configured with the VPRN (2000), ordinary BD (BD-2001) and SBD (BD-2002) as above. In addition, PE2 and PE3 are attached to ES-1 where a receiver is connected. Multicast state synchronization through BGP Multicast Synch routes is automatically enabled in R-VPLS services in OISM mode and no additional configuration is needed:
/* Example of ES-1 configuration and MCS on PE3. Similar configuration is needed in
PE2.
bgp-evpn
ethernet-segment "ES-1" virtual create
esi 01:00:00:00:00:00:01:00:00:00
service-carving
mode manual
manual
preference non-revertive create
value 30
exit
exit
exit
multi-homing single-active
lag 1
dot1q
q-tag-range 2001
exit
no shutdown
exit
When the previous configuration is executed in the three nodes, the EVPN routes are exchanged. BD2003 in PE4 receives IMET routes from the remote SBD PEs and creates "SBD" destinations to PE2 and PE3. Those SBD destinations are used to forward multicast traffic to PE2 and PE3, following the OISM forwarding procedures described in OISM forwarding plane. The following command shows an example of IMET route (flagged as SBD route working on OISM mode) and SMET route received on PE4 from PE2.
IMET route received from PE2 on PE4.
show router bgp routes evpn incl-mcast community target:64500:2002 hunt
<snip>
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
Nexthop : 192.0.2.2
From : 192.0.2.2
Res. Nexthop : 192.168.24.1
Local Pref. : 100 Interface Name : int-PE-4-PE-2
<snip>
Community : target:64500:2002
mcast-flags:SBD/NO-MEG/NO-PEG/OISM/NO-MLD-Proxy/NO-IGMP-Proxy <---
bgp-tunnel-encap:MPLS
<snip>
EVPN type : INCL-MCAST
Tag : 0
Originator IP : 192.0.2.2 <------
Route Dist. : 192.0.2.2:2002
<snip>
-------------------------------------------------------------------------------
PMSI Tunnel Attributes :
Tunnel-type : Ingress Replication
Flags : Type: RNVE(0) BM: 0 U: 0 Leaf: not required
MPLS Label : LABEL 524241
Tunnel-Endpoint: 192.0.2.2
-------------------------------------------------------------------------------
SMET route from PE2 received on PE4.
show router bgp routes evpn smet community target:64500:2002 hunt
<snip>
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
Nexthop : 192.0.2.2
From : 192.0.2.2
Res. Nexthop : 192.168.24.1
Local Pref. : 100 Interface Name : int-PE-4-PE-2
<snip>
Community : target:64500:2002 bgp-tunnel-encap:MPLS
<snip>
EVPN type : SMET
Tag : 0
Src IP : 0.0.0.0 <------
Grp IP : 239.0.0.4 <------
Originator IP : 192.0.2.2 <------
Route Dist. : 192.0.2.2:2002
<snip>
When PE4 receives the IMET routes from PE2 and PE3 SBDs, it identifies the routes as SBD routes in OISM mode, and PE4 creates special EVPN destinations on the BD-2003 service that are used to forward the multicast traffic. The SBD destinations are shown as Sup BCast Domain in the show commands output.
show service id 2003 evpn-mpls
===============================================================================
BGP EVPN-MPLS Dest (Instance 1)
===============================================================================
TEP Address Transport:Tnl Egr Label Oper Mcast Num
State MACs
-------------------------------------------------------------------------------
192.0.2.2 ldp:65551 524266 Up m 0
192.0.2.3 ldp:65537 524266 Up m 0
-------------------------------------------------------------------------------
Number of entries : 2
===============================================================================
*A:PE-4#
show service id 2003 evpn-mpls detail
===============================================================================
BGP EVPN-MPLS Dest (Instance 1)
===============================================================================
TEP Address Transport:Tnl Egr Label Oper Mcast Num
State MACs
-------------------------------------------------------------------------------
192.0.2.2 ldp:65551 524266 Up m 0
Oper Flags : None
Sup BCast Domain : Yes
Last Update : 02/07/2023 14:59:03
192.0.2.3 ldp:65537 524266 Up m 0
Oper Flags : None
Sup BCast Domain : Yes
Last Update : 02/07/2023 13:21:09
-------------------------------------------------------------------------------
Number of entries : 2
===============================================================================
Based on the reception of the SMET routes from PE2 and PE3, PE4 adds the SBD EVPN destinations to its MFIB on BD-2003.
show service id 2003 igmp-snooping base
===============================================================================
IGMP Snooping Base info for service 2003
===============================================================================
Admin State : Up
Querier : 10.41.0.1 on rvpls bd-2003
SBD service : 2002
-------------------------------------------------------------------------------
Port Oper MRtr Pim Send Max Max Max MVR Num
Id Stat Port Port Qrys Grps Srcs Grp From-VPLS Grps
Srcs
-------------------------------------------------------------------------------
sap:1/1/1:2003 Up Yes No No None None None Local 0
rvpls Up Yes No N/A N/A N/A N/A N/A N/A
sbd-mpls:192.0.2.2:524241 Up No No N/A N/A N/A N/A N/A 1 <------
sbd-mpls:192.0.2.3:524253 Up No No N/A N/A N/A N/A N/A 1 <------
===============================================================================
*A:PE-4#
show service id 2003 igmp-snooping statistics
===============================================================================
IGMP Snooping Statistics for service 2003
===============================================================================
Message Type Received Transmitted Forwarded
-------------------------------------------------------------------------------
<snip>
EVPN SMET Routes 2 0 N/A <------
-------------------------------------------------------------------------------
*A:PE-4# show service id 2003 mfib
<snip>
-------------------------------------------------------------------------------
* * sap:1/1/1:2003 Local Fwd
* 239.0.0.4 sap:1/1/1:2003 Local Fwd
sbd-eMpls:192.0.2.2:524241 Local Fwd
sbd-eMpls:192.0.2.3:524253 Local Fwd
PE2 and PE3 also creates regular destinations and SBD destinations based on the reception of IMET routes. As an example, the following command shows the destinations created by PE3 in the ordinary BD-2001.
show service id 2001 evpn-mpls
==============================================================================
BGP EVPN-MPLS Dest (Instance 1)
===============================================================================
TEP Address Transport:Tnl Egr Label Oper Mcast Num
State MACs
-------------------------------------------------------------------------------
192.0.2.2 ldp:65551 524266 Up m 0
192.0.2.2 ldp:65551 524267 Up bum 0
192.0.2.2 ldp:65551 524268 Up none 1
192.0.2.4 ldp:65539 524269 Up m 0
-------------------------------------------------------------------------------
Number of entries : 4
===============================================================================
show service id 2001 evpn-mpls detail
===============================================================================
BGP EVPN-MPLS Dest (Instance 1)
===============================================================================
TEP Address Transport:Tnl Egr Label Oper Mcast Num
State MACs
-------------------------------------------------------------------------------
192.0.2.2 ldp:65551 524266 Up m 0
Oper Flags : None
Sup BCast Domain : Yes
Last Update : 02/07/2023 14:59:04
192.0.2.2 ldp:65551 524267 Up bum 0
Oper Flags : None
Sup BCast Domain : No
Last Update : 02/07/2023 14:59:04
192.0.2.2 ldp:65551 524268 Up none 1
Oper Flags : None
Sup BCast Domain : No
Last Update : 02/07/2023 14:59:04
192.0.2.4 ldp:65539 524269 Up m 0
Oper Flags : None
Sup BCast Domain : Yes
Last Update : 02/07/2023 13:21:10
-------------------------------------------------------------------------------
Number of entries : 4
===============================================================================
In case of an SBD destination and a non-SBD destination to the same PE (PE2), IGMP only uses the non-SBD one in the MFIB. The non-SBD destination always has priority over the SBD destination. This can be seen in the following command in PE3, where the SBD destination to PE2 is down as long as the non-SBD destination is up.
show service id 2001 igmp-snooping base
==============================================================================
IGMP Snooping Base info for service 2001
===============================================================================
Admin State : Up
Querier : 10.0.0.3 on rvpls bd-2001
SBD service : 2002
-------------------------------------------------------------------------------
Port Oper MRtr Pim Send Max Max Max MVR Num
Id Stat Port Port Qrys Grps Srcs Grp From-VPLS Grps
Srcs
-------------------------------------------------------------------------------
sap:lag-1:2001 Down No No No None None None Local 1
rvpls Up Yes No N/A N/A N/A N/A N/A N/A
sbd-mpls:192.0.2.2:524241 Down No No N/A N/A N/A N/A N/A 0 <------
mpls:192.0.2.2:524242 Up No No N/A N/A N/A N/A N/A 1 <------
sbd-mpls:192.0.2.4:524245 Up No No N/A N/A N/A N/A N/A 0
===============================================================================
show service id 2001 mfib
==============================================================================
Multicast FIB, Service 2001
===============================================================================
Source Address Group Address Port Id Svc Id Fwd
Blk
-------------------------------------------------------------------------------
* 239.0.0.4 sap:lag-1:2001 Local Fwd
eMpls:192.0.2.2:524242 Local Fwd <---
Finally, to check the Layer 3 IIF and OIF entries on the VPRN services, enter the following command. As an example, the command is executed in PE2:
show router 2000 pim group detail
==============================================================================
PIM Source Group ipv4
===============================================================================
Group Address : 239.0.0.4
Source Address : *
<snip>
===============================================================================
PIM Source Group ipv4
===============================================================================
Group Address : 239.0.0.4
Source Address : 10.41.0.41
<snip>
Up Time : 0d 00:13:20 Resolved By : rtable-u
Up JP State : Joined Up JP Expiry : 0d 00:00:00
Up JP Rpt : Pruned Up JP Rpt Override : 0d 00:00:00
Rpf Neighbor : 10.41.0.41
Incoming Intf : bd-2002
Outgoing Intf List : bd-2001
Curr Fwding Rate : 0.000 kbps
Forwarded Packets : 1000 Discarded Packets : 0
Forwarded Octets : 84000 RPF Mismatches : 0
Spt threshold : 0 kbps ECMP opt threshold : 7
Admin bandwidth : 1 kbps
-------------------------------------------------------------------------------
Groups : 2
===============================================================================
Inclusive Provider mLDP Tunnels in OISM
Inclusive provider tunnels of type mLDP are supported in OISM PEs. These tunnels can be used to transport multicast flows from root PEs to leaf PEs while preventing multiple copies of the same multicast packet on the same link.
OISM with IR versus inclusive mLDP illustrates the difference between using Ingress Replication (IR) and inclusive mLDP provider tunnels in OISM. With a source S1 connected to BD1 and sending a flow to G1, if IR is used, the multicast traffic is only sent to PEs with receivers for (S1,G1). However, if an inclusive mLDP tunnel on PE1 is used (right side of OISM with IR versus inclusive mLDP) the multicast flow is sent to all the PEs in the tenant domain. For example, PE3 receives the flow only to drop it because there are no local receivers.
mLDP tunnels are referred to as Inclusive BUM tunnels, because, although IP multicast traffic uses these tunnels, any BUM frame is also distributed to all PEs in the tenant. For example, in OISM with IR versus inclusive mLDP (right hand side), any BUM frame generated by any host connected to BD1 in PE1 uses the mLDP tunnel and is also sent to PE3.
The use of mLDP-inclusive provider tunnels in OISM requires the following configuration and procedures to be enabled on the PEs:
-
All the PEs in the OISM tenant domain that need to transmit or receive multicast traffic on an mLDP tree in a BD, are configured with the following commands:
configure service vpls provider-tunnel inclusive owner bgp-evpn-mpls configure service vpls provider-tunnel inclusive mldp
-
The PEs attached to the sources (root PEs) should be configured with the following command on the ordinary BDs, and the PEs attached to the receivers should be configured as root-and-leaf or leaf-only.
configure service vpls provider-tunnel inclusive root-and-leaf
-
The PEs attached to the receivers (leaf PEs) need to be configured using the following command on the BDs or SBDs.
- MD-CLI
configure service vpls bgp-evpn routes incl-mcast advertise-ingress-replication
- classic
CLI
configure service vpls bgp-evpn ingress-repl-inc-mcast-advertisement
- MD-CLI
-
The SBD must always be configured as leaf-only in all PEs, because the SBD mLDP tree is not used to transmit IP multicast.
-
For the IMET and SMET routes to be exported and imported with the correct route targets, no vsi-import or vsi-export policies should be configured on the ordinary BDs and the SBDs.
Assuming the above guidelines are followed, and as illustrated in OISM with IR versus inclusive mLDP (right side), the root PE (PE1) that is attached to the source in BD1 sends the multicast traffic in an mLDP tree that is joined by leaf PEs either on BD1 (if BD1 is exists) or on the SBD (if BD1 does not exist on the leaf PE).
Example of Inclusive Provider Tunnels in OISM
OISM with inclusive mLDP example illustrates an example of the OISM procedures with mLDP trees.
Consider three PEs, PE1, PE2, and PE3, attached to BD1/BD2, BD1, and BD3 respectively, as in OISM with inclusive mLDP example. Assume that the source S1 is connected to BD1 in PE1. PE2 and PE3 are leaf PEs, because they have receivers but no sources. In this example:
-
BD and SBD services must be configured for provider tunnel as follows:
-
To have PE1 sending multicast traffic in P2MP mLDP tunnels on BD1 and BD2, both BDs are configured using the following command.
They are also configured with the following command.configure service vpls provider-tunnel inclusive root-and-leaf
- MD-CLI
configure service vpls bgp-evpn routes incl-mcast advertise-ingress-replication
- classic
CLI
configure service vpls bgp-evpn ingress-repl-inc-mcast-advertisement
*A:PE-1>config>service>vpls# info ---------------------------------------------- allow-ip-int-bind exit bgp exit bgp-evpn evi 1 ingress-repl-inc-mcast-advertisement // default value mpls bgp 1 auto-bind-tunnel resolution any exit no shutdown exit exit provider-tunnel inclusive owner bgp-evpn-mpls root-and-leaf data-delay-interval 10 mldp no shutdown exit exit igmp-snooping / mld-snooping no shutdown exit <snip>
- MD-CLI
-
PE2 and PE3 BDs are configured as leaf as they must be able to join mLDP trees but not set up an mLDP tree themselves.
- MD-CLI
Do not configure root-and-leaf. An unconfigured root-and-leaf command functions as a leaf-only node. If configured, use the following command to delete the configuration.
configure groups group service vpls provider-tunnel inclusive delete root-and-leaf
- classic
CLI
configure service vpls provider-tunnel inclusive no root-and-leaf
- MD-CLI
configure service vpls bgp-evpn routes incl-mcast advertise-ingress-replication
- classic
CLI
configure service vpls bgp-evpn ingress-repl-inc-mcast-advertisement
Multicast traffic cannot use the mLDP tree unless there is an EVPN-MPLS destination in the MFIB for the multicast stream.
- MD-CLI
-
The SBDs in all PEs must be configured as follows:
and with the following command.configure service vpls provider-tunnel inclusive root-and-leaf
- MD-CLI
configure service vpls bgp-evpn routes incl-mcast advertise-ingress-replication
- classic
CLI
configure service vpls bgp-evpn ingress-repl-inc-mcast-advertisement
- MD-CLI
-
-
When the configuration is added, the PEs create EVPN-MPLS destinations as follows, where a destination is represented as {pe, label} with ‟pe” being the IP address of the remote PE and ‟label” being the EVPN label advertised by the remote PE.
-
PE1 creates the following EVPN-MPLS destinations:
-
On BD1: {pe2,bd1-L21}, {pe2,sbd-L22}, {pe3,sbd-L32}
-
On BD2: {pe2,sbd-L22}, {pe3,sbd-L32}
-
On SBD: {pe2,sbd-L22}, {pe3,sbd-L32}
-
-
PE2 creates destinations as follows:
-
On BD1: {pe1,bd1-L11}, {pe1,sbd-L13}, {pe3,sbd-L32}
-
On SBD: {pe1,sbd-L13}, {pe3,sbd-L32}
-
-
PE3 creates destinations as follows:
-
On BD3: {pe1,sbd-L13}, {pe2,sbd-L22}
-
On SBD: {pe1,sbd-L13}, {pe2,sbd-L22}
-
-
PE2's BD1 and PE3's BD3 does not create an EVPN-MPLS destination to PE1's BD2. Also, PE3's BD3 does not create a destination to PE1's BD1. This is in spite of receiving IMET-Composite routes for those BDs with the SBD-RT, which is imported in PE2/PE3 ordinary BDs.
-
-
As an example, on BD1, PE1's IGMP process adds the EVPN-MPLS destinations {pe2,bd1-L21}, {pe3,sbd-L32} to the MFIB. The third destination {pe2,sbd-L22} is kept down because the EVPN-MPLS destination in BD1 has higher priority.
-
Upon receiving the SMET route from PE2, PE1 adds {pe2,bd1-L21} as OIF for the MFIB (*,G1).
-
In the meantime, PE2 and PE3 have joined the mLDP tree with tunnel-id 1.
-
When multicast to G1 is received from S1, because there is an MFIB EVPN OIF entry, the multicast traffic is forwarded. At the IOM level, PE1 replaces the MFIB EVPN destination with the P2MP tunnel with tunnel-id 1, as long as the P2MP tree is operationally up.
-
The multicast traffic is sent along the mLDP tree and arrives at PE2/BD1 and PE3/SBD. Then local forwarding or routing is performed in PE2 and PE3, as normally in OISM.
-
OISM interworking with MVPN and PIM for MEG or PEG gateways
For EVPN OISM to successfully interwork with MVPN and PIM, it is important to ensure that the MVPN/PIM procedures in the IPVPN network are not modified. In this interworking scenario, two (or more) OISM PEs act as the gateway between the EVPN and the MVPN/PIM network to ensure the OISM procedures are transparent to MVPN/PIM, and vice versa.
SR OS supports the MVPN-to-EVPN Gateway (MEG) and PIM-to-EVPN Gateway (PEG)functions in accordance with draft-ietf-bess-evpn-irb-mcast. Both, Ingress Replication (IR) and mLDP trees are supported on the SBD so that multicast traffic can be received from or transmitted to OISM PEs.
When more than one MEG or PEG is present per EVPN tenant (that is, per SBD), one of the MEG or PEGs acts as the MEG or PEG designated router (DR). The following are the special functions of MEG/PEGs DRs.
-
The DRs behave as a First Hop Router (FHR) from the MVPN/PIM network perspective and register sources in the OISM domain with the RP in the MVPN/PIM domain.
-
The DRs behave as Last Hop Router (LHR) from the MVPN/PIM network perspective, and join the shared or source tree. The non-DR PEs remove the SBD R-VPLS interface from the VPRN’s Layer 3 multicast OIF list, which prevents the PEs from sending multicast traffic to the OISM receivers.
The MEG or PEG DR election occurs in each PE attached to the SBD configured as MEG or PEG. Each PE builds a DR candidate list based on the reception of the Inclusive Multicast Ethernet Tag (IMET) routes for the SBD that include the MEG and/or PEG flag. After the timer set using the dr-activation-timer expires, the PE runs the DR election based on the default algorithm used for EVPN DF election (modulo function of the EVI and number of PEs). The dr-activation-timer command is configured in the following context:
-
MD-CLI
configure service vpls routed-vpls multicast evpn-gateway
-
Classic CLI
configure service vpls allow-ip-int-bind evpn-mcast-gateway
-
MD-CLI
configure service vpls routed-vpls multicast evpn-gateway advertise
-
classic CLI
configure service vpls allow-ip-int-bind evpn-mcast-gateway advertise
Procedures for sources in MVPN and PIM and receivers in OISM, Procedures for ASM sources in OISM and receivers in MVPN, and Procedures for SSM sources in OISM and receivers in MVPN describe the MVPN and PIM procedures depending on whether the sources and receivers are attached to the OISM or MVPN network.
Procedures for sources in MVPN and PIM and receivers in OISM
The MEG DR for the SBD generates C-multicast source/shared tree join routes for receivers in the OISM domain. The following information applies to this procedure:
-
This is similar to a PIM DR and its Last Hop Router (LHR) function.
-
For receivers that are directly connected to the MEG DR, the MEG DR creates a Layer 3 multicast state upon receiving an IGMP or MLD message and generates the corresponding C-multicast routes. This handling applies to MEG and PEG PEs.
-
For receivers not directly connected, the MEG DR creates a Layer 3 multicast state upon receiving an SMET route from the PE connected to the receiver. Based on this newly created state, the MEG generates the corresponding C-multicast routes. This scenario is shown in Sources in the MVPN and PIM network .
-
Use one of the following commands to trigger the non-DR MEG to create the Layer 3 multicast state too and advertises the C-multicast routes to attract the multicast traffic. The attracted multicast traffic is dropped at the non-DR MEG; however, configuring the following command enables a faster convergence in case of a MEG DR failure.
-
Classic CLI
configure service vpls routed-vpls multicast evpn-gateway non-dr-attract-traffic from-pim-mvpn
-
MD-CLI
configure service vpls allow-ip-int-bind evpn-mcast-gateway non-dr-attract-traffic from-pim-mvpn
-
The following figure displays sources in the MVPN and PIM network.
Procedures for ASM sources in OISM and receivers in MVPN
When Any-Source Multicast (ASM) group sources in the EVPN OISM domain, the MEG DR for the SBD needs to attract the ASM traffic from the EVPN sources and initiate the MVPN register and source discovery procedure. This is homologous to a PIM DR and its First Hop Router (FHR) function. The following figure displays ASM sources in the EVPN network.
To attract ASM source traffic and act as the FHR, the MEG DR performs the following steps:
-
The MEG DR generates a wildcard SMET route.
The wildcard SMET route is automatically generated as soon as the MEG is elected as DR. The wildcard SMET route is formatted in accordance with RFC 6625, with address and length equal to zero.
In addition, to attract multicast traffic from ASM sources on the non-DR routers, the user can configure thethe user can configure the following command in the configure service vpls context in the SBD. This command triggers the advertisement of the wildcard SMET route from the non-DR routers:
-
MD-CLI
routed-vpls multicast evpn-gateway non-dr-attract-traffic from-evpn-pim-mvpn
-
Classic CLI
allow-ip-int-bind evpn-mcast-gateway non-dr-attract-traffic from-evpn
-
-
When the MEG DR (for example, PE3 in ASM sources in the EVPN network ) receives the ASM multicast traffic, it is handled as follows:
- Assuming the MEG DR does not have the Layer 3 multicast state for it, the multicast traffic, (S1,G1) in the example shown in ASM sources in the EVPN network, is sent to the CPM.
- The CPM encapsulates the multicast traffic into unicast register messages to the RP. For example, PE5 decapsulates the traffic and sends the multicast traffic down the shared tree.
- In MVPN, PE5 triggers Source A-D routes and a C-multicast route for (S1,G1), and the SPT switchover occurs.
-
If the MEG non-DR (for example, PE4) receives the ASM multicast traffic, it is handled as follows:
- Assuming the MEG non-DR does not have the Layer 3 multicast state for it, the multicast traffic, (S1,G1) in the example shown in ASM sources in the EVPN network, is sent to the CPM and discarded.
- The multicast traffic is heavily rate limited in the CPM.
- On the remote OISM PEs attached to the ASM source (for example, PE1), the PE creates
an MFIB for (*,*) with OIFs for the MEGs that sent the wildcard SMET route. For
example:
- (*,*) OIF: evpn-dest-PE3 (assuming non-dr-attract-traffic false)
- (*,*) OIF: evpn-dest-PE3, evpn-dest-PE4 (assuming non-dr-attract-traffic true)
- Any multicast traffic is forwarded based on the MFIB for (*,*).
- The preceding handling also applies to ASM sources attached to the non-DR MEG or PEG. The non-DR creates an EVPN destination (on the BD attached to the source) to the DR as OIF for (*,*).
Procedures for SSM sources in OISM and receivers in MVPN
Irrespective of the DR election and source discovery process, when a MEG receives an MVPN C-multicast join route, it creates the Layer 3 multicast state and generates an SMET route for the S,G. This is shown in the following figure.
-
PE6 may pick PE4 as the Upstream Multicast Hop (UMH) PE for S1,G1 following regular MVPN procedures. In this case, PE4 adds the SBD interface as IIF, the MVPN tunnel as OIF member, and generates an SMET (S1,G1) route to draw the multicast traffic.
-
After PE4 creates the state for (S1,G1), traffic to (S1,G1) is no longer sent to the CPM to be discarded, but it is forwarded in the datapath based on the Layer 3 MFIB state.
-
PE1 creates an MFIB for (S1,G1) and starts sending traffic to PE4. The following are two potential scenarios in this case.
- If PE4 is configured in the classic CLI as no
non-dr-attract-traffic (or in the MD-CLI as
non-dr-attract-traffic none), it does not send the
wildcard SMET. PE1 creates the following entries in the MFIB and sends
traffic to both MEGs:
- (*,*) oif: evpn-dest-PE3
- (S1,G1) oif: evpn-dest-PE3, evpn-dest-PE4
- If PE4 is configured in the classic and MD-CLI as
non-dr-attract-traffic from-evpn , PE4 and PE3 both
send the wildcard SMET. PE1, ignores any SMET (S/*,G) routes from a PE when
a SMET (*,*) is received from the same PE. If the (*,*) route is removed,
PE1 reverts to handling (S/*,G) entries. For this reason, PE1 in this case
creates only (*,*) OIFs and sends the traffic to both MEGs.
The following OIF entry is created: (*,*) oif: evpn-dest-PE3, evpn-dest-PE4.
- If PE4 is configured in the classic CLI as no
non-dr-attract-traffic (or in the MD-CLI as
non-dr-attract-traffic none), it does not send the
wildcard SMET. PE1 creates the following entries in the MFIB and sends
traffic to both MEGs:
MEG or PEG gateways and local receivers or sources
This section uses examples to describe the applicable considerations for local receivers and sources on MEG and PEG PEs.
Local singlehomed Receiver-2 on a MEG or PEG PE1, BD2
Local singlehomed receivers shows an initial situation where PE1 and PE2 are MEG/PEGs and PE2 is elected as the MEG or PEG DR. PE1 and PE2 do not have an EVPN destination between their SBDs.
The following shows a local singlehomed receiver.
The following workflow applies to the example shown in the preceding graphic:
-
PE1 learns via IGMP/MLD that Receiver-2 is interested in (S1,G1).
As shown in Local singlehomed receivers, Receiver-1, which is connected to a remote OISM non-MEG PE, issues an IGMP join for the same group. This triggers the corresponding SMET route from PE3 and PE4.
-
PE1 determines from its route table that there is a route to S1 via IP-VPN.
PE1 originates an MVPN C-multicast source tree join (S,G) route or a PIM (S,G) join, via normal MVPN or PIM procedures.
- PE1 adds the MVPN tunnel or PIM interface as the Layer 3 IIF. The BD2 IRB is added to the Layer 3 OIF list.
- PE1 also issues an SMET route as usual.
- Since PE2 is the SBD’s MEG DR, PE2 also sends a PIM/C-multicast join route upon receiving the SMET route from PE3 and PE4.
- PE1 or PE2 receives the multicast traffic from the appropriate tunnel or interface,
and passes RPF check. PE1 sends multicast down the BD2 IRB to the receiver. Since
PE1 is non-DR for the SBD, the SBD IRB is not in the Layer 3 OIF
list.
PE2’s SBD does not send the multicast flow to PE1, because there are no EVPN multicast destinations between MEG or PEG PEs of the same SBD.
- Only PE2, the SBD’s MEG or PEG DR, sends the multicast down the SBD’s IRB to the remote OISM PEs and regular OISM forwarding follows on PE3 and PE4.
Local multihomed Receiver-1 on a pair of MEG or PEG PE1 and PE2, BD2
Local multihomed receivers, shows an initial situation where MEG or PEG routers PE1 and PE2 are multihomed to a local receiver in BD2. PE2 is the DR for the SBD. As both PEs are MEG or PEG for the same SBD, no EVPN multicast destination exists between the PEs for the SBD.
The following figure shows local multihomed receivers.
The following workflow applies to the example shown in the preceding graphic.
-
PE2 learns, via IGMP/MLD, that Receiver-1 is interested in (S1,G1) and adds the ES SAP to the OIF list.
-
PE2 synchronizes the (S1,G1) state with PE1 via BGP multicast synch routes and adds the ES SAP to the OIF list.
-
Both PE1 and PE2 originate an SMET (S1,G1) following normal OISM procedures.
Both PEs generate the corresponding MVPN/PIM join route for (S1,G1). This is because the MEG or PEG DR election occurs only in the SBD and the state is created in BD2. Consequently, both PEs send the MVPN/PIM join route in this case.
- Step 3 causes traffic
from the source to flow to both the DF and NDF, although only the DF forwards the
traffic.
- The MEG or PEG DR and non-DR states only impact the addition of the SBD interface to the Layer 3 OIF.
- The datapath extensions prevent MVPN traffic from being sent to EVPN destinations other than an SBD EVPN destination.
- PE2’s SBD is added to the Layer 3 OIF list. However, since there is no EVPN multicast destination between the MEG/PEGs of the same SBD, multicast is not sent from PE2 to PE1.
Local-bias behavior only applies to Layer 2 multicast (BUM in general) and not to Layer 3 multicast. That is, in Local multihomed receivers, the following applies to Layer 3 multicast traffic arriving at PE1 and PE2:
- can be forwarded to single-homed and DF SAPs in BD2
- cannot be forwarded to non-DF SAPs in BD2
- cannot be forwarded to EVPN destinations in BD2, in accordance with the OISM rules
Local multihomed source S2 on a pair of MEG or PEG PE1 and PE2, BD2
Local multihomed sources shows a scenario where PE1 and PE2 are multihomed to a local source. A local receiver is also using multihoming to the same MEG or PEG pair. PE2 is the SBD-DR.
When the source sends multicast traffic for S2,G2, VXLAN local-bias or regular ESI-label filtering ensures the multihomed local receiver does not get duplicate traffic. The following applies in this scenario:
- The MEG SBD-DR (PE2) still performs the FHR functionality in this case (sends register/Source A-D routes), even if the source was singlehomed to the non-DR.
- If S2 was singlehomed to PE2 only, to avoid tromboning, the source S2 would be learned via ARP/ND as a host route and advertised in a VPN-IP route to attract the join route on PE2.
- If S2 is multihomed, as shown in Local multihomed sources, tromboning may occur, but traffic still flows correctly. For
example:
- the remote PE performs UMH selection and picks up PE1
- PE1 generates a SMET route in the SBD as usual
- the SMET is imported and the state added to BD2 in PE2
- traffic is received by PE2, forwarded to PE1 via BD2, and then forwarded to the remote MVPN/PIM PE
MEG or PEG configuration example for Ingress Replication on the SBD
This section shows a configuration example for a pair of redundant MEGs. For a PEG example, replace the MVPN configuration for PIM interfaces in the VPRN service.
Each MEG in the pair is configured with a VPRN that contains the MVPN configuration and an SBD R-VPLS. It is assumed that there are no local sources or receivers in this example. The use of domain-id in the VPRN and the SBD R-VPLS prevents control plane loops for unicast routes reinjected from the IP-VPN domain into the EVPN domain, and the other way around. Preventing these loops guarantees the correct installation of unicast routes in the MEGs' route tables, and therefore ensures the C-multicast routes are correctly advertised and processed. See BGP D-PATH attribute for Layer 3 loop protection for more information about the configuration of domain ID. The following CLI shows the configuration in MEG1.
// MEG1’s VPRN service
*A:MEG1# configure service vprn 6000
*A:MEG1>config>service>vprn# info
----------------------------------------------
interface "SBD-6002" create
vpls "SBD-6002"
evpn-tunnel supplementary-broadcast-domain
exit
exit
bgp-ipvpn
mpls
auto-bind-tunnel
resolution any
exit
domain-id 64500:6000
route-distinguisher 192.0.2.2:6000
vrf-target target:64500:6000
no shutdown
exit
exit
igmp
interface "SBD-6002"
no shutdown
exit
no shutdown
exit
pim
interface "SBD-6002"
multicast-senders always
exit
apply-to all
rp
static
address 2.2.2.2
group-prefix 239.0.0.0/8
exit
exit
bsr-candidate
shutdown
exit
rp-candidate
shutdown
exit
exit
no shutdown
exit
mvpn
auto-discovery default
c-mcast-signaling bgp
intersite-shared persistent-type5-adv
provider-tunnel
inclusive
mldp
no shutdown
exit
exit
exit
vrf-target unicast
exit
exit
no shutdown
----------------------------------------------
// MEG1’s SBD service
*A:MEG1>config>service>vprn# /configure service vpls 6002
*A:MEG1>config>service>vpls# info
----------------------------------------------
allow-ip-int-bind
forward-ipv4-multicast-to-ip-int
forward-ipv6-multicast-to-ip-int
evpn-mcast-gateway create
non-dr-attract-traffic from-evpn from-pim-mvpn
no shutdown
exit
exit
bgp
exit
bgp-evpn
no mac-advertisement
ip-route-advertisement domain-id 64500:6002
sel-mcast-advertisement
evi 6002
mpls bgp 1
ingress-replication-bum-label
ecmp 2
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
igmp-snooping
no shutdown
exit
mld-snooping
no shutdown
exit
no shutdown
----------------------------------------------
The configuration of the redundant MEG2 is as follows:
// MEG2’s VPRN configuration
*A:MEG2# configure service vprn 6000
*A:MEG2>config>service>vprn# info
----------------------------------------------
interface "SBD-6002" create
vpls "SBD-6002"
evpn-tunnel supplementary-broadcast-domain
exit
exit
bgp-ipvpn
mpls
auto-bind-tunnel
resolution any
exit
domain-id 64500:6000
route-distinguisher 192.0.2.3:6000
vrf-target target:64500:6000
no shutdown
exit
exit
igmp
interface "SBD-6002"
no shutdown
exit
no shutdown
exit
pim
interface "SBD-6002"
multicast-senders always
exit
apply-to all
rp
static
address 3.3.3.3
group-prefix 239.0.0.0/8
exit
exit
bsr-candidate
shutdown
exit
rp-candidate
shutdown
exit
exit
no shutdown
exit
mvpn
auto-discovery default
c-mcast-signaling bgp
intersite-shared persistent-type5-adv
provider-tunnel
inclusive
mldp
no shutdown
exit
exit
exit
vrf-target unicast
exit
exit
no shutdown
----------------------------------------------
// MEG2 SBD configuration
*A:MEG2>config>service>vprn# /configure service vpls 6002
*A:MEG2>config>service>vpls# info
----------------------------------------------
allow-ip-int-bind
forward-ipv4-multicast-to-ip-int
forward-ipv6-multicast-to-ip-int
evpn-mcast-gateway create
non-dr-attract-traffic from-evpn from-pim-mvpn
no shutdown
exit
exit
bgp
exit
bgp-evpn
no mac-advertisement
ip-route-advertisement domain-id 64500:6002
sel-mcast-advertisement
evi 6002
mpls bgp 1
ingress-replication-bum-label
ecmp 2
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
igmp-snooping
no shutdown
exit
mld-snooping
no shutdown
exit
no shutdown
----------------------------------------------
After the preceding configuration is added, MEG1 and MEG2 run the DR election. In the following example, which displays a sample DR election result, MEG1 is the DR:
*A:MEG1# show service id "SBD-6002" evpn-mcast-gateway all
===============================================================================
Service Evpn Multicast Gateway
===============================================================================
Type : mvpn-pim
Admin State : Enabled
DR Activation Timer : 3 secs
Mvpn Evpn Gateway DR : Yes
Pim Evpn Gateway DR : Yes
===============================================================================
===============================================================================
Mvpn Evpn Gateway
===============================================================================
DR Activation Timer Remaining: 3 secs
DR : Yes
DR Last Change : 09/27/2021 08:50:32
===============================================================================
===============================================================================
Candidate list
===============================================================================
Orig-Ip Time Added
-------------------------------------------------------------------------------
192.0.2.2 09/27/2021 08:50:29
192.0.2.3 09/27/2021 08:51:20
-------------------------------------------------------------------------------
Number of Entries: 2
===============================================================================
===============================================================================
Pim Evpn Gateway
===============================================================================
DR Activation Timer Remaining: 3 secs
DR : Yes
DR Last Change : 09/27/2021 08:50:32
===============================================================================
===============================================================================
Candidate list
===============================================================================
Orig-Ip Time Added
-------------------------------------------------------------------------------
192.0.2.2 09/27/2021 08:50:29
192.0.2.3 09/27/2021 08:51:20
-------------------------------------------------------------------------------
Number of Entries: 2
===============================================================================
*A:MEG2# show service id “SBD-6002” evpn-mcast-gateway all
===============================================================================
Service Evpn Multicast Gateway
===============================================================================
Type : mvpn-pim
Admin State : Enabled
DR Activation Timer : 3 secs
Mvpn Evpn Gateway DR : No
Pim Evpn Gateway DR : No
===============================================================================
===============================================================================
Mvpn Evpn Gateway
===============================================================================
DR Activation Timer Remaining: 3 secs
DR : No
DR Last Change : 09/27/2021 08:51:24
===============================================================================
===============================================================================
Candidate list
===============================================================================
Orig-Ip Time Added
-------------------------------------------------------------------------------
192.0.2.2 09/27/2021 08:51:21
192.0.2.3 09/27/2021 08:50:37
-------------------------------------------------------------------------------
Number of Entries: 2
===============================================================================
===============================================================================
Pim Evpn Gateway
===============================================================================
DR Activation Timer Remaining: 3 secs
DR : No
DR Last Change : 09/27/2021 08:51:24
===============================================================================
===============================================================================
Candidate list
===============================================================================
Orig-Ip Time Added
-------------------------------------------------------------------------------
192.0.2.2 09/27/2021 08:51:21
192.0.2.3 09/27/2021 08:50:37
-------------------------------------------------------------------------------
Number of Entries: 2
===============================================================================
If a source 40.0.0.1 is located in a remote PE of the MVPN network, and it is streaming group 239.0.0.44, the DR (for example, MEG1) attracts the traffic (by sending a C-multicast source join route) and forwards it to the SBD. The non-DR MEG2 does not add the SBD to the OIF list, and therefore it does not forward the multicast traffic to the OISM domain. The following is a sample output for this scenario.
// On the DR, MEG1, the SBD-6002 is added to the OIF list
*A:MEG1# show router 6000 pim group 239.0.0.44 detail
===============================================================================
PIM Source Group ipv4
===============================================================================
Group Address : 239.0.0.44
Source Address : 40.0.0.1
RP Address : 2.2.2.2
Advt Router : 192.0.2.4
Flags : spt Type : (S,G)
Mode : sparse
MRIB Next Hop : 192.0.2.4
MRIB Src Flags : remote
Keepalive Timer : Not Running
Up Time : 2d 04:42:53 Resolved By : rtable-u
Up JP State : Joined Up JP Expiry : 0d 00:00:06
Up JP Rpt : Not Joined StarG Up JP Rpt Override : 0d 00:00:00
Register State : No Info
Reg From Anycast RP: No
Rpf Neighbor : 192.0.2.4
Incoming Intf : mpls-if-73731
Outgoing Intf List : SBD-6002
Curr Fwding Rate : 0.000 kbps
Forwarded Packets : 9999 Discarded Packets : 0
Forwarded Octets : 839916 RPF Mismatches : 0
Spt threshold : 0 kbps ECMP opt threshold : 7
Admin bandwidth : 1 kbps
-------------------------------------------------------------------------------
Groups : 1
===============================================================================
// SBD-6002 is not added to the OIF list on the non-DR MEG2
*A:PE-3# show router 6000 pim group 239.0.0.44 detail
===============================================================================
PIM Source Group ipv4
===============================================================================
Group Address : 239.0.0.44
Source Address : 40.0.0.1
RP Address : 3.3.3.3
Advt Router : 192.0.2.4
Flags : spt Type : (S,G)
Mode : sparse
MRIB Next Hop : 192.0.2.4
MRIB Src Flags : remote
Keepalive Timer : Not Running
Up Time : 2d 04:43:02 Resolved By : rtable-u
Up JP State : Joined Up JP Expiry : 0d 00:00:58
Up JP Rpt : Not Joined StarG Up JP Rpt Override : 0d 00:00:00
Register State : No Info
Reg From Anycast RP: No
Rpf Neighbor : 192.0.2.4
Incoming Intf : mpls-if-73733
Outgoing Intf List :
Curr Fwding Rate : 0.000 kbps
Forwarded Packets : 0 Discarded Packets : 0
Forwarded Octets : 0 RPF Mismatches : 0
Spt threshold : 0 kbps ECMP opt threshold : 7
Admin bandwidth : 1 kbps
-------------------------------------------------------------------------------
Groups : 1
===============================================================================
MEG or PEG configuration example for mLDP on the SBD
This section shows a configuration example for a pair of redundant MEGs that use mLDP in the SBD to transmit and receive multicast traffic.
As in the previous example, each MEG in the pair is configured with a VPRN that contains the MVPN configuration and an SBD R-VPLS. Local sources/receivers are supported in this example and they are attached to local BDs or local interfaces in the VPRN. A receiver connected to a local BD (BD-6023) is multihomed to MEG1 and MEG2. Also, as in the previous example, the use of domain-id in the VPRN and the SBD R-VPLS prevents control plane loops for unicast routes. The following CLI shows the configuration in MEG1.
// MEG1’s VPRN service
*A:MEG1# configure service vprn 6000
*A:MEG1>config>service>vprn# info
----------------------------------------------
local-routes-domain-id 64500:2 // avoids loops for local routes
interface "BD-6023" create // local BD
address 11.0.0.2/24
vrrp 1 passive
backup 11.0.0.254
exit
vpls "BD-6023"
evpn
arp
no learn-dynamic
advertise dynamic
exit
exit
exit
exit
interface "SBD-6002" create
vpls "SBD-6002"
evpn-tunnel supplementary-broadcast-domain
exit
exit
interface "local" create // local interface
address 20.0.0.254/24
sap pxc-6.a:600 create
exit
exit
bgp-ipvpn
mpls
auto-bind-tunnel
resolution any
exit
domain-id 64500:6000
route-distinguisher 192.0.2.2:6000
vrf-target target:64500:6000
no shutdown
exit
exit
igmp
interface "BD-6023"
no shutdown
exit
interface "SBD-6002"
no shutdown
exit
interface "local"
no shutdown
exit
exit
pim
interface "SBD-6002"
multicast-senders always
exit
apply-to all
rp
static
address 4.4.4.4
group-prefix 224.0.0.0/4
exit
exit
bsr-candidate
shutdown
exit
rp-candidate
shutdown
exit
exit
no shutdown
exit
mvpn
auto-discovery default
c-mcast-signaling bgp
intersite-shared persistent-type5-adv
provider-tunnel
inclusive
mldp
no shutdown
exit
exit
exit
vrf-target unicast
exit
exit
no shutdown
----------------------------------------------
// MEG1’s SBD service
*A:MEG1>config>service>vprn# /configure service vpls 6002
*A:MEG1>config>service>vpls# info
----------------------------------------------
allow-ip-int-bind
forward-ipv4-multicast-to-ip-int
forward-ipv6-multicast-to-ip-int
evpn-mcast-gateway create
non-dr-attract-traffic from-evpn from-pim-mvpn
no shutdown
exit
exit
bgp
exit
bgp-evpn
no mac-advertisement
ip-route-advertisement domain-id 64500:6002
sel-mcast-advertisement
evi 6002
mpls bgp 1
ingress-replication-bum-label
ecmp 2
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
provider-tunnel // mldp is enabled on the SBD
inclusive
owner bgp-evpn-mpls
data-delay-interval 10
root-and-leaf
mldp
no shutdown
exit
exit
igmp-snooping
no shutdown
exit
mld-snooping
no shutdown
exit
no shutdown
----------------------------------------------
// MEG1’s local BD-6023 service
*A:MEG1>config>service>vprn# /configure service vpls 6023
*A:MEG1>config>service>vpls# info
----------------------------------------------
allow-ip-int-bind
forward-ipv4-multicast-to-ip-int
forward-ipv6-multicast-to-ip-int
igmp-snooping
mrouter-port
exit
mld-snooping
mrouter-port
exit
exit
bgp
exit
bgp-evpn
evi 623
mpls bgp 1
ingress-replication-bum-label
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
provider-tunnel
inclusive
owner bgp-evpn-mpls
data-delay-interval 10
root-and-leaf
mldp
no shutdown
exit
exit
stp
shutdown
exit
igmp-snooping
no shutdown
exit
mld-snooping
no shutdown
exit
sap lag-1:623 create
igmp-snooping
send-queries
exit
no shutdown
exit
no shutdown
The configuration of the redundant MEG2 is as follows:
// MEG2’s VPRN configuration
*A:MEG2# configure service vprn 6000
*A:MEG2>config>service>vprn# info
----------------------------------------------
local-routes-domain-id 64500:3
interface "BD-6023" create
address 11.0.0.3/24
vrrp 1 passive
backup 11.0.0.254
exit
vpls "BD-6023"
evpn
arp
no learn-dynamic
advertise dynamic
exit
exit
exit
exit
interface "SBD-6002" create
vpls "SBD-6002"
evpn-tunnel supplementary-broadcast-domain
exit
exit
interface "local" create
address 30.0.0.254/24
sap pxc-6.a:600 create
exit
exit
bgp-ipvpn
mpls
auto-bind-tunnel
resolution any
exit
domain-id 64500:6000
route-distinguisher 192.0.2.3:6000
vrf-target target:64500:6000
no shutdown
exit
exit
igmp
interface "BD-6023"
no shutdown
exit
interface "SBD-6002"
no shutdown
exit
interface "local"
no shutdown
exit
no shutdown
exit
pim
interface "SBD-6002"
multicast-senders always
exit
apply-to all
rp
static
address 4.4.4.4
group-prefix 224.0.0.0/4
exit
exit
bsr-candidate
shutdown
exit
rp-candidate
shutdown
exit
exit
no shutdown
exit
mvpn
auto-discovery default
c-mcast-signaling bgp
intersite-shared persistent-type5-adv
provider-tunnel
inclusive
mldp
no shutdown
exit
exit
exit
vrf-target unicast
exit
exit
no shutdown
----------------------------------------------
// MEG2 SBD configuration
*A:MEG2>config>service>vprn# /configure service vpls 6002
*A:MEG2>config>service>vpls# info
----------------------------------------------
allow-ip-int-bind
forward-ipv4-multicast-to-ip-int
forward-ipv6-multicast-to-ip-int
evpn-mcast-gateway create
non-dr-attract-traffic from-evpn from-pim-mvpn
no shutdown
exit
exit
bgp
exit
bgp-evpn
no mac-advertisement
ip-route-advertisement domain-id 64500:6002
sel-mcast-advertisement
evi 6002
mpls bgp 1
ingress-replication-bum-label
ecmp 2
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
provider-tunnel
inclusive
owner bgp-evpn-mpls
data-delay-interval 10
root-and-leaf
mldp
no shutdown
exit
exit
igmp-snooping
no shutdown
exit
mld-snooping
no shutdown
exit
no shutdown
----------------------------------------------
// MEG2’s local BD-6023 service
A:MEG2>config>service>vpls# /configure service vpls 6023
A:MEG2>config>service>vpls# info
----------------------------------------------
allow-ip-int-bind
forward-ipv4-multicast-to-ip-int
forward-ipv6-multicast-to-ip-int
igmp-snooping
mrouter-port
exit
mld-snooping
mrouter-port
exit
exit
bgp
exit
bgp-evpn
evi 623
mpls bgp 1
ingress-replication-bum-label
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
provider-tunnel
inclusive
owner bgp-evpn-mpls
data-delay-interval 10
root-and-leaf
mldp
no shutdown
exit
exit
stp
shutdown
exit
igmp-snooping
no shutdown
exit
mld-snooping
no shutdown
exit
sap lag-1:623 create
igmp-snooping
send-queries
exit
no shutdown
exit
no shutdown
----------------------------------------------
After the preceding configuration is added, MEG1 and MEG2 run the DR election. As in the previous example, MEG1 is elected as DR and MEG2 as non-DR. In this example, the SBD is using mLDP instead of ingress replication to transmit and receive multicast traffic. The following sample output shows the status of the provider-tunnel in MEG1 and MEG2.
A:MEG1# show service id "SBD-6002" provider-tunnel
===============================================================================
Service Provider Tunnel Information
===============================================================================
Type : inclusive Root and Leaf : enabled
Admin State : enabled Data Delay Intvl : 10 secs
PMSI Type : ldp LSP Template :
Remain Delay Intvl : 0 secs LSP Name used : 8195
PMSI Owner : bgpEvpnMpls
Oper State : up Root Bind Id : 32767
===============================================================================
A:MEG1# tools dump service id "SBD-6002" provider-tunnels
===============================================================================
VPLS 6002 Inclusive Provider Tunnels Originating
===============================================================================
ipmsi (LDP) P2MP-ID Root-Addr
-------------------------------------------------------------------------------
8195 8195 192.0.2.2
-------------------------------------------------------------------------------
===============================================================================
VPLS 6002 Inclusive Provider Tunnels Terminating
===============================================================================
ipmsi (LDP) P2MP-ID Root-Addr
-------------------------------------------------------------------------------
8193 192.0.2.1
-------------------------------------------------------------------------------
A:MEG2# show service id "SBD-6002" provider-tunnel
===============================================================================
Service Provider Tunnel Information
===============================================================================
Type : inclusive Root and Leaf : enabled
Admin State : enabled Data Delay Intvl : 10 secs
PMSI Type : ldp LSP Template :
Remain Delay Intvl : 0 secs LSP Name used : 8195
PMSI Owner : bgpEvpnMpls
Oper State : up Root Bind Id : 32767
===============================================================================
A:MEG2# tools dump service id "SBD-6002" provider-tunnels
===============================================================================
VPLS 6002 Inclusive Provider Tunnels Originating
===============================================================================
ipmsi (LDP) P2MP-ID Root-Addr
-------------------------------------------------------------------------------
8195 8195 192.0.2.3
-------------------------------------------------------------------------------
===============================================================================
VPLS 6002 Inclusive Provider Tunnels Terminating
===============================================================================
ipmsi (LDP) P2MP-ID Root-Addr
-------------------------------------------------------------------------------
8193 192.0.2.1
-------------------------------------------------------------------------------
Also the example in MEG or PEG configuration example for Ingress Replication on the SBD showed the IIF and OIF lists on the MEGs for a source 40.0.0.1 that was connected to a remote MVPN PE and was streaming group 239.0.0.44. In this example, there is a source 10.0.0.1 connected to a remote OISM PE and it is streaming group 239.0.0.1. The group has local receivers on the local SBD-6023, which is multihomed to MEG1 and MEG2, and has receivers on a remote MVPN PE. The remote MVPN PE is configured with the default mvpn umh-selection highest-ip and therefore a local join triggers a C-multicast source-join route that is imported only by MEG2 (given that it has higher IP address than MEG1).
The following shows a sample output for this scenario.
// The source-join route for 239.0.0.1 is only imported by MEG2
A:MEG1# show router bgp routes mvpn-ipv4 type source-join group-ip 239.0.0.1 source-ip 10.0.0.1
===============================================================================
BGP Router ID:192.0.2.2 AS:64500 Local AS:64500
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP MVPN-IPv4 Routes
===============================================================================
Flag RouteType OriginatorIP LocalPref MED
RD SourceAS Path-Id IGP Cost
Nexthop SourceIP Label
As-Path GroupIP
-------------------------------------------------------------------------------
No Matching Entries Found.
===============================================================================
A:PE-3# show router bgp routes mvpn-ipv4 type source-join group-ip 239.0.0.1 source-ip 10.0.0.1
===============================================================================
BGP Router ID:192.0.2.3 AS:64500 Local AS:64500
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP MVPN-IPv4 Routes
===============================================================================
Flag RouteType OriginatorIP LocalPref MED
RD SourceAS Path-Id IGP Cost
Nexthop SourceIP Label
As-Path GroupIP
-------------------------------------------------------------------------------
u*>i Source-Join - 100 0
192.0.2.3:6000 64500 None -
192.0.2.4 10.0.0.1
No As-Path 239.0.0.1
-------------------------------------------------------------------------------
Routes : 1
===============================================================================
// Therefore, only MEG2 will add the MVPN tunnel to the OIF list for the group
// MEG1 only adds the local BD-6023 to the OIF list
A:MEG1# show router 6000 pim group 239.0.0.1 detail
===============================================================================
PIM Source Group ipv4
===============================================================================
Group Address : 239.0.0.1
Source Address : 10.0.0.1
RP Address : 4.4.4.4
Advt Router :
Flags : spt Type : (S,G)
Mode : sparse
MRIB Next Hop : 10.0.0.1
MRIB Src Flags : direct
Keepalive Timer Exp: 0d 00:02:43
Up Time : 1d 17:27:02 Resolved By : rtable-u
Up JP State : Joined Up JP Expiry : 0d 00:00:00
Up JP Rpt : Not Joined StarG Up JP Rpt Override : 0d 00:00:00
Register State : Pruned Register Stop Exp : 0d 00:00:09
Reg From Anycast RP: No
Rpf Neighbor : 10.0.0.1
Incoming Intf : SBD-6002
Outgoing Intf List : BD-6023, SBD-6002
Curr Fwding Rate : 67.200 kbps
Forwarded Packets : 15258 Discarded Packets : 0
Forwarded Octets : 1281672 RPF Mismatches : 0
Spt threshold : 0 kbps ECMP opt threshold : 7
Admin bandwidth : 1 kbps
-------------------------------------------------------------------------------
Groups : 1
===============================================================================
// on MEG1's local BD-6023 there is a receiver on sap:lag-1:623
A:MEG1# show service id "BD-6023" mfib
===============================================================================
Multicast FIB, Service 6023
===============================================================================
Source Address Group Address Port Id Svc Id Fwd
Blk
-------------------------------------------------------------------------------
* * mpls:192.0.2.3:524258 Local Fwd
10.0.0.1 239.0.0.1 sap:lag-1:623 Local Fwd
mpls:192.0.2.3:524258 Local Fwd
* * (mac) mpls:192.0.2.3:524258 Local Fwd
-------------------------------------------------------------------------------
Number of entries: 3
===============================================================================
// MEG2 adds the local BD-6023 and the MVPN tunnel to the OIF list
A:PE-3# show router 6000 pim tunnel-interface
===============================================================================
PIM Interfaces ipv4
===============================================================================
Interface Originator Address Adm Opr Transport Type
-------------------------------------------------------------------------------
mpls-if-73729 192.0.2.3 Up Up Tx-IPMSI
mpls-if-73733 192.0.2.4 Up Up Rx-IPMSI
mpls-if-73736 192.0.2.2 Up Up Rx-IPMSI
-------------------------------------------------------------------------------
Interfaces : 3
===============================================================================
A:PE-3# show router 6000 pim group 239.0.0.1 detail
===============================================================================
PIM Source Group ipv4
===============================================================================
Group Address : 239.0.0.1
Source Address : 10.0.0.1
RP Address : 4.4.4.4
Advt Router :
Flags : spt Type : (S,G)
Mode : sparse
MRIB Next Hop : 10.0.0.1
MRIB Src Flags : direct
Keepalive Timer : Not Running
Up Time : 1d 17:32:43 Resolved By : rtable-u
Up JP State : Joined Up JP Expiry : 0d 00:00:00
Up JP Rpt : Not Joined StarG Up JP Rpt Override : 0d 00:00:00
Register State : No Info
Reg From Anycast RP: No
Rpf Neighbor : 10.0.0.1
Incoming Intf : SBD-6002
Outgoing Intf List : BD-6023, mpls-if-73729
Curr Fwding Rate : 66.864 kbps
Forwarded Packets : 27221 Discarded Packets : 0
Forwarded Octets : 2286564 RPF Mismatches : 0
Spt threshold : 0 kbps ECMP opt threshold : 7
Admin bandwidth : 1 kbps
-------------------------------------------------------------------------------
Groups : 1
===============================================================================
EVPN Layer-2 multicast (IGMP/MLD proxy)
SR OS supports EVPN Layer-2 multicast as described in the EVPN IGMP/MLD Proxy specification RFC9251. When this is enabled in a VPLS service with active IGMP or MLD snooping, IGMP or MLD messages are no longer sent to EVPN destinations. SMET routes (EVPN routes type 6) are advertised instead, so that the interest in a specific (S,G) can be signaled to the rest of the PEs attached to the same VPLS (also known as a Broadcast Domain (BD)). See SMET routes replace IGMP/MLD reports.
A VPLS service supporting EVPN-based proxy-IGMP/MLD functionality is configured as follows:
vpls 1 name "evi-1" customer 1 create
bgp
exit
bgp-evpn
evi 1
sel-mcast-advertisement
vxlan
shutdown
exit
mpls
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
igmp/mld-snooping
evpn-proxy
no shutdown
exit
sap lag-1:101 create
igmp-snooping
send-queries
exit
no shutdown
exit
Where:
-
The sel-mcast-advertisement command allows the advertisement of SMET routes.
The received SMET routes are processed regardless of the command.
-
The evpn-proxy command in either the igmp-snooping or mld-snooping contexts:
-
triggers an IMET route update with the multicast flags EC and the proxy bits set. The multicast flags extended community carries a flag for IGMP proxy, that is set if igmp-snooping>evpn-proxy no shutdown is configured. Similarly, the MLD proxy flag is set if mld-snooping>evpn-proxy no shutdown is configured.
-
no longer turns EVPN MPLS into an Mrouter port, when used in EVPN MPLS service
-
enables EVPN proxy (IGMP or MLD snooping must be shutdown)
-
When the VPLS service is configured as an EVPN proxy service, IGMP or MLD queries or reports are no longer forwarded to EVPN destinations of PEs that support EVPN proxy. The reports are also no longer processed when received from PEs that support EVPN proxy.
The IGMP or MLD snooping function works in the following manner when the evpn-proxy command is enabled:
-
IGMP or MLD works in proxy mode despite its configuration as IGMP or MLD snooping.
-
Received IGMP or MLD join or leave messages on SAP or SDP bindings are processed by the proxy database to summarize the IGMP or MLD state in the service based on the group joined (each join for a group lists all sources to join). The proxy database can be displayed as follows.
# show service id 4000 igmp-snooping proxy-db =============================================================================== IGMP Snooping Proxy-reporting DB for service 4000 =============================================================================== Group Address Mode Up Time Num Sources ------------------------------------------------------------------------------- 239.0.0.1 exclude 0d 00:53:00 0 239.0.0.2 include 0d 00:53:01 1 ------------------------------------------------------------------------------- Number of groups: 2 ===============================================================================
-
When evpn-proxy is enabled, an additional EVPN proxy database is created to hand the version flags over to BGP and generate the SMET routes with the proper IGMP or MLD version flags. This EVPN proxy database is populated with local reports received on SAP or SDP binds but not with received SMET routes (the regular proxy database includes reports from SMETs too, without the version). The EVPN proxy database can be displayed as follows:
# show service id 4000 igmp-snooping evpn-proxy-db =============================================================================== IGMP Snooping Evpn-Proxy-reporting DB for service 4000 =============================================================================== Group Address Mode Up Time Num Sources V1 V2 V3 ------------------------------------------------------------------------------- 239.0.0.1 exclude 0d 00:53:55 0 V3 239.0.0.2 include 0d 00:53:55 1 V3 ------------------------------------------------------------------------------- Number of groups: 2 ===============================================================================
-
The EVPN proxy database or proxy database process IGMP or MLD reports as follows:
-
The EVPN proxy database result is communicated to the EVPN layer so that the corresponding SMET routes and flags are sent to the BGP peers. If multiple versions exist on the EVPN proxy database, multiple flags are set in the SMET routes.
-
The regular proxy database result is conveyed to the local Mrouter ports on SAP or SDP binds by IGMP or MLD reports and they are never sent to EVPN destinations of PEs with evpn-proxy configured.
-
-
IGMP or MLD messages received on local SAP or SDP bind Mrouter ports (which have a default *.* entry) and queries are not processed by the proxy database. Instead, they are forwarded to local SAP or SDP binds but never to EVPN destinations of PEs with evpn-proxy configured (they are, however, still sent to non-EVPN proxy PEs).
-
IGMP or MLD reports or queries are not received from EVPN PEs with evpn-proxy configured, but they are received and processed from EVPN PEs with no evpn-proxy configured. A PE determines if a specified remote PE, in the same BD, supports EVPN proxy based on the received igmp-proxy and mld-proxy flags along with the IMET routes.
-
The Layer-2 MFIB OIF list for an (S,G) is built out of the local IGMP or MLD reports and remote SMET routes.
-
For backwards compatibility, PEs that advertise IMET routes without the multicast flags EC or with the EC but without the proxy bit set, are considered as Mrouters. For example, its EVPN binds are added to all OIF lists and reports are sent to them.
-
Even if MLD snooping is shut down and only IGMP snooping is enabled, the MFIB shows the EVPN binds added to *,* for MAC scope. If MLD snooping is enabled, the EVPN binds are not added as Mrouter ports for MAC scope.
-
-
When SMET routes are received for a specific (S,G), the corresponding reports are sent to local SAP or SDP binds connected to queriers. The report version is set based on the local version of the querier.
The IGMP or MLD EVPN proxy functionality is supported in VPLS services with EVPN-VXLAN or EVPN-MPLS, and along with ingress replication or mLDP provider-tunnel trees.
In addition, EVPN proxy VPLS services support EVPN multihoming with multicast state synchronization using EVPN routes type 7 and 8. No additional command is needed to trigger the advertisement and processing of the multicast synch routes. In VPLS services, BGP sync routes are advertised or processed whenever the evpn-proxy command is enabled and there is a local Ethernet segment in the service. See EVPN OISM and multihoming for more information about the EVPN multicast synchronization routes and state synchronization in Ethernet segments.
Selective Provider Tunnels in OISM and EVPN-proxy services
Selective Provider Tunnels (S-PMSI)
Selective Provider Tunnels or Selective Provider Multicast Service Interface (S-PMSI) tunnels are supported in R-VPLS services configured in Optimized Inter-Subnet Multicast (OISM) mode or VPLS services configured in evpn-proxy mode.
Selective Provider Tunnels are signaled using the EVPN Selective Provider Multicast Service Interface Auto-Discovery (S-PMSI A-D) route, or EVPN route type 10. SR OS supports two types of Selective Provider Tunnels:
- mLDP wildcard S-PMSI trees, which are used to optimize the delivery of multicast and
forward it to only PEs with IP Multicast sources or receivers. Wildcard S-PMSIs are
enabled by the following
command.
configure service vpls provider-tunnel selective wildcard-spmsi
- mLDP specific S-PMSI trees for (S,G) and/or (*,G) groups, which are used to optimize the
delivery of some multicast groups that have receivers only in a limited number of PEs.
(S,G) and (*,G) S-PMSIs are enabled by configuring the following
command.
configure service vpls provider-tunnel selective data-threshold
The configuration of mLDP S-PMSIs for EVPN is similar to the mLDP S-PMSIs for MVPN. A data-threshold for a group-address and mask is configured. When the threshold (configured in kbps) for a group contained in the group-address and mask is exceeded, the router sets up a selective provider tunnel and the PEs with receivers for that group will join the mLDP selective tree. Options to setup the S-PMSIs based on the number of interested PEs are also supported, as well as a maximum-p2mp-spmsi parameter that limits the number of S-PMSI trees created per service.
BGP-EVPN S-PMSI A-D route
The EVPN Selective Provider Multicast Service Interface Auto-Discovery route or simply S-PMSI A-D route or route type 10 is required to advertise:
- Wildcard PMSI routes to setup mLDP IP multicast trees
- Selective S-PMSI routes to setup mLDP selective IP multicast trees.
The S-PMSI A-D route is specified in draft-ietf-bess-evpn-bum-procedure-updates and the format is depicted in S-PMSI A-D route format:
Where:
- All the fields are considered part of the route key for BGP processing.
- When a service is configured to advertise wildcard spmsi routes, a route type 10 is advertised with Source and Group being all zeros. Otherwise the Source and Groups are populated as in the case of the other multicast routes.
- The S-PMSI A-D route above is only supported along with tunnel type mLDP (in the Provider Tunnel Attribute). No other tunnel types are supported.
- While in VPLS evpn-proxy services the S-PMSI AD routes is advertised using the route
distinguisher and the route target of the service, in OISM mode, the S-PMSI A-D routes are
advertised from the SBD or from an ordinary BD:
- When advertised from an ordinary BD, the route includes the BD route-target (and route distinguisher) where the selective tree is configured plus the SBD route target
- When advertised from the SBD, the route includes the SBD route target only. This is only required in cases where the PE is a MEG/PEG.
Wildcard Selective Provider tunnels
The wildcard S-PMSI A-D route is supported in OISM and VPLS evpn-proxy modes.
- An mLDP tree can be configured as selective wildcard-spmsi in all the R-VPLS services of the tenant, including the SBD.
- An mLDP tree can be configured as selective wildcard-spmsi in VPLS services as long as evpn-proxy is enabled.
- The selective provider-tunnel configuration is blocked on services where evpn-proxy or OISM are not enabled.
- Based on the configuration of the following command PE1 signals a wildcard S-PMSI A-D
route for BD1 (in addition to the IMET routes as in the regular OISM case or the EVPN
proxy case). The route contains the SDB-RT (SBD's route target) in addition to the
BD1-RT (BD1's route
target).
configure service vpls provider-tunnel selective wildcard-spmsi
- PE2 and PE3 import the route as they would do for BD1 IMET in OISM mode. PE2 and PE3
join the wildcard S-PMSI mLDP tree if they have been enabled using the following command
and they have any local receivers that issued an IGMP/MLD join. A PE will not join the
wildcard S-PMSI if no local receivers are joined.
- MD-CLI
configure service vpls provider-tunnel selective admin-state enable
- classic
CLI
configure service vpls provider-tunnel selective no shutdown
- MD-CLI
- The impact of this procedure is twofold:
- PE1 now uses the wildcard S-PMSI mLDP tree for IP Multicast traffic. The IP multicast traffic is delivered to only those downstream PEs that joined the wildcard S-PMSI tree, and not the rest of the PEs of the tenant. Note that, in MVPN, the wildcard-spmsi does not carry traffic (the route does not even contain PTA). In EVPN, the wildcard-spmsi carries IP multicast and the route is advertised with an MLDP PTA.
- PE1 now sends BUM traffic to only the PEs attached to the source BD, for example, PE2, and not to PE-3, while still using MLDP for multicast traffic. Without wildcard-spmsis, if we wanted to use mLDP for multicast, it had to be used for BUM traffic too, which would mean BUM was attracted by PE3 as well (in the example above).
- The wildcard-spmsi is used for multicast and the BUM EVPN destinations can be used for BUM. Note that the PE1’s EVPN SBD destination bind to PE3 is of type multicast (‘m’), so it is not used for BUM.
Inclusive and Selective mLDP Provider Tunnels are not simultaneously supported in the same service.
(S,G) and (*,G) Selective Provider Tunnels
- PE1 may use wildcard-spmsi or regular inclusive forwarding for IP multicast traffic. In the example, PE1 uses wildcard-spmsi.
- PE2 and PE3 are configured with the following command and therefore join the
wildcard-spmsi tree.
- MD-CLI
configure service vpls provider-tunnel selective admin-state enable
- classic
CLI
configure service vpls provider-tunnel selective no shutdown
- MD-CLI
- Since PE2 receives a local IGMP join (*,G1), PE2 triggers an SMET (*,G1) that creates an MFIB entry for (*,G1) on PE1.
- PE1 is configured with a threshold in ‘kbps’ units for G1 and starts polling stats for all MFIB entries that include G1. When the configured kbps threshold (and optionally, the number of PEs for a S,G) is exceeded, PE1 signals an S-PMSI A-D route for the (*,G), and after the delay-interval it starts using the new tree for S,G.
- If PE1 receives an SMET (S,G), then it generates a S-PMSI A-D route for (S,G) instead.
- If both SMETs are received, for example, (*,G1) and (S,G1), both S-PMSI types are generated, with the different mLDP tree information (in that way, a receiver only interested in (S,G) would not attract (*,G) traffic).
- Interested PEs with local receivers for the (S,G) join the new tree. In the example, only PE2 joins the spmsi tree, because it is the only PE with a local MFIB entry for (*,G1).
*A:PE-2>config>service>vpls>provider-tunnel# tree detail
selective
|
+---data-delay-interval <seconds>
| no data-delay-interval
|
+---mldp
| no mldp
|
+---wildcard-spmsi
| no wildcard-spmsi
|
+---data-threshold {<c-grp-ip-addr/mask>|<c-grp-ip-addr> <netmask>} <s-pmsi-threshold> [pe-threshold-add <pe-threshold-add>] [pe-threshold-delete <pe-threshold-delete>]
data-threshold <c-grp-ipv6-addr/prefix-length> <s-pmsi-threshold> [pe-threshold-add <pe-threshold-add>] [pe-threshold-delete <pe-threshold-delete>]
no data-threshold {<c-grp-ip-addr/mask>|<c-grp-ip-addr> <netmask>}
no data-threshold <c-grp-ipv6-addr/prefix-length>|
|
+---maximum-p2mp-spmsi <range>
| no maximum-p2mp-spmsi
|
+---no shutdown
| shutdown
Where:
- The selective container and the commands above are supported in VPLS services in evpn-proxy mode and R-VPLS services in OISM mode, in particular, in all ordinary BDs and the SBD of MEG/PEG nodes.
- group-address/mask — specifies an IPv4 or IPv6 multicast group
address and netmask length. Multiple group-address/masks can be specified. In case of
overlapping ranges, for a aowxudux group, only the longest prefix match is used. For
instance, if the following two overlapping ranges are configured, and an SMET route for
(*,232.0.1.1) is received, the S-PMSI tree for (*,232.0.1.1) is created only when the BW
threshold exceeds
10kbps.
*A:PE-4>config>service>vpls>pt>selective$ info ---------------------------------------------- data-threshold 232.0.0.0/16 0 data-threshold 232.0.1.0/24 10 ----------------------------------------------
- s-pmsi-threshold - rate in kbps. If the rate for a given (S,G) or (*,G) within the specified group range exceeds the threshold, traffic for the (S,G) or (*,G) included in the group range is switched to the selective provider tunnel. Threshold 0 is also supported for mLDP. When threshold 0 is configured, the (S,G) or (*,G) switches to S-PMSI as soon as it is learned in the SBD/BD.
- pe-threshold-add — specifies the number of receiver PEs for creating S-PMSI. When the number of receiver PEs for a given multicast group configuration is non-zero and below this value, and the bandwidth threshold is satisfied, the S-PMSI is created. The number of receiver PEs is derived out of the SMET count (of routes included in the group range) for the SBD/BD. The originator-IP of the SMET route is checked so that the same PE is not counted multiple times. For example, for a (*,G1) SPMSI setup by PE1, if PE2 has a local receiver for (S1,G1) and another one for (S2,G1), PE2 issues two SMET routes. However, those are received by PE1 with the same originator-IP and therefore they count as one PE. The command pe-threshold-add dictates when to bring back the spmsi-tunnels after the number of receiver PEs counter has hit the pe-threshold-delete, in which case we have deleted the spmsi-tunnel for this group. It has no implication on when to setup the spmsi-tunnel, since the router always waits for the s-pmsi-threshold to be exceeded.
- pe-threshold-delete — specifies the number of receiver PEs needed to delete the S-PMSI. When the number of receiver PEs for a given multicast group configuration is above the threshold, the S-PMSI is deleted and the multicast group is moved to ingress replication EVPN destinations or a wildcard-spmsi if configured, or potentially to a (*,G) P2MP if the MFIB was previously using a (S,G) PMSI. It is recommended that the delete threshold is significantly larger than the add threshold, to avoid re-signaling of S-PMSI as the receiver PE count fluctuates.
- Note that the threshold add/delete commands are based on SMET route counts, which not
always match the number of receivers in the network for a specific (*,G) or (S,G). For
instance:
- SMETs may be received from non-spmsi enabled PEs. These routes are counted, however the receivers on these PEs do not get the traffic because they do not support spmsi trees.
- SMETs from a PE can be aggregated, for example, for local (*,G), (S1..Sn,G) state, SMETs are aggregated into a single (*,G) SMETs. That does not provide a clear indication of the amount of receivers for a specific group on the root PE.
- Examples of how these thresholds work are shown in the following tables, assuming pe-threshold-add 2 pe-threshold-delete
5:
Table 6. Receiver PE count rising thresholds Receiver PE count (rising) (based on SMET routes) PMSI used by the root PE Effect 0→1 Selective PE count < pe-threshold-add
S-PMSI used to carry traffic1→2 Selective PE count < pe-threshold-delete
Traffic remains on S-PMSI2→3 Selective PE count < pe-threshold-delete
Traffic remains on S-PMSI3→4 Selective PE count < pe-threshold-delete
Traffic remains on S-PMSI4→5 wildcard-spmsi if exists or EVPN destinations (Ingress Replication) PE count = pe-threshold-delete
Traffic switched back to wildcard-spmsi if exists or IR otherwise. Or potentially a (*,G) SPMSI if the MFIB was previously using it before moving to (S,G) SPMSITable 7. Receiver PE count falling thresholds Receiver PE count (falling) (based on SMET routes) PMSI used by root PE Effect 5 wildcard-spmsi if exists or EVPN destinations (Ingress Replication) Traffic flows on wildcard-spmsi if exists or IR. Or potentially a (*,G) SPMSI if the SMETs are for (S,G) 5→4 wildcard-spmsi if exists or EVPN destinations (Ingress Replication) PE count > pe-threshold-add
Traffic remains on wildcard-spmsi if exists or IR. Or potentially a (*,G) SPMSI if the SMETs are for (S,G)4→3 wildcard-spmsi if exists or EVPN destinations (Ingress Replication) PE count > pe-threshold-add
Traffic remains on wildcard-spmsi if exists or IR. Or potentially a (*,G) SPMSI if the SMETs are for (S,G)3→2 Selective PE count = pe-threshold-add
S-PMSI re-signaled. Traffic switched to S-PMSI.2→1 Selective Traffic flows on S-PMSI - maximum-p2mp-spmsi - determines the maximum number of originating spmsi tunnels in the service (including the wildcard-spmsi). This limit is not validated against the total number of p2mp tunnels supported in the system.
Other parameters are configured as in the provider-tunnel inclusive context.
Selective Provider Tunnels are also supported in MEG/PEG gateways. In a MEG/PEG scenario, when the source is attached to an OISM PE, the PE may not use the S-PMSI tree for a given (x,G) if the only OIF is the MEG/PEG Designated Router (DR). This is because the way the implementation handles the wildcard SMET versus specific SMET routes in the MFIB (a specific SMET does not create an entry if there is a wildcard SMET from the same PE). Suppose MEG1 and MEG2 are the two MEGs between an OISM and an MVPN network, where the receivers are in the MVPN network and the source in the OISM domain. In that case:
- An OISM PE would create only a (*,*) entry if it received a wildcard SMET and a (S,G) SMET from the MEG DR. Therefore even if the threshold for (S,G) is exceeded, the OISM PE still uses the wildcard S-PMSI as opposed to the more specific S-PMSI.
- In addition, the same OISM PE would create an OIF to the non-DR MEG and a (S,G) entry (with OIFs to the two MEGs) if it received a (S,G) SMET from the non-DR MEG.
Configuration examples for selective provider tunnels
The following sections provide example configurations for selective provider tunnels in EVPN proxy and OISM services.
Use of Selective Provider Tunnels in EVPN proxy services
// PE2
[ex:/configure service vpls "evpn-proxy-bd-10k"]
A:admin@PE-2# info
admin-state enable
service-id 10000
customer "1"
bgp 1 {
}
igmp-snooping {
admin-state enable
evpn-proxy {
admin-state enable
}
}
bgp-evpn {
evi 10000
routes {
sel-mcast {
advertise true
}
}
mpls 1 {
admin-state enable
ingress-replication-bum-label true
ecmp 2
auto-bind-tunnel {
resolution any
}
}
}
sap lag-1:100 {
igmp-snooping {
send-queries true
}
}
provider-tunnel {
selective {
admin-state enable
owner bgp-evpn-mpls
wildcard-spmsi true
mldp true
data-threshold {
group-prefix 224.0.0.0/4 {
threshold 0
}
}
}
}
// PE3
[ex:/configure service vpls "evpn-proxy-bd-10k"]
A:admin@PE-3# info
admin-state enable
service-id 10000
customer "1"
bgp 1 {
}
igmp-snooping {
admin-state enable
evpn-proxy {
admin-state enable
}
}
bgp-evpn {
evi 10000
routes {
sel-mcast {
advertise true
}
}
mpls 1 {
admin-state enable
ingress-replication-bum-label true
ecmp 2
auto-bind-tunnel {
resolution any
}
}
}
sap lag-1:100 {
igmp-snooping {
send-queries true
}
}
provider-tunnel {
selective {
admin-state enable
owner bgp-evpn-mpls
wildcard-spmsi true
mldp true
data-threshold {
group-prefix 224.0.0.0/4 {
threshold 0
}
}
}
}
// PE4
[ex:/configure service vpls "evpn-proxy-bd-10k"]
A:admin@PE-4# info
admin-state enable
service-id 10000
customer "1"
bgp 1 {
}
igmp-snooping {
admin-state enable
evpn-proxy {
admin-state enable
}
}
bgp-evpn {
evi 10000
routes {
sel-mcast {
advertise true
}
}
mpls 1 {
admin-state enable
ingress-replication-bum-label true
ecmp 2
auto-bind-tunnel {
resolution any
}
}
}
sap pxc-6.a:100 {
}
provider-tunnel {
selective {
admin-state enable
data-delay-interval 5
owner bgp-evpn-mpls
wildcard-spmsi true
mldp true
data-threshold {
group-prefix 224.0.0.0/4 {
threshold 0
}
group-prefix 239.0.0.0/8 {
threshold 1
}
}
}
}
Assuming a source with IP address 10.0.0.4 connected to PE4 starts sending
multicast traffic to 239.0.0.4, PE4 detects the stream and as soon as it exceeds the
configured threshold (1 kbps), PE4 advertises an S-PMSI A-D route. Since PE2 and PE3 receive
an IGMP join for (10.0.0.4,239.0.0.4), they advertise the corresponding SMET routes and the
S-PMSI trees are
setup:// PE4 advertises the S-PMSI A-D route since the received stream exceeds 1kbps:
A:PE-4#
439 2023/02/07 19:25:13.370 UTC MINOR: DEBUG #2001 Base Peer 1: 192.0.2.6
"Peer 1: 192.0.2.6: UPDATE
Peer 1: 192.0.2.6 - Send BGP UPDATE:
Withdrawn Length = 0
Total Path Attr Length = 100
Flag: 0x90 Type: 14 Len: 38 Multiprotocol Reachable NLRI:
Address Family EVPN
NextHop len 4 NextHop 192.0.2.4
Type: EVPN-SPMSI-AD Len: 27 RD: 192.0.2.4:10000, tag: 0, Mcast-Src-Len:
32, Mcast-Src-Addr: 10.0.0.4, Mcast-Grp-Len: 32, Mcast-Grp-Addr: 239.0.0.4, Orig Addr: 192.0.2.4/32
Flag: 0x40 Type: 1 Len: 1 Origin: 0
Flag: 0x40 Type: 2 Len: 0 AS Path:
Flag: 0x40 Type: 5 Len: 4 Local Preference: 100
Flag: 0xc0 Type: 16 Len: 16 Extended Community:
target:64500:10000
bgp-tunnel-encap:MPLS
Flag: 0xc0 Type: 22 Len: 22 PMSI:
Tunnel-type LDP P2MP LSP (2)
Flags: (0x0)[Type: None BM: 0 U: 0 Leaf: not required]
MPLS Label 0
Root-Node 192.0.2.4, LSP-ID 0x2008
"
3 2023/02/07 19:25:13.368 UTC MAJOR: SVCMGR #2320 Base
"Service Id 10000, Dynamic vplsPmsi SDP Bind Id 32767:4294967285 was created."
Output of the MFIB and S-PMSIs in PE2 and PE4
show service id "10000" mfib statistics
===============================================================================
Multicast FIB Statistics, Service 10000
===============================================================================
Source Address Group Address Matched Pkts Matched Octets
Forwarding Rate
-------------------------------------------------------------------------------
10.0.0.4 239.0.0.4 11190 1096620
77.616 kbps
* * (mac) 0 0
0.000 kbps
-------------------------------------------------------------------------------
Number of entries: 2
===============================================================================
show service id "10000" provider-tunnel spmsi-tunnels
===============================================================================
LDP Spmsi Tunnels
===============================================================================
LSP ID : 8199
Root Address : 192.0.2.2
S-PMSI If Index : 73750
Num. Leaf PEs : 1
Uptime : 0d 04:32:59
Group Address : 239.0.0.4
Source Address : 10.0.0.4
Origin IP Address : 192.0.2.2
State : TX Joined
Remain Delay Intvl : 0
-------------------------------------------------------------------------------
LSP ID : 8200
Root Address : 192.0.2.3
S-PMSI If Index : 73748
Num. Leaf PEs : 1
Uptime : 0d 04:33:02
Group Address : 239.0.0.4
Source Address : 10.0.0.4
Origin IP Address : 192.0.2.3
State : RX Joined
Remain Delay Intvl : 0
-------------------------------------------------------------------------------
LSP ID : 8200
Root Address : 192.0.2.4
S-PMSI If Index : 73754
Num. Leaf PEs : 1
Uptime : 0d 00:00:32
Group Address : 239.0.0.4
Source Address : 10.0.0.4
Origin IP Address : 192.0.2.4
State : RX Joined
Remain Delay Intvl : 0
-------------------------------------------------------------------------------
LSP ID : 8197
Root Address : 192.0.2.2
S-PMSI If Index : 73733
Uptime : 0d 04:32:59
Group Address : * (wildcard)
Source Address : *
Origin IP Address : 192.0.2.2
State : TX Joined
Remain Delay Intvl : 0
-------------------------------------------------------------------------------
LSP ID : 8198
Root Address : 192.0.2.3
S-PMSI If Index : 73747
Uptime : 0d 04:33:02
Group Address : * (wildcard)
Source Address : *
Origin IP Address : 192.0.2.3
State : RX Joined
Remain Delay Intvl : 0
-------------------------------------------------------------------------------
LSP ID : 8197
Root Address : 192.0.2.4
S-PMSI If Index : 73746
Uptime : 0d 04:33:02
Group Address : * (wildcard)
Source Address : *
Origin IP Address : 192.0.2.4
State : RX Joined
Remain Delay Intvl : 0
-------------------------------------------------------------------------------
===============================================================================
tools dump service id "10000" provider-tunnels type terminating
===============================================================================
VPLS 10000 Inclusive Provider Tunnels Terminating
===============================================================================
ipmsi (LDP) P2MP-ID Root-Addr
-------------------------------------------------------------------------------
8197 192.0.2.4
8198 192.0.2.3
8200 192.0.2.3
8200 192.0.2.4
-------------------------------------------------------------------------------
===============================================================================
VPLS 10000 Selective Provider Tunnels Terminating
===============================================================================
spmsi (LDP) Source-Addr Group-Addr Root-Addr LSP-ID Lsp-Name
-------------------------------------------------------------------------------
10.0.0.4 239.0.0.4 192.0.2.3 8200
10.0.0.4 239.0.0.4 192.0.2.4 8200
* * 192.0.2.3 8198
* * 192.0.2.4 8197
-------------------------------------------------------------------------------
Outputs in PE4
===============================================================================
Multicast FIB Statistics, Service 10000
===============================================================================
Source Address Group Address Matched Pkts Matched Octets
Forwarding Rate
-------------------------------------------------------------------------------
* * 0 0
0.000 kbps
10.0.0.4 239.0.0.4 31484 3211368
57.691 kbps
* * (mac) 0 0
0.000 kbps
-------------------------------------------------------------------------------
Number of entries: 3
===============================================================================
tools dump service id 10000 provider-tunnels type originating
===============================================================================
VPLS 10000 Inclusive Provider Tunnels Originating
===============================================================================
No Tunnels Found
-------------------------------------------------------------------------------
===============================================================================
VPLS 10000 Selective Provider Tunnels Originating
===============================================================================
spmsi (LDP) Source-Addr Group-Addr Root-Addr LSP-ID Lsp-Name
-------------------------------------------------------------------------------
10.0.0.4 239.0.0.4 192.0.2.4 8200 8200
* * 192.0.2.4 8197 8197
-------------------------------------------------------------------------------
Use of Selective Provider Tunnel in OISM service
// PE2's relevant configuration for OISM
[ex:/configure service]
A:admin@PE-2# info
vpls "BD20023" {
admin-state enable
service-id 20023
customer "1"
routed-vpls {
multicast {
ipv4 {
forward-to-ip-interface true
}
}
}
bgp 1 {
}
igmp-snooping {
admin-state enable
}
bgp-evpn {
evi 20023
mpls 1 {
admin-state enable
ingress-replication-bum-label true
auto-bind-tunnel {
resolution any
}
}
}
sap lag-1:200 {
}
provider-tunnel {
selective {
admin-state enable
owner bgp-evpn-mpls
wildcard-spmsi true
mldp true
}
}
}
vpls "SBD20001" {
admin-state enable
service-id 20001
customer "1"
routed-vpls {
multicast {
ipv4 {
forward-to-ip-interface true
}
}
}
bgp 1 {
}
igmp-snooping {
admin-state enable
}
bgp-evpn {
evi 20001
routes {
ip-prefix {
advertise true
}
sel-mcast {
advertise true
}
}
mpls 1 {
admin-state enable
auto-bind-tunnel {
resolution any
}
}
}
provider-tunnel {
selective {
admin-state enable
owner bgp-evpn-mpls
mldp true
}
}
}
vprn "oism-vprn-20000" {
admin-state enable
service-id 20000
customer "1"
ecmp 2
igmp {
interface "BD20023" {
}
}
pim {
apply-to all
ipv4 {
rpf-table both
}
interface "SBD20001" {
multicast-senders always
}
}
interface "BD20023" {
ipv4 {
primary {
address 10.0.0.2
prefix-length 24
}
neighbor-discovery {
learn-unsolicited true
proactive-refresh true
host-route {
populate dynamic {
}
}
}
vrrp 1 {
backup [10.0.0.254]
passive true
}
}
}
}
// PE4 relevant configuration for OISM
[ex:/configure service]
A:admin@PE-4# info
vpls "BD20004" {
admin-state enable
service-id 20004
customer "1"
routed-vpls {
multicast {
ipv4 {
forward-to-ip-interface true
}
}
}
bgp 1 {
}
igmp-snooping {
admin-state enable
}
bgp-evpn {
evi 20004
mpls 1 {
admin-state enable
ingress-replication-bum-label true
auto-bind-tunnel {
resolution any
}
}
}
sap pxc-6.a:200 {
}
provider-tunnel {
selective {
admin-state enable
owner bgp-evpn-mpls
wildcard-spmsi true
mldp true
data-threshold {
group-prefix 239.0.0.0/8 {
threshold 0
}
}
}
}
}
vpls "SBD20001" {
admin-state enable
service-id 20001
customer "1"
routed-vpls {
multicast {
ipv4 {
forward-to-ip-interface true
}
}
}
bgp 1 {
}
igmp-snooping {
admin-state enable
}
bgp-evpn {
evi 20001
routes {
ip-prefix {
advertise true
}
sel-mcast {
advertise true
}
}
mpls 1 {
admin-state enable
auto-bind-tunnel {
resolution any
}
}
}
provider-tunnel {
selective {
admin-state enable
owner bgp-evpn-mpls
mldp true
}
}
}
vprn "oism-vprn-20000" {
admin-state enable
service-id 20000
customer "1"
ecmp 2
igmp {
interface "BD20004" {
}
}
pim {
apply-to all
ipv4 {
rpf-table both
}
interface "SBD20001" {
multicast-senders always
}
}
interface "BD20004" {
ipv4 {
primary {
address 40.0.0.1
prefix-length 24
}
neighbor-discovery {
learn-unsolicited true
proactive-refresh true
host-route {
populate dynamic {
}
}
}
}
vpls "BD20004" {
evpn {
arp {
learn-dynamic false
advertise dynamic {
}
}
}
}
}
interface "SBD20001" {
mac 00:00:00:00:00:04
vpls "SBD20001" {
evpn-tunnel {
supplementary-broadcast-domain true
}
}
}
}
With the above configuration on PE2, PE3 and PE4, PE4 advertises an S-PMSI
A-D route for group (40.0.0.4,239.0.0.4), in addition to the wildcard-spmsi route:show router bgp routes evpn spmsi-ad rd 192.0.2.4:20004
===============================================================================
BGP Router ID:192.0.2.2 AS:64500 Local AS:64500
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
========================================================
BGP EVPN SPMSI AD Routes
========================================================
Flag Route Dist. Src Address
Tag Grp Address
Orig Address
--------------------------------------------------------
u*>i 192.0.2.4:20004 0.0.0.0
0 0.0.0.0
192.0.2.4
u*>i 192.0.2.4:20004 40.0.0.4
0 239.0.0.4
192.0.2.4
--------------------------------------------------------
Routes : 2
========================================================
The S-PMSI A-D route for 239.0.0.4 makes PE2 and PE3 join the Selective mLDP tree setup by PE4. The multicast group is delivered over the S-PMSI tree. The following commands help show the established S-PMSI trees (which are modeled as SDP-binds at the service level and therefore consume SDP-bind resources). PE2 and PE3 join the S-PMSI tree for 239.0.0.4 on the SBD because they are not attached to the source ordinary BD. The traffic is received at Layer 3, therefore the statistics are seeing at the VPRN level and not at the MFIB level (as in the case of EVPN proxy):
show service id "20001" provider-tunnel spmsi-tunnels detail
===============================================================================
LDP Spmsi Tunnels
===============================================================================
LSP ID : 8199
Root Address : 192.0.2.4
S-PMSI If Index : 73752
Num. Leaf PEs : 1
Uptime : 0d 14:45:11
Group Address : 239.0.0.4
Source Address : 40.0.0.4
Origin IP Address : 192.0.2.4
State : RX Joined
Remain Delay Intvl : 0
-------------------------------------------------------------------------------
LSP ID : 8198
Root Address : 192.0.2.4
S-PMSI If Index : 73751
Uptime : 0d 14:45:11
Group Address : * (wildcard)
Source Address : *
Origin IP Address : 192.0.2.4
State : RX Joined
Remain Delay Intvl : 0
-------------------------------------------------------------------------------
===============================================================================
tools dump service id "20001" provider-tunnels type terminating
===============================================================================
VPLS 20001 Inclusive Provider Tunnels Terminating
===============================================================================
No Tunnels Found
-------------------------------------------------------------------------------
===============================================================================
VPLS 20001 Selective Provider Tunnels Terminating
===============================================================================
spmsi (LDP) Source-Addr Group-Addr Root-Addr LSP-ID Lsp-Name
-------------------------------------------------------------------------------
40.0.0.4 239.0.0.4 192.0.2.4 8199
* * 192.0.2.4 8198
-------------------------------------------------------------------------------
show router "20000" pim group detail
===============================================================================
PIM Source Group ipv4
===============================================================================
Group Address : 239.0.0.4
Source Address : 40.0.0.4
RP Address : 0
Advt Router :
Flags : Type : (S,G)
Mode : sparse
MRIB Next Hop : 40.0.0.4
MRIB Src Flags : direct
Keepalive Timer : Not Running
Up Time : 0d 15:42:23 Resolved By : rtable-u
Up JP State : Joined Up JP Expiry : 0d 00:00:14
Up JP Rpt : Not Joined StarG Up JP Rpt Override : 0d 00:00:00
Register State : No Info
Reg From Anycast RP: No
Rpf Neighbor : 40.0.0.4
Incoming Intf : SBD20001
Outgoing Intf List : BD20023
Curr Fwding Rate : 67.200 kbps
Forwarded Packets : 29945 Discarded Packets : 0
Forwarded Octets : 2515380 RPF Mismatches : 0
Spt threshold : 0 kbps ECMP opt threshold : 7
Admin bandwidth : 1 kbps
-------------------------------------------------------------------------------
Groups : 1
===============================================================================
EVPN-VPWS PW headend functionality
EVPN-VPWS is often used as an aggregation technology to connect access devices to the residential or business PE in the service provider network. The PE receives tagged traffic inside EVPN-VPWS circuits and maps each tag to a different service in the core, such as ESM services, Epipe services, or VPRN services.
SR OS implements this PW headend functionality by using PW ports that use multihomed Ethernet Segments (ESs) for redundancy. ESs can be associated with PW ports in two different modes of operation.
- PW port-based ESs with multihoming procedures on PW SAPs
- PW port-based ESs with multihoming procedures on stitching Epipe
PW port-based ESs with multihoming procedures on PW SAPs
PW ports in ESs and virtual ESs (vESs) are supported for EVPN-VPWS MPLS services. In addition to LAG, port, and SDP objects, PW port ID can be configured in an Ethernet Segment. In this mode of operation, PW port-based ESs only support all-active configuration mode, and not single-active configuration mode.
The following requirements apply:
- Port-based or FPE-based PW ports can be used in ESs
- PW port scenarios supported along with ESs are as follows:
- port-based PW port
- FPE-based PW port, where the stitching service uses a spoke SDP to the access CE
- FPE-based PW port, where the stitching service uses EVPN-VPWS (MPLS) to the access CE
For all the preceding scenarios, fault-propagation to the access CE only works in the case of physical failures. Administrative shutdown of individual Epipes, PW SAPs, ESs or BGP-EVPN may result in traffic black holes.
The following figure shows the use of PW ports in ESs. In this example, an FPE-based PW port is associated with the ES, where the stitching service itself also uses EVPN-VPWS.
In this example, the following conditions apply:
- Redundancy is driven by EVPN all-active multihoming. ES-1 is a virtual ES configured on the FPE-based PW port on PE-1 and PE-2.
- The access network between the access PE (PE-A) and the network PEs (PE-1 and
PE-2), uses EVPN-VPWS to backhaul the traffic. Therefore, PE-1 and PE-2 use
EVPN-VPWS in the PW port stitching service, where:
- PE-1 and PE-2 apply the same Ethernet tag configuration on the stitching service (Epipe 10)
- Optionally PE-1 and PE-2 can use the same RD on the stitching service
- AD per-EVI routes for the stitching service Ethernet tags are advertised with ESI=0
- Forwarding in the CE-1 to CE-2 or CE-3 direction, works as follows:
- PE-A forwards traffic based on the selection of the best AD per-EVI route advertised by PE-1 and PE-2 for the stitching Epipe 10. This selection can be either BGP-based if PE-2 and PE-3 use the same RD in the stitching service, or EVPN-based if different RD is used.
- When the PE-1 route is selected, PE-1 receives the traffic on the local PW-SAP for Epipe 1 or Epipe 2, and forwards it based on the customer EVPN-VPWS rules in the core.
- Forwarding in the CE-2 or CE-3 to CE-1 direction, works as follows:
- PE-3 forwards the traffic based on the configuration of ECMP and aliasing rules for Epipe 1 and Epipe 2.
- PE-3 can send the traffic to PE-2 and PE-2 to PE-A, following different directions.
- If the user needs the traffic to follow a symmetric path in both directions, then the AD per-EVI route selection on PE-A and PE-3 can be handled so that the same PE (PE-1 or PE-2) is selected for both directions.
- For this example, the solution provides redundancy in case of node failures in
PE-1 or PE-2. However, the administrative shutdowns, configured in some objects,
are not propagated to PE-A, leading to traffic blackholing. As a result, black
holes may be caused by the following events in PE-1 or PE-2:
- Epipe 1 or Epipe 2 service shutdown
- Epipe 1 or Epipe 2 BGP-EVPN MPLS shutdown
- vES-1 shutdown
- BGP shutdown
PW port-based ESs with multihoming on stitching Epipe
The solution described in PW port-based ESs with multihoming procedures on PW SAPs provides PW-headend redundancy where the access PE selects one of the PW-headend PE devices based on BGP best path selection, and the traffic from the core to the access may follow an asymmetric path. This is because the multihoming procedures are actually run on the PW SAPs of the core services, and the AD per-EVI routes advertised in the context of the stitching Epipe use an ESI=0.
SR OS also supports a different mode of operation called pw-port headend which allows running the multihoming procedures in the stitching Epipe and, therefore, use regular EVPN-VPWS primary or backup signaling to the access PE. The mode of operation is supported in a single-active mode shown in the following figure.
The following configuration triggers the needed behavior:
// ES and stitching Epipe config
PE-1/2>config>service# info
system
bgp-evpn
ethernet-segment “ES-1” create
esi 00:12:12:12:12:12:12:12:12:12
multi-homing single-active
pw-port 1 pw-headend
no shutdown
epipe 300 name ”stitching-300" customer 1 create
pw-port 1 fpe 1 create
no shutdown
bgp-evpn
local-attachment-circuit ac-23 eth-tag 23
remote-attachment-circuit ac-1 eth-tag 1
mpls bgp 1
auto-bind-tunnel resolution any
// Services config
epipe 10
sap pw-1:10 create
bgp-evpn
mpls bgp 1
epipe 11
sap pw-1:10 create
bgp-evpn
mpls bgp 1
The configuration and functionality are divided in four aspects.
Configuration of single-active multihoming on ESs associated with PW ports of type pw-headend
In this mode, PW Ports are associated with single-active non-virtual Ethernet Segments. The pw-headend keyword is needed when associating the PW port.
PE-1/2>config>service# info
system
bgp-evpn
ethernet-segment “ES-1” create
esi 00:12:12:12:12:12:12:12:12:12
multi-homing single-active
pw-port 1 pw-headend
no shutdown
The pw-port id pw-headend command indicates to the system that the multihoming procedures are run in the PW port stitching Epipe and the routes advertised in the context of the stitching Epipe contains the ESI of the ES.
Configuration of the PW port stitching Epipe
A configuration example of the stitching Epipe follows.
epipe 300 name ”stitching-300" customer 1 create
pw-port 1 fpe 1 create
no shutdown
bgp-evpn
local-attachment-circuit ac-23 eth-tag 23
remote-attachment-circuit ac-1 eth-tag 1
mpls bgp 1
auto-bind-tunnel resolution any
The preceding example shows the configuration of a stitching EVPN VPWS Epipe with MPLS transport, however SRv6 transport is also supported.
When the ES is configured with a PW port in pw-headend mode, the stitching Epipe associated with the PW port is now running the ES and DF election procedures. Therefore, the following actions apply:
- an AD per-ES route is advertised with:
- the RD or RT of the stitching Epipe
- the configured ESI of the ES associated with the PW port
- the ESI-label extended community with the multihomed mode indication and ESI label
- an AD per EVI route is advertised with:
- the RD or RT of the stitching Epipe
- the configured ESI where the PW port resides
- the P/B bits according to the DF election procedures
- the non-DF drives the PW port operationally down with a flag MHStandby. As a result, all the PW SAPs contained in the PW port are brought operationally down. Optionally, the config>service>epipe>pw-port>oper-up-on-mhstandby command can be configured so that the PW port stays operationally up even if it is in MHStandby state (that is, the PE is non-DF). This command may speed up convergence in case a significant number of PW SAPs are configured in the same PW port.
Configuration of the PW port-contained PW SAPs and edge services
The edge services that contain the PW SAPs of the pw-headend pw-port command are configured without any other additional commands. These PW SAPs can be configured on Epipes, VPRN interfaces, or subscriber interfaces, VPLS (capture SAPs). As an example, if the PW SAP is configured on an Epipe EVPN-VPWS service:
epipe 10
sap pw-1:10 create
bgp-evpn
mpls bgp 1
- The PW SAP is brought operationally down if the PW port is down. The PW port goes down with the reason MHStandby if the PE is a non-DF, or with reason stitching-svc-down if the EVPN destination is removed from the stitching Epipe.
- If the PW SAP is configured in an EVPN-VPWS edge service as in the preceding
example, the following actions are performed:
- An AD per ES route is advertised for the EVPN-VPWS service with the RD or RT of the service Epipe, the configured ESI of the ES associated with the PW port, and the ESI-label extended community with the multihomed mode indication of the ES and ESI label (label is the same value as in the AD per ES for the stitching Epipe). If the PW port is only down because of the MHStandby flag, the AD per ES route for the Epipe service is still advertised.
- In addition, an AD per EVI route is advertised with the RD or RT of
the service Epipe, the configured ESI of the ES associated with the
PW port, and the P/B flags of the ES:
- P=1/B=0 on the DF
- P=0/B=1 on backup
- P=0/B=0 on non-DFs and non-backup
- If the PW port is down only because of MHStandby, the AD per EVI route for the service Epipe is still advertised.
Some considerations and dependencies between the PW port and the service Epipe PW SAPs
- If all the PW SAPs associated with the FPE PW port are brought down, the
following rules apply:
- state of the PW port does not change
- does not trigger any AD per-ES/EVI or ES route withdraw toward the CE from the stitching Epipe
- Any event that brings down the PW port (except for MHStandby) triggers:
- an AD per-EVI/ES route withdrawal within the context of the stitching Epipe
- an ES route withdrawal
- an AD per-EVI/ES routes withdrawal within the context of the service Epipes
- the pw-port>monitoring-oper-group command can also modify the state of the PW port driven by the state of the operational group
- An individual PW SAP going administrative or operationally down while the PW
port is still operationally up, the following actions may be performed:
- may create black holes for that particular service
- triggers the withdrawal of the AD per-EVI routes for the service Epipe (not the AD per-ES route, which is kept advertised if the PW port is up)
- if the PW SAP is administratively not shutdown, the service Epipe AD per-ES/EVI routes mirror the AD per-ES/EVI routes of the stitching service and they are advertised if the routes for the stitching Epipe are advertised
The PW SAP can also be configured on VPRN services (under regular interfaces or subscriber interfaces) and works without any special consideration, other than that a PW port in non-DF state brings down the PW SAP and, therefore, the interface. Similarly, VPLS services with capture PW SAPs support this mode of operation too.
Interaction of EVPN and other features
This section contains information about EVPN and how it interacts with other features.
Interaction of EVPN-VXLAN and EVPN-MPLS with existing VPLS features
When enabling existing VPLS features in an EVPN-VXLAN or an EVPN-MPLS enabled service, the following must be considered:
EVPN-VXLAN services are not supported on I-VPLS/B-VPLS. VXLAN cannot be enabled on those services. EVPN-MPLS is only supported in regular VPLS and B-VPLS. Other VPLS types, such as m-vpls, are not supported with either EVPN-VXLAN or EVPN-MPLS VPLS etree services are supported with EVPN-MPLS.
In general, no router-generated control packets are sent to the EVPN destination bindings, except for ARP, VRRP, ping, BFD and Eth-CFM for EVPN-VXLAN, and proxy-ARP/proxy-ND confirm messages and Eth-CFM for EVPN-MPLS.
The following rules apply to xSTP and M-VPLS services:
-
xSTP can be configured in BGP-EVPN services. BPDUs are not sent over the EVPN bindings.
-
bgp-evpn is blocked in m-vpls services; however, a different m-vpls service can manage a SAP or spoke SDP in a bgp-evpn-enabled service.
-
xSTP is not supported in BGP-EVPN services that use Ethernet segments for multihoming, and an M-VPLS must not drive the state of a BGP-EVPN service that uses Ethernet segments.
-
In bgp-evpn enabled VPLS services, mac-move can be used in SAPs/SDP bindings; however, the MACs being learned through BGP-EVPN are considered.
Note: The MAC duplication already provides a protection against mac-moves between EVPN and SAPs/SDP bindings.disable-learning and other fdb-related tools only work for data plane learned MAC addresses.
mac-protect cannot be used in conjunction with EVPN.
Note: EVPN provides its own protection mechanism for static MAC addresses.MAC OAM tools are not supported for bgp-evpn services, that is: mac-ping, mac-trace, mac-populate, mac-purge, and cpe-ping.
EVPN multihoming and BGP-MH can be enabled in the same VPLS service, as long as they are not enabled in the same SAP-SDP or spoke SDP. There is no limitation on the number of BGP-MH sites supported per EVPN-MPLS service.
Note: The number of BGP-MH sites per EVPN-VXLAN service is limited to 1.SAPs/SDP bindings that belong to a specified ES but are configured on non-BGP-EVPN-MPLS-enabled VPLS or Epipe services are kept down with the StandByForMHProtocol flag.
CPE-ping is not supported on EVPN services but it is in PBB-EVPN services (including I-VPLS and PBB-Epipe). CPE-ping packets are not sent over EVPN destinations. CPE-ping only works on local active SAP or SDP bindings in I-VPLS and PBB-Epipe services.
Other commands not supported in conjunction with bgp-evpn are:
Subscriber management commands under service, SAP, and SDP binding interfaces
BPDU translation
L2PT termination
MAC-pinning
-
Other commands not supported in conjunction with bgp-evpn mpls are:
SPB configuration and attributes
Interaction of PBB-EVPN with existing VPLS features
In addition to the B-VPLS considerations described in section Interaction of EVPN-VXLAN and EVPN-MPLS with existing VPLS features, the following specific interactions for PBB-EVPN should also be considered:
When bgp-evpn mpls is enabled in a b-vpls service, an i-vpls service linked to that b-vpls cannot be an R-VPLS (the allow-ip-int-bind command is not supported).
The ISID value of 0 is not allowed for PBB-EVPN services (I-VPLS and Epipes).
The ethernet-segments can be associated with b-vpls SAPs/SDP bindings and i-vpls/epipe SAPs/SDP bindings,; however, the same ES cannot be associated with b-vpls and i-vpls/epipe SAP or SDP bindings at the same time.
When PBB-Epipes are used with PBB-EVPN multihoming, spoke SDPs are not supported on ethernet-segments.
When bgp-evpn mpls is enabled, eth-tunnels are not supported in the b-vpls instance.
Interaction of VXLAN, EVPN-VXLAN and EVPN-MPLS with existing VPRN or IES features
When enabling existing VPRN features on interfaces linked to VXLAN R-VPLS (static or BGP-EVPN based), or EVPN-MPLS R-VPLS interfaces, consider that the following are not supported:
the commands arp-populate and authentication-policy
dynamic routing protocols such as IS-IS, RIP, and OSPF
When enabling existing IES features on interfaces linked to VXLAN R-VPLS or EVPN-MPLS R-VPLS interfaces, the following commands are not supported:
-
if>vpls>evpn-tunnel
-
bgp-evpn>ip-route-advertisement
-
arp-populate
-
authentication-policy
Dynamic routing protocols such as IS-IS, RIP, and OSPF are also not supported.
Interaction of EVPN with BGP owners in the same VPRN service
SR OS allows multiple BGP owners in the same VPRN service to receive or advertise IP prefixes contained in the VPRN's route table. Specifically, the same VPRN route table can simultaneously install and process IPv4 or IPv6 prefixes for the following owners:
EVPN-IFL (EVPN Interface-less IP prefix routes)
EVPN-IFF (EVPN Interface-ful IP prefix routes)
VPN-IP (also referred to as IPVPN routes)
IP (also referred to as BGP PE-CE routes)
Different owners supported on the same VPRN shows the service architecture and the concept of different owners supported on the same VPRN.
In the example shown in Different owners supported on the same VPRN, VPRN 10 is configured with regular interfaces and R-VPLS interfaces and receives the same prefix 10.0.0.0/24 via the four owners.
EVPN-IFL routes are EVPN IP-Prefix (or type 5) routes that are imported and exported based on the VPRN bgp-evpn>mpls configuration, as described in Interface-less IP-VRF-to-IP-VRF model (IP encapsulation) for MPLS tunnels.
EVPN-IFF routes are EVPN IP-Prefix (or type 5) routes that are imported and exported based on the configuration of the R-VPLS services attached to the VPRN. EVPN-IFF routes are advertised and processed if the R-VPLS services are configured with the configure>service>vpls>bgp-evpn>ip-route-advertisement command. Although installed in the VPRN service, EVPN-IFF routes use the route distinguisher and route targets determined by the configuration in the R-VPLS, and are supported in R-VPLS services with VXLAN or MPLS encapsulations. See Interface-ful IP-VRF-to-IP-VRF with SBD IRB model for more information about EVPN-IFF routes.
In addition to EVPN-IFL and EVPN-IFF routes, BGP IP and VPN-IP families are supported on the same VPRN.
Interworking of EVPN-IFL and IPVPN in the same VPRN
This section describes the SR OS interworking details for BGP owners in the same VPRN. The behavior is compliant with draft-ietf-bess-evpn-ipvpn-interworking.
A VPRN service can be configured to support EVPN-IFL and IPVPN simultaneously. For example, the following MD CLI excerpt shows a VPRN service configured for EVPN-IFL (vprn>bgp-evpn context) and IPVPN (vprn>bgp-ipvpn context):
[ex:/configure service vprn "vprn-ipvpn-evpnifl-AL-80"]
A:admin@PE-2# info
admin-state enable
service-id 80
customer "1"
bgp-evpn {
mpls 1 {
admin-state enable
route-distinguisher "192.0.2.2:80"
vrf-target {
community "target:64500:80"
}
auto-bind-tunnel {
resolution any
}
}
}
bgp-ipvpn {
mpls {
admin-state enable
route-distinguisher "192.0.2.2:80"
vrf-target {
community "target:64500:80"
}
auto-bind-tunnel {
resolution any
}
}
}
interface "lo0" {
loopback true
ipv4 {
primary {
address 2.2.2.2
prefix-length 32
}
}
}
When EVPN-IFL and IPVPN are both enabled on the same VPRN, the following rules apply:
IPVPN and EVPN-IFL routes are treated by BGP as separate routes; that is, the selection is done at route table level and not at the BGP level.
At the route table level, IPVPN and EVPN-IFL routes may have the same route table preference (by default, this is 170 for both routes), route selection between IPVPN and EVPN-IFL routes is based on regular BGP path selection.
ECMP across IPVPN and EVPN-IFL routes for the same prefix is not supported. When vprn>ecmp is configured to 2 or greater, installing multiple equal cost next hops for the same prefix in the VPRN route table is only supported within the same route owner, IPVPN or EVPN IFL.
When EVPN-IFL and IPVPN are both enabled in the same VPRN, by default, EVPN-IFL routes are exported into IPVPN and the other way around (CLI configuration is not required).
The configure>service>vprn>allow-export-bgp-vpn command is relevant within the same owner (either IPVPN or EVPN-IFL) and works as follows:
The command re-exports a received EVPN-IFL route into an EVPN-IFL route to a different peer.
The command also re-exports a received IPVPN route into an IPVPN route.
If EVPN-IFL and IPVPN are both configured in the same VPRN, an EVPN-IFL route is automatically re-exported into an IPVPN route. Conversely, an IPVPN route is re-exported into an EVPN-IFL. This is true unless export policies prevent the automatic re-export function.
Route selection across EVPN-IFL and other owners in the VPRN service
This section describes the rules for route selection among EVPN-IFL, VPN-IP, and IP route table owners.
A PE may receive an IPv4 or IPv6 prefix in routes from different or same owners, and from the same or different BGP peer. For example, prefix 10.0.0.0/24 can be received as an EVPN-IFL route and also received as a VPN-IPv4 route. Or prefix 2001:db8:1::/64 can be received in two EVPN-IFL routes with different route distinguishers from different peers. In all these examples, the router selects the best route in a deterministic way.
For EVPN-IFF route selection rules, see Route selection for EVPN-IFF routes in the VPRN service. In SR OS, the VPRN route table route selection for all BGP routes, excluding EVPN-IFF, is performed using the following ordered, tie-breaking rules:
valid route wins over invalid route
lowest origin validation state (valid<not found<invalid) wins
lowest RTM (route table) preference wins
highest local preference wins
-
shortest D-PATH wins (skipped if d-path-length-ignore is configured)
lowest AIGP metric wins
shortest AS_PATH wins (skipped if the as-path-ignore command is configured for the route owner)
lowest origin wins
lowest MED wins
lowest owner type wins (BGP<BGP-LABEL<BGP-VPN)
Note: BGP-VPN refers to VPN-IP and EVPN-IFL in this context.EBGP wins
lowest route table or tunnel-table cost to the next hop (skipped if the ignore-nh-metric command is configured)
lowest next-hop type wins (resolution of next hop in TTM wins vs RTM) (skipped if the ignore-nh-metric command is configured)
lowest router ID wins (skipped if the ignore-router-id command is configured)
shortest cluster_list length wins
lowest IP address
Note: The IP address refers to the peer that advertised the route.EVPN-IFL wins over IPVPN routes
next-hop check (IPv4 next hop wins over IPv6 next hop, and then lowest next hop wins)
Note: This is a tiebreaker if BGP receives the same prefix for VPN-IPv6 and IFL. An IPv6-Prefix received as VPN-IPv6 is mapped as IPv6 next hop, whereas the same IPv6 prefix received as IFL could have an IPv4 next hop.RD check for RTM (lowest RD wins)
ECMP is not supported across EVPN-IFL and other owners, but it is supported within the EVPN-IFL owner for multiple EVPN-IFL routes received with the same IP prefix. When ECMP is configured with N number of paths in the VPRN, BGP orders the routes based on the previously described tie-break criteria breaking out after step 13 (lowest next-hop type). At that point, BGP creates an ECMP set with the best N routes.
Example:
In a scenario in which two EVPN-IFL routes are received on the same VPRN with same prefix, 10.0.0.0/24; different RDs 192.0.2.1:1 and 192.0.2.2; and different router ID, 192.0.2.1 and 192.0.2.2; the following tie-breaking criteria are considered.
Assuming everything else is the same, BGP orders the routes based on the preceding criteria and prefers the route with the lowest router ID.
If vprn>ecmp=2, the two routes are treated as equal in the route table and added to the same ECMP set.
Route selection for EVPN-IFF routes in the VPRN service
While the route selection in VPRN for other BGP owners described in Route selection across EVPN-IFL and other owners in the VPRN service follows similar criteria, the default selection for EVPN-IFF routes in the VPRN route table follow different rules:
By default, EVPN-IFF routes have a VPRN route table preference of 169; therefore, EVPN-IFF routes are preferred over EVPN-IFL, VPN-IP, or IP owners that have a preference of 170.
When two or more EVPN-IFF routes with the same IPv4 or IPv6 prefix and length, but with different route keys are received (for example, two routes with the same prefix and length but with different RDs), BGP hands the EVPN-IFF routes over to the EVPN application for selection. In this case, EVPN orders the routes based on their {R-VPLS Ifindex, RD, Ethernet Tag} and considers the top one for installing in the route table if ecmp is 1. If ecmp is N, the top N routes for the prefix are selected.
Example:
Consider the following two IP-Prefix routes that are received on the same R-VPLS service:
Route 1: (RD=192.0.0.1:30, Ethernet Tag=0, Prefix=10.0.0.0/24, next-hop 192.0.0.1)
Route 2: (RD=192.0.0.2:30, Ethernet Tag=0, Prefix=10.0.0.0/24, next-hop 192.0.0.2)
Because their route key is different (their RDs do not match), EVPN orders them based on R-VPLS Ifindex first, then RD, and then Ethernet Tag. Because they are received on the same R-VPLS, the Ifindex is the same on both. The top route on the priority list is Route 1, based on its lower RD. If the VPRN's ecmp command has a value of 1, only Route 1 is installed in the VPRN's route table.
If the previously described way of selecting EVPN-IFF routes in the VPRN does not satisfy the user requirements, the configure service system bgp evpn ip-prefix-routes iff-bgp-path-selection command enables a BGP-based path selection for EVPN-IFF routes, which is equivalent to the selection followed for EVPN-IFL or IPVPN routes with the following considerations:
All the tie breakers in Route selection across EVPN-IFL and other owners in the VPRN service are valid for EVPN-IFF, except for the lowest owner type tie breaker that does not affect EVNP-IFF routes.
Note: The ignore-nh-metric command does not exist for EVPN-IFF routes.While the tie breakers in Route selection across EVPN-IFL and other owners in the VPRN service are also valid for comparing an EVPN-IFL and an IPVPN route for the same prefix, an EVPN-IFF and IPVPN route are never compared based on those tie breakers. They are only used to compare multiple EVPN-IFF routes for the same prefix, and only when the iff-bgp-path-selection command is configured.
If the iff-bgp-path-selection command is configured, the EVPN-IFF path selection for the N routes that form the ECMP set follow the same rules as in Route selection across EVPN-IFL and other owners in the VPRN service for EVPN-IFL or IPVPN routes.
Upon receiving the same IP prefix 10.0.0.0/24 for the same VPRN in EVPN-IFF routes with different RDs, as shown in EVPN-IFF path selection for N routes with iff-bgp-path-selection configured, the following selection criteria is used:
If the iff-bgp-path-selection command is configured, the selection is based on BGP path selection, and the selected route is the top route, based on the highest Local Preference (LP)(500>300>200).
However, if the iff-bgp-path-selection command is not configured, the bottom route is selected assuming the three routes are received on the same R-VPLS, and based on the lower RD (1.1.1.1:1<1.1.1.5:1<1.1.1.10:1).
Although, by default, EVPN-IFF routes have an RTM preference of 169 and they are preferred over the RTM preference of 170 used for the other BGP route owners, a selection across EVPN-IFF and the other owners may result if the RTM preference is changed and made equal (via import policy or config>router>bgp>preference). Note that the route table preference for EVPN-IFF routes can be changed from the default value 169 if the iff-attribute-uniform-propagation command is enabled and an import policy or the config>router>bgp>preference command is configured to change it.
In case the RTM preference is changed and made equal to the same as for EVPN-IFF routes, if multiple routes with the same key (different RD) are received for EVPN-IFF and another owner in the same VPRN, the selection order is as follows:
BGP (IPv4 or IPv6)
BGP-IPVPN
EVPN-IFF
EVPN-IFL
BGP path attribute propagation
A VPRN can receive and install routes for a specific BGP for a specific BGP owner. The routes may be re-exported in the context of the same VPRN and to the same BGP owner or a different one. For example, an EVPN-IFL route can be received from peer N, installed in VPRN 1, and re-exported to peer M using family VPN-IPv4.
When re-exporting BGP routes, the original BGP path attributes are preserved without any configuration in the following cases:
EVPN-IFL route re-exported into an IPVPN route, and the other way around
EVPN-IFL route re-exported into a BGP IP route (PE-CE), and the other way around
IPVPN route re-exported into a BGP IP route (PE-CE), and the other way around
EVPN-IFL, IPVPN or BGP IP routes re-exported into a route of the same owner. For example, EVPN-IFL to EVPN-IFL, when the allow-export-bgp-vpn command is configured.
BGP path attributes to or from EVPN-IFF are not preserved by default. If BGP Path Attribute propagation is required, the configure service system bgp-evpn ip-prefix-routes iff-attribute-uniform-propagation command must be configured. BGP path attribute propagation when iff-attribute-uniform-propagation is configured shows an example of BGP Path Attribute propagation from EVPN-IFF to the other BGP owners in the VPRN when the iff-attribute-uniform-propagation command is configured.
In the example in BGP path attribute propagation when iff-attribute-uniform-propagation is configured, DGW1 propagates the received LP and communities on an EVPN-IFF route, when advertising the same prefix into any type of BGP owner route, including VPN-IPv4/6, EVPN-IFL, EVPN-IFF, IPv4, or IPv6. If the iff-attribute-uniform-propagation command is not configured on DCW1, no BGP path attributes are propagated, but are re-originated instead. The propagation in the opposite direction follows the same rules; configuration of the iff-attribute-uniform-propagation command is required.
When propagating BGP path attributes, the following criteria are considered:
The propagation is compliant with the uniform propagation described in draft-ietf-bess-evpn-ipvpn-interworking.
The following extended communities are filtered or excluded when propagating attributes:
all extended communities of type 0x06 (EVPN type). In particular, all those that are supported by routes type 5:
MAC Mobility extended community (sub-type 0x00)
EVPN Router's MAC extended community (sub-type 0x03)
BGP encapsulation extended community
all Route Target extended communities
The BGP Path Attribute propagation within the same owner is supported in the following cases:
EVPN-IFF to EVPN-IFF (route received on R-VPLS and advertised in a different R-VPLS context), assuming the iff-attribute-uniform-propagation command is configured
EVPN-IFL to EVPN-IFL (route received on a VPRN and re-advertised based on the configuration of vprn>allow-export-bgp-vpn)
VPN-IPv4/6 to VPN-IPv4/6 (route received on a VPRN and re-advertised based on the configuration of vprn>allow-export-bgp-vpn)
The propagation is supported for iBGP and eBGP as follows:
iBGP-only attributes can only be propagated to iBGP peers
non-transitive attributes are propagated based on existing rules
when peering an eBGP neighbor, the AS_PATH is prepended by the VPRN ASN
If ECMP is enabled in the VPRN and multiple routes of the same BGP owner with different Route Distinguishers are installed in the route table, only the BGP path attributes of the best route are subject for propagation.
BGP D-PATH attribute for Layer 3 loop protection
SR OS has a full implementation of the D-PATH attribute as described in draft-ietf-bess-evpn-ipvpn-interworking.
D-PATH is composed of a sequence of domain segments (similar to AS_PATH). Each domain segment is graphically represented as shown in the following figure.
Where:
-
Each domain segment comprises of <domain_segment_length, domain_segment_value>, where the domain segment value is a sequence of one or more domains.
-
Each domain is represented by <DOMAIN-ID:ISF_SAFI_TYPE>, where the newly added domain is added by a gateway, is always prepended at the left of the existing last domain.
-
The supported ISF_SAFI_TYPE values are:
- 0 = Local ISF route
- 1 = safi 1 (typically identifies PE-CE BGP domains)
- 70 = evpn
- 128 = safi 128 (IPVPN domains)
-
Labeled unicast IP routes do not support D-PATH.
-
The D-PATH attribute is only modified by a gateway and not by an ABR/ASBR or RR. A gateway is defined as a PE where a VPRN is instantiated, and that VPRN advertises or receives routes from multiple BGP owners (for example, EVPN-IFL and BGP-IPVPN) or multiple instances of the same owner (for example, VPRN with two BGP-IPVPN instances)
Suppose a router receives prefix P in an EVPN-IFL instance with the following D-PATH from neighbor N.
If the router imports the route in VPRN-1, BGP-EVPN SRv6 instance with domain 65000:2, the router readvertises the route to its BGP-IPVPN MPLS instance as follows.
If the router imports the route in VPRN-1, BGP-EVPN SRv6 instance with domain 65000:3, the router readvertises the route to its BGP-EVPN MPLS instance as follows.
If the router imports the route in VPRN-1, BGP-EVPN MPLS instance with domain 65000:4, the router readvertises the route to its PE-CE BGP neighbor as follows.
When a BGP route of families that support D-PATH is received and must be imported in a VPRN, the following rules apply:
-
All domain IDs included in the D-PATH are compared with the local domain IDs configured in the VPRN. The local domain IDs for the VPRN include a list of (up to four) domain IDs configured at the vprn or vprn bgp instance level, including the domain IDs in local attached R-VPLS instances.
-
If one or more D-PATH domain IDs match any local domain IDs for the VPRN, the route is not installed in the VPRN’s route table.
-
In the case where the IP-VPN or EVPN route matches the import route target in multiple VRFs, the D-PATH loop detection works per VPRN. For example, for each VPRN, BGP checks if the received domain IDs match any locally configured (maximum 4) domain IDs for that VPRN. A route may have a looped domain for one VPRN and not the other. In this case, BGP installs a route only in the VPRN route table that does not have a loop; the route is not installed in the VPRN that has the loop.
-
A route that is not installed in any VPRN RTM (due to the domain ID matching any of the local domain IDs in the importing VPRNs) is still kept in the RIB-IN. The route is displayed in the show router bgp routes command with a DPath Loop VRFs field, indicating the VPRN in which the route is not installed due to a loop.
-
Route target-based leaking between VPRNs and D-PATH loop detection is described in the following example.
Consider an EVPN-IFL route to prefix P imported in VPRN 20 (configured with domain 65000:20) is leaked into VPRN 30.
When the route to prefix P is readvertised in the context of VPRN 30, which is enabled for BGP-IPVPN MPLS and BGP-EVPN MPLS, the readvertised BGP-IPVPN and BGP-EVPN routes have a D-PATH with a prepended domain 65000:20:0. That is, leaked routes are readvertised with the domain ID of the VPRN of origin and an ISF_SAFI_TYPE = 0, as described in draft-ietf-bess-evpn-ipvpn-interworking.
In the D-PATH example shown in the following figure, the different gateway PEs along the domains modify the D-PATH attribute by adding the source domain and family. If PE4 receives a route for the prefix with the domain of PE4 included in the D-PATH, PE4 does not install the route to avoid control plane loops.
In the D-PATH example shown in the following figure, DGW1 and DGW2 rely on the D-PATH attribute to automatically discard the prefixes received from the peer gateway in IPVPN and avoid loops by reinjecting the route back into the EVPN domain.
BGP D-PATH configuration
The D-PATH attribute is modified on transmission or processed on reception based on the local VPRN or R-VPLS configuration. The domain ID is configured per-BGP instance and the ISF_SAFI_TYPE automatically derived from the instance type that imported the original route.
The domain-id is configured at service bgp instance level as a six-byte value that includes a global admin value and a local admin value, for example, 65000:1. Domain ID configuration is supported on:
- VPRN BGP-EVPN MPLS and SRv6 instances (EVPN-IFL)
- VPRN BGP-IPVPN MPLS and SRv6 instance
- R-VPLS BGP-EVPN MPLS and VXLAN instances (EVPN-IFF only – the R-VPLS is configured with the evpn-tunnel command)
- VPRN BGP neighbors (PE-CE)
- VPRN level (for local routes)
The following is an example CLI configuration:
// domain-id configuration
*[ex:configure service vprn "blue" bgp-evpn mpls 1]
*[ex:configure service vprn "blue" bgp-evpn segment-routing-v6 1]
*[ex:configure service vprn "blue" bgp-ipvpn mpls 1]
*[ex:configure service vprn "blue" bgp-ipvpn segment-routing-v6 1]
*[ex:configure service vprn "blue" bgp]
*[ex:configure service vpls "blue" bgp-evpn routes ip-prefix]
+-- domain-id <global-field:local-field>
*[ex:configure service vprn "blue"]
A:admin@PE-2#
+-- local-routes-domain-id <global-field:local-field>
// used as the domain-id for non-bgp routes in the VPRN.
// Example ‘a’
*[ex:configure service vprn "blue" bgp-ipvpn mpls 1]
domain-id 65000:1
In the preceding "example 'a'", if a VPN-IPv4 route is received from a neighbor, imported in VPRN "blue" and exported to another neighbor as EVPN, the router prepends a D-PATH segment <65000:1:IPVPN> to the advertised EVPN RT5.
// Example ‘b’
*[ex:configure service vprn "blue"]
local-routes-domain-id 65000:10
In the preceding "example 'b'", the local-routes-domain-id is configured at the vprn level. When configured, local routes (direct, static, IGP routes) are advertised with a D-PATH that contains the vprn>local-routes-domain-id.
The following additional considerations apply:
- If vprn>local-routes-domain-id is not configured, the local routes are advertised into the BGP instances with no D-PATH.
- If a VPRN BGP instance is not configured with a domain ID, the following handling
applies:
- Routes imported in the VPRN BGP instance are readvertised in a different instance without modifying the D-PATH.
- Routes exported in the VPRN BGP instance are advertised with the D-PATH modified to include the domain ID of the instance that imported the route in the first place.
- Up to a maximum of four domain IDs per VPRN are supported. This includes domain IDs configured in the associated R-VPLS services.
- Modifying the domain IDs list initiates a route refresh for all address families associated with the VPRN.
BGP D-PATH and BGP best path selection
D-PATH is also considered for the BGP best path selection, as described in draft-ietf-bess-evpn-ipvpn-interworking.
As D-PATH is introduced in networks, not all the PEs may support D-PATH for BGP path selection. To guarantee compatibility in networks with PEs that do not support D-PATH, the following command determines if the D-PATH should be considered for BGP best-path selection.
ex:/configure]
A:admin@PE-3#
router “Base” bgp best-path-selection d-path-length-ignore <boolean> // default: false
service vprn <string> bgp best-path-selection d-path-length-ignore <boolean> // default: false
service vprn <string> d-path-length-ignore <boolean> // default: false
configure service system bgp evpn ip-prefix-routes d-path-length-ignore <boolean> // default: false
The following conditions apply to the d-path-length-ignore command usage:
- When d-path-length-ignore is configured at the base router level (or vprn>bgp level for PE-CE routes), BGP ignores the D-PATH domain segment length for best path selection purposes. This ignores d-path-length when comparing two VPN routes or two IFL routes within the same RD. These VPN or IFL routes are processed in main BGP instance.
- When d-path-length-ignore is configured at the VPRN router level, the VPRN RTM ignores the D-PATH domain segment length for best path selection purposes (for routes in VPRN).
- When d-path-length-ignore is configured at the service system bgp evpn ip-prefix-routes context, EVPN ignores the D-PATH length when iff-bgp-path-selection is enabled.
- When d-path-length-ignore is not configured, the D-PATH length is considered in the BGP best path selection process (at the BGP, the RTM, and IFF levels, respectively).
Configuration examples
This section describes configuration examples for stitching IPVPN and EVPN-IFL domains and the propagation of BGP path attributes for EVPN-IFF.
Example 1 - stitching IPVPN and EVPN-IFL domains
In this configuration example, IPVPN and EVPN-IFL are simultaneously configured in VPRN 80 of PE2. This allows the stitching of IPVPN and EVPN-IFL domains, as shown in Stitching IPVPN and EVPN-IFL domains.
The following is an example configuration of PE1, PE2, and PE4 for VPRN 80.
// PE1's VPRN 80
A:PE-1# configure service vprn 80
A:PE-1>config>service>vprn# info
----------------------------------------------
router-id 192.0.2.1
autonomous-system 64500
interface "lo0" create
address 1.1.1.1/32
loopback
exit
interface "local" create
address 10.0.0.254/24
sap 1/1/c1/1:80 create
exit
exit
bgp-ipvpn
mpls
auto-bind-tunnel
resolution any
exit
route-distinguisher 192.0.2.1:80
vrf-target target:64500:80
no shutdown
exit
exit
bgp
min-route-advertisement 1
group "pe-ce"
family ipv4
type external
export "export-al-to-vnf"
neighbor 10.0.0.1
local-as 64500
peer-as 81
exit
exit
no shutdown
exit
no shutdown
// PE2's VPRN 80
A:PE-2# configure service vprn 80
A:PE-2>config>service>vprn# info
----------------------------------------------
interface "lo0" create
address 2.2.2.2/32
loopback
exit
bgp-ipvpn
mpls
auto-bind-tunnel
resolution any
exit
route-distinguisher 192.0.2.2:80
vrf-target target:64500:80
no shutdown
exit
exit
bgp-evpn
mpls
auto-bind-tunnel
resolution any
exit
route-distinguisher 192.0.2.2:80
vrf-target target:64500:80
no shutdown
exit
exit
no shutdown
----------------------------------------------
// PE4's VPRN 80
A:PE-4# configure service vprn 80
A:PE-4>config>service>vprn# info
----------------------------------------------
router-id 192.0.2.4
autonomous-system 64500
interface "lo0" create
address 4.4.4.4/32
loopback
exit
interface "local" create
address 40.0.0.254/24
sap 1/1/c1/1:80 create
exit
exit
bgp-evpn
mpls
auto-bind-tunnel
resolution any
exit
route-distinguisher 192.0.2.4:80
vrf-target target:64500:80
no shutdown
exit
exit
bgp
min-route-advertisement 1
group "pe-ce"
family ipv4
type external
export "export-bl-to-pe"
neighbor 40.0.0.1
local-as 64500
peer-as 84
exit
exit
no shutdown
exit
no shutdown
Example 2 - propagation of BGP path attributes for EVPN-IFF
In this configuration example, the DCGW PE2 re-exports EVPN-IFF routes into EVPN-IFF (leaked) routes and EVPN-IFL routes. The BGP path attributes are propagated as shown in Propagation of BGP path attributes for EVPN-IFF. As described in BGP path attribute propagation, EVPN extended communities, BGP encapsulation extended community and route targets are not propagated but instead, re-originated.
The following is an example configuration for PE4 and PE2 (PE1 has equivalent configuration as PE4).
// PE4 services for EVPN-IFF
A:PE-4>config>service>vprn# /configure service vprn 93
A:PE-4>config>service>vprn# info
----------------------------------------------
router-id 4.4.4.4
autonomous-system 64500
interface "evi-95" create
address 94.0.0.254/24
vrrp 1 owner passive
backup 94.0.0.254
exit
vpls "evi-95"
exit
exit
interface "evi-94" create
vpls "evi-94"
evpn-tunnel
exit
exit
bgp
min-route-advertisement 1
group "pe-ce"
family ipv4
type external
export "export-al-to-vnf"
neighbor 94.0.0.1
local-as 64500
peer-as 94
exit
exit
no shutdown
exit
no shutdown
----------------------------------------------
A:PE-4>config>service>vprn# /configure service vpls 95
A:PE-4>config>service>vpls# info
----------------------------------------------
allow-ip-int-bind
exit
stp
shutdown
exit
sap 1/1/c1/1:90 create
no shutdown
exit
no shutdown
----------------------------------------------
A:PE-4>config>service>vpls# /configure service vpls 94
A:PE-4>config>service>vpls# info
----------------------------------------------
allow-ip-int-bind
exit
vxlan instance 1 vni 94 create
exit
bgp
exit
bgp-evpn
no mac-advertisement
ip-route-advertisement
evi 94
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
----------------------------------------------
// PE2 config
A:PE-2# configure service vprn 90
A:PE-2>config>service>vprn# info
----------------------------------------------
interface "evi-91" create
vpls "evi-91"
evpn-tunnel
exit
exit
bgp-evpn
mpls
auto-bind-tunnel
resolution any
exit
route-distinguisher 192.0.2.2:90
vrf-export "leak-color-51-into-93"
vrf-target import target:64500:90
no shutdown
exit
exit
no shutdown
----------------------------------------------
A:PE-2>config>service>vprn# /configure service vpls 91
A:PE-2>config>service>vpls# info
----------------------------------------------
allow-ip-int-bind
exit
vxlan instance 1 vni 91 create
exit
bgp
exit
bgp-evpn
no mac-advertisement
ip-route-advertisement
evi 91
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
----------------------------------------------
A:PE-2>config>service>vpls# /configure service vprn 93
A:PE-2>config>service>vprn# info
----------------------------------------------
interface "evi-94" create
vpls "evi-94"
evpn-tunnel
exit
exit
bgp-evpn
mpls
auto-bind-tunnel
resolution any
exit
route-distinguisher 192.0.2.2:93
vrf-export "leak-color-51-into-90"
vrf-target import target:64500:93
no shutdown
exit
exit
no shutdown
----------------------------------------------
A:PE-2>config>service>vprn# /configure service vpls 94
A:PE-2>config>service>vpls# info
----------------------------------------------
allow-ip-int-bind
exit
vxlan instance 1 vni 94 create
exit
bgp
exit
bgp-evpn
no mac-advertisement
ip-route-advertisement
evi 94
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
----------------------------------------------
A:PE-2>config>service>vpls# /show router policy "leak-color-51-into-90"
entry 10
from
community "color-51"
exit
action accept
community add "RT64500:90" "RT64500:93"
exit
exit
default-action accept
community add "RT64500:93"
exit
A:PE-2>config>service>vpls# /show router policy "leak-color-51-into-93"
entry 10
from
community "color-51"
exit
action accept
community add "RT64500:90" "RT64500:93"
exit
exit
default-action accept
community add "RT64500:90"
exit
Example 3 - D-PATH configuration
The example in the following figure shows a typical Layer 3 EVPN DC gateway scenario where EVPN-IFF routes are translated into IPVPN routes, and vice versa. Because redundant gateways are used, this scenario is subject to Layer 3 routing loops, and the D-PATH attribute helps preventing these loops in an automatic way, without the need for extra routing policies to tag or drop routes.
The following is the configuration of the VPRN or R-VPLS services in DGW1 and DGW2 in the preceding figure.
A:DGW1# configure service vprn 20
A:DGW1>config>service>vprn# info
----------------------------------------------
interface "sbd-1" create
vpls “sbd-1”
evpn-tunnel
exit
exit
segment-routing-v6 1 create
locator "LOC-1"
function
end-dt46
exit
exit
exit
bgp-ipvpn
segment-routing-v6
route-distinguisher 192.0.2.1:20
srv6-instance 1 default-locator "LOC-1"
source-address 2001:db8::1
vrf-target target:64500:20
domain-id 65000:2
no shutdown
exit
exit
no shutdown
*A:DGW1# configure service vpls "sbd-1"
*A:DGW1>config>service>vpls# info
----------------------------------------------
allow-ip-int-bind
exit
vxlan instance 1 vni 1 create
exit
bgp
exit
bgp-evpn
evi 1
ip-route-advertisement domain-id 65000:1
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
stp
shutdown
exit
A:DGW2# configure service vprn 20
A:DGW2>config>service>vprn# info
----------------------------------------------
interface "sbd-1" create
vpls “sbd-1”
evpn-tunnel
exit
exit
segment-routing-v6 1 create
locator "LOC-1"
function
end-dt46
exit
exit
exit
bgp-ipvpn
segment-routing-v6
route-distinguisher 192.0.2.2:20
srv6-instance 1 default-locator "LOC-1"
source-address 2001:db8::2
vrf-target target:64500:20
domain-id 65000:2
no shutdown
exit
exit
no shutdown
*A:DGW2# configure service vpls "sbd-1"
*A:DGW2>config>service>vpls# info
----------------------------------------------
allow-ip-int-bind
exit
vxlan instance 1 vni 1 create
exit
bgp
exit
bgp-evpn
evi 1
ip-route-advertisement domain-id 65000:1
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
stp
shutdown
exit
The following considerations apply to the example configuration shown in Use of D-PATH for Layer 3 DC gateway redundancy.
- Imported VPN-IP SRv6 routes are readvertised as EVPN-IFF VXLAN routes with a prepended D-PATH domain 65000:2:128.
- Imported EVPN-IFF VXLAN routes are readvertised as VPN-IP SRv6 routes with a prepended D-PATH domain 65000:1:70.
If PE1 sends an EVPN-IFF route 10.0.0.0/24 that is imported by both DGW1 and DGW2, then, when DGW1 and DGW2 receive each other’s routes, they identify the D-PATH attribute and compare the list of domains with the locally configured domains in the VPRN. Since the domain matches one of the local domains, the route is not installed in the VPRN route table and it is flagged as a looped route (the show router bgp routes detail or hunt commands show DPath Loop VRFs: 20). In this way loops are prevented.
Routing policies for BGP EVPN routes
Routing policies match on specific fields when EVPN routes are imported or exported. These matching fields (excluding route table evpn ip-prefix routes, unless explicitly mentioned), are:
communities, extended-communities, and large-communities
well-known communities (no-export | no-export-subconfed | no-advertise)
family EVPN
protocol BGP-VPN (this term also matches VPN-IPv4 and VPN-IPv6 routes)
prefix lists for routes type 2 when they contain an IP address, and for type 5
route tags that can be passed by EVPN to BGP from:
service>epipe/vpls>bgp-evpn>mpls/vxlan>default-route-tag (this route-tag can be matched on export only)
service>vpls>proxy-arp/nd>evpn-route-tag (this route tag can be matched on export only)
route table route-tags when exporting EVPN IP-prefix routes
EVPN type
BGP attributes that are applicable to EVPN routes (such as AS-path, local-preference, next-hop)
Additionally, the route tags can be used on export policies to match EVPN routes that belong to a service and BGP instance, routes that are created by the proxy-arp or proxy-nd application, or IP-Prefix routes that are added to the route table with a route tag.
EVPN can pass only one route tag to BGP to achieve matching on export policies. In case of a conflict, the default-route-tag has the least priority of the three potential tags added by EVPN.
For instance, if VPLS 10 is configured with proxy-arp>evpn-route-tag 20 and bgp-evpn>mpls>default-route-tag 10, all MAC/IP routes, which are generated by the proxy-arp application, uses route tag 20. Export policies can then use ‟from tag 20” to match all those routes. In this case, inclusive Multicast routes are matched by using ‟from tag 10”.
Routing policies for BGP EVPN IP prefixes
BGP routing policies are supported for IP prefixes imported or exported through BGP-EVPN in R-VPLS services (EVPN-IFF routes) or VPRN services (EVPN-IFL routes).
When applying routing policies to control the distribution of prefixes between EVPN-IFF and IP-VPN (or EVPN-IFL), the user must consider that these owners are completely separate as far as BGP is concerned and when prefixes are imported in the VPRN routing table, the BGP attributes are lost to the other owner, unless the iff-attribute-uniform-propagation command is configured on the router.
If the iff-attribute-uniform-propagation command is disabled, the use of route tags allows the controlled distribution of prefixes across the two families.
IP-VPN import and EVPN export BGP workflow shows an example of how VPN-IPv4 routes are imported into the RTM (Routing Table Manager) and then passed to EVPN for its own process.
Policy tags can be used to match EVPN IP prefixes that were learned not only from BGP VPN-IPv4 but also from other routing protocols. The tag range supported for each protocol is different, as follows:
<tag> : accepts in decimal or hex
[0x1..0xFFFFFFFF]H (for OSPF and IS-IS)
[0x1..0xFFFF]H (for RIP)
[0x1..0xFF]H (for BGP)
EVPN import and I-VPN export BGP workflow shows an example of the reverse workflow where routes are imported from EVPN and exported from RTM to BGP VPN-IPv4.
The preceding described behavior and the use of tags is also valid for vsi-import and vsi-export policies in the R-VPLS.
The following is a summary of the policy behavior for EVPN-IFF IP-prefixes when iff-attribute-uniform-propagation is disabled.
For EVPN-IFF routes received and imported in RTM, policy entries (peer or vsi-import) match on communities or any of the following fields, and can add tags (as action):
communities, extended-communities or large communities
well-known communities
family EVPN
protocol bgp-vpn
prefix-lists
EVPN route type
BGP attributes (as-path, local-preference, next-hop)
For exporting RTM to EVPN-IFF prefix routes, policy entries only match on tags, and based on this matching, add communities, accept, or reject. This applies to the peer level or on the VSI export level. Policy entries can also add tags for static routes, RIP, OSPF, IS-IS, BGP, and ARP-ND routes, which can then be matched on the BGP peer export policy, or on the VSI export policy for EVPN-IFF routes.
The following applies if the iff-attribute-uniform-propagation command is enabled.
For exporting RTM to EVPN-IFF prefix routes, in addition to matching on tags, matching path attributes on EVPN-IFF routes is supported in the following:
-
vrf-export (when exporting the prefixes in VPN-IP or EVPN IFL or IP routes)
-
vsi-export policies (when exporting the prefixes in EVPN-IFF routes)
-
for non-BGP route-owners (RIP, OSPF, IS-IS, static, ARP-ND), there are no changes and the only match criterion in vsi-export for EVPN-IFF routes is tags
EVPN Weighted ECMP for IP prefix routes
SR OS supports weighted ECMP for EVPN IP prefix routes (IPv4 and IPv6), in the EVPN Interface-less (EVPN-IFL) and EVPN Interface-ful (EVPN-IFF) models.
Based on draft-ietf-bess-evpn-unequal-lb, the EVPN Link Bandwidth extended community is used in the IP Prefix routes to indicate a weight that the receiver PE must consider when load balancing traffic to multiple EVPN, CE, or both next hops. The supported weight in the extended community is of type Generalized weight and encodes the count of CEs that advertised prefix N to a PE in a BGP PE-CE route. The following figure shows the use of EVPN weighted ECMP.
In the preceding figure, some multi-rack Container Network Functions (CNFs) are connected to a few TORs in the EVPN network. Each CNF advertises the same anycast service network 10.1.1.0/24 using a single PE-CE BGP session. Without Weighted ECMP, the TOR2, TOR3 and TO4 would re-advertise the prefix in an EVPN IP-Prefix route and flows to 10.1.1.0/24 from the Border Leaf-1 would be equally distributed among TOR2, TOR3 and TOR4. However, the needed load balancing distribution is based on the count of CNFs that are attached to each TOR. That is, out of five flows to 10.1.1.0/24, three should be directed to TOR3 (because it has three CNFs attached), one to TOR4 and one to either TOR2 or TOR1 (since CNF1 is dual-homed to both).
Weighted ECMP achieves the needed unequal load balancing based on the CNF count on each TOR. In the Weighted ECMP for IP prefix routes use case example, if Weighted ECMP is enabled, the TORs add a weight encoded in the EVPN IP Prefix route, where the weight matches the count of CNFs that each TOR has locally . The Border Leaf creates an ECMP set for prefix 10.1.1.0/24 were the weights are considered when distributing the load to the prefix.
The procedures associated with EVPN Weighted ECMP for IP Prefix routes can be divided into advertising and receiving procedures:
- Use the following commands to configure the advertising procedures for EVPN IFL.Use the following command to configure the advertising procedures for EVPN IFF.
configure service vprn bgp-evpn mpls evpn-link-bandwidth advertise configure service vprn bgp-evpn segment-routing-v6 evpn-link-bandwidth advertise
configure service vpls bgp-evpn ip-route-link-bandwidth advertise
The advertise command triggers the advertisement of the EVPN Link Bandwidth extended community with a weight that matches the CE count advertised by the route. The dynamic weight can, optionally, be overridden by a configuring the advertise weight value.
- Use the following commands to configure the receiving procedures for EVPN-IFL.Use the following command to configure the receiving procedures for EVPN-IFF.
configure service vprn bgp-evpn mpls evpn-link-bandwidth weighted-ecmp configure service vprn bgp-evpn segment-routing-v6 evpn-link-bandwidth weighted-ecmp
configure service vpls bgp-evpn ip-route-link-bandwidth weighted-ecmp
When the weighted-ecmp command is enabled, the receiving PE installs IP Prefix routes in the VPRN route-table associated with a normalized weight that is derived from the signaled weight.
- For EVPN-IFL, for weighted ECMP across EVPN next hops and CE next hops, the
following commands must be
configured.
configure service vprn bgp group evpn-link-bandwidth add-to-received-bgp configure service vprn bgp eibgp-loadbalance
- For EVPN-IFF, Weighted ECMP can only be applied to EVPN next hops and not to the eibgp-loadbalance command.
- For EVPN-IFL, for weighted ECMP across EVPN next hops and CE next hops, the
following commands must be
configured.
EVPN-IFL MPLS service configuration
The following example shows the configuration of the EVPN Weighted ECMP feature for EVPN IFL routes with MPLS transport. A similar example could have been added for EVPN IFL routes with SRv6 transport.
Suppose PE2, PE4, and PE5 are attached to the same EVPN-IFL service on VPRN 2000. PE4 is connected to two CEs (CE-41 and CE-42) and PE5 to one CE (CE-51). The three CEs advertise the same prefix 192.168.1.0/24 using PE-CE BGP and the goal is for PE2 to distribute to PE4 (to 192.168.1.0/24) twice as many flows as for PE5.
The configuration of PE4 and PE5 follows:
*A:PE-4# configure service vprn 2000
*A:PE-4>config>service>vprn# info
----------------------------------------------
ecmp 10
autonomous-system 64500
interface "to-CE41" create
address 10.41.0.1/24
sap pxc-3.a:401 create
exit
exit
interface "to-CE42" create
address 10.42.0.1/24
sap pxc-3.a:402 create
exit
exit
bgp-evpn
mpls
auto-bind-tunnel
resolution any
exit
evi 2000
evpn-link-bandwidth
advertise
weighted-ecmp
exit
route-distinguisher 192.0.2.4:2000
vrf-target target:64500:2000
no shutdown
exit
exit
bgp
multi-path
ipv4 10
exit
eibgp-loadbalance
router-id 4.4.4.4
rapid-withdrawal
group "pe-ce"
family ipv4 ipv6
neighbor 10.41.0.2
peer-as 64541
evpn-link-bandwidth
add-to-received-bgp 1
exit
exit
neighbor 10.42.0.2
peer-as 64542
evpn-link-bandwidth
add-to-received-bgp 1
exit
exit
exit
no shutdown
exit
no shutdown
A:PE-5# configure service vprn 2000
A:PE-5>config>service>vprn# info
----------------------------------------------
autonomous-system 64500
interface "to-CE51" create
address 10.51.0.1/24
sap pxc-3.a:501 create
exit
exit
bgp-evpn
mpls
auto-bind-tunnel
resolution any
exit
evi 2000
evpn-link-bandwidth
advertise
weighted-ecmp
exit
route-distinguisher 192.0.2.5:2000
vrf-target target:64500:2000
no shutdown
exit
exit
bgp
multi-path
ipv4 10
exit
eibgp-loadbalance
router-id 5.5.5.5
rapid-withdrawal
group "pe-ce"
family ipv4 ipv6
neighbor 10.51.0.2
peer-as 64551
evpn-link-bandwidth
add-to-received-bgp 1
exit
exit
exit
no shutdown
exit
no shutdown
The configuration on PE2 follows:
*A:PE-2# configure service vprn 2000
*A:PE-2>config>service>vprn# info
----------------------------------------------
ecmp 10
interface "to-PE" create
address 20.10.0.1/24
sap pxc-3.a:2000 create
exit
exit
bgp-evpn
mpls
auto-bind-tunnel
resolution any
exit
evi 2000
evpn-link-bandwidth
advertise
weighted-ecmp
exit
route-distinguisher 192.0.2.2:2000
vrf-target target:64500:2000
no shutdown
exit
exit
no shutdown
PE4 and PE5 IP Prefix route advertisement
As a result of the preceding configuration, PE4 (next-hop 2001:db8::4) and PE5 (next-hop 2001:db8::5) advertise the IP Prefix route from the CEs with weights 2 and 1 respectively:
show router bgp routes evpn ip-prefix prefix 192.168.1.0/24 community target:64500:2000 hunt
===============================================================================
BGP Router ID:192.0.2.2 AS:64500 Local AS:64500
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
Network : n/a
Nexthop : 2001:db8::4
Path Id : None
From : 2001:db8::4
Res. Nexthop : fe80::b446:ffff:fe00:142
Local Pref. : 100 Interface Name : int-PE-2-PE-4
Aggregator AS : None Aggregator : None
Atomic Aggr. : Not Atomic MED : None
AIGP Metric : None IGP Cost : 10
Connector : None
Community : target:64500:2000 evpn-bandwidth:1:2
bgp-tunnel-encap:MPLS
Cluster : No Cluster Members
Originator Id : None Peer Router Id : 192.0.2.4
Flags : Used Valid Best IGP
Route Source : Internal
AS-Path : 64541
EVPN type : IP-PREFIX
ESI : ESI-0
Tag : 0
Gateway Address: 00:00:00:00:00:00
Prefix : 192.168.1.0/24
Route Dist. : 192.0.2.4:2000
MPLS Label : LABEL 524283
Route Tag : 0
Neighbor-AS : 64541
Orig Validation: N/A
Source Class : 0 Dest Class : 0
Add Paths Send : Default
Last Modified : 01h19m43s
Network : n/a
Nexthop : 2001:db8::5
Path Id : None
From : 2001:db8::5
Res. Nexthop : fe80::b449:1ff:fe01:1f
Local Pref. : 100 Interface Name : int-PE-2-PE-5
Aggregator AS : None Aggregator : None
Atomic Aggr. : Not Atomic MED : None
AIGP Metric : None IGP Cost : 10
Connector : None
Community : target:64500:2000 evpn-bandwidth:1:1
bgp-tunnel-encap:MPLS
Cluster : No Cluster Members
Originator Id : None Peer Router Id : 192.0.2.5
Flags : Used Valid Best IGP
Route Source : Internal
AS-Path : 64551
EVPN type : IP-PREFIX
ESI : ESI-0
Tag : 0
Gateway Address: 00:00:00:00:00:00
Prefix : 192.168.1.0/24
Route Dist. : 192.0.2.5:2000
MPLS Label : LABEL 524285
Route Tag : 0
Neighbor-AS : 64551
Orig Validation: N/A
Source Class : 0 Dest Class : 0
Add Paths Send : Default
Last Modified : 00h08m45s
-------------------------------------------------------------------------------
RIB Out Entries
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Routes : 2
===============================================================================
PE2 prefix installation
The show router id route-table extensive command performed on PE2, shows that PE2 installs the prefix with weights 2 and 1 respectively for PE4 and PE5:
show router 2000 route-table 192.168.1.0/24 extensive
===============================================================================
Route Table (Service: 2000)
===============================================================================
Dest Prefix : 192.168.1.0/24
Protocol : EVPN-IFL
Age : 01h22m47s
Preference : 170
Indirect Next-Hop : 2001:db8::4
Label : 524283
QoS : Priority=n/c, FC=n/c
Source-Class : 0
Dest-Class : 0
ECMP-Weight : 2
Resolving Next-Hop : 2001:db8::4 (LDP tunnel)
Metric : 10
ECMP-Weight : N/A
Indirect Next-Hop : 2001:db8::5
Label : 524285
QoS : Priority=n/c, FC=n/c
Source-Class : 0
Dest-Class : 0
ECMP-Weight : 1
Resolving Next-Hop : 2001:db8::5 (LDP tunnel)
Metric : 10
ECMP-Weight : N/A
-------------------------------------------------------------------------------
No. of Destinations: 1
===============================================================================
*A:PE-2# show router 2000 fib 1 192.168.1.0/24 extensive
===============================================================================
FIB Display (Service: 2000)
===============================================================================
Dest Prefix : 192.168.1.0/24
Protocol : EVPN-IFL
Installed : Y
Indirect Next-Hop : 2001:db8::4
Label : 524283
QoS : Priority=n/c, FC=n/c
Source-Class : 0
Dest-Class : 0
ECMP-Weight : 2
Resolving Next-Hop : 2001:db8::4 (LDP tunnel)
ECMP-Weight : 1
Indirect Next-Hop : 2001:db8::5
Label : 524285
QoS : Priority=n/c, FC=n/c
Source-Class : 0
Dest-Class : 0
ECMP-Weight : 1
Resolving Next-Hop : 2001:db8::5 (LDP tunnel)
ECMP-Weight : 1
===============================================================================
Total Entries : 1
===============================================================================
EVPN-IFL handling
In case of EVPN-IFL, Weighted ECMP is also supported for EIBGP load balancing among EVPN and CE next hops. For example, PE4 installs the same prefix with an EVPN-IFL next hop and two CE next hops, and each one with its normalized weight:
show router 2000 route-table 192.168.1.0/24 extensive
===============================================================================
Route Table (Service: 2000)
===============================================================================
Dest Prefix : 192.168.1.0/24
Protocol : BGP
Age : 00h02m27s
Preference : 170
Indirect Next-Hop : 10.41.0.2
QoS : Priority=n/c, FC=n/c
Source-Class : 0
Dest-Class : 0
ECMP-Weight : 1
Resolving Next-Hop : 10.41.0.2
Interface : to-CE41
Metric : 0
ECMP-Weight : N/A
Indirect Next-Hop : 10.42.0.2
QoS : Priority=n/c, FC=n/c
Source-Class : 0
Dest-Class : 0
ECMP-Weight : 1
Resolving Next-Hop : 10.42.0.2
Interface : to-CE42
Metric : 0
ECMP-Weight : N/A
Indirect Next-Hop : 2001:db8::5
Label : 524285
QoS : Priority=n/c, FC=n/c
Source-Class : 0
Dest-Class : 0
ECMP-Weight : 1
Resolving Next-Hop : 2001:db8::5 (LDP tunnel)
Metric : 10
ECMP-Weight : N/A
-------------------------------------------------------------------------------
No. of Destinations: 1
===============================================================================
EVPN IP aliasing for IP prefix routes
SR OS supports IP aliasing for EVPN IP prefix routes in the EVPN IFL (Interface-less) or EVPN IFF (Interface-ful) models and as described in draft-sajassi-bess-evpn-ip-aliasing.
IP aliasing allows PEs to load-balance flows to multiple PEs attached to the same prefix, even if not all of them advertise reachability to the prefix in IP prefix routes. IP aliasing works based on the following principles:
- It requires the configuration of a virtual Ethernet Segment (ES), for example, ES-1, that is associated with a vprn-next-hop and an evi configured in the vprn context. All PEs with reachability to the vprn-next-hop, via the non-EVPN route, advertise their attachment to the ES using EVPN Auto-Discovery per ES and per EVI routes in the VPRN service context.
- Any PE that receives a BGP PE-CE route for a prefix P via next-hop N, where N matches the active vprn-next-hop, advertises an IP prefix route for P with the ESI of the ES; for example, ESI-1.
- On reception, PEs importing IP prefix routes with ESI-1 install the prefix P in the route table using the next hops of the AD per-EVI routes for ESI-1, instead of the next hop of the IP prefix route.
EVPN IP aliasing in an EVPN-IFL model is an example of the use of IP aliasing in an EVPN-IFL model.
In the EVPN IP aliasing in an EVPN-IFL model example shown in the preceding figure, a multi-rack Virtual Network Function (VNF) is attached to Leaf-1 and Leaf-2. Although the VNF supports a single PE-CE eBGP session to Leaf-1, the preferred behavior is for the Border-Leaf-1 to load balance the traffic toward the VNF using both Leaf-1 and Leaf-2 as next hops. EVPN IP aliasing achieves that preferred behavior based on the following configuration.
An ES L3-ES-1 is configured in Leaf-1 and Leaf-2. The ES is configured for all-active mode and is associated with the vprn-next-hop of the VNF. The association with the evi of the VPRN where the next hop is installed is also required.
Leaf-1 and Leaf-2 ES configuration (MD-CLI)
[ex:/configure service system bgp evpn]
A:admin@node2# info
ethernet-segment "L3-ES-1" {
admin-state enable
type virtual
esi 0x01010101010000000000
multi-homing-mode all-active
association {
vprn-next-hop 1.1.1.1 {
virtual-ranges {
evi 2500 { }
}
}
}
}
Leaf-1 and Leaf-2 ES configuration (classic CLI)
config>service>system>bgp-evpn# info
----------------------------------------------
ethernet-segment "L3-ES-1" virtual create
esi 01:01:01:01:01:00:00:00:00:00
service-carving
mode auto
exit
multi-homing all-active
vprn-next-hop 1.1.1.1
evi
evi-range 2500
exit
no shutdown
exit
The VPRN service configuration in Leaf-1 and Leaf-2 requires the configuration of the evi so that the ES is active on the service.
Leaf-1 VPRN configuration (MD-CLI)
[ex:/configure service vprn "2500"]
A:admin@node2# info
admin-state enable
customer "1"
autonomous-system 64502
bgp-evpn {
mpls 1 {
admin-state enable
route-distinguisher "192.168.0.1:2500"
evi 2500
vrf-target {
community "target:64500:2500"
}
auto-bind-tunnel {
resolution any
}
}
}
bgp {
min-route-advertisement 1
router-id 2.2.2.2
rapid-withdrawal true
ebgp-default-reject-policy {
import false
export false
}
next-hop-resolution {
use-bgp-routes true
}
group "pe-ce" {
multihop 10
family {
ipv4 true
ipv6 true
}
}
neighbor "1.1.1.1" {
group "pe-ce"
peer-as 64501
}
}
interface "irb1" {
ipv4 {
primary {
address 10.10.10.254
prefix-length 24
}
vrrp 1 {
backup [10.10.10.254]
owner true
passive true
}
}
vpls "BD2501" {
evpn {
arp {
learn-dynamic false
advertise dynamic {
}
}
}
}
}
interface "lo1" {
loopback true
ipv4 {
primary {
address 2.2.2.2
prefix-length 32
}
}
}
static-routes {
route 1.1.1.1/32 route-type unicast {
next-hop "10.10.10.1" {
admin-state enable
}
}
}
Leaf-1 VPRN configuration (classic CLI)
config>service>vprn 2500 # info
----------------------------------------------
autonomous-system 64502
interface "irb1" create
address 10.10.10.254/24
vrrp 1 owner passive
backup 10.10.10.254
exit
vpls "BD2501"
evpn
arp
no learn-dynamic
advertise dynamic
exit
exit
exit
exit
interface "lo1" create
address 2.2.2.2/32
loopback
exit
static-route-entry 1.1.1.1/32
next-hop 10.10.10.1
no shutdown
exit
exit
bgp-evpn
mpls
auto-bind-tunnel
resolution any
exit
evi 2500
route-distinguisher 192.168.0.1:2500
vrf-target target:64500:2500
no shutdown
exit
exit
bgp
min-route-advertisement 1
router-id 2.2.2.2
rapid-withdrawal
next-hop-resolution
use-bgp-routes
exit
group "pe-ce"
family ipv4 ipv6
multihop 10
neighbor 1.1.1.1
peer-as 64501
exit
exit
no shutdown
exit
no shutdown
Leaf-2 VPRN configuration (MD-CLI)
[ex:/configure service vprn "2500"]
A:admin@node2# info
admin-state enable
customer "1"
bgp-evpn {
mpls 1 {
admin-state enable
route-distinguisher "192.168.0.2:2500"
evi 2500
vrf-target {
community "target:64500:2500"
}
auto-bind-tunnel {
resolution any
}
}
}
interface "irb1" {
ipv4 {
primary {
address 10.10.10.254
prefix-length 24
}
vrrp 1 {
backup [10.10.10.254]
owner true
passive true
}
}
vpls "BD2501" {
evpn {
arp {
learn-dynamic false
advertise dynamic {
}
}
}
}
}
static-routes {
route 1.1.1.1/32 route-type unicast {
next-hop "10.10.10.1" {
admin-state enable
}
}
}
Leaf-2 VPRN configuration (classic CLI)
config>service>vprn 2500 # info
----------------------------------------------
interface "irb1" create
address 10.10.10.254/24
vrrp 1 owner passive
backup 10.10.10.254
exit
vpls "BD2501"
evpn
arp
no learn-dynamic
advertise dynamic
exit
exit
exit
exit
static-route-entry 1.1.1.1/32
next-hop 10.10.10.1
no shutdown
exit
exit
bgp-evpn
mpls
auto-bind-tunnel
resolution any
exit
evi 2500
route-distinguisher 192.168.0.2:2500
vrf-target target:64500:2500
no shutdown
exit
exit
no shutdown
The Border-Leaf-1 configuration also needs the addition of the evi in the VPRN. This allows the creation of ECMP-sets where the next hops of the received IP prefixes are linked to the AD per-EVI routes next hops.
Border-Leaf-1 VPRN configuration (MD-CLI)
[ex:/configure service vprn "2500"]
A:admin@node2# info
admin-state enable
customer "1"
ecmp 4
bgp-evpn {
mpls 1 {
admin-state enable
route-distinguisher "192.168.0.3:2500"
evi 2500
vrf-target {
community "target:64500:2500"
}
auto-bind-tunnel {
resolution any
}
}
}
Border-Leaf-1 VPRN configuration (classic CLI)
config>service>vprn 2500 # info
----------------------------------------------
ecmp 4
bgp-evpn
mpls
auto-bind-tunnel
resolution any
exit
evi 2500
route-distinguisher 192.168.0.3:2500
vrf-target target:64500:2500
no shutdown
exit
exit
no shutdown
Based on the preceding configuration and the reachability of next-hop 1.1.1.1 via non-EVPN route, the two leaf nodes advertise their attachment to the ES via AD per-ES or EVI routes. Use the following command to display the advertisement status for ESI routes.
show router bgp routes evpn auto-disc esi 01:01:01:01:01:00:00:00:00:00
Advertisement of Auto-Discovery per-ES routes
===============================================================================
BGP Router ID:192.0.2.3 AS:64500 Local AS:64500
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP EVPN Auto-Disc Routes
===============================================================================
Flag Route Dist. ESI NextHop
Tag Label
-------------------------------------------------------------------------------
u*>i 192.168.0.1:2500 01:01:01:01:01:00:00:00:00:00 192.168.0.1
0 LABEL 524282
u*>i 192.168.0.1:2500 01:01:01:01:01:00:00:00:00:00 192.168.0.1
MAX-ET LABEL 0
u*>i 192.168.0.2:2500 01:01:01:01:01:00:00:00:00:00 192.168.0.2
0 LABEL 524283
u*>i 192.168.0.2:2500 01:01:01:01:01:00:00:00:00:00 192.168.0.2
MAX-ET LABEL 0
-------------------------------------------------------------------------------
Routes : 4
===============================================================================
At the same time, upon reception of the BGP PE-CE route from the VNF with prefix 11.11.11.11/32 (with next-hop 1.1.1.1, matching the vprn-next-hop), Leaf-1 readvertises the route in an IP prefix route with the ESI of the IP aliasing ES. Use the following command.
show router bgp routes evpn ip-prefix prefix 11.11.11.11/32
Leaf-1 readvertises the route in an IP Prefix route with the ESI of the IP Aliasing ES
===============================================================================
BGP Router ID:192.0.2.3 AS:64500 Local AS:64500
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
Flag Route Dist. Prefix
Tag Gw Address
NextHop
Label
ESI
-------------------------------------------------------------------------------
u*>i 192.168.0.3:2500 11.11.11.11/32
0 00:00:00:00:00:00
192.168.0.1
LABEL 524279
01:01:01:01:01:00:00:00:00:00
-------------------------------------------------------------------------------
Routes : 1
===============================================================================
The IP prefix routes with non-zero ESI are also processed and recursively resolved on the PEs that are part of the ES. In the EVPN IP aliasing in an EVPN-IFL model example, Leaf-2 installs the prefix with the next hop associated with the ES instead of the next hop of the IP Prefix route; that is, the resolved next-hop is 1.1.1.1 instead of the IP prefix route next-hop 192.168.1.1. Use the following command to display the prefix with next hop association.
show router 2500 route-table 11.11.11.11/32 extensive
Prefix with next hop associated with ES
===============================================================================
Route Table (Service: 2500)
===============================================================================
Dest Prefix : 11.11.11.11/32
Protocol : EVPN-IFL
Age : 22h41m44s
Preference : 170
Indirect Next-Hop : 1.1.1.1
QoS : Priority=n/c, FC=n/c
Source-Class : 0
Dest-Class : 0
ECMP-Weight : N/A
Resolving Next-Hop : 1.1.1.1
Interface : irb1
Metric : 0
ECMP-Weight : N/A
-------------------------------------------------------------------------------
No. of Destinations: 1
===============================================================================
Although the preceding example is based on an EVPN IFL model, vprn-next-hop ES can also be associated with VPRNs that use the EVPN IFF model to exchange IP Prefix routes. In an EVPN IFF model, the vprn-next-hop ES is associated with the VPRN if its R-VPLS connected to the EVPN tunnel contains the evi configured in the ES.
The following considerations about vprn-next-hop ES apply:
- The ES is operationally up as long as it is administratively enabled. The operational state does not reflect the presence of the VPRN next hop in the VPRN’s route table.
- The AD per-ES or EVI routes for the ES are advertised as long as the VPRN next hop is installed in the route table (as a non-EVPN route). If the vprn-next-hop is installed in the VPRN’s route table as an EVPN IP prefix route, the AD per-ES or EVI routes are not advertised.
- A node can generate an IP prefix route with the ESI of a vprn-next-hop as long as the node has the vprn-next-hop installed in its VPRN’s route table, even as an EVPN IP prefix route.
- The AD per-ES or EVI routes are advertised with the RD and route target of the VPRN instance associated with the evi configured for the vprn-next-hop.
- ES routes are also advertised for the ES and are responsible for the DF Election in the ES in the case of single-active mode.
- All the non-DF PEs in the ES advertise their AD per-EVI route with bit P=0 and bit B=1, whereas the DF PE advertises its AD per-EVI with P=1 and B=0. When creating the ECMP-set for a prefix associated with an ESI, the remote PEs exclude those PEs for which their AD per-EVI routes indicate P=0.
EVPN Sticky ECMP for IP prefix routes
SR OS supports sticky ECMP for EVPN-IFL and EVPN-IFF IP prefix routes. Non-sticky ECMP, or just ECMP, for a specific IP prefix with n number of next hops requires the router to rehash the flows when one of the next hops is removed or added. This may impact flows that are now sent to a different next hop.
- Upon withdrawal of one of the next hops, only the affected flows are redistributed into the remaining three next hops, as equally as possible.
- Upon addition of the fifth next hop, the router minimizes the impact on existing flows.
The implementation of sticky ECMP is based on software. The router emulates the behavior by repeating each ECMP next hop of the sticky route a number of times (according to the next-hop normalized weight) in different hash buckets, to create a fill pattern of size N for the incoming flows. In general, the closer the number of next hops gets to the maximum number of ECMP paths, the worse the distribution algorithm works. For detailed information about the general implementation of sticky ECMP in SR OS, see section BGP support for sticky ECMP in the 7450 ESS, 7750 SR, 7950 XRS, and VSR Unicast Routing Protocols Guide.
An IP prefix is made sticky configuring the sticky-ecmp policy action on an import policy (at the peer or VPRN level). Sticky ECMP for EVPN IP-prefix routes is supported in combination with other ECMP features such as EVPN unequal ECMP or IP aliasing.
Configuring an EVPN service with CLI
This section provides information to configure VPLS using the command line interface.
EVPN-VXLAN configuration examples
Layer 2 PE example
This section shows a configuration example for three PEs in a Data Center, all the following assumptions are considered:
PE-1 is a Data Center Network Virtualization Edge device (NVE) where service VPLS 2000 is configured.
PE-2 and PE-3 are redundant Data Center Gateways providing Layer 2 connectivity to the WAN for service VPLS 2000.
DC PE-1 configuration for service VPLS 2000
DC PE-2 and PE-3 configuration with SAPs at the WAN side (advertisement of all macs and unknown-mac-route):
vpls 2000 name "2000" customer 1 create
vxlan instance 1 vni 2000 create
exit
bgp
route-distinguisher 65001:2000
route-target export target:65000:2000 import target:65000:2000
exit
bgp-evpn
unknown-mac-route
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
site "site-1" create
site-id 1
sap 1/1/1:1
no shutdown
exit
sap 1/1/1:1 create
no shutdown
exit
no shutdown
exit
DC PE-2 and PE-3 configuration with BGP-AD spoke-SDPs at the WAN side (mac-advertisement disable, only unknown-mac-route advertised):
service vpls 2000 name "vpls2000" customer 1 create
vxlan instance 1 vni 2000 create
bgp
pw-template-binding 1 split-horizon-group ‟to-WAN” import-
rt target:65000:2500
vsi-export ‟export-policy-1” #policy exporting the WAN and DC RTs
vsi-import ‟import-policy-1” #policy importing the WAN and DC RTs
route-distinguisher 65001:2000
bgp-ad
no shutdown
vpls-id 65000:2000
bgp-evpn
mac-advertisement disable
unknown-mac-route
vxlan bgp 1 vxlan-instance 1
no shutdown
site site-1 create
split-horizon-group ‟to-WAN”
no shutdown
site-id 1
EVPN for VXLAN in R-VPLS services example
This section shows a configuration example for three 7750 SR, 7450 ESS, or 7950 XRS PEs in a Data Center, based on the following assumptions:
PE-1 is a Data Center Network Virtualization Edge device (NVE) where the following services are configured:
-
R-VPLS 2001 and R-VPLS 2002 are subnets where Tenant Systems are connected
-
VPRN 500 is a VPRN instance providing inter-subnet forwarding between the local subnets and from local subnets to the WAN subnets
-
R-VPLS 501 is an IRB backhaul R-VPLS service that provides EVPN-VXLAN connectivity to the VPRNs in PE-2 and PE-3
*A:PE-1>config>service# info
vprn 500 name "vprn500" customer 1 create
ecmp 4
route-distinguisher 65071:500
vrf-target target:65000:500
interface "evi-501" create
address 10.30.30.1/24
vpls "evpn-vxlan-501"
exit
exit
interface "subnet-2001" create
address 10.10.10.1/24
vpls "r-vpls 2001"
exit
exit
interface "subnet-2002" create
address 10.20.20.1/24
vpls "r-vpls 2002"
exit
exit
no shutdown
exit
vpls 501 name ‟evpn-vxlan-501” customer 1 create
allow-ip-int-bind
vxlan instance 1 vni 501 create
exit
bgp
route-distinguisher 65071:501
route-target export target:65000:501 import target:65000:501
exit
bgp-evpn
ip-route-advertisement incl-host
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
exit
vpls 2001 name ‟r-vpls 2001” customer 1 create
allow-ip-int-bind
sap 1/1/1:21 create
exit
sap 1/1/1:501 create
exit
no shutdown
exit
vpls 2002 name ‟r-vpls 2002” customer 1 create
allow-ip-int-bind
sap 1/1/1:22 create
exit
sap 1/1/1:502 create
exit
no shutdown
exit
PE-2 and PE-3 are redundant Data Center Gateways providing Layer 3 connectivity to the WAN for subnets ‟subnet-2001” and ‟subnet-2002”. The following configuration excerpt shows an example for PE-2. PE-3 would have an equivalent configuration.
*A:PE-2>config>service# info
vprn 500 name "vprn500" customer 1 create
ecmp 4
route-distinguisher 65072:500
auto-bind-tunnel
resolution-filter
gre
ldp
rsvp
exit
resolution filter
exit
vrf-target target:65000:500
interface "evi-501" create
address 10.30.30.2/24
vpls "evpn-vxlan-501"
exit
exit
no shutdown
exit
vpls 501 name ‟evpn-vxlan-501” customer 1 create
allow-ip-int-bind
vxlan instance 1 vni 501 create
exit
bgp
route-distinguisher 65072:501
route-target export target:65000:501 import target:65000:501
exit
bgp-evpn
ip-route-advertisement incl-host
vxlan bgp 1 vxlan-instance 1
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
exit
EVPN for VXLAN in EVPN tunnel R-VPLS services example
The example in EVPN for VXLAN in R-VPLS services example can be optimized by using EVPN tunnel R-VPLS services instead of regular IRB backhaul R-VPLS services. If EVPN tunnels are used, the corresponding R-VPLS services cannot contain SAPs or SDP-bindings and the VPRN interfaces do not need IP addresses.
The following excerpt shows the configuration in PE-1 for the VPRN 500. The R-VPLS 501, 2001 and 2002 can keep the same configuration as shown in the previous section.
*A:PE-1>config>service# info
vprn 500 name "vprn500" customer 1 create
ecmp 4
route-distinguisher 65071:500
vrf-target target:65000:500
interface "evi-501" create
vpls "evpn-vxlan-501"
evpn-tunnel# no need to configure an IP address
exit
exit
interface "subnet-2001" create
address 10.10.10.1/24
vpls "r-vpls 2001"
exit
exit
interface "subnet-2002" create
address 20.20.20.1/24
vpls "r-vpls 2002"
exit
exit
no shutdown
exit
The VPRN 500 configuration in PE-2 and PE-3 would be changed in the same way by adding the evpn-tunnel and removing the IP address of the EVPN-tunnel R-VPLS interface. No other changes are required.
*A:PE-2>config>service# info
vprn 500 name "vprn500" customer 1 create
ecmp 4
route-distinguisher 65072:500
auto-bind-tunnel
resolution-filter
gre
ldp
rsvp
exit
resolution filter
exit
vrf-target target:65000:500
interface "evi-501" create
vpls "evpn-vxlan-501"
evpn-tunnel# no need to configure an IP address
exit
exit
no shutdown
exit
EVPN for VXLAN in R-VPLS services with IPv6 interfaces and prefixes example
In the following configuration example, PE1 is connected to CE1 in VPRN 30 through a dual-stack IP interface. VPRN 30 is connected to an EVPN-tunnel R-VPLS interface enabled for IPv6.
In the following excerpt configuration the PE1 advertises, in BGP EVPN, the 172.16.0.0/24 and 2001:db8:1000::1 prefixes in two separate NLRIs. The NLRI for the IPv4 prefix uses gateway IP = 0 and a non-zero gateway MAC, whereas the NLRI for the IPv6 prefix is sent with gateway IP = Link-Local Address for interface ‟int-evi-301” and no gateway MAC.
*A:PE1>config>service# info
vprn 30 name "vprn30" customer 1 create
route-distinguisher 192.0.2.1:30
vrf-target target:64500:30
interface "int-PE-1-CE-1" create
enable-ingress-stats
address 172.16.0.254/24
ipv6
address 2001:db8:1000::1/64
exit
sap 1/1/1:30 create
exit
exit
interface "int-evi-301" create
ipv6
exit
vpls "evi-301"
evpn-tunnel
exit
exit
no shutdown
----------------------------------------------
EVPN-MPLS configuration examples
EVPN all-active multihoming example
This section shows a configuration example for three 7750 SR, 7450 ESS, or 7950 XRS PEs, all the following assumptions are considered:
PE-1 and PE-2 are multihomed to CE-12 that uses a LAG to get connected to the network. CE-12 is connected to LAG SAPs configured in an all-active multihoming Ethernet segment.
PE-3 is a remote PE that performs aliasing for traffic destined for the CE-12.
The following configuration excerpt applies to a VPLS-1 on PE-1 and PE-2, as well as the corresponding Ethernet-segment and LAG commands.
A:PE1# configure lag 1
A:PE1>config>lag# info
----------------------------------------------
mode access
encap-type dot1q
port 1/1/2
lacp active administrative-key 1 system-id 00:00:00:00:69:72
no shutdown
----------------------------------------------
A:PE1>config>lag# /configure service system bgp-evpn
A:PE1>config>service>system>bgp-evpn# info
----------------------------------------------
route-distinguisher 192.0.2.69:0
ethernet-segment "ESI-71" create
esi 0x01000000007100000001
es-activation-timer 10
service-carving
mode auto
exit
multi-homing all-active
lag 1
no shutdown
exit
----------------------------------------------
A:PE1>config>service>system>bgp-evpn# /configure service vpls 1
A:PE1>config>service>vpls# info
----------------------------------------------
bgp
exit
bgp-evpn
cfm-mac-advertisement
evi 1
vxlan
shutdown
exit
mpls bgp 1
ingress-replication-bum-label
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
stp
shutdown
exit
sap lag-1:1 create
exit
no shutdown
----------------------------------------------
A:PE2# configure lag 1
A:PE2>config>lag# info
----------------------------------------------
mode access
encap-type dot1q
port 1/1/3
lacp active administrative-key 1 system-id 00:00:00:00:69:72
no shutdown
----------------------------------------------
A:PE2>config>lag# /configure service system bgp-evpn
A:PE2>config>service>system>bgp-evpn# info
----------------------------------------------
route-distinguisher 192.0.2.72:0
ethernet-segment "ESI-71" create
esi 0x01000000007100000001
es-activation-timer 10
service-carving
mode auto
exit
multi-homing all-active
lag 1
no shutdown
exit
----------------------------------------------
A:PE2>config>service>system>bgp-evpn# /configure service vpls 1
A:PE2>config>service>vpls# info
----------------------------------------------
bgp
exit
bgp-evpn
cfm-mac-advertisement
evi 1
vxlan
shutdown
exit
mpls bgp 1
ingress-replication-bum-label
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
stp
shutdown
exit
sap lag-1:1 create
exit
no shutdown
----------------------------------------------
The configuration on the remote PE (PE-3), which supports aliasing to PE-1 and PE-2 is shown below. PE-3 does not have any Ethernet-segment configured. It only requires the VPLS-1 configuration and ecmp>1 to perform aliasing.
*A:PE3>config>service>vpls# info
----------------------------------------------
bgp
exit
bgp-evpn
cfm-mac-advertisement
evi 1
mpls bgp 1
ingress-replication-bum-label
ecmp 4
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
stp
shutdown
exit
sap 1/1/1:1 create
exit
spoke-sdp 4:13 create
no shutdown
exit
no shutdown
----------------------------------------------
EVPN single-active multihoming example
If we wanted to use single-active multihoming on PE-1 and PE-2 instead of all-active multihoming, we would only need to modify the following:
change the LAG configuration to single-active
The CE-12 is now configured with two different LAGs, therefore, the key/system-id/system-priority must be different on PE-1 and PE-2
change the Ethernet-segment configuration to single-active
No changes are needed at service level on any of the three PEs.
The differences between single-active versus all-active multihoming are highlighted in bold in the following example excerpts:
A:PE1# configure lag 1
A:PE1>config>lag# info
----------------------------------------------
mode access
encap-type dot1q
port 1/1/2
lacp active administrative-key 1 system-id 00:00:00:00:69:69
no shutdown
----------------------------------------------
A:PE1>config>lag# /configure service system bgp-evpn
A:PE1>config>service>system>bgp-evpn# info
----------------------------------------------
route-distinguisher 192.0.2.69:0
ethernet-segment "ESI-71" create
esi 0x01000000007100000001
es-activation-timer 10
service-carving
mode auto
exit
multi-homing single-active
lag 1
no shutdown
exit
----------------------------------------------
A:PE2# configure lag 1
A:PE2>config>lag# info
----------------------------------------------
mode access
encap-type dot1q
port 1/1/3
lacp active administrative-key 1 system-id 00:00:00:00:72:72
no shutdown
----------------------------------------------
A:PE2>config>lag# /configure service system bgp-evpn
A:PE2>config>service>system>bgp-evpn# info
----------------------------------------------
route-distinguisher 192.0.2.72:0
ethernet-segment "ESI-71" create
esi 0x01000000007100000001
es-activation-timer 10
service-carving
mode auto
exit
multi-homing single-active
lag 1
no shutdown
exit
----------------------------------------------
PBB-EVPN configuration examples
PBB-EVPN all-active multihoming example
As in the EVPN all-active multihoming example, this section also shows a configuration example for three 7750 SR, 7450 ESS, or 7950 XRS PEs, however, PBB-EVPN is used in this excerpt, as follows:
PE-1 and PE-2 are multihomed to CE-12 that uses a LAG to get connected to I-VPLS 20001. CE-12 is connected to LAG SAPs configured in an all-active multihoming Ethernet-segment.
PE-3 is a remote PE that performs aliasing for traffic destined for the CE-12.
The three PEs are connected through B-VPLS 20000, a Backbone VPLS where EVPN is enabled.
The following excerpt shows the example configuration for I-VPLS 20001 and B-VPLS 20000 on PE-1 and PE-2, as well as the corresponding Ethernet-segment and LAG commands:
*A:PE1# configure lag 1
*A:PE1>config>lag# info
----------------------------------------------
mode access
encap-type dot1q
port 1/1/2
lacp active administrative-key 1 system-id 00:00:00:00:69:72
no shutdown
----------------------------------------------
*A:PE1>config>lag# /configure service system bgp-evpn
*A:PE1>config>service>system>bgp-evpn# info
----------------------------------------------
route-distinguisher 192.0.2.69:0
ethernet-segment "ESI-71" create
esi 01:00:00:00:00:71:00:00:00:01
source-bmac-lsb 71-71 es-bmac-table-size 8
es-activation-timer 5
service-carving
mode auto
exit
multi-homing all-active
lag 1
no shutdown
exit
----------------------------------------------
*A:PE1>config>service>system>bgp-evpn# /configure service vpls 20001
*A:PE1>config>service>vpls# info
----------------------------------------------
pbb
backbone-vpls 20000
exit
exit
stp
shutdown
exit
sap lag-1:71 create
exit
no shutdown
----------------------------------------------
*A:PE1>config>service>vpls# /configure service vpls 20000
*A:PE1>config>service>vpls# info
----------------------------------------------
service-mtu 2000
pbb
source-bmac 00:00:00:00:00:69
use-es-bmac
exit
bgp-evpn
evi 20000
mpls bgp 1
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
----------------------------------------------
*A:PE2# configure lag 1
*A:PE2>config>lag# info
----------------------------------------------
mode access
encap-type dot1q
port 1/1/3
lacp active administrative-key 1 system-id 00:00:00:00:69:72
no shutdown
----------------------------------------------
*A:PE2>config>lag# /configure service system bgp-evpn
*A:PE2>config>service>system>bgp-evpn# info
----------------------------------------------
route-distinguisher 192.0.2.72:0
ethernet-segment "ESI-71" create
esi 01:00:00:00:00:71:00:00:00:01
source-bmac-lsb 71-71 es-bmac-table-size 8
es-activation-timer 5
service-carving
mode auto
exit
multi-homing all-active
lag 1
no shutdown
exit
----------------------------------------------
*A:PE2>config>service>system>bgp-evpn# /configure service vpls 20001
*A:PE2>config>service>vpls# info
----------------------------------------------
pbb
backbone-vpls 20000
exit
exit
stp
shutdown
exit
sap lag-1:71 create
exit
no shutdown
----------------------------------------------
*A:PE2>config>service>vpls# /configure service vpls 20000
*A:PE2>config>service>vpls# info
----------------------------------------------
service-mtu 2000
pbb
source-bmac 00:00:00:00:00:72
use-es-bmac
exit
bgp-evpn
evi 20000
mpls bgp 1
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
----------------------------------------------
*A:PE2>config>service>vpls#
The combination of the pbb source-bmac and the Ethernet-segment source-bmac-lsb create the same BMAC for all the packets sourced from both PE-1 and PE-2 for Ethernet-segment ‟ESI-71”.
PBB-EVPN single-active multihoming example
In the following configuration example, PE-70 and PE-73 are part of the same single-active multihoming, Ethernet-segment ESI-7413. In this case, the CE is connected to PE-70 and PE-73 through spoke-SDPs 4:74 and 34:74, respectively.
In this example PE-70 and PE-73 use a different source-bmac for packets coming from ESI-7413 and it is not an es-bmac as shown in the PBB-EVPN all-active multihoming example.
*A:PE70# configure service system bgp-evpn
*A:PE70>config>service>system>bgp-evpn# info
----------------------------------------------
route-distinguisher 192.0.2.70:0
ethernet-segment "ESI-7413" create
esi 01:74:13:00:74:13:00:00:74:13
es-activation-timer 0
service-carving
mode auto
exit
multi-homing single-active
sdp 4
no shutdown
exit
----------------------------------------------
*A:PE70>config>service>system>bgp-evpn# /configure service vpls 20001
*A:PE70>config>service>vpls# info
----------------------------------------------
pbb
backbone-vpls 20000
exit
exit
stp
shutdown
exit
spoke-sdp 4:74 create
no shutdown
exit
no shutdown
----------------------------------------------
*A:PE70>config>service>vpls# /configure service vpls 20000
*A:PE70>config>service>vpls# info
----------------------------------------------
service-mtu 2000
pbb
source-bmac 00:00:00:00:00:70
exit
bgp-evpn
evi 20000
mpls bgp 1
ecmp 2
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
----------------------------------------------
*A:PE70>config>service>vpls#
A:PE73>config>service>system>bgp-evpn# info
----------------------------------------------
route-distinguisher 192.0.2.73:0
ethernet-segment "ESI-7413" create
esi 01:74:13:00:74:13:00:00:74:13
es-activation-timer 0
service-carving
mode auto
exit
multi-homing single-active
sdp 34
no shutdown
exit
----------------------------------------------
A:PE73>config>service>system>bgp-evpn# /configure service vpls 20001
A:PE73>config>service>vpls# info
----------------------------------------------
pbb
backbone-vpls 20000
exit
exit
stp
shutdown
exit
spoke-sdp 34:74 create
no shutdown
exit
no shutdown
----------------------------------------------
A:PE73>config>service>vpls# /configure service vpls 20000
A:PE73>config>service>vpls# info
----------------------------------------------
service-mtu 2000
pbb
source-bmac 00:00:00:00:00:73
exit
bgp-evpn
evi 20000
mpls bgp 1
auto-bind-tunnel
resolution any
exit
no shutdown
exit
exit
stp
shutdown
exit
no shutdown
----------------------------------------------
A:PE73>config>service>vpls#