ARP-ND Host Routes in Data Centers

This chapter provides information about ARP-ND Host Routes in Data Centers.

Topics in this chapter include:

Applicability

This chapter was initially written based on SR OS Release 16.0.R1, but the CLI in the current edition is based on SR OS Release 21.10.R3. Address Resolution Protocol - Neighbor Discovery (ARP-ND) host routes in VPRN and base router interfaces are supported in SR OS Release 15.0.R6 and later, but Nokia recommends using the feature in SR OS Release 15.0.R9, or later.

Chapters EVPN for MPLS Tunnels, EVPN for VXLAN Tunnels (Layer 2), EVPN for VXLAN Tunnels (Layer 3), and EVPN for MPLS Tunnels in Routed VPLS are prerequisite reading.

Overview

Inter-subnet forwarding (or simply routing) for a tenant domain in a Data Center (DC) must be efficient and avoid forwarding over the same path as arriving, known as tromboning or hairpinning. L2 broadcast domain extension across DCs shows an L2 broadcast domain (VPLS 1) extended across two DCs. This example is used to explain the requirement of upstream and downstream efficiency.

Figure 1. L2 broadcast domain extension across DCs

In L2 broadcast domain extension across DCs, subnet 10.0.0.0/16 is extended across two DCs and four DC Gateways (DCGWs), using VPLS 1 or R-VPLS 1 in the network nodes. The DCGWs are connected to the users of subnet 10.0.20.0/24 on PE1 via IP-VPN (or EVPN). In this scenario, there are two network characteristics that allow an efficient upstream and downstream routing:

  • Anycast gateways

  • ARP-ND host routes

Anycast Gateways provide upstream routing efficiency for the hosts connected to subnet 10.0.0.0/16, regardless of the DCGW to which they are connected. For example, if host 10.0.0.1 is in DC-1 and needs to forward traffic to subnet 10.0.20.0, DCGW1 and DCGW2 should be able to route the traffic upstream, without the need to go to DCGW3 or DCGW4. In the same way, if host 10.0.0.1 moves to DC-2, the upstream traffic to subnet 10.0.20.0 must be routed by the local DCGWs without changing the existing host default gateway IP and MAC configuration. To achieve this local default gateway routing, all the DCGWs of the extended broadcast domain need to have the same IP and MAC addresses in the R-VPLS interface (Integrated Routing and Bridging (IRB) interface in industry-standard terminology).

Anycast Gateways are implemented in SR OS by using passive VRRP. See the EVPN for MPLS Tunnels in Routed VPLS chapter for more information about passive VRRP.

ARP-ND host routes learning and advertising are required to provide an efficient downstream routing from remote subnets to the hosts in the extended broadcast domain. Assuming virtual machine VM 10.0.0.1 (in L2 broadcast domain extension across DCs) is connected to DC-1 (left-side DC), when PE1 needs to send traffic to host 10.0.0.1, it will do a Longest Prefix Match (LPM) lookup on the VPRN route table. If the only IP prefix advertised by the four DCGWs were 10.0.0.0/16, PE1 could send the packets to a DC where the VM is not present. This would result in unnecessary tromboning; for example, PE1 could send the traffic to DCGW3, then DCGW3 would send it to DCGW2 to get to VM 10.0.0.1. However, PE1 could have forwarded directly to DCGW2.

To provide efficient downstream routing to the DC where the VM is located, DCGW1 and DCGW2 need to generate host routes for the VMs to which they are attached. Furthermore, when the VM moves to the other DC, DCGW3 and DCGW4 must be able to learn the VM host route and advertise it to PE1. Also, DCGW1 and DCGW2 will have to withdraw the route for 10.0.0.1, because the VM is no longer in the local DC.

To address this and other use cases, SR OS can learn the VM host route from the ARP or ND messages that it generates when it boots or when it moves. The host route can also be learned from EVPN routes type 2 (MAC/IP routes) that are installed in the ARP/ND caches, or in general, any ARP/ND entry can generate an ARP/ND host route.

A route owner type called "ARP-ND" is supported in the base router or a VPRN route table. The ARP-ND host routes have a preference of 1 and they are automatically created out of the ARP or ND Neighbor entries in the router instance. ARP-ND module and generated ARP-ND host routes shows how the ARP/ND software modules can generate ARP-ND host routes in the route table.

Figure 2. ARP-ND module and generated ARP-ND host routes

When config>service>vprn/ies>interface>arp-host-route>populate [static | dynamic | evpn] is enabled, the static, dynamic, and EVPN ARP entries of the routing context will create ARP-ND host routes in the route table. In the same way, ARP-ND host routes are created in the IPv6 route table out of static, dynamic, and EVPN neighbor entries, if config>service>vprn/ies>interface>ipv6>nd-host-route>populate [static | dynamic | evpn] is enabled.

ARP-ND module and generated ARP-ND host routes shows how the ARP/ND module populates its database from the usual dynamic and static entries, as well as from EVPN routes type 2 that include an IP address. Through the host-route learning action, ARP-ND host routes are handed over to the route table.

ARP-ND module and generated ARP-ND host routes also shows that the preference assigned to ARP-ND host routes is 1, which means that ARP-ND routes will be preferred over any other route owner, except for direct routes. For example, if the same host route gets to the route table from ARP-ND and VPN-IPv4 or EVPN, the ARP-ND host route will be preferred and added to the route table. Although they are added to the route table and advertised to routing protocols, ARP-ND host routes are never installed in the FIB. That helps preserve the FIB scale in the router.

The arp/nd-host-route populate [static | dynamic | evpn] commands are typically used along with other features:

  • A route tag can be added to ARP-ND hosts by the command route-tag. This tag can be matched on BGP vrf-export and peer export policies.

  • The ARP-ND host route will be kept in the route table while the corresponding ARP or Neighbor entry is active. The commands arp-proactive-refresh and nd-proactive-refresh help keep the entries active (even if there is no traffic destined to them) by sending an ARP refresh 30 seconds before the arp-timeout or starting Neighbor Unreachable Detection (NUD) when the stale-time expires.

  • To speed up the learning of the ARP-ND host routes, the commands arp-learn-unsolicited and nd-learn-unsolicited can be configured. When arp-learn-unsolicited is enabled, received unsolicited ARP messages (typically, Gratuitous Address Resolution Protocol (GARP) messages) create an ARP entry, and therefore an ARP-ND route if arp-host-route>populate [static | dynamic | evpn] is added. Similarly, unsolicited Neighbor Advertisement messages will create a "stale" neighbor. If nd-host-route>populate [static | dynamic | evpn] is enabled, a confirmation message (NUD) is sent for all the neighbor entries created as stale, and, if confirmed, the corresponding ARP-ND routes are added to the route table.

In the example of L2 broadcast domain extension across DCs, arp-host-route>populate [static | dynamic | evpn] on the DCGWs allows them to learn/advertise the ARP-ND host route 10.0.0.1/32 when the VM is locally connected, and remove/withdraw it when the VM is no longer present in the local DC.

The following sections describe three typical DC scenarios in which the use of Anycast gateways and ARP-ND host routes is needed. The examples are focused on IPv4 and ARP; however, there is equivalent functionality for IPv6 and ND.

Configuration

The initial configuration includes the following:

  • Cards, MDAs, ports

  • Router interfaces

  • IS-IS as an IGP

The following three scenarios are configured and presented in this document:

  • DC inter-subnet forwarding with Anycast GWs (and no ARP-ND hosts)

  • DC inter-subnet forwarding with Anycast GWs and ARP-ND hosts

  • Data Center Interconnect (DCI) inter-subnet forwarding with Anycast GWs and ARP-ND hosts

DC inter-subnet forwarding with Anycast GWs

DC inter-subnet forwarding with Anycast GWs shows a typical DC network, where PE-1, PE-2, and PE-3 are leaf switches that use EVPN-VXLAN services to provide connectivity between two subnets of a tenant domain. Those two subnets are 10.0.0.0/24 and 10.0.20.0/24, respectively, and while the three PEs are attached to hosts in the 10.0.0.0/24 subnet, only PE-1 is attached to the 10.0.20.0/24 subnet. Subnet 10.0.0.0/24 uses R-VPLS 17 in the three PEs and subnet 10.0.20.0/24 uses R-VPLS 22 in PE-1. The distribution of the R-VPLS services does not have to be uniform in all the PEs, and those R-VPLS services are only created if there are hosts attached to them.

To provide inter-subnet forwarding for the tenant, each PE must be configured with a VPRN instance (VPRN 16) that has an interface to the subnet R-VPLS. In industry-standard terms, VPRN 16 represents the IP-VRF for the tenant, and R-VPLS 17 and R-VPLS 22 are user Broadcast Domains (BDs). R-VPLS 15 is not a user BD, but rather a backhaul R-VPLS that provides EVPN connectivity among the VPRN instances.

Figure 3. DC inter-subnet forwarding with Anycast GWs

The BGP configuration in the PEs is similar. As an example, the BGP configuration in PE-1 is as follows:

# on PE-1:
configure
    router Base
        bgp
            family evpn
            vpn-apply-import
            vpn-apply-export 
            rapid-withdrawal
            rapid-update evpn
            group "dc"
                type internal
                neighbor 192.0.2.2
                exit
                neighbor 192.0.2.3
                exit
            exit
            no shutdown
        exit

PE-2 has the following service configuration. The service configuration on PE-3 is similar.

# on PE-2:
configure
    service
        vpls 15 name "sbd-15" customer 1 create
            allow-ip-int-bind
            exit
            vxlan instance 1 vni 15 create
            exit
            bgp
            exit
            bgp-evpn
                no mac-advertisement
                ip-route-advertisement
                evi 15
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            no shutdown
        exit
        vprn 16 name "ip-vrf-16" customer 1 create
            ecmp 2
            interface "evi-15" create
                mac 00:00:00:00:00:02
                vpls "sbd-15"
                    evpn-tunnel
                exit
            exit
            interface "evi-17" create
                address 10.0.0.2/24
                vrrp 1 passive
                    backup 10.0.0.254
                    ping-reply
                    traceroute-reply
                exit
                vpls "evi-17"
                exit
            exit
            no shutdown
        exit
        vpls 17 name "evi-17" customer 1 create
            allow-ip-int-bind
            exit
            vxlan instance 1 vni 17 create
            exit
            bgp
            exit
            bgp-evpn
                evi 17
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            sap pxc-10.a:17 create
                no shutdown
            exit
            no shutdown
        exit

R-VPLS 17, "evi-17" in the configuration, is the BD used by subnet 10.0.0.0/24 in all the PEs. On the evi-17 interface in VPRN 16, a real IP address as well as a virtual (passive VRRP) IP address are configured. The real IP address is a unique address across the three PEs in R-VPLS 17 (10.0.0.2 in PE-2). This IP address will not be used by the R-VPLS 17 hosts as a default gateway, but rather will be used for troubleshooting purposes (ICMP or similar).

The backup IP address in the passive VRRP instance (10.0.0.254) is the Anycast GW IP address, and the same IP address is configured in all the PEs attached to R-VPLS 17. Because the virtual MAC is auto-derived from the VRRP instance, all the PEs will also have the same virtual MAC for this Anycast GW:

*A:PE-2# show router 16 vrrp instance interface "evi-17"
 
===============================================================================
VRRP Instances for interface "evi-17"
===============================================================================
-------------------------------------------------------------------------------
VRID 1
-------------------------------------------------------------------------------
Owner               : No                  VRRP State        : Master
Passive             : Yes
Primary IP of Master: 10.0.0.2 (Self)
Primary IP          : 10.0.0.2            Standby-Forwarding: Disabled
VRRP Backup Addr    : 10.0.0.254
Admin State         : Up                  Oper State        : Up
Up Time             : 02/18/2022 14:52:45 Virt MAC Addr     : 00:00:5e:00:01:01
---snip---
*A:PE-3# show router 16 vrrp instance interface "evi-17"
 
===============================================================================
VRRP Instances for interface "evi-17"
===============================================================================
-------------------------------------------------------------------------------
VRID 1
-------------------------------------------------------------------------------
Owner               : No                  VRRP State        : Master
Passive             : Yes
Primary IP of Master: 10.0.0.3 (Self)
Primary IP          : 10.0.0.3            Standby-Forwarding: Disabled
VRRP Backup Addr    : 10.0.0.254
Admin State         : Up                  Oper State        : Up
Up Time             : 02/18/2022 14:52:53 Virt MAC Addr     : 00:00:5e:00:01:01
---snip---
*A:PE-1# show router 16 vrrp instance interface "evi-17"
 
===============================================================================
VRRP Instances for interface "evi-17"
===============================================================================
-------------------------------------------------------------------------------
VRID 1
-------------------------------------------------------------------------------
Owner               : No                  VRRP State        : Master
Passive             : Yes
Primary IP of Master: 10.0.0.1 (Self)
Primary IP          : 10.0.0.1            Standby-Forwarding: Disabled
VRRP Backup Addr    : 10.0.0.254
Admin State         : Up                  Oper State        : Up
Up Time             : 02/18/2022 14:52:38 Virt MAC Addr     : 00:00:5e:00:01:01
---snip---

All the hosts attached to R-VPLS 17, such as host-181, host-182, and host-183, are configured with the Anycast GW as default gateway (10.0.0.254). The use of passive VRRP (or Anycast GW in standard terminology) has the following benefits:

  • All the hosts use the same default gateway configuration, regardless of what PE they are attached to.

  • When the hosts send traffic destined to a remote subnet, the local PE can route it directly, without any tromboning.

  • In the case of a host moving to a different leaf switch, the host does not need to change its IP or default gateway, or even its ARP cache.

For completeness, the service configuration in PE-1 follows:

# on PE-1:
configure
    service
        vpls 15 name "sbd-15" customer 1 create
            allow-ip-int-bind
            exit
            vxlan instance 1 vni 15 create
            exit
            bgp
            exit
            bgp-evpn
                no mac-advertisement
                ip-route-advertisement
                evi 15
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
        exit
        vprn 16 name "ip-vrf-16" customer 1 create
            ecmp 2
            interface "evi-15" create
                mac 00:00:00:00:00:01
                vpls "sbd-15"
                    evpn-tunnel
                exit
            exit
            interface "evi-17" create
                address 10.0.0.1/24
                vrrp 1 passive
                    backup 10.0.0.254
                    ping-reply
                    traceroute-reply
                exit
                vpls "evi-17"
                exit
            exit
            interface "evi-22" create
                address 10.0.20.1/24
                vrrp 1 passive
                    backup 10.0.20.254
                    ping-reply
                    traceroute-reply
                exit
                vpls "evi-22"
                exit
            exit
            no shutdown
        exit
        vpls 17 name "evi-17" customer 1 create
            allow-ip-int-bind
            exit
            vxlan instance 1 vni 17 create
            exit
            bgp
            exit
            bgp-evpn
                evi 17
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            sap pxc-10.a:17 create
                no shutdown
            exit
            sap pxc-10.b:20 create
                no shutdown
            exit
            no shutdown
        exit
        vpls 22 name "evi-22" customer 1 create
            allow-ip-int-bind
            exit
            stp
                shutdown
            exit
            sap pxc-10.b:19 create
                no shutdown
            exit
            no shutdown
        exit 

See the EVPN for VXLAN Tunnels (Layer 3) chapter for more information about the EVPN-related configuration in the R-VPLS services. When there is no need for a recursive resolution of the EVPN IP prefix routes to a MAC/IP route, no mac-advertisement is used in the R-VPLS 15, compared to the examples in EVPN for VXLAN Tunnels (Layer 3).

With the described configuration, as an example, the intra-subnet and inter-subnet forwarding connectivity from host-182 is tested (host-182 is simulated with VPRN 18 that is connected to R-VPLS 17 via PXC SAP):

*A:PE-2# traceroute router 18 10.0.0.183 source 10.0.0.182 
traceroute to 10.0.0.183 from 10.0.0.182, 30 hops max, 40 byte packets
  1  10.0.0.183 (10.0.0.183)    2.49 ms  2.31 ms  2.41 ms
*A:PE-2# traceroute router 18 10.0.20.191 source 10.0.0.182 
traceroute to 10.0.20.191 from 10.0.0.182, 30 hops max, 40 byte packets
  1  10.0.0.2 (10.0.0.2)    0.979 ms  0.890 ms  0.875 ms
  2  10.0.20.1 (10.0.20.1)    2.02 ms  1.98 ms  1.92 ms
  3  10.0.20.191 (10.0.20.191)    2.61 ms  2.73 ms  2.74 ms

When host-182 sends traffic to host-191, it will ARP for the Anycast GW IP and will receive the virtual MAC as a reply. The virtual MAC is always associated with the local CPM on the local PE; therefore, the local PE can always route the traffic directly while it has a route for the IP destination.

Host-182 (VPRN 18) resolves the Anycast GW to the virtual MAC:

*A:PE-2# show router 18 arp 10.0.0.254
 
===============================================================================
ARP Table (Service: 18)
===============================================================================
IP Address      MAC Address       Expiry    Type   Interface
-------------------------------------------------------------------------------
10.0.0.254      00:00:5e:00:01:01 03h59m18s Dyn[I] local
===============================================================================

In PE-2, the virtual MAC is associated with a local IP interface:

*A:PE-2# show service id 17 fdb mac 00:00:5e:00:01:01
 
===============================================================================
Forwarding Database, Service 17
===============================================================================
ServId     MAC               Source-Identifier       Type     Last Change
            Transport:Tnl-Id                         Age
-------------------------------------------------------------------------------
17         00:00:5e:00:01:01 cpm                     Intf     02/18/22 14:52:45
-------------------------------------------------------------------------------
Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static Lf=Leaf
===============================================================================

The following route table of VPRN 16 on PE-2 shows that subnet 10.0.20.0/24 from host-191 is learned via EVPN:

*A:PE-3# show router 16 route-table
 
===============================================================================
Route Table (Service: 16)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric
-------------------------------------------------------------------------------
10.0.0.0/24                                   Local   Local     00h03m08s  0
       evi-17                                                       0
10.0.20.0/24                                  Remote  EVPN-IFF  00h03m07s  169
       evi-15 (ET-00:00:00:00:00:01)                                0
-------------------------------------------------------------------------------
No. of Routes: 2
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================

DC inter-subnet forwarding with Anycast GWs and ARP-ND host routes

While the configuration shown in the preceding section is common in DCs, there is a variation that eliminates the flooding among PEs that are attached to the same BD, typically caused by ARP messages and ND. The configuration described in this section is recommended only if all the following conditions are met:

  • All the hosts are directly connected to the leaf switches (PEs in DC inter-subnet forwarding with Anycast GWs).

  • All the hosts announce themselves by issuing a GARP (or unsolicited NA for IPv6) whenever they boot up or move to a different leaf switch.

    Note: This is the case for virtual machines.

  • All the traffic among hosts is IP unicast or non-IP unicast (if the hosts are in the same BD), and there is no Broadcast, Unknown unicast, or Multicast (BUM) traffic from the hosts in the tenant domain, other than ARP/ND.

If the preceding conditions are true, the ARP-ND host route feature can help eliminate BUM traffic completely.

DC inter-subnet forwarding with Anycast GWs and ARP-ND host routes shows the scenario used in this section.

Figure 4. DC inter-subnet forwarding with Anycast GWs and ARP-ND host routes

Compared to the configuration used in the preceding section, VPRN 16 is modified in the three PEs as follows (changes in bold):

# on PE-2:
configure
    service
        vprn "ip-vrf-16" 
            ecmp 2
            interface "evi-15" create
                mac 00:00:00:00:00:02
                vpls "sbd-15"
                    evpn-tunnel
                exit
            exit
            interface "evi-17" create
                address 10.0.0.2/24
                arp-host-route
                    populate static
                    populate dynamic
                    populate evpn
                exit
                arp-timeout 300
                arp-learn-unsolicited
                arp-proactive-refresh
                vrrp 1 passive
                    backup 10.0.0.254
                    ping-reply
                    traceroute-reply
                exit
                remote-proxy-arp
                vpls "evi-17"
                exit
            exit
            no shutdown
# on PE-3:
configure
    service
        vprn "ip-vrf-16" 
            ecmp 2
            interface "evi-15" create
                mac 00:00:00:00:00:03
                vpls "sbd-15"
                    evpn-tunnel
                exit
            exit
            interface "evi-17" create
                address 10.0.0.3/24
                arp-host-route
                    populate static
                    populate dynamic
                    populate evpn
                exit
                arp-timeout 300
                arp-learn-unsolicited
                arp-proactive-refresh
                vrrp 1 passive
                    backup 10.0.0.254
                    ping-reply
                    traceroute-reply
                exit
                remote-proxy-arp
                vpls "evi-17"
                exit
            exit
            no shutdown
# on PE-1:
configure
    service
        vprn "ip-vrf-16" 
            ecmp 2
            interface "evi-15" create
                mac 00:00:00:00:00:01
                vpls "sbd-15"
                    evpn-tunnel
                exit
            exit
            interface "evi-17" create
                address 10.0.0.1/24
                arp-host-route
                    populate static
                    populate dynamic
                    populate evpn
                exit
                arp-timeout 300
                arp-learn-unsolicited
                arp-proactive-refresh
                vrrp 1 passive
                    backup 10.0.0.254
                    ping-reply
                    traceroute-reply
                exit
                remote-proxy-arp
                vpls "evi-17"
                exit
            exit
            interface "evi-22" create
                address 10.0.20.1/24
                arp-host-route
                    populate static
                    populate dynamic
                    populate evpn
                exit
                arp-timeout 300
                arp-learn-unsolicited
                arp-proactive-refresh
                vrrp 1 passive
                    backup 10.0.20.254
                    ping-reply
                    traceroute-reply
                exit
                remote-proxy-arp
                vpls "evi-22"
                exit
            exit
            no shutdown

The behavior due to the newly added commands is as follows:

  • arp-host-route>populate [static | dynamic | evpn] makes the router create an ARP-ND host route per ARP entry in the route table of VPRN "ip-vrf-16".

  • arp-learn-unsolicited makes the router learn ARP entries for the hosts out of the GARP messages that they send when they boot up or move. Without this command, ARP entries are only created after the router receives packets with the host as the destination, issues an ARP request, and the host replies to this solicited ARP request.

  • arp-proactive-refresh makes the router refresh every dynamic ARP entry even if there is no traffic destined to the owner. Without the command, host IP addresses will not be maintained in the ARP cache unless they receive traffic from remote hosts.

  • arp-timeout 300 is the timeout selected in this example (in seconds). The ARP timeout has an impact on how often the router will try to refresh an entry (30 seconds before the timeout expires). In environments where the hosts are subject to mobility (VMs moving between leaves), having a shorter ARP timeout will speed up the removal of the old ARP entry, that is, the old ARP-ND host route entry. However, in scaled environments with tens of thousands of ARP entries, Nokia does not recommend lowering the ARP timeout under 10 minutes.

  • remote-proxy-arp allows the router to reply to any ARP request looking for an IP address in the same subnet as the source, with its virtual MAC (00:00:5e:00:01:01), and route the traffic, as long as there is a route for the destination in the route table.

In addition, the following commands will be executed in the three PEs:

# on PE-1, PE-2, PE-3:
configure 
    service 
        vpls "evi-17" 
            bgp-evpn 
                vxlan 
                    shutdown
                exit
                no ingress-repl-inc-mcast-advertisement
                vxlan 
                    no shutdown
                exit
            exit

By disabling the advertisement of the Inclusive Multicast Ethernet Tag (IMET) route in R-VPLS 17, the PEs will not create a VXLAN BUM destination among each other, preventing the exchange of BUM traffic. Only known unicast traffic can be now exchanged in the context of R-VPLS 17. The three PEs will show VXLAN destinations that have Mcast "-", as opposed to "BUM":

*A:PE-3# show service id 17 vxlan destinations
 
===============================================================================
Egress VTEP, VNI
===============================================================================
Instance    VTEP Address                            Egress VNI  EvpnStatic Num
 Mcast       Oper State                              L2 PBR     SupBcasDom MACs
-------------------------------------------------------------------------------
1           192.0.2.1                               17          evpn       3
 -           Up                                      No          No
1           192.0.2.2                               17          evpn       2
 -           Up                                      No          No
-------------------------------------------------------------------------------
Number of Egress VTEP, VNI : 2
-------------------------------------------------------------------------------
===============================================================================
 
===============================================================================
BGP EVPN-VXLAN Ethernet Segment Dest
===============================================================================
Instance  Eth SegId                       Num. Macs     Last Change
-------------------------------------------------------------------------------
No Matching Entries
===============================================================================

With the described configuration, when the hosts boot up and generate a GARP message, the ARP entries will be created, and subsequently ARP-ND hosts and EVPN IP-prefix advertisements for them. The host bootup is simulated by disabling and re-enabling the VPRN that emulates the host. As an example, some debug commands are used to see the behavior when host-181 boots up and sends a GARP:

*A:PE-1# configure service vprn 18 shutdown 
*A:PE-1# configure service vprn 18 no shutdown 

1 2022/02/18 14:58:30.128 UTC MINOR: DEBUG #2001 vprn18 PIP
"PIP: ARP
instance 3 (18), interface index 7 (local),
ARP egressing on local
   Who has 10.0.0.181 ? Tell 10.0.0.181
"
 
2 2022/02/18 14:58:30.129 UTC MINOR: DEBUG #2001 vprn16 PIP
"PIP: ARP
instance 2 (16), interface index 5 (evi-17),
ARP ingressing on evi-17
   Who has 10.0.0.181 ? Tell 10.0.0.181
"
 
4 2022/02/18 14:58:30.129 UTC MINOR: DEBUG #2001 vprn21 PIP
"PIP: ARP
instance 5 (21), interface index 9 (local),
ARP ingressing on local
   Who has 10.0.0.181 ? Tell 10.0.0.181
"

The GARP creates an ARP entry and, subsequently, an ARP-ND host route in the route table of VPRN 16. Host-181 MAC/IP and IP-prefix routes are advertised too:

3 2022/02/18 14:58:30.129 UTC MINOR: DEBUG #2001 vprn16 PIP
"PIP: ROUTE
instance 2 (16), RTM ADD event
   New Route Info
      prefix: 10.0.0.181/32 (0x11952c690)  preference: 1   metric: 0  
              backup metric: 0  owner: ARP-ND ownerId: 0
      1 ecmp hops  0 backup hops:
         hop 0:  10.0.0.181 @ if 5, weight 0
"

5 2022/02/18 14:58:30.129 UTC MINOR: DEBUG #2001 Base Peer 1: 192.0.2.3
"Peer 1: 192.0.2.3: UPDATE
Peer 1: 192.0.2.3 - Send BGP UPDATE:
    Withdrawn Length = 0
    Total Path Attr Length = 81
    Flag: 0x90 Type: 14 Len: 44 Multiprotocol Reachable NLRI:
        Address Family EVPN
        NextHop len 4 NextHop 192.0.2.1
        Type: EVPN-MAC Len: 33 RD: 192.0.2.1:17 ESI: ESI-0, tag: 0, mac len: 48 
              mac: 00:00:00:00:01:81, IP len: 0, IP: NULL, label1: 17
    Flag: 0x40 Type: 1 Len: 1 Origin: 0
    Flag: 0x40 Type: 2 Len: 0 AS Path:
    Flag: 0x40 Type: 5 Len: 4 Local Preference: 100
    Flag: 0xc0 Type: 16 Len: 16 Extended Community:
        target:64500:17
        bgp-tunnel-encap:VXLAN
"
 
6 2022/02/18 14:58:30.129 UTC MINOR: DEBUG #2001 Base Peer 1: 192.0.2.3
"Peer 1: 192.0.2.3: UPDATE
Peer 1: 192.0.2.3 - Send BGP UPDATE:
    Withdrawn Length = 0
    Total Path Attr Length = 90
    Flag: 0x90 Type: 14 Len: 45 Multiprotocol Reachable NLRI:
        Address Family EVPN
        NextHop len 4 NextHop 192.0.2.1
        Type: EVPN-IP-PREFIX Len: 34 RD: 192.0.2.1:15, tag: 0, 
                   ip_prefix: 10.0.0.181/32 gw_ip 0.0.0.0 Label: 15 (Raw Label: 0xf)
    Flag: 0x40 Type: 1 Len: 1 Origin: 0
    Flag: 0x40 Type: 2 Len: 0 AS Path:
    Flag: 0x40 Type: 5 Len: 4 Local Preference: 100
    Flag: 0xc0 Type: 16 Len: 24 Extended Community:
        target:64500:15
        mac-nh:00:00:00:00:00:01
        bgp-tunnel-encap:VXLAN
"

As an example, following are the ARP and route tables in PE-1:

*A:PE-1# show router 16 arp
 
===============================================================================
ARP Table (Service: 16)
===============================================================================
IP Address      MAC Address       Expiry    Type   Interface
-------------------------------------------------------------------------------
10.0.0.1        00:00:00:00:1e:17 00h00m00s Oth[I] evi-17
10.0.0.2        00:00:00:00:2e:17 00h00m00s Evp[I] evi-17
10.0.0.3        00:00:00:00:3e:17 00h00m00s Evp[I] evi-17
10.0.0.181      00:00:00:00:01:81 00h03m09s Dyn[I] evi-17
10.0.0.254      00:00:5e:00:01:01 00h00m00s Oth[I] evi-17
10.0.20.1       00:00:00:00:1e:22 00h00m00s Oth[I] evi-22
10.0.20.254     00:00:5e:00:01:01 00h00m00s Oth[I] evi-22
-------------------------------------------------------------------------------
No. of ARP Entries: 7
===============================================================================
*A:PE-1# show router 16 route-table
 
===============================================================================
Route Table (Service: 16)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric
-------------------------------------------------------------------------------
10.0.0.0/24                                   Local   Local     00h03m09s  0
       evi-17                                                       0
10.0.0.2/32                                   Remote  ARP-ND    00h03m09s  1
       10.0.0.2                                                     0
10.0.0.3/32                                   Remote  ARP-ND    00h03m09s  1
       10.0.0.3                                                     0
10.0.0.181/32                                 Remote  ARP-ND    00h02m55s  1
       10.0.0.181                                                   0
10.0.0.182/32                                 Remote  EVPN-IFF  00h02m58s  169
       evi-15 (ET-00:00:00:00:00:02)                                0
10.0.0.183/32                                 Remote  EVPN-IFF  00h02m56s  169
       evi-15 (ET-00:00:00:00:00:03)                                0
10.0.20.0/24                                  Local   Local     00h03m09s  0
       evi-22                                                       0
-------------------------------------------------------------------------------
No. of Routes: 7
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================

As discussed, the ARP-ND host routes are installed in the route table, but not in the FIB:

*A:PE-1# show router 16 fib 1
 
===============================================================================
FIB Display
===============================================================================
Prefix [Flags]                                              Protocol
  NextHop
-------------------------------------------------------------------------------
10.0.0.0/24                                                 LOCAL
  10.0.0.0 (evi-17)
10.0.0.182/32                                               EVPN-IFF
  (evi-15 (ET-00:00:00:00:00:02))
10.0.0.183/32                                               EVPN-IFF
  (evi-15 (ET-00:00:00:00:00:03))
10.0.20.0/24                                                LOCAL
  10.0.20.0 (evi-22)
-------------------------------------------------------------------------------
Total Entries : 4
-------------------------------------------------------------------------------
===============================================================================

A side effect of this scenario is that traffic between hosts in the same BD (R-VPLS 17) is routed instead of switched. This can be shown on the traceroute from host-181 to host-182 (there are three hops instead of two), or the TTL on the ping packets (62 instead of 64):

*A:PE-1# traceroute router 18 10.0.0.182
traceroute to 10.0.0.182, 30 hops max, 40 byte packets
  1  10.0.0.1 (10.0.0.1)    2.02 ms  2.29 ms  2.26 ms
  2  10.0.0.2 (10.0.0.2)    3.35 ms  3.39 ms  3.17 ms
  3  10.0.0.182 (10.0.0.182)    4.10 ms  3.95 ms  3.56 ms
*A:PE-1# ping router 18 10.0.0.182 source 10.0.0.181
PING 10.0.0.182 56 data bytes
64 bytes from 10.0.0.182: icmp_seq=1 ttl=62 time=3.25ms.
64 bytes from 10.0.0.182: icmp_seq=2 ttl=62 time=3.49ms.
64 bytes from 10.0.0.182: icmp_seq=3 ttl=62 time=3.41ms.
64 bytes from 10.0.0.182: icmp_seq=4 ttl=62 time=3.49ms.
64 bytes from 10.0.0.182: icmp_seq=5 ttl=62 time=3.54ms.
 
---- 10.0.0.182 PING Statistics ----
5 packets transmitted, 5 packets received, 0.00% packet loss
round-trip min = 3.25ms, avg = 3.43ms, max = 3.54ms, stddev = 0.101ms

This extension of a subnet across a pure routing domain is compliant with the virtual subnet concept described in RFC 7814.

DCI inter-subnet forwarding with Anycast GWs and ARP-ND hosts

DCI inter-subnet forwarding with Anycast GWs and ARP-ND host routes shows a DCI scenario where the use of Anycast GWs, ARP-ND hosts, and some additional configuration provide efficient inter-subnet forwarding within the tenant domain.

Figure 5. DCI inter-subnet forwarding with Anycast GWs and ARP-ND host routes

In this example, VPLS 10 is extended across DC-1 and DC-2, via PE-4 and PE-5 (which are DC GWs). PE-4 and PE-5 are also connected to the WAN and use IP-VPN for inter-subnet forwarding connectivity to the remote host-6. In this network, PE-4 and PE-5 provide the Anycast GW functionality to host-2 and host-3, so that they can move between the two DCs without having to change their IP/MAC/default GW or ARP cache, and efficient upstream forwarding is provided.

PE-4 and PE-5 learn the ARP-ND host route of their respective host and advertise it to the WAN, so that downstream routing from PE-6 can be efficient and without tromboning.

To avoid unnecessary ARP flooding between DCs, proxy-ARP is used in PE-2 and PE-3. The configuration of VPLS 10 in the PE-2 and PE-3 is as follows:

# on PE-2:
configure
    service
        vpls 10 name "centralized-gw-bd" customer 1 create
            vxlan instance 1 vni 10 create
            exit
            bgp
            exit
            bgp-evpn
                evi 10
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            sap pxc-10.a:10 create
                no shutdown
            exit
            proxy-arp
                send-refresh 120
                no unknown-arp-request-flood-evpn
                dynamic-arp-populate
                no garp-flood-evpn
                evpn-route-tag 1
                no shutdown
            exit
            no shutdown
        exit
# on PE-3:
configure
    service
       vpls 10 name "centralized-gw-bd" customer 1 create
            vxlan instance 1 vni 10 create
            exit
            bgp
            exit
            bgp-evpn
                evi 10
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            sap pxc-10.a:10 create
                no shutdown
            exit
            proxy-arp
                send-refresh 120
                no unknown-arp-request-flood-evpn
                dynamic-arp-populate
                no garp-flood-evpn
                evpn-route-tag 1
                no shutdown
            exit
            no shutdown
        exit

Because the hosts are directly connected to PE-2 and PE-3, and they announce themselves to the network through a GARP when they boot up or move, the proxy-ARP configuration includes the parameters no unknown-arp-request-flood-evpn and no garp-flood-evpn. Those two commands prevent unnecessary ARP flooding between DCs.

The two PEs also include the proxy-arp evpn-route-tag 1 command. This command allows the proxy-ARP module to tag the routes when sent to BGP for advertisement of a MAC/IP route with non-zero IP. In this example, the tag is used in an export policy to add a Site-Of-Origin (SOO) extended community to the MAC/IP routes with non-zero IP. This, for example, allows PE-4 to accept MAC/IP routes from its own DC-1 and drop MAC/IP routes from DC-2 so that PE-4 only advertises ARP-ND host routes attached to DC-1. Vice versa for PE-5. The MAC/IP routes with zero-IP (that are also sent for every MAC) will not be tagged with the SOO and, therefore, will be imported by all the PEs in VPLS 10. This allows normal L2 connectivity among the four PEs, while the ARP-ND routes are only generated for the local hosts.

On PE-2, BGP is configured as follows:

# on PE-2:
configure
    router Base
        autonomous-system 64500
        policy-options
            begin
            community "SOO-DC-1" 
                members "origin:64500:1"
            exit
            policy-statement "export-add-SOO"
                entry 10
                    from
                        tag 1
                    exit
                    action accept
                        community add "SOO-DC-1"
                    exit
                exit
            exit
            policy-statement "import-prefer-DC-1"
                entry 10
                    from
                        community "SOO-DC-1"
                    exit
                    action accept
                        local-preference 200
                    exit
                exit
            exit
            commit
        exit
        bgp
            family vpn-ipv4 vpn-ipv6 evpn
            vpn-apply-import
            vpn-apply-export
            import "import-prefer-DC-1" 
            export "export-add-SOO" 
            rapid-withdrawal
            rapid-update evpn
            group "dc"
                type internal
                neighbor 192.0.2.3
                exit
            exit
            group "dcgws"
                type internal
                neighbor 192.0.2.4
                exit
                neighbor 192.0.2.5
                exit
            exit
        exit

On PE-3, BGP is configured as follows:

# on PE-3:
configure
    router
        autonomous-system 64500
        policy-options
            begin
            community "SOO-DC-2" 
                members "origin:64500:2"
            exit
            policy-statement "export-add-SOO"
                entry 10
                    from
                        tag 1
                    exit
                    action accept
                        community add "SOO-DC-2"
                    exit
                exit
            exit
            policy-statement "import-prefer-DC-2"
                entry 10
                    from
                        community "SOO-DC-2"
                    exit
                    action accept
                        local-preference 200
                    exit
                exit
            exit
            commit
        exit
        bgp
            family vpn-ipv4 vpn-ipv6 evpn
            vpn-apply-import
            vpn-apply-export
            import "import-prefer-DC-2" 
            export "export-add-SOO" 
            rapid-withdrawal
            rapid-update evpn
            group "dc"
                type internal
                neighbor 192.0.2.2
                exit
            exit
            group "dcgws"
                type internal
                neighbor 192.0.2.4
                exit
                neighbor 192.0.2.5
                exit
            exit
        exit

As an example, the following show commands prove that PE-2 does not add an SOO to MAC/IP routes with zero-IP, but it does add SOO-DC-1 for MAC/IP routes with non-zero IP:

*A:PE-2# show router bgp routes evpn mac rd 192.0.2.2:10 hunt 
===============================================================================
 BGP Router ID:192.0.2.2        AS:64500       Local AS:64500      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP EVPN MAC Routes
===============================================================================
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------
RIB Out Entries
-------------------------------------------------------------------------------
---snip---
 
Network        : n/a
Nexthop        : 192.0.2.2
Path Id        : None
To             : 192.0.2.3
Res. Nexthop   : n/a
Local Pref.    : 100                    Interface Name : NotAvailable
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : None
AIGP Metric    : None                   IGP Cost       : n/a
Connector      : None
Community      : origin:64500:1 target:64500:10
                 bgp-tunnel-encap:VXLAN
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.3
Origin         : IGP
AS-Path        : No As-Path
EVPN type      : MAC
ESI            : ESI-0
Tag            : 0
IP Address     : 10.0.0.2
Route Dist.    : 192.0.2.2:10
Mac Address    : 00:00:00:00:00:02
MPLS Label1    : VNI 10                 MPLS Label2    : n/a
Route Tag      : 0
Neighbor-AS    : n/a
Orig Validation: N/A
Source Class   : 0                      Dest Class     : 0
 
Network        : n/a
Nexthop        : 192.0.2.2
Path Id        : None
To             : 192.0.2.3
Res. Nexthop   : n/a
Local Pref.    : 100                    Interface Name : NotAvailable
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : None
AIGP Metric    : None                   IGP Cost       : n/a
Connector      : None
Community      : target:64500:10 bgp-tunnel-encap:VXLAN
                 mac-mobility:Seq:0/Static
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.3
Origin         : IGP
AS-Path        : No As-Path
EVPN type      : MAC
ESI            : ESI-0
Tag            : 0
IP Address     : n/a
Route Dist.    : 192.0.2.2:10
Mac Address    : 02:13:ff:00:03:3a
MPLS Label1    : VNI 10                 MPLS Label2    : n/a
Route Tag      : 0
Neighbor-AS    : n/a
Orig Validation: N/A
Source Class   : 0                      Dest Class     : 0

---snip---

The VPLS 10 configuration on PE-4 and the corresponding import policy to drop non-local SOO follow. PE-5 has a similar configuration (not shown), including the same RD 64500:10 in VPLS 10 as PE-4. The policy will drop routes tagged with SOO-DC-1 instead of SOO-DC-2.

# on PE-4:
configure
    service
        vpls 10 name "centralized-gw-bd" customer 1 create
            allow-ip-int-bind
            exit
            vxlan instance 1 vni 10 create
            exit
            bgp
                route-distinguisher 64500:10
            exit
            bgp-evpn
                evi 10
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
        exit 

On PE-4, the BGP configuration is as follows:

# on PE-4:
configure
    router Base
        autonomous-system 64500
        policy-options
            begin
            community "SOO-DC-1"
                members "origin:64500:1"
            exit
            community "SOO-DC-2" 
                members "origin:64500:2"
            exit
            policy-statement "export-add-SOO"
                entry 10
                    from
                    exit
                    action accept
                        community add "SOO-DC-1"
                    exit
                exit
            exit
            policy-statement "import-drop-DC-2"
                entry 10
                    from
                        community "SOO-DC-2"
                    exit
                    action drop
                    exit
                exit
            exit
            commit
        exit
        bgp
            family vpn-ipv4 vpn-ipv6 evpn
            vpn-apply-import
            vpn-apply-export
            import "import-drop-DC-2" 
            export "export-add-SOO" 
            rapid-withdrawal
            rapid-update evpn
            group "dc"
                type internal
                neighbor 192.0.2.2
                exit
                neighbor 192.0.2.3
                exit
            exit
            group "wan"
                type internal
                neighbor 192.0.2.5
                exit
                neighbor 192.0.2.6
                exit
            exit
            no shutdown
        exit

There is another aspect for which policies are used: on PE-2 and PE-3, two MAC/IP routes with the Anycast GW virtual MAC are received (one from PE-4 and another from PE5). To provide efficient upstream routing with no tromboning, it is important that PE-2 prefers the PE-4 virtual MAC route (its own DGW) over that of PE-5, and vice versa for PE-3. This is achieved by:

  • Configuring the same RD on PE-4 and PE-5 for VPLS10.

  • Configuring an import policy on PE-2 and PE-3 that modifies the local preference of the routes, so that each one prefers the local DGW.

PE-2 and PE-3 could have dropped the routes from the non-local DCGW, but with this configuration, DCGW redundancy is provided in case of failure:

*A:PE-2# show router policy "import-prefer-DC-1" 
    entry 10
        from
            community "SOO-DC-1"
        exit
        action accept
            local-preference 200
        exit
    exit
*A:PE-2# show router bgp routes evpn mac community target:64500:10 
                                                  mac-address 00:00:5e:00:01:01
===============================================================================
 BGP Router ID:192.0.2.2        AS:64500       Local AS:64500
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete
 
===============================================================================
BGP EVPN MAC Routes
===============================================================================
Flag  Route Dist.         MacAddr           ESI
      Tag                 Mac Mobility      Label1
                          Ip Address
                          NextHop
-------------------------------------------------------------------------------
u*>i  64500:10            00:00:5e:00:01:01 ESI-0
      0                   Static            VNI 10
                          10.0.0.254
                          192.0.2.4
 
*i    64500:10            00:00:5e:00:01:01 ESI-0
      0                   Static            VNI 10
                          10.0.0.254
                          192.0.2.5
 
-------------------------------------------------------------------------------
Routes : 2
===============================================================================
*A:PE-3# show router policy "import-prefer-DC-2" 
    entry 10
        from
            community "SOO-DC-2"
        exit
        action accept
            local-preference 200
        exit
    exit
*A:PE-3# show router bgp routes evpn mac community target:64500:10 
                                                  mac-address 00:00:5e:00:01:01
===============================================================================
 BGP Router ID:192.0.2.3        AS:64500       Local AS:64500
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete
 
===============================================================================
BGP EVPN MAC Routes
===============================================================================
Flag  Route Dist.         MacAddr           ESI
      Tag                 Mac Mobility      Label1
                          Ip Address
                          NextHop
-------------------------------------------------------------------------------
u*>i  64500:10            00:00:5e:00:01:01 ESI-0
      0                   Static            VNI 10
                          10.0.0.254
                          192.0.2.5
 
*i    64500:10            00:00:5e:00:01:01 ESI-0
      0                   Static            VNI 10
                          10.0.0.254
                          192.0.2.4
 
-------------------------------------------------------------------------------
Routes : 2
===============================================================================

Finally, the VPRN 11 configuration on PE-4 and PE-5 is as follows:

# on PE-4:
configure
    service
        vprn 11 name "wan-ip-vpn" customer 1 create
            interface "evi-10" create
                address 10.0.0.4/16
                mac 00:00:00:00:00:04
                arp-host-route
                    populate static route-tag 1
                    populate dynamic route-tag 1
                    populate evpn route-tag 1
                exit
                arp-timeout 600
                arp-learn-unsolicited
                vrrp 1 passive
                    backup 10.0.0.254
                    ping-reply
                    traceroute-reply
                exit
                vpls "centralized-gw-bd"
                exit
            exit
            bgp-ipvpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    route-distinguisher auto-rd
                    vrf-target target:64500:11
                    no shutdown
                exit
            exit
            no shutdown
        exit
# on PE-5:
configure
    service
        vprn 11 name "wan-ip-vpn" customer 1 create
            interface "evi-10" create
                address 10.0.0.5/16
                mac 00:00:00:00:00:05
                arp-host-route
                    populate static route-tag 1
                    populate dynamic route-tag 1
                    populate evpn route-tag 1
                exit
                arp-timeout 600
                arp-learn-unsolicited
                vrrp 1 passive
                    backup 10.0.0.254
                    ping-reply
                    traceroute-reply
                exit
                vpls "centralized-gw-bd"
                exit
            exit
            bgp-ipvpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    route-distinguisher auto-rd
                    vrf-target target:64500:11
                    no shutdown
                exit
            exit
            no shutdown
        exit

The passive VRRP commands, as well as the ARP commands, have already been discussed in preceding sections. The only new command in the configuration is route-tag 1. This command tags all the ARP-ND host routes learned on the interface, so that export policies can match on that tag and modify the routes before they are advertised. The command is included for completeness, however, in this configuration, there is no export policy using this tag.

When the configuration is in place and the hosts are connected, the FDBs, proxy-ARP, ARP caches, and route tables are checked with the following commands (example for host-2 and host-6).

When host-2 ARPs for its default gateway (10.0.0.254), PE-2 will reply with the information from its proxy-ARP table:

*A:PE-2# show service id 10 proxy-arp detail 10.0.0.254
-------------------------------------------------------------------------------
Proxy Arp
-------------------------------------------------------------------------------
Admin State       : enabled
Dyn Populate      : enabled
Age Time          : disabled            Send Refresh      : 120 secs
Table Size        : 250                 Total             : 5
Static Count      : 0                   EVPN Count        : 4
Dynamic Count     : 1                   Duplicate Count   : 0
 
Dup Detect
-------------------------------------------------------------------------------
Detect Window     : 3 mins              Num Moves         : 5
Hold down         : 9 mins
Anti Spoof MAC    : None
 
EVPN
-------------------------------------------------------------------------------
Garp Flood        : disabled            Req Flood         : disabled
Static Black Hole : disabled
EVPN Route Tag    : 1
-------------------------------------------------------------------------------
 
===============================================================================
VPLS Proxy Arp Entries
===============================================================================
IP Address          Mac Address         Type      Status    Last Update
-------------------------------------------------------------------------------
10.0.0.254          00:00:5e:00:01:01   evpn      active    02/18/2022 15:06:57
-------------------------------------------------------------------------------
Number of entries : 1
===============================================================================

When host-2 sends traffic to the virtual MAC, it will forward it to PE-4 based on a lookup on the FDB:

*A:PE-2# show service id 10 fdb mac 00:00:5e:00:01:01
 
===============================================================================
Forwarding Database, Service 10
===============================================================================
ServId     MAC               Source-Identifier       Type     Last Change
            Transport:Tnl-Id                         Age
-------------------------------------------------------------------------------
10         00:00:5e:00:01:01 vxlan-1:                EvpnS:P  02/18/22 15:06:57
                             192.0.2.4:10
-------------------------------------------------------------------------------
Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static Lf=Leaf
===============================================================================

If PE-4 receives packets with MAC Destination Address (DA) equal to the virtual MAC and IP DA of host-6 (172.16.0.6), the forwarding is based on the information in the R-VPLS FDB first, and afterward on the VPRN 11 route table, as follows.

*A:PE-4# show service id 10 fdb mac 00:00:5e:00:01:01
 
===============================================================================
Forwarding Database, Service 10
===============================================================================
ServId     MAC               Source-Identifier       Type     Last Change
            Transport:Tnl-Id                         Age
-------------------------------------------------------------------------------
10         00:00:5e:00:01:01 cpm                     Intf     02/18/22 15:06:57
-------------------------------------------------------------------------------
Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static Lf=Leaf
===============================================================================
*A:PE-4# show router 11 route-table
 
===============================================================================
Route Table (Service: 11)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric
-------------------------------------------------------------------------------
10.0.0.0/16                                   Local   Local     00h06m07s  0
       evi-10                                                       0
10.0.0.2/32                                   Remote  ARP-ND    00h06m06s  1
       10.0.0.2                                                     0
172.16.0.0/24                                 Remote  BGP VPN   00h05m33s  170
       192.0.2.6 (tunneled)                                         10
-------------------------------------------------------------------------------
No. of Routes: 3
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================

When the traffic goes back from host-6 to host-2, PE-6 will forward to PE-4 due to a Longest Prefix Match (LPM) lookup on the VPRN route table. The advertisement of the ARP-ND routes on PE-4 and PE-6 ensures that PE-6 can forward downstream traffic to the correct PE:

*A:PE-6# show router 11 route-table
 
===============================================================================
Route Table (Service: 11)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric
-------------------------------------------------------------------------------
10.0.0.0/16                                   Remote  BGP VPN   00h06m57s  170
       192.0.2.4 (tunneled)                                         10
10.0.0.0/16                                   Remote  BGP VPN   00h06m57s  170
       192.0.2.5 (tunneled)                                         10
10.0.0.2/32                                   Remote  BGP VPN   00h06m57s  170
       192.0.2.4 (tunneled)                                         10
10.0.0.3/32                                   Remote  BGP VPN   00h06m57s  170
       192.0.2.5 (tunneled)                                         10
172.16.0.0/24                                 Local   Local     00h07m01s  0
       local                                                        0
-------------------------------------------------------------------------------
No. of Routes: 5
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================

Traceroute commands from host-6 provide information about the path to each remote host (VPRN 12 on PE-6 simulates host-6):

*A:PE-6# traceroute router 12 10.0.0.2
traceroute to 10.0.0.2, 30 hops max, 40 byte packets
  1  172.16.0.254 (172.16.0.254)    3.09 ms  2.23 ms  2.31 ms
  2  10.0.0.4 (10.0.0.4)    3.27 ms  3.24 ms  3.28 ms
  3  10.0.0.2 (10.0.0.2)    5.64 ms  5.77 ms  5.86 ms
*A:PE-6# traceroute router 12 10.0.0.3
traceroute to 10.0.0.3, 30 hops max, 40 byte packets
  1  172.16.0.254 (172.16.0.254)    1.96 ms  2.19 ms  2.20 ms
  2  10.0.0.5 (10.0.0.5)    3.44 ms  3.27 ms  3.04 ms
  3  10.0.0.3 (10.0.0.3)    8.40 ms  5.63 ms  5.48 ms

Communication between host-2 and host-3 uses regular L2 switching, as expected, because there are EVPN-VXLAN destinations created between PE-2 and PE-3 for VPLS 10:

*A:PE-2# show service id 10 vxlan destinations
 
===============================================================================
Egress VTEP, VNI
===============================================================================
Instance    VTEP Address                            Egress VNI  EvpnStatic Num
 Mcast       Oper State                              L2 PBR     SupBcasDom MACs
-------------------------------------------------------------------------------
1           192.0.2.3                               10          evpn       2
 BUM         Up                                      No          No
1           192.0.2.4                               10          evpn       2
 BUM         Up                                      No          No
1           192.0.2.5                               10          evpn       1
 BUM         Up                                      No          No
-------------------------------------------------------------------------------
Number of Egress VTEP, VNI : 3
-------------------------------------------------------------------------------
===============================================================================
 
===============================================================================
BGP EVPN-VXLAN Ethernet Segment Dest
===============================================================================
Instance  Eth SegId                       Num. Macs     Last Change
-------------------------------------------------------------------------------
No Matching Entries
===============================================================================
*A:PE-2# ping router 12 10.0.0.3
PING 10.0.0.3 56 data bytes
64 bytes from 10.0.0.3: icmp_seq=1 ttl=64 time=9.23ms.
64 bytes from 10.0.0.3: icmp_seq=2 ttl=64 time=3.69ms.
64 bytes from 10.0.0.3: icmp_seq=3 ttl=64 time=3.46ms.
64 bytes from 10.0.0.3: icmp_seq=4 ttl=64 time=3.42ms.
64 bytes from 10.0.0.3: icmp_seq=5 ttl=64 time=3.48ms.
 
---- 10.0.0.3 PING Statistics ----
5 packets transmitted, 5 packets received, 0.00% packet loss
round-trip min = 3.42ms, avg = 4.65ms, max = 9.23ms, stddev = 2.29ms
*A:PE-2# traceroute router 12 10.0.0.3
traceroute to 10.0.0.3, 30 hops max, 40 byte packets
  1  10.0.0.3 (10.0.0.3)    3.76 ms  3.69 ms  3.67 ms

Troubleshooting and debugging

The following commands can be used when troubleshooting these scenarios:

  • show router <id> route table and show router <id> fib <slot> (and their corresponding commands for IPv6)

  • show router <id> arp / neighbor

  • show service <id> fdb detail

  • show service <id> proxy-arp/nd detail

  • show router bgp routes evpn / vpn-ipv4 / vpn-ipv6

The following debug commands are also important to analyze the scenarios:

debug
    router "Base"
        bgp
            update
        exit
    exit
    router service-name "ip-vrf-16"
        ip
            arp
            route-table
        exit
    exit
    router service-name "VM-test-anycast-gw"
        ip
            arp
        exit
    exit
    service
        id 10
            proxy-arp
                 all
            exit
        exit
    exit
exit

Conclusion

ARP-ND host routes are generated out of ARP-ND entries in a router context. These ARP-ND host routes, along with passive VRRP (for Anycast GWs), provide the correct solution for efficient inter-subnet forwarding in DCs and DCI networks.