EVPN for VXLAN tunnels (Layer 2)

EVPN-VXLAN L2 basic configuration

Basic configuration of EVPN-VXLAN L2 on SR Linux consists of the following:

  • A vxlan-interface, which contains the ingress VNI of the incoming VXLAN packets associated with the vxlan-interface

  • A MAC-VRF network-instance, where the vxlan-interface is attached. Only one vxlan-interface can be attached to a MAC-VRF network-instance.

  • BGP-EVPN is also enabled in the same MAC-VRF with a minimum configuration of the EVI and the network-instance vxlan-interface associated with it.

    The BGP instance under BGP-EVPN has an encapsulation-type leaf, which is VXLAN by default.

    For EVPN, this determines that the BGP encapsulation extended community is advertised with value VXLAN and the value encoded in the label fields of the advertised NLRIs is a VNI.

    If the route-distinguisher or route-target/policies are not configured, the required values are automatically derived from the configured EVI as follows:

    • The route-distinguisher is derived as <ip-address:evi>, where the ip-address is the IPv4 address of the default network-instance sub-interface system0.0.

    • The route-target is derived as <asn:evi>, where the asn is the autonomous system configured in the default network-instance.

The following example shows a basic EVPN-VXLAN L2 configuration consisting of a vxlan-interface, MAC-VRF network-instance, and BGP-EVPN configuration:

--{ candidate shared default }--[ ]--
# info
...
    tunnel-interface vxlan1 {
        vxlan-interface 1 {
            type bridged
            ingress {
                vni 10
            }
            egress {
                source-ip use-system-ipv4-address
            }
        }
    }
  
// In the network-instance:
  
A:dut2#  network-instance blue
--{ candidate shared default }--[ network-instance blue ]--
# info
    type mac-vrf
    admin-state enable
    description "Blue network instance"
    interface ethernet-1/2.1 {
    }
    vxlan-interface vxlan1.1 {
    }
    protocols {
        bgp-evpn {
            bgp-instance 1 {
                admin-state enable
                vxlan-interface vxlan1.1
                evi 10
            }
        }
        bgp-vpn {
            bgp-instance 1 {
     // rd and rt are auto-derived from evi if this context is not configured
                export-policy pol-def-1
                import-policy pol-def-1
                route-distinguisher {
                    route-distinguisher 64490:200
                }
                route-target {
                    export-rt target:64490:200
                    import-rt target:64490:100
                }
            }
        }
    }

EVPN L2 basic routes

EVPN Layer 2 (without multi-homing) includes the implementation of the BGP-EVPN address family and support for the following route types:

  • EVPN MAC/IP route (or type 2, RT2)

  • EVPN Inclusive Multicast Ethernet Tag route (IMET or type 3, RT3)

The MAC/IP route is used to convey the MAC and IP information of hosts connected to subinterfaces in the MAC-VRF. The IMET route is advertised as soon as bgp-evpn is enabled in the MAC-VRF; it has the following purpose:

  • Auto-discovery of the remote VTEPs attached to the same EVI

  • Creation of a default flooding list in the MAC-VRF so that BUM frames are replicated

Advertisement of the MAC/IP and IMET routes is configured on a per-MAC-VRF basis. The following example shows the default setting advertise true, which advertises MAC/IP and IMET routes.

Note that changing the setting of the advertise parameter and committing the change internally flaps the BGP instance.

--{ candidate shared default }--[ network-instance blue protocols bgp-evpn bgp-
instance 1 ]--
# info detail
    admin-state enable
    vxlan-interface vxlan1.1
    evi 1
    ecmp 1
    default-admin-tag 0
    routes {
        next-hop use-system-ipv4-address
        mac-ip {
            advertise true
        }
        inclusive-mcast {
            advertise true
        }
    }

Creation of VXLAN destinations based on received EVPN routes

The creation of VXLAN destinations of type unicast, unicast ES (Ethernet Segment), and multicast for each vxlan-interface is driven by the reception of EVPN routes.

The created unicast, unicast ES, and multicast VXLAN destinations are visible in state. Each destination is allocated a system-wide unique destination index and is an internal NHG-ID (next-hop group ID). The destination indexes for the VXLAN destinations are shown in the following example for destination 10.22.22.4, vni 1

--{ [FACTORY] + candidate shared default }--[  ]--
# info from state tunnel-interface vxlan1 vxlan-interface 1 bridge-table unicast-
destinations destination * vni *
    tunnel-interface vxlan1 {
        vxlan-interface 1 {
            bridge-table {
                unicast-destinations {
                    destination 10.44.44.4 vni 1 {
                        destination-index 677716962904 // destination index
                        statistics {
                        }
                        mac-table {
                            mac 00:00:00:01:01:04 {
                                type evpn-static
                                last-update "16 hours ago"
                            }
                        }
                    }
                }
            }
        }
    }
--{ [FACTORY] + candidate shared default }--[  ]--
# info from state network-instance blue bridge-table mac-table mac 00:00:00:01:01:04
    network-instance blue {
        bridge-table {
            mac-table {
                mac 00:00:00:01:01:04 {
                    destination-type vxlan
                    destination-index 677716962904 // destination index
                    type evpn-static
                    last-update "16 hours ago"
                    destination "vxlan-interface:vxlan1.1 vtep:10.44.44.4 vni:1"
                }
            }
        }
    }

The following is an example of dynamically created multicast destinations for a vxlan-interface:

--{ [FACTORY] + candidate shared default }--[  ]--
A:dut1# info from state tunnel-interface vxlan1 vxlan-interface 1 bridge-
table multicast-destinations
    tunnel-interface vxlan1 {
        vxlan-interface 1 {
            bridge-table {
                multicast-destinations {
                    destination 40.1.1.2 vni 1 {
                        multicast-forwarding BUM
                        destination-index 46428593833
                    }
                    destination 40.1.1.3 vni 1 {
                        multicast-forwarding BUM
                        destination-index 46428593835
                    }
                    destination 40.1.1.4 vni 1 {
                        multicast-forwarding BUM
                        destination-index 46428593829
                    }
                }
            }
        }
    }

EVPN route selection

When a MAC is received from multiple sources, the route is selected based on the priority listed in MAC selection. Learned and EVPN-learned routes have equal priority; the latest received route is selected.

When multiple EVPN-learned MAC/IP routes arrive for the same MAC but with a different key (for example, two routes for MAC M1 with different route-distinguishers), a selection is made based on the following priority:

  1. EVPN MACs with higher SEQ number

  2. EVPN MACs with lower IP next-hop

  3. EVPN MACs with lower Ethernet Tag

  4. EVPN MACs with lower RD

BGP next hop configuration for EVPN routes

You can configure the BGP next hop to be used for the EVPN routes advertised for a network-instance. This next hop is by default the IPv4 address configured in interface system 0.0 of the default network-instance. However, the next-hop address can be changed to any IPv4 address.

The system does not check that the configured IP address exists in the default network-instance. Any valid IP address can be used as next hop of the EVPN routes advertised for the network-instance, irrespective of its existence in any subinterface of the system. However, the receiver leaf nodes create their unicast, multicast and ES destinations to this advertised next-hop, so it is important that the configured next-hop is a valid IPv4 address that exists in the default network-instance.

When the system or loopback interface configured for the BGP next-hop is administratively disabled, EVPN still advertises the routes, as long as a valid IP address is available for the next-hop. However, received traffic on that interface is dropped.

The following example configures a BGP next hop to be used for the EVPN routes advertised for a network-instance.

--{ candidate shared default }--[ network-instance 1 protocols bgp-evpn bgp-
instance 1 ]--
# info
    routes {
        next-hop 1.1.1.1
        }
    }

MAC duplication detection for Layer 2 loop prevention in EVPN

MAC loop prevention in EVPN broadcast domains is based on the SR Linux MAC duplication feature (see MAC duplication detection and actions), but considers MACs that are learned via EVPN as well. The feature detects MAC duplication for MACs moving among bridge subinterfaces of the same MAC-VRF, as well as MACs moving between bridge subinterfaces and EVPN in the same MAC-VRF, but not for MACs moving from a VTEP to a different VTEP (via EVPN) in the same MAC-VRF.

Also, when a MAC is declared as duplicate, and the blackhole configuration option is added to the interface, then not only incoming frames on bridged subinterfaces are discarded if their MAC SA or DA match the blackhole MAC, but also frames encapsulated in VXLAN packets are discarded if their source MAC or destination MAC match the blackhole MAC in the mac-table.

When a MAC exceeds the allowed num-moves, the MAC is moved to a type duplicate (irrespective of the type of move: EVPN-to-local, local-to-local, local-to-EVPN), the EVPN application receives an update that advertises the MAC with a higher sequence number (which may trigger the duplication in other nodes). The ‟duplicate” MAC can be overwritten by a higher priority type, or flushed by the tools command (see Deleting entries from the bridge table).

EVPN L2 multi-homing

SR Linux supports single-active multi-homing and all-active multi-homing, as defined in RFC 7432. The EVPN multi-homing implementation uses the following SR Linux features:

  • System network-instance

    A system network-instance container hosts the configuration and state of EVPN for multi-homing.

  • BGP network-instance

    The ES model uses a BGP instance from where the RD/RT and export/import policies are taken to advertise and process the multi-homing ES routes. Only one BGP instance is allowed, and all the ESes are configured under this BGP instance. The RD/RTs cannot be configured when the BGP instance is associated with the system network-instance; however the operational RD/RTs are still shown in state.

  • Ethernet Segments

    An ES has an admin-state (disabled by default) setting that must be toggled to change any of the parameters that affect the EVPN control plane. In particular, the ESes support the following:

    • General and per-ES boot and activation timers.

    • Manual 10-byte ESI configuration.

    • All-active and single-active multi-homing modes.

    • DF Election algorithm type Default (modulo based) or type Preference.

    • Configuration of ES and AD per-ES routes next-hop, and ES route originating-IP per ES.

    • An AD per ES route is advertised per mac-vrf, where the route carries the network-instance RD and RT.

    • Association with an interface that can be of type Ethernet or LAG. When associated with a LAG, the LAG can be static or LACP-based. In case of LACP, the same system-id/system-priority/port-key settings must be configured on all the nodes attached to the same ES.

  • Aliasing load balancing

    This hashing operation for aliasing load balancing uses the following hash fields in the incoming frames by default:

    • For IP traffic: IP DA and IP SA, Layer 4 source and destination ports, protocol, VLAN ID.

    • For Ethernet (non-IP) traffic: MAC DA and MAC SA, VLAN ID, Ethertype.

    For IPv6 addresses, 32 bit fields are generated by XORing and Folding the 128 bit address. The packet fields are supplied as input to the hashing computation.

  • Reload-delay timer

    The reload-delay timer configures an interface to be shut down for a period of time following a node reboot or an IMM reset to avoid black-holing traffic.

EVPN L2 multi-homing procedures

EVPN relies on three different procedures to handle multi-homing: DF election, split-horizon, and aliasing. DF election is relevant to single-active and all-active multi-homing; split-horizon and aliasing are relevant only to all-active multi-homing.

  • DF Election – The Designated Forwarder (DF) is the leaf that forwards BUM traffic in the ES. Only one DF can exist per ES at a time, and it is elected based on the exchange of ES routes (type 4) and the subsequent DF Election Algorithm (DF Election Alg).

    In single-active multi-homing, the non-DF leafs bring down the subinterface associated with the ES.

    In all-active multi-homing, the non-DF leafs do not forward BUM traffic received from remote EVPN PEs.

  • Split-horizon – This is the mechanism by which BUM traffic received from a peer ES PE is filtered so that it is not looped back to the CE that first transmitted the frame. Local bias is applied in VXLAN services, as described in RFC 8365.

  • Aliasing – This is the procedure by which PEs that are not attached to the ES can process non-zero ESI MAC/IP routes and AD routes and create ES destinations to which per-flow ECMP can be applied.

To support multi-homing, EVPN-VXLAN supports two additional route types:

  • ES routes (type 4) – Used for ES discovery on all the leafs attached to the ES and DF Election.

    ES routes use an ES-import route target extended community (its value is derived from the ESI), so that its distribution is limited to only the leafs that are attached to the ES.

    The ES route is advertised with the DF Election extended community, which indicates the intent to use a specific DF Election Alg and capabilities.

    Upon reception of the remote ES routes, each PE builds a DF candidate list based on the originator IP of the ES routes. Then, based on the agreed DF Election Alg, each PE elects one of the candidates as DF for each mac-vrf where the ES is defined.

  • AD route (type 1) – Advertised to the leafs attached to an ES. There are two versions of AD routes:

    • AD per-ES route – Used to advertise the multi-homing mode (single-active or all-active) and the ESI label, which is not advertised or processed in case of VXLAN. Its withdrawal enables the mass withdrawal procedures in the remote PEs.

    • AD per-EVI route – Used to advertise the availability of an ES in an EVI and its VNI. It is needed by the remote leafs for the aliasing procedures.

    Both versions of AD routes can influence the DF Election. Their withdrawal from a leaf results in removing that leaf from consideration for DF Election for the associated EVI, as long as ac-df exclude is configured. (The AC-DF capability can be set to exclude only if the DF election algorithm type is set to preference.)

EVPN-VXLAN local bias for all-active multi-homing

Local bias for all-active multi-homing is based on the following behavior at the ingress and egress leafs:

  • At the ingress leaf, any BUM traffic received on an all-active multi-homing LAG subinterface (associated with an EVPN-VXLAN mac-vrf) is flooded to all local subinterfaces, irrespective of their DF or NDF status, and VXLAN tunnels.

  • At the egress leaf, any BUM traffic received on a VXLAN subinterface (associated with an EVPN-VXLAN mac-vrf) is flooded to single-homed subinterfaces and multi-homed subinterfaces whose ES is not shared with the owner of the source VTEP if the leaf is DF for the ES.

In SR Linux, the local bias filtering entries on the egress leaf are added or removed based on the ES routes, and they are not modified by the removal of AD per EVI/ES routes. This may cause blackholes in the multi-homed CE for BUM traffic if the local subinterfaces are administratively disabled.

Single-active multi-homing

EVPN L2 single-homing configuration shows a single-active ES attached to two leaf nodes. In this configuration, the ES in single-active mode can be configured to do the following:

  • Associate to an Ethernet interface or a LAG interface (as all-active ESes)

  • Coexist with all-active ESes on the same node, as well as in the same MAC-VRF service.

  • Signal the non-DF state to the CE by using LACP out-of-synch signaling or power off.

    Optionally, the ES can be configured not to signal to the CE. When the LACP synch flag or power off is used to signal the non-DF state to the CE/server, all of the subinterfaces are active on the same node; that is, load balancing is not per-service, but rather per-port. This mode of operation is known as EVPN multi-homing port-active mode.

  • Connect to a CE that uses a single LAG to connect to the ES or separate LAG/ports per leaf in the ES.

Figure 1. EVPN L2 single-homing configuration

All peers in the ES must be configured with the same multi-homing mode; if the nodes are not configured consistently, the oper-multi-homing-mode in state is single-active. From a hardware resource perspective, no local-bias-table entries are populated for ESes in single-active mode.

The following features work in conjunction with single-active mode:

Preference-based DF election / non-revertive option

Preference-based DF election is defined in draft-ietf-bess-evpn-pref-df and specifies a way for the ES peers to elect a DF based on a preference value (highest preference-value wins). The draft-ietf-bess-evpn-pref-df document also defines a non-revertive mode, so that upon recovery of a former DF node, traffic does not revert to the node. This is desirable in most cases to avoid double impact on the traffic (failure and recovery).

The configuration requires the command df-election/algorithm/type preference and the corresponding df-election/algorithm/preference-alg/preference-value. Optionally, you can set non-revertive mode to true. See EVPN multi-homing configuration example.

All of the peers in the ES should be configured with the same algorithm type. However, if that is not the case, all the peers fall back to the default algorithm/oper-type.

Attachment Circuit influenced DF Election (AC-DF)

AC-DF refers to the ability to modify the DF election candidate list for a network-instance based on the presence of the AD per-EVI and per-ES routes. When enabled (ac-df include command), a node cannot become DF if it has the ES subinterface in administratively disabled state. The AC-DF capability is defined in RFC 8584, and it is by default enabled.

The AC-DF capability should be disabled (ac-df exclude command) when single-active multi-homing is used and standby-signaling (lacp power-off command) signals the non-DF state to the multi-homed CE/server. In this case, the same node must be DF for all the contained subinterfaces. Administratively disabling one subinterface does not cause a DF switchover for the network-instance if ac-df exclude is configured.

The AC-DF capability is configured with the command df-election algorithm preference-alg capabilities ac-df; include is the default. See EVPN multi-homing configuration example.

Standby LACP-based or power-off signaling

Standby LACP-based or power-off signaling is used for cases where the AC-DF capability is excluded, and the DF election port-active mode is configured.

When single-active multi-homing is used and all subinterfaces on the node for the ES must be in DF or non-DF state, the multi-homed CE should not send traffic to the non-DF node. SR Linux supports two ways of signaling the non-DF state to the multi-homed CE: LACP standby or power-off.

Signaling the non-DF state is configured at the interface level, using the command interface ethernet standby-signaling, and must also be enabled for a specific ES using the ethernet-segment df-election interface-standby-signaling-on-non-df command. See EVPN multi-homing configuration example.

The LACP signaling method is only available on LAG interfaces with LACP enabled. When the node is in non-DF state, it uses an LACP out-of-synch notification (the synch bit is clear in the LACP PDUs) to signal the non-DF state to the CE. The CE then brings down LACP, and the system does not jump to the collecting-distributing state, and neither does the peer (because of out_of_sync). After the port is out of standby mode, LACP needs to be re-established, and the forwarding ports need to be programmed after that.

The power-off signaling is available on Ethernet and LAG interfaces. When the node is in non-DF state, the interface goes oper-down, and the lasers on the Ethernet interfaces (all members in case of a LAG) are turned off. This brings the CE interface down and avoids any traffic on the link. The interfaces state show oper-state down and oper-down-reason standby-signaling.

Reload-delay timer

After the system boots, the reload-delay timer keeps an interface shut down with the laser off for a configured amount of time until connectivity with the rest of network is established. When applied to an access multi-homed interface (typically an Ethernet Segment interface), this delay can prevent black-holing traffic coming from the multi-homed server or CE.

In EVPN multi-homing scenarios, if one leaf in the ES peer group is rebooting, coming up after an upgrade or a failure, it is important for the ES interface not to become active until after the node is ready to forward traffic to the core. If the ES interface comes up too quickly and the node has not programmed its forwarding tables yet, traffic from the server is black-holed. To prevent this from happening, you can configure a reload-delay timer on the ES interface so that the interface does not become active until after network connectivity is established.

When a reload-delay timer is configured, the interface port is shut down and the laser is turned off from the time that the system determines the interface state following a reboot or reload of the XDP process, until the number of seconds specified in the reload-delay timer elapse.

The reload-delay timer is only supported on Ethernet interfaces that are not enabled with breakout mode. For a multi-homed LAG interface, the reload-delay timer should be configured on all the interface members. The reload-delay timer can be from 1-86,400 seconds. There is no default value; if not configured for an interface, there is no reload-delay timer.

Only ES interfaces should be configured with a non-zero reload-delay timer. Single-homed interfaces and network interfaces (used to forward VXLAN traffic) should not have a reload-delay timer configured.

The following example sets the reload-delay timer for an interface to 20 seconds. The timer starts following a system reboot or when the IMM is reconnected, and the system determines the interface state. During the timer period, the interface is deactivated and the port laser is inactive.

--{ * candidate shared default }--[  ]--
# info interface ethernet-1/1
    interface ethernet-1/1 {
        admin-state enable
        ethernet {
            reload-delay 20
        }
    }

When the reload-delay timer is running, the port-oper-down-reason for the port is shown as interface-reload-timer-active. The reload-delay-expires state indicates the amount of time remaining until the port becomes active. For example:

--{ * candidate shared default }--[  ]--
# info from state interface ethernet-1/1
    interface ethernet-1/1 {
        description eth_seg_1
        admin-state enable
        mtu 9232
        loopback-mode false
        ifindex 671742
        oper-state down
        oper-down-reason interface-reload-time-active
        last-change "51 seconds ago"
        vlan-tagging true
        ...
        ethernet {
            auto-negotiate false
            lacp-port-priority 32768
            port-speed 100G
            hw-mac-address 00:01:01:FF:00:15
            reload-delay 20
reload-delay-expires "18 seconds from now"
            flow-control {
                receive false
                transmit false
            }
        }
    }

EVPN multi-homing configuration example

The following is an example of a single-active multi-homing configuration, including standby signaling power-off, AC-DF capability, and preference-based DF algorithm.

The following configures power-off signaling and the reload delay timer for an interface:

--{ * candidate shared default }--[  ]--
# info interface ethernet-1/21 ethernet
    standby-signaling power-off // needed to signal non-DF state to the CE
    reload-delay 100 // upon reboot, this is required to avoid attracting traffic 
from the multi-homed CE until the node is ready to forward.
// The time accounts for the time it takes all network protocols to be up 
and forwarding entries ready.

The following configures DF election settings for the ES, including preference-based DF election and a preference value for the DF election alg. The ac-df setting is set to exclude, which disables the AC-DF capability. The non-revertive option is enabled, which prevents traffic from reverting back to a former DF node when the node reconnects to the network.

--{ * candidate shared default }--[ system network-instance protocols evpn ethernet-
segments bgp-instance 1 ethernet-segment eth_seg_1 ]--
# info
    admin-state enable
    esi 00:01:00:00:00:00:00:00:00:00
    interface ethernet-1/21
    multi-homing-mode single-active
    df-election {
        interface-standby-signaling-on-non-df { // presence container that enables 
the standby-signaling for the ES
        }
        algorithm {
            type preference // enables the use of preference based DF election
            preference-alg {
                preference-value 100 // changes the default 32767 to a 
specific value
                capabilities {
                    ac-df exclude // turns off the default ac-df capability
                    non-revertive true // enables the non-revertive mode
                }
            }
        }
    }

The following shows the state (and consequently the configuration) of an ES for single-active multi-homing and indicates the default settings for the algorithm/oper-type. All of the peers in the ES should be configured with the same algorithm type. However, if that is not the case, all the peers fall back to the default algorithm.

--{ * candidate shared default }--[ system network-instance protocols evpn ethernet-
segments bgp-instance 1 ethernet-segment eth_seg_1 ]--
# info
    admin-state enable
    oper-state up
    esi 00:01:00:00:00:00:00:00:00:00
    interface ethernet-1/21
    multi-homing-mode single-active
    oper-multi-homing-mode single-active // oper mode may be different if not all 
the ES peers are configured in the same way
    df-election {
        interface-standby-signaling-on-non-df {
        }
        algorithm {
            type preference
            oper-type preference // if at least one peer in the ES is in type 
default, all the peers will fall back to default
            preference-alg {
                preference-value 100
                capabilities {
                    ac-df exclude
                    non-revertive true
                }
            }
        }
    }
    routes {
        next-hop use-system-ipv4-address
        ethernet-segment {
            originating-ip use-system-ipv4-address
        }
    }
    association {
        network-instance blue {
            bgp-instance 1 {
                designated-forwarder-role-last-change "2 seconds ago"
                designated-forwarder-activation-start-time "2 seconds ago"
                designated-forwarder-activation-time 3
                computed-designated-forwarder-candidates {
                    designated-forwarder-candidate 40.1.1.1 {
                        add-time "2 seconds ago"
                        designated-forwarder true
                    }
                    designated-forwarder-candidate 40.1.1.2 {
                        add-time "2 minutes ago"
                        designated-forwarder false
                    }
                }
            }
        }
    }
}

To display information about the ES, use the show system network-instance ethernet-segments command. For example:

--{ [FACTORY] + candidate shared default }--[  ]--
# show system network-instance ethernet-segments eth_seg_1
------------------------------------------------------------------------------------
eth_seg_1 is up, single-active
  ESI      : 00:01:00:00:00:00:00:00:00:00
  Alg      : preference
  Peers    : 40.1.1.2
  Interface: ethernet-1/21
  Network-instances:
     blue
      Candidates : 40.1.1.1 (DF), 40.1.1.2
      Interface : ethernet-1/21.1
------------------------------------------------------------------------------------
Summary
 1 Ethernet Segments Up
 0 Ethernet Segments Down
------------------------------------------------------------------------------------

The detail option displays more information about the ES. For example:

--{ [FACTORY] + candidate shared default }--[  ]--
# show system network-instance ethernet-segments eth_seg_1 detail
====================================================================================
Ethernet Segment
====================================================================================
Name                 : eth_seg_1
40.1.1.1 (DF)
Admin State          : enable              Oper State        : up
ESI                  : 00:01:00:00:00:00:00:00:00:00
Multi-homing         : single-active          Oper Multi-homing : single-active
Interface            : ethernet-1/21
ES Activation Timer  : None
DF Election          : preference          Oper DF Election  : preference
 
Last Change          : 2021-04-06T08:49:44.017Z
====================================================================================
 MAC-VRF    Actv Timer Rem   DF
eth_seg_1   0                Yes
------------------------------------------------------------------------------------
DF Candidates
------------------------------------------------------------------------------------
Network-instance       ES Peers
blue                   40.1.1.1 (DF)
blue                   40.1.1.2
====================================================================================

On the DF node, the info from state command displays the following:

--{ [FACTORY] + candidate shared default }--[  ]--
# info from state interface ethernet-1/21 | grep oper
        oper-state up
            oper-state not-present
            oper-state up

--{ [FACTORY] + candidate shared default }--[  ]--
# info from state network-instance blue interface ethernet-1/21.1
    network-instance blue {
        interface ethernet-1/21.1 {
            oper-state up
            oper-mac-learning up
            index 6
            multicast-forwarding BUM
        }
    }

On the non-DF node, the info from state command displays the following:

--{ [FACTORY] + candidate shared default }--[  ]--
# info from state interface ethernet-1/21 | grep oper
        oper-state down
        oper-down-reason standby-signaling
            oper-state not-present
            oper-state down
            oper-down-reason port-down

--{ [FACTORY] + candidate shared default }--[  ]--
# info from state network-instance blue interface ethernet-1/21.1
    network-instance blue {
        interface ethernet-1/21.1 {
            oper-state down
            oper-down-reason subif-down
            oper-mac-learning up
            index 7
            multicast-forwarding none
        }
    }

Layer 2 proxy-ARP/ND

Proxy-ARP (Address Resolution Protocol) and Proxy-ND (Neighbor Discovery) are Layer 2 functions supported on MAC-VRF network-instances, specified in RFC 9161. Proxy-ARP/ND enables the learning of IP-to-MAC bindings so that leaf nodes can reply to ARP-requests or Neighbor Solicitations without having to flood those requests in the BD. The proxy-ARP/ND function supports ARP/ND flooding suppression and security protection against ARP/ND spoofing attacks in large BDs.

When proxy-ARP/ND is enabled for a MAC-VRF, a table is created that contains entries related to proxy-ARP/ND for the BD. Entries in the proxy-ARP/ND table can be of the following types:

  • Dynamic

    Dynamic IP-MAC entries are learned by snooping ARP and ND messages; requires enabling proxy-ARP/ND and enabling dynamic learning. See Dynamic learning for proxy-ARP/ND.

  • Static

    Static IP-MAC entries are manually provisioned in the proxy-ARP/ND table; requires enabling proxy-ARP/ND and configuring static entries. See Static proxy-ARP entries.

  • EVPN-learned

    EVPN-learned IP-MAC entries are learned from information received in EVPN MAC/IP (type 2 or RT2) routes from remote PE devices; requires enabling proxy-ARP/ND and configuring bgp-evpn on the MAC-VRF. See EVPN learning for proxy-ARP/ND.

  • Duplicate

    Duplicate entries are identified as duplicates by the IP duplication detection procedure. You can configure the criteria that determines whether an entry is a duplicate and optionally inject anti-spoofing MACs in case of duplication. See Proxy-ARP/ND duplicate IP detection.

The proxy-ARP/ND table for the MAC-VRF has a default size of 250 entries. You can modify the table size. See Proxy-ARP/ND table.

Layer 2 proxy-ARP example illustrates how proxy-ARP/ND functions in a BD with an example showing proxy-ARP.

Figure 2. Layer 2 proxy-ARP example

In this example, the SR Linux leaf nodes snoop and learn the local hosts and add dynamic IP-MAC entries for them in the proxy-ARP table for the MAC-VRF. The IP-MAC bindings for the local hosts are advertised in EVPN MAC/IP (type 2 or RT2) routes and installed as EVPN-learned IP-MAC entries in the proxy-ARP table on the remote SR Linux leaf nodes.

For example, SRL LEAF-1 dynamically learns the IP-MAC binding for Host-1 (10.0.0.1-M1) and adds it to the proxy-ARP table as a dynamically learned entry. The IP-MAC binding for Host-1 is advertised in an EVPN MAC/IP route. The remote SRL LEAF-4 installs the IP-MAC binding for Host-1 in its proxy-ARP table as an EVPN-learned entry. When Host-5 sends an ARP-request for 10.0.0.1, SRL LEAF-4 looks it up in the proxy-ARP table and replies with M1. Because Leaf-4 has the 10.0.0.1-M1 binding in its proxy-ARP table, it does not need to flood the request in the BD.

If the lookup is unsuccessful, the ARP-request is re-injected into the datapath and flooded in the BD. Alternatively, this flooding can be suppressed. See Configuring proxy-ARP/ND traffic flooding options.

See RFC 9161 for details about proxy-ARP/ND in EVPN deployments.

Dynamic learning for proxy-ARP/ND

When dynamic learning for proxy-ARP/ND is enabled, all frames coming into bridged subinterfaces of the MAC-VRF that have Ethertype 0x0806 (for proxy-ARP) or ICMPv6 Neighbor Discovery (for proxy-ND) are sent to the CPM for learning. This includes ARP-request and ARP-reply (including gratuitous ARP) messages and Neighbor Solicitation and Neighbor Advertisement messages (for ND). Dynamic entries in the proxy-ARP/ND table are created from both message types. For proxy-ND, dynamic entries are created only from Neighbor Advertisement messages. Learning an entry is done irrespective of the MAC Destination Address (DA) of the ARP or ND packet (unicast or broadcast).

The dynamically learned information in the table is based on the ARP/ND payload MAC Source Address (SA) and IP DA, and not the frame outer MAC SA (although they normally match). A valid MAC SA must be present for the entry to be learned.

In addition to learning the dynamic entry from the ARP-request, ARP-reply, or Neighbor Advertisement, the system re-injects and forwards messages as follows:

  • For received ARP-request or Neighbor Solicitation messages, the system looks up the requested IP address, and if the lookup is successful, it sends an ARP-reply or Neighbor Advertisement with the MAC→IP information. If the lookup is not successful, the ARP-request / Neighbor Solicitation is re-injected and flooded based on the flood list and the configured flooding option, with source squelching. See Configuring proxy-ARP/ND traffic flooding options.

    The re-injected ARP-request or Neighbor Solicitation keeps all existing non-service-delimiting tags of the original frame. Unicast ARP-requests are replied to if there is an entry in the proxy table. If the lookup is not successful, the frame is forwarded to the MAC DA.

  • For ARP-reply / Neighbor Advertisement messages, the MAC DA (unicast) is looked up in the MAC table. In case of a hit, the frame is re-injected and unicasted based on the MAC table information. If there is no hit, the frame is re-injected and flooded based on the flood list information (with source squelching). The re-injected ARP-reply / Neighbor Advertisement keeps all existing non-service-delimiting tags of the original frame.

  • For ARP/ND frames that are sent to the CPM, the ARP reply / Neighbor Advertisement is always unicasted to the subinterface on which the ARP-request / Neighbor Solicitation arrived, even if the MAC itself has not yet been learned in the MAC table.

If a MAC ACL is bound to the ingress of a bridged subinterface (see the SR Linux ACL and Policy-Based Routing Guide), and proxy-ARP/ND is enabled in the network-instance, the rules in the MAC ACL are applied before extraction to the CPM. Consequently, if the MAC ACL drops an ARP packet, it never reaches the CPM.

Disabling dynamic learning for proxy-ARP/ND causes the dynamically learned entries to be flushed from the proxy-ARP/ND table. You can also set an age timer (default disabled) for dynamic entries, after which they are flushed from the proxy table.

In addition, you can set a timer to refresh dynamic entries. At a configured interval (default never) the system generates ARP-requests / Neighbor Solicitations with the intent to refresh the proxy entry; if no response is received, another one is attempted.

Configuring dynamic learning for proxy-ARP/ND

To configure dynamic learning for entries in the proxy-ARP table, enable proxy-ARP and dynamic learning for the MAC-VRF. The same commands exist under proxy-nd.

--{ candidate shared default }--[  ]--
# info network-instance MAC-VRF-1 bridge-table proxy-arp
    network-instance MAC-VRF-1 {
        bridge-table {
            proxy-arp {
                admin-state enable
                dynamic-learning {
                    admin-state enable
                    age-time 600
                    send-refresh 200
                }
            }
        }
    }

This example also configures the age-time and send-refresh timers. By default, both timers are disabled.

  • The age-time specifies in seconds the aging timer for each proxy-ARP/ND entry. When the aging expires, the entry is flushed. The age is reset when a new ARP/GARP/NA for the same IP-MAC binding is received.
  • The send-refresh timer sends ARP-request / Neighbor Solicitation messages at the configured time, so that the owner of the IP address can reply and therefore refresh its IP-MAC (proxy-ARP/ND) and MAC (FDB) entries.

Static proxy-ARP entries

You can configure static entries in the proxy-ARP/ND table. Static entries have higher priority than snooped dynamic and EVPN entries in the table.

A static proxy-ARP/ND entry requires a static or dynamic MAC entry in the MAC table to become active. The static entries are advertised in MAC/IP routes, with the MAC Mobility extended community following the information associated with the MAC entry (static bit or sequence number).

The system does not validate the MAC addresses configured in static entries.

When proxy-arp or proxy-nd is disabled, the configured static entries remain in the proxy table, unlike dynamically learned and EVPN-learned entries, which are flushed from the table.

Configuring static proxy-ARP entries

To configure static proxy-ARP, add entries to the proxy-ARP table.

The following example enables proxy-ARP and configures a static entry in the proxy-ARP table. Note that the system does not validate MAC addresses specified in static entries.

The same commands exist under proxy-nd, although the configured IP address is an IPv6 address.

--{ candidate shared default }--[  ]--
# info network-instance MAC-VRF-1 bridge-table proxy-arp
    network-instance MAC-VRF-1 {
        bridge-table {
            proxy-arp {
                admin-state enable
                static-entries {
                    neighbor 101.1.1.1 {
                        link-layer-address 00:00:64:01:01:01
                    }
                }
            }
        }
    }

EVPN learning for proxy-ARP/ND

When proxy-ARP/ND is configured in an EVPN, the PE devices dynamically learn IP-MAC bindings for their local hosts and advertise them in EVPN MAC/IP routes to remote PE devices. The remote PE devices add these IP-MAC bindings to the proxy-ARP/ND table as EVPN-learned entries.

When a host sends an ARP-request or Neighbor Solicitation for a host on the remote side of the EVPN, if the PE device has the EVPN-learned IP-MAC binding in its proxy-ARP/ND table, it sends back an ARP-reply or Neighbor Advertisement with the remote host's MAC. If the IP-MAC binding does not exist in the PE device's proxy-ARP/ND table, the ARP-request or Neighbor Solicitation is flooded in the BD (ARP-request / Neighbor Solicitation flooding can be optionally disabled). See Layer 2 proxy-ARP example for an illustration of this process.

Configuring EVPN learning for proxy-ARP/ND

To configure EVPN learning for proxy-ARP/ND, enable proxy-ARP/ND for the MAC-VRF and configure bgp-evpn to advertise the IP-MAC bindings in EVPN MAC/IP Advertisement routes and learn new IP-MAC bindings from the imported MAC/IP Advertisement routes.

For deployments that use a multi-homing solution, advertise-arp-nd-only-with-mac-table-entry must be set to true to avoid issues with ESIs. If this setting is not configured, race conditions may result in MAC/IP routes advertised with ESI=0, and later with non-zero ESIs.

Advertise local MACs and local MAC-IP pairs
--{ candidate shared default }--[  ]--
# info network-instance MAC-VRF-1 bridge-table proxy-arp
    network-instance MAC-VRF-1 {
        bridge-table {
            proxy-arp {
                admin-state enable
                }
            }
        }
    }
--{ candidate shared default }--[  ]--
# info network-instance MAC-VRF-1 bgp-evpn bgp-instance 1 routes bridge-table
    network-instance MAC-VRF-1 {
        protocols {
            bgp-evpn {
                bgp-instance 1 {
                    routes {
                        bridge-table {
                            mac-ip {
                                advertise true
                            }
                        }
                    }
                }
            }
        }
    }
Advertise local MAC-IP pairs only with a local MAC in the MAC table
--{ candidate shared default }--[  ]--
# info network-instance MAC-VRF-1 bridge-table proxy-arp
    network-instance MAC-VRF-1 {
        bridge-table {
            proxy-arp {
                admin-state enable
                }
            }
        }
    }
--{ candidate shared default }--[  ]--
# info network-instance MAC-VRF-1 bgp-evpn bgp-instance 1 routes bridge-table
    network-instance MAC-VRF-1 {
        protocols {
            bgp-evpn {
                bgp-instance 1 {
                    routes {
                        bridge-table {
                            mac-ip {
                                advertise-arp-nd-only-with-mac-table-entry true
                            }
                        }
                    }
                }
            }
        }
    }

Note that EVPN MAC/IP routes advertised for Layer 2 and Layer 3 differ in that Layer 3 ARP entries are always advertised in non-zero IP MAC/IP routes, while Layer 2 proxy-ARP/ND entries are advertised only when there is a local MAC in the MAC table, since the proxy-ARP/ND entry must be active.

Enabling advertise-arp-nd-only-with-mac-table-entry is recommended for Layer 2 proxy-ARP/ND multi-homing deployments, because the MAC delete process is handled separately from the proxy-ARP/ND delete process. If the MAC delete process happens first, the proxy-ARP/ND entry is advertised with ES=0 right away, which flushes the MAC on the ES peer.

Configuring proxy-ARP/ND traffic flooding options

If a lookup in the proxy-ARP/ND table is unsuccessful, by default the system re-injects the ARP-request into the data path and floods it in the BD.

For proxy-ARP, you can configure the following options for how ARP frames are flooded into the EVPN network. The default for both of these options is true.

  • unknown-arp-req

    This option configures whether unknown broadcast ARP-requests are flooded to EVPN destinations. Unknown in this context means the lookup in the proxy-ARP/ND table was unsuccessful. Non-broadcast ARP-requests are not affected by this option.

  • gratuitous-arp

    This option configures whether Gratuitous ARP (GARP) requests or replies are flooded to EVPN destinations. GARPs are ARP messages where the sender's IP address matches the target's IP address. Normally the MAC DA is a broadcast address.

For proxy-ND, you can configure the following options for how Neighbor Solicitation and Neighbor Advertisement frames are flooded into the EVPN network. The default for these options is true.

  • unknown-neighbor-advertise-host

    This option configures whether to flood Neighbor Advertisement replies for type host into the EVPN network.

  • unknown-neighbor-advertise-router

    This option configures whether to flood Neighbor Advertisement replies for type router into the EVPN network.

  • unknown-neighbor-solicitation

    This option configures whether to flood Neighbor Solicitation messages (with source squelching) into the EVPN network

Disable EVPN flooding for unknown broadcast ARP-requests and GARPs

The following example disables flooding to EVPN destinations for both unknown broadcast ARP-requests and GARPs:

--{ candidate shared default }--[  ]--
# info network-instance MAC-VRF-1 bridge-table proxy-arp evpn flood
    network-instance MAC-VRF-1 {
        bridge-table {
            proxy-arp {
                evpn {
                    flood {
                        unknown-arp-req false
                        gratuitous-arp false
                    }
                }
            }
        }
    }

Disable EVPN flooding for Neighbor Advertisements

The following example disables flooding to EVPN destinations for Neighbor Advertisement replies for type router and type host:

--{ candidate shared default }--[  ]--
# info network-instance MAC-VRF-1 bridge-table proxy-nd evpn flood
    network-instance MAC-VRF-1 {
        bridge-table {
            proxy-nd {
                evpn {
                    flood {
                        unknown-neighbor-advertise-router false
                        unknown-neighbor-advertise-host false
                    }
                }
            }
        }
    }

Proxy-ARP/ND duplicate IP detection

Proxy-ARP/ND duplicate IP detection is a security mechanism described in RFC 9161 to detect ARP/ND-spoofing attacks. In an ARP/ND-spoofing attack, an attacker sends false ARP/ND messages into a BD, with the goal of associating the attacker's IP address with a target host and directing traffic for that host to the attacker.

The proxy-ARP/ND duplicate IP detection feature monitors changes to active entries in the proxy-ARP/ND table. When an IP move occurs (for example, IP1→MAC1 is replaced by IP1→MAC2 in the table), a monitoring-window timer is started (default 3 minutes). If a specified number of IP moves (default 5) is detected before the monitoring-window timer expires, the IP is considered to be a duplicate.

When the system detects an IP move in the proxy table (for example, IP1→MAC1 changing to IP1→MAC2) it places the IP1→MAC2 proxy table entry in pending-confirmation state for a maximum of 30 seconds. During the pending-confirmation period, the ARP/ND entry is inactive, and a confirm-message is unicast to MAC1. If no reply from MAC1 is received during the pending-confirmation period, the IP1→MAC2 entry is confirmed as legitimate and becomes active. If a reply from MAC1 is received, then MAC2 is sent a confirm-message. If MAC2 replies, an additional confirm-message is sent to MAC1. If both MAC1 and MAC2 keep replying to the confirm-messages, it triggers the duplicate IP detection procedure for IP1, because the number of IP moves exceeds the maximum allowed during the monitoring window.

When an IP is detected as a duplicate, the proxy table cannot be updated with new dynamic or EVPN-learned entries for the same IP (although you can configure a static entry for the IP). The duplicate IP is subject to this restriction until a configured hold-down-time expires (default 9 minutes), after which the entry for the IP is removed from the proxy table, and the monitoring process for the IP is restarted.

Anti-Spoofing MAC

You can configure an Anti-Spoofing MAC (AS-MAC). If an AS-MAC is configured, the system associates the duplicate IP with the AS-MAC in the proxy-ARP/ND table. A GARP/unsolicited-NA message with IP1→AS-MAC is sent to the local CEs, and an EVPN MAC/IP route with IP1→AS-MAC is sent to the remote PEs. This updates the ARP/ND caches on the CEs in the BD, so that CEs in the BD use the AS-MAC as MAC DA when sending traffic to IP1.

If you configure the static-blackhole true option, the AS-MAC is installed in the MAC table as a blackhole MAC, which discards incoming frames with a MAC source or destination matching the AS-MAC.

Configuring proxy-ARP/ND duplicate IP detection

To configure proxy-ARP/ND duplicate IP detection, set the following options:
  • monitoring-window

    This is the number of minutes the system monitors a proxy-ARP/ND table entry following an IP move (default 3 minutes).

  • num-moves

    This is the maximum number of IP moves a proxy-ARP/ND table entry can have during the monitoring-window before the IP is considered duplicate (default 5 IP moves).

  • hold-down-time

    This is the number of minutes from the time an IP is declared duplicate to the time the IP is removed from the proxy-ARP/ND table (default 9 minutes).

  • anti-spoof-mac

    If configured, this replaces the MAC of the duplicate IP in the proxy-ARP/ND table. The AS-MAC is advertised in EVPN to remote PEs.

  • static-blackhole

    If this option is set to true, this installs the AS-MAC in the MAC table as a blackhole MAC, causing incoming frames with a MAC source or destination matching the AS-MAC to be discarded.

Configure proxy-ARP duplicate IP detection
--{ candidate shared default }--[  ]--
# info network-instance MAC-VRF-1 bridge-table proxy-arp ip-duplication
    network-instance MAC-VRF-1 {
        bridge-table {
            proxy-arp {
                ip-duplication {
                    monitoring-window 5
                    num-moves 7
                    hold-down-time 10
                    anti-spoof-mac 00:CA:FE:CA:FE:08
                    static-blackhole true
                }
            }
        }
    }
Configure proxy-ND duplicate IP detection
--{ candidate shared default }--[  ]--
# info network-instance MAC-VRF-1 bridge-table proxy-nd ip-duplication
    network-instance MAC-VRF-1 {
        bridge-table {
            proxy-nd {
                ip-duplication {
                    monitoring-window 5
                    num-moves 7
                    hold-down-time 10
                    anti-spoof-mac 00:CA:FE:CA:FE:08
                    static-blackhole true
                }
            }
        }
    }

Configuring processing for proxy-ARP/ND probe packets

ARP Request and Neighbor Solicitation packets that have 0.0.0.0 or :: as the sender's IP address are used as probes by the IPv4/IPv6 Duplicate Address Detection (DAD) function. Proxy-ARP/ND identifies these probe packets and can either process them locally, or forward them into the MAC-VRF (flooding them into the MAC-VRF) so that they reach all the hosts in the BD. You can configure whether the probe packets are processed locally or flooded to the MAC-VRF.

Configure flooding for proxy-ARP probe packets

The following example configures local processing for ARP probe messages. When set to false, ARP probe messages are flooded to the remote nodes if unknown-ARP-requests are configured to be flooded.

By default, local processing is set to true, which causes ARP probe messages used by the hosts for DAD to be processed, replied if a proxy-ARP entry is hit, or reinjected into the data path.

--{ candidate shared default }--[  ]--
# info network-instance MAC-VRF-1
    network-instance MAC-VRF-1 {
        type mac-vrf
        bridge-table {
            proxy-arp {
                process-arp-probes false
            }
        }
    }
Configure flooding for Neighbor Solicitation DAD messages

The following example configures local processing for Neighbor Solicitation DAD messages. When set to false, Neighbor Solicitation DAD messages are flooded to the remote nodes if unknown-neighbor-solicitation is configured, so that unknown Neighbor Solicitation messages are flooded.

By default, local processing is set to true, which causes Neighbor Solicitation DAD messages used by the hosts for DAD to be processed, replied if a proxy-ND entry is hit, or reinjected into the data path.

--{ candidate shared default }--[  ]--
# info network-instance MAC-VRF-1
    network-instance MAC-VRF-1 {
        bridge-table {
            proxy-nd {
                process-dad-neighbor-solicitations false
            }
        }
    }

Displaying proxy-ARP/ND duplicate IP detection information

You can use a show command to display information about duplicate IPs detected in the proxy-ARP table. A similar show command exists for proxy-ND.

To display duplicate IP information from the proxy-ARP table for all MAC-VRFs:

--{ running }--[  ]--
# show network-instance * bridge-table proxy-arp ip-duplication duplicate-entries
---------------------------------------------------------------------------------------------
IP-duplication in network instance mac-vrf-1
---------------------------------------------------------------------------------------------
Monitoring window       : 3 minutes
Number of moves allowed : 5
Hold-down-time          : 10 seconds
Anti-Spoof-MAC          : 00:DE:AD:00:00:01 (Static-blackhole)
---------------------------------------------------------------------------------------------
Duplicate entries in network instance mac-vrf-1
----------------------------------------------------------------------------------------------
+--------------+------------------------------+---------------------------+------------------+
|  Neighbor    |     MAC Address              | Detect Time               | Hold down time   |
|              |                              |                           | remaining        |
+==============+==============================+===========================+==================+
| 10.10.10.1   |  00:DE:AD:00:00:01           | 2021-12-11T12:48:24.000Z  | 10               |
| 10.10.10.2   |  00:DE:AD:00:00:01           | 2021-12-11T12:48:24.000Z  | 9                |
| 10.10.10.3   |  00:DE:AD:00:00:01           | 2021-12-11T12:48:24.000Z  | 10               |
+--------------+------------------------------+---------------------------+------------------+
---------------------------------------------------------------------------------------------
IP-duplication in network instance mac-vrf-2
---------------------------------------------------------------------------------------------
...
---------------------------------------------------------------------------------------------
Total Duplicate IPs       : 4  Total 4 Active
---------------------------------------------------------------------------------------------

Proxy-ARP/ND table

When you enable proxy-ARP/ND for a MAC-VRF, this creates a table containing proxy-ARP/ND entries learned dynamically by snooping ARP/ND messages, configured statically, or learned from EVPN MAC/IP routes from remote PE nodes. By default, this table can contain up to 250 entries. You can configure the size of this table to be from 1 to 8,192 entries.

The system generates a log event when the size of the table reaches 90% of the maximum size and when the table reaches 95% of the maximum size. When the table reaches 100% of its maximum size, entries for an IP can be replaced (that is, a different MAC can be learned and added for the IP), but no new IP entries can be added to the table, regardless of the type (dynamic, static, or EVPN-learned).

Configuring the proxy-ARP/ND table size

By default, the proxy-ARP or proxy-ND table can contain up to 250 entries of all types (dynamic, static, EVPN, duplicate). You can increase or decrease the maximum size of the table. If you configure the table size to be lower than the number of entries it currently contains, the system stops and restarts the proxy-ARP/ND application, causing the non-static entries to be flushed from the table.

Configure the proxy-ARP table size

The following example configures the size of the proxy-ARP table for a MAC-VRF:

--{ candidate shared default }--[  ]--
# info network-instance MAC-VRF-1 bridge-table proxy-arp table-size
    network-instance MAC-VRF-1 {
        bridge-table {
            proxy-arp {
                table-size 125
            }
        }
    }
Configure the proxy-ND table size

The following example configures the size of the proxy-ND table for a MAC-VRF:

--{ candidate shared default }--[  ]--
# info network-instance MAC-VRF-1 bridge-table proxy-arp table-size
    network-instance MAC-VRF-1 {
        bridge-table {
            proxy-nd {
                table-size 125
            }
        }
    }

Clearing entries from the proxy-ARP/ND table

You can use tools commands to clear dynamic/duplicate entries from the proxy-ARP table. You can clear all dynamic/duplicate entries or you can clear specific entries.

Clear dynamic entries from the proxy-ARP table

To clear dynamic entries from the proxy-ARP table:

--{ running }--[  ]--
# tools network-instance MAC-VRF-1 bridge-table proxy-arp dynamic delete-all
# tools network-instance MAC-VRF-1 bridge-table proxy-arp dynamic entry 10.10.10.1 delete-ip
Clear duplicate entries from the proxy-ARP table

To clear duplicate entries from the proxy-ARP table:

--{ running }--[  ]--
# tools network-instance MAC-VRF-1 bridge-table proxy-arp duplicate delete-all
# tools network-instance MAC-VRF-1 bridge-table proxy-arp duplicate entry 10.10.10.1 delete-ip
Clear dynamic entries from the proxy-ND table

To clear dynamic entries from the proxy-ND table:

--{ running }--[  ]--
# tools network-instance MAC-VRF-1 bridge-table proxy-nd dynamic delete-all
# tools network-instance MAC-VRF-1 bridge-table proxy-nd dynamic entry 10.10.10.1 delete-ip
Clear duplicate entries from the proxy-ND table

To clear duplicate entries from the proxy-ND table:

--{ running }--[  ]--
# tools network-instance MAC-VRF-1 bridge-table proxy-nd duplicate delete-all
# tools network-instance MAC-VRF-1 bridge-table proxy-nd duplicate entry 10.10.10.1 delete-ip

EVPN ARP/ND extended community flags

RFC 9047 defines an extended community (EC) for EVPN MAC/IP advertisement routes that carry flags related to ARP or ND. This EC includes flags that can indicate to a remote PE router whether an ARP/ND entry learned via EVPN belongs to a host or a router, or if the address is an anycast address. Information from the EC flags can allow remote PE routers configured for proxy ARP/ND to reply to local ARP or ND requests with correct information.

On SR Linux, the RFC 9047 EC is not carried in EVPN MAC/IP advertisement routes by default. You can configure SR Linux to advertise the RFC 9047 EC along with the MAC/IP routes advertised for local static, dynamic, and duplicate proxy-ARP/ND entries. When you enable SR Linux to advertise the RFC 9047 EC, you can also configure how the flags are advertised in EVPN MAC/IP routes and how the flags are set in replies to NS messages for EVPN entries.

This feature is not supported along with IRB interfaces.

The feature must be configured consistently on all PEs attached to the same BD, and in particular attached to the same ES and BD.

At the BGP level, only one EVPN ARP/ND EC is expected to be received along with an EVPN MAC/IP advertisement route. If more than one EVPN ARP/ND EC is received, the router considers only the first one on the list for processing purposes on the PE.

At the EVPN level, the flags are ignored if they come in a MAC/IP route without IP.

RFC 9047 extended community flags

RFC 9047 defines the following flags for the EVPN ARP/ND extended community. The flags field is the third octet of the EC.
  • Router flag (bit 7 in the flags field)

    If set, the Router flag (R flag) indicates the IPv6->MAC pair in the MAC/IP advertisement route belongs to an IPv6 router; if not set, the IPv6->MAC pair belongs to a host.

  • Override flag (bit 6 in the flags field)

    The Override flag (O flag) is normally set to 1 by the egress router when advertising IPv6->MAC pairs, set to 0 when IPv6 anycast is enabled for the BD or interface, and ignored when the received MAC/IP advertisement route belongs to an IPv4->MAC pair. Note that anycast IPv6 addresses are not supported in SRLinux proxy-ND tables.

    The ingress PE installs the ND entry with the received O flag and always uses this O flag value when replying to an NS message for the IPv6 address.

    The O flag is conveyed in EVPN advertisements and advertised based on the received NA messages, and is installed in the proxy-ND entries. However, other than transferring the information between ND and EVPN, the O flag does not change any of the selection of ND entries in proxy-ND.

  • Immutable flag (bit 4 in the flags field)

    If set, the Immutable flag (I flag) indicates the IP->MAC pair in the MAC/IP advertisement route is a configured ARP/ND entry; the IP address in the EVPN MAC/IP advertisement route can only be bound together with the MAC address specified in the same route.

    The I flag applies to IPv6 and IPv4 entries and is set in MAC/IP advertisements for static proxy-ARP/ND entries.

Enabling advertisement of the RFC 9047 extended community

When you enable advertisement of the RFC 9047 extended community, the EC is advertised along with the MAC/IP routes advertised for local static, dynamic, and duplicate proxy-ARP/ND entries. Enabling this feature also allows you to control how the EC is processed and how ARP/ND entries are selected based on the I flag.

The following example enables SR Linux to advertise the RFC 9047 EC:

--{ candidate shared default }--[  ]--
# info network-instance m1 protocols bgp-evpn bgp-instance 1 routes bridge-table
    network-instance m1 {
        protocols {
            bgp-evpn {
                bgp-instance 1 {
                    routes {
                        bridge-table {
                            mac-ip {
                                advertise-arp-nd-extended-community true
                            }
                        }
                    }
                }
            }
        }
    }
By default, the advertise-arp-nd-extended-community command is set to false. When this command is set to true, the R, I, and O flags are advertised and processed as described in the following sections:
Advertising and processing for the R flag

After you enable SR Linux to advertise the RFC 9047 EC, you can control the advertisement of proxy-ND EVPN entries and the reply to NS messages received for those EVPN entries by configuring the advertise-neighbor-type command. For example:

--{ candidate shared default }--[  ]--
# info network-instance m1 bridge-table
    network-instance m1 {
        bridge-table {
            proxy-nd {
                evpn {
                    advertise-neighbor-type router-host
                }
            }
        }
    }

The advertise-neighbor-type command has three options: host, router, and router-host. The effect this command has on how SR Linux processes the R flag for static, dynamic, EVPN, and duplicate proxy-ND entries is described in the following table.

Table 1. Effect the advertise-neighbor-type setting has on proxy ND entries
For this proxy-ND entry type The advertise-neighbor-type setting has this effect
Static proxy-ND entries

Static proxy-ND entries are added with an R=1 flag if configured with neighbor <ipv6-address> type router.

NS messages received for the entry are replied to with the configured R flag.

The entries are advertised in EVPN if the command advertise-neighbor-type is configured.

If the configuration for an entry is changed from host to router, or from router to host, an unsolicited NA message is triggered containing the new flag.

Note that this unsolicited NA message is flooded or not into EVPN based on the configured flooding options.

Dynamic proxy-ND entries

When proxy-nd/dynamic-learning/admin-state enable is configured, the router learns dynamic entries from NA messages; the corresponding R flag is also learned and added to the proxy-ND table.

Subsequent NS messages received for the entry are replied to with the corresponding R flag.

The learned dynamic entries are advertised in EVPN depending configured advertise-neighbor-type setting:

  • router advertises dynamic and static entries for which R=1.
  • host advertises dynamic and static entries for which R=0.
  • router-host advertises dynamic and static entries for which R=0 or R=1.

    The router-host option is available only if advertise-arp-nd-extended-community true is configured. The R flag of the entry is then propagated in the R bit of the extended community for the route.

  • The default value is advertise-neighbor-type router.
EVPN proxy-ND entries
The setting for advertise-neighbor-type controls the proxy-ND dynamic/static/duplicate entries that are advertised in EVPN:
  • router only advertises in EVPN dynamic entries learned with R=1, or static entries configured as router.
  • host only advertises in EVPN dynamic entries learned with R=0, or static entries configured as host.
  • router-host advertises in EVPN dynamic and static entries irrespective of the router flag, and with the correct R flag in each case. For an EVPN entry, the command adds the Router flag depending on the R flag coming in the RFC 9047 EC.

  • Irrespective of the configured option, the absence of the RFC 9047 EC in a received MAC/IP route creates a proxy-ND entry of type R=1 (router).

The setting for advertise-neighbor-type must be consistent on all nodes for the same service.

Duplicate proxy-ND entries

Duplicate proxy-ND entries are treated as host or router entries as far as EVPN advertisement and response to NS messages is concerned.

If advertise-neighbor-type host is configured, the duplicate entry is treated as host.

If advertise-neighbor-type router or router-host is configured, the duplicate entry is treated as router.

Note the following:
  • System-generated unsolicited NA messages for duplicate entries carry R=0 or R=1, and are flooded to EVPN if allowed by the configured flooding options.
  • The R flag in the RFC 9047 EC is ignored for proxy-ARP entries.
Advertising and processing for the I flag
When the advertise-arp-nd-extended-community true command is configured, SR Linux advertises the I flag as follows:
  • Any static proxy-ARP/ND entry is advertised with I=1.
  • Duplicate entries with AS-MAC are advertised with I=1 (in addition to O=1 and R=0 or 1, based on configuration).
  • The setting of the I flag is independent of the Static bit associated with the FDB entry; the I flag is only used with proxy-ARP/ND advertisements.

Upon reception, the I flag is processed as follows:

When SR Linux receives an EVPN MAC/IP advertisement route containing an IP->MAC and I=1, it installs the IP->MAC entry in the proxy-ARP/ND table as an immutable binding, overriding any existing non-immutable binding for the same IP->MAC. MAC mobility does not consider the I flag.

For multiple routes to the same IP, proxy-ARP/ND selection operates according the following rules:

  1. Local immutable ARP/ND entries (static) are preferred first
  2. EVPN immutable ARP/ND entries are preferred next
  3. Regular existing ARP/ND selection applies third.

If SR Linux receives multiple EVPN MAC/IP advertisement routes with the I flag set to 1 with the same IP but a different MAC address, a route is selected using these selection rules.

If SR Linux originates an EVPN MAC/IP advertisement route containing an IP->MAC and I=1, it also originates the route with the MAC mobility EC Sticky/static flag set if the MAC is static. In this case. the MAC->IP binding is immutable, and it cannot move. If an update for the same immutable and static IP->MAC is received from a different PE, one of the two routes is selected.

There are no changes for generation of confirmation messages or GARP/unsolicited-NA messages when a new entry is learned.

Advertising and processing for the O flag
When the advertise-arp-nd-extended-community true command is configured, SR Linux advertises the O flag as follows:
  • For dynamic entries, the setting for the O flag is taken from what was learned and added to the proxy-ND table.

  • For static and duplicate entries, O=1.

Upon reception, the O flag is processed as follows:

  • The O flag is stored in the proxy-ND table, and is used when replying to a received NS message.

  • For link-local neighbor advertisements (generated for solicitations to the local chassis MAC) NA messages with O=0 are not learned.

  • If an EVPN route is received without the RFC 9047 EC, the entry is installed with the default value of O=1.

  • For solicited or unsolicited NA messages, the O flag is O=1 or O=0, depending on the flag in the proxy-ND entry that was previously learned.

Displaying proxy-ARP/ND information

You can use show commands to display information about the contents of the proxy-ARP table. Similar show commands exist for proxy-ND.

Display the entire proxy-ARP table

To display all entries in the proxy-ARP table for a MAC-VRF:

--{ candidate shared default }--[  ]--
# show network-instance MAC-VRF-1 bridge-table proxy-arp all
-----------------------------------------------------------------------------------------------
Proxy-ARP table of network instance MAC-VRF-1
-----------------------------------------------------------------------------------------------
+-------------+--------------------+------------+---------+--------+--------------------------+
| Neighbor    | MAC Address        |   Type     | State   | Aging  | Last Update              |
+=============+====================+============+=========+========+==========================+
| 10.10.10.1  | 00:CA:FE:CA:FE:01  |  dynamic   | active  | 300    | 2021-12-11T12:48:24.000Z |
| 10.10.10.2  | 00:CA:FE:CA:FE:02  |  evpn      | active  | N/A    | 2021-12-11T12:48:24.000Z |
| 10.10.10.3  | 00:CA:FE:CA:FE:03  |  duplicate | active  | N/A    | 2021-12-11T12:48:24.000Z |
+-------------+--------------------+------------+---------+--------+--------------------------+
Total Static Neighbors     :    0 Total    0 Active
Total Dynamic Neighbors    :    1 Total    1 Active
Total Evpn Neighbors       :    1 Total    1 Active
Total Duplicate Neighbors  :    1 Total    1 Active
Total Neighbors            :    3 Total    3 Active
-----------------------------------------------------------------------------------------------

The output of this command indicates the total number of entries of each type, as well as the number that are active (replies are sent for received ARP-requests). For an entry to be considered active in the proxy-ARP table, it must have a corresponding entry in the MAC table of the type listed in the following table:

Table 2. MAC table entry types required for proxy-ARP table entry types to be active
Proxy-ARP table entry type MAC table entry type

Dynamic

Learned

Dynamic

Static

Dynamic

EVPN

Static

Learned

Static

Static

EVPN

EVPN (irrespective of the ESI)

EVPN

Static or dynamic matching the EVPN ESI

Duplicate

If a proxy-ARP table entry has a corresponding entry in the MAC table of a type not listed in this table, then the proxy-ARP table entry is considered inactive, so no replies are sent for received ARP-requests. In addition, if the MAC-VRF is not active, its proxy-ARP table entries are not active as well.

Display a specific entry in the proxy-ARP table

--{ candidate shared default }--[  ]--
# show network-instance MAC-VRF-1 bridge-table proxy-arp neighbor 10.10.10.1
----------------------------------------------------------------------------
Proxy-ARP table of network instance MAC-VRF-1
----------------------------------------------------------------------------
Neighbor                : 10.10.10.1
MAC Address             : 00:CA:FE:CA:FE:01
Type                    : dynamic
Programming Status      : Success
Aging                   : 300
Last Update             : 2021-12-11T12:48:24.000Z
Duplicate Detect time   : N/A
Hold down time remaining: N/A
----------------------------------------------------------------------------

Display a summary of the proxy-ARP table

--{ candidate shared default }--[  ]--
# show network-instance * bridge-table proxy-arp summary
------------------------------------------------------
Network Instance Proxy-ARP Table Summary
------------------------------------------------------
Network Instance: MAC-VRF-1
Total Static Neighbors     :    0 Total    0 Active
Total Dynamic Neighbors    :    1 Total    1 Active
Total Evpn Neighbors       :    1 Total    1 Active
Total Duplicate Neighbors  :    1 Total    1 Active
Total Neighbors            :    3 Total    3 Active
Maximum Entries  : 250
Warning Threshold: 95% (237)
Clear Warning    : 90% (225)
-------------------------------------------------------
Network Instance: MAC-VRF-2
Total Static Neighbors     :    1 Total    1 Active
Total Dynamic Neighbors    :    1 Total    1 Active
Total Evpn Neighbors       :    0 Total    0 Active
Total Duplicate Neighbors  :    0 Total    0 Active
Total Neighbors            :    2 Total    2 Active
Maximum Entries  : 250
Warning Threshold: 95% (237)
Clear Warning    : 90% (225)
--------------------------------------------------------
--------------------------------------------------------
Total Static Neighbors     :    1 Total    1 Active
Total Dynamic Neighbors    :    2 Total    2 Active
Total Evpn Neighbors       :    1 Total    1 Active
Total Duplicate Neighbors  :    1 Total    1 Active
Total Neighbors            :    5 Total    5 Active
---------------------------------------------------------

Displaying proxy-ARP/ND statistics

To display proxy-ARP statistics, use the info from state command in candidate or running mode, or the info command in state mode. You can display system-wide statistics or statistics for a specific MAC-VRF network-instance.

Display proxy-ARP statistics

The following example displays proxy-ARP statistics for a MAC-VRF network-instance:

--{ candidate shared default }--[  ]--
# info from state network-instance MAC-VRF-1 bridge-table proxy-arp statistics
    network-instance MAC-VRF-1 {
        bridge-table {
            proxy-arp {
                statistics {
                    total-entries 0
                    active-entries 0
                    in-active-entries 0
                    pending-entries 0
                    neighbor-origin {
                        origin
                        total-entries 0
                        active-entries 0
                        in-active-entries 0
                        pending-entries 0
                    }
                }
            }
        }

Display proxy-ND statistics

The following example displays proxy-ND statistics for a MAC-VRF network-instance:

--{ candidate shared default }--[  ]--
# info from state network-instance MAC-VRF-1 bridge-table proxy-nd statistics
    network-instance MAC-VRF-1 {
        bridge-table {
            proxy-nd {
                statistics {
                    total-entries 0
                    active-entries 0
                    in-active-entries 0
                    pending-entries 0
                    neighbor-origin {
                        origin
                        total-entries 0
                        active-entries 0
                        in-active-entries 0
                        pending-entries 0
                    }
                }
            }
        }