EVPN Sticky ECMP for IP Prefix Routes

This chapter provides information about EVPN sticky ECMP for IP prefix routes.

Topics in this chapter include:

Applicability

The information and the configuration in this chapter are based on SR OS Release 25.3.R1. Sticky ECMP for BGP routes is supported in SR OS Release 19.10 and later. Sticky ECMP for EVPN IFL and EVPN IFF routes is supported in SR OS Release 23.10 and later.

EVPN sticky ECMP can be combined with weighted ECMP and IP aliasing ECMP. For weighted ECMP, see the EVPN unequal ECMP for RT5 IFL and IFF routes chapter. For IP aliasing ECMP, see the EVPN IP Aliasing for IP Prefix Routes chapter.

Overview

Weighted traffic distribution toward anycast services network shows the flow distribution in an EVPN network toward an anycast services network. The ECMP distribution of the flows from the border leaf BL-1 toward the Top of Racks (TORs) is weighted. For example, TOR-4 advertises a weight of 3 (expressed as a next-hop count of 3 in the EVPN Link Bandwidth Extended Community), which means it receives three times as many flows as TOR-5 and as the combined total of TOR-2 and TOR-3. CNF-6 has an EBGP session with TOR-2, but not with TOR-3. With IP aliasing ECMP configured between TOR-2 and TOR-3, TOR-3 forwards traffic to CNF-6 without tromboning via TOR-2.
Note: CNF stands for Containerized Network Function. In this example, the CNFs are simulated by SR OS nodes.
Figure 1. Weighted traffic distribution toward anycast services network

Sticky ECMP is implemented in software and—for FP-based platforms—each sticky route takes 64 next-hop hashing buckets in the data path. When CNF-10 is removed or a network failure causes CNF-10 to be unreachable, TOR-5 withdraws the IP prefix route for prefix 10.10.0.0/24 and the flows are redistributed over the remaining next-hops according to the weighted ECMP set. Existing flows to TOR-2, TOR-3, or TOR-4 may be affected and the potential applications' TCP sessions to the CNFs too. When sticky ECMP is configured, the existing flows via TOR-2, TOR-3, or TOR-4 remain unchanged; only the flows via TOR-5 are affected and must be redistributed over the remaining routes that are shown in Redistributed traffic flows after CNF-10 is removed. Table Sticky ECMP flow distribution when one next-hop is removed for 10.10.0.0/24 in the Appendix shows the distribution over the internal hashing buckets.

Figure 2. Redistributed traffic flows after CNF-10 is removed

The same issue arises when an additional CNF is added. When initially all flows are distributed over TOR-2, TOR-3, and TOR-4, and CNF-10 is added afterward, only a subset of the flows is affected. The total ECMP weight is initially 1 + 3 = 4 and after CNF-10 is added, the total ECMP weight becomes 1 + 3 + 1 = 5. This implies that 80% of the existing flows via TOR-2, TOR-3, or TOR-4 remain unchanged, while 20% of the flows moves to CNF-10 with TOR-5 as next-hop, see Sticky ECMP flow distribution when one next-hop is added for 10.10.0.0/24 in the Appendix.

The sticky-ecmp command enables stickiness and is configurable in policy actions.

*A:BL-1# tree flat detail | match sticky-ecmp
configure router policy-options policy-statement default-action no sticky-ecmp
configure router policy-options policy-statement default-action sticky-ecmp
configure router policy-options policy-statement entry action no sticky-ecmp
configure router policy-options policy-statement entry action sticky-ecmp

The sticky-ecmp command only has effect in a BGP import policy applied to one or more BGP peers in the base router or in a service; it has no effect in a BGP export policy.

Configuration

The following examples are described in this section;

Sticky ECMP can be combined with regular ECMP, weighted ECMP, and IP aliasing ECMP. In all examples in the following sections, weighted ECMP is enabled on BL-1 and all TORs; IP aliasing ECMP is configured on TOR-2 and TOR-3.

Sticky ECMP for EVPN IFL over MPLS

Example topology - IFL EVPN-MPLS shows the example topology with an EVPN-MPLS network in autonomous system (AS) 64500 with border leaf BL-1 and four TORs:
  • TOR-2 and TOR-3 are both connected to CNF-6 and have IP aliasing ECMP
  • TOR-4 is connected to CNF-7, CNF-8, and CNF-9
  • TOR-5 is connected to CNF-10
The CNFs in AS 64501 connect to an anycast services network. The ECMP distribution of the flows from the BL to the TORs are weighted based on the number of CNFs advertising the same anycast network to the TORs, so TOR-4 receives three times as many flows as TOR-5 or the combination of TOR-2 and TOR-3.
Figure 3. Example topology - IFL EVPN-MPLS

Configuration

The initial configuration includes:

  • cards, MDAs, ports
  • router interfaces
  • SR-ISIS on the router interfaces in AS 64500
  • IBGP between route reflector (RR) BL-1 and each TOR for the EVPN address family

Sticky ECMP is only configured on BL-1 using the following policy which adds stickiness to prefix 10.10.0.0/24 in VPRN-10:

# on BL-1:
configure
    router Base
        policy-options
            begin
            prefix-list "cnf_ips-10"
                prefix 10.10.0.0/24 longer
            exit
            community "comm-10"
                members "target:64500:10"
            exit
            policy-statement "import-add-stickiness-vprn-10"
                entry 10
                    from
                        prefix-list "cnf_ips-10"
                        community "comm-10"
                    exit
                    action accept
                        sticky-ecmp     # add stickiness
                    exit
                exit
                entry 11
                    from
                        community "comm-10"
                    exit
                    action accept 
                    exit
                exit
            exit

The sticky-ecmp command only has effect in import policies. It suffices to configure this policy on one BGP peer in the base router on BL-1, as follows. For the same destination 10.10.0.0/24, the router programs the next-hops (192.0.2.2, 192.0.2.3, 192.0.2.4, and 192.0.2.5) as sticky even if only BGP peer 192.0.2.2 is configured with this import policy.

# on RR BL-1:
configure
    router Base
        bgp
            vpn-apply-import
            vpn-apply-export
            rapid-withdrawal
            split-horizon
            rapid-update evpn
            group "TORs"
                family evpn
                type internal
                cluster 192.0.2.1
                peer-as 64500
                neighbor 192.0.2.2
                    import "import-add-stickiness-vprn-10"
                exit
                neighbor 192.0.2.3
                exit
                neighbor 192.0.2.4
                exit
                neighbor 192.0.2.5
                exit
            exit

Alternatively, the import policy can also be configured at service level, see further.

On the TORs, the BGP configuration does not include such import policy:

# on TOR-2, TOR-3, TOR-4, TOR-5:
configure
    router Base
        autonomous-system 64500
        bgp
            vpn-apply-import
            vpn-apply-export
            rapid-withdrawal
            rapid-update evpn
            group "BL"
                family evpn
                type internal
                peer-as 64500
                neighbor 192.0.2.1
                exit
            exit
        exit

The configuration of VPRN-10 on BL-1 is as follows:

# on BL-1:
configure
    service 
        vprn 10 name "VPRN-10" customer 1 create
            description "EVPN-MPLS IFL VPRN-10"
            ecmp 10
            interface "test-10" create
                address 172.20.10.1/32
                sap 1/1/c10/1:10 create
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    evi 10
                    evpn-link-bandwidth
                        advertise
                        weighted-ecmp
                    exit
                    route-distinguisher 192.0.2.1:10
                    vrf-target target:64500:10
                    no shutdown
                exit
            exit
            no shutdown

The service configuration on TOR-2 is as follows. The Ethernet segment is configured on TOR-2 and TOR-3 for IP aliasing ECMP. EBGP is configured in VPRN-10 with neighbor CNF-6 on TOR-2 only, not on TOR-3.

# on TOR-2:
configure
    service
        system
            bgp-evpn
                ethernet-segment "ES-10" virtual create    # same on TOR-3
                    esi 00:00:00:00:00:23:23:23:10:00
                    service-carving
                        mode auto
                    exit
                    multi-homing all-active
                    vprn-next-hop 10.100.10.1            # IP alias on CNF-6
                    evi
                        evi-range 10
                    exit
                    no shutdown
                exit
            exit
        exit
        vprn 10 name "VPRN-10" customer 1 create
            description "EVPN-MPLS IFL VPRN-10"
            ecmp 10
            router-id 192.0.2.2                    # on TOR-3: 192.0.2.3
            autonomous-system 64500
            interface "int-VPRN10-TOR-2-to-CNF-6" create    # "...-TOR-3-..."
                address 10.10.26.1/24              # on TOR-3: 10.10.36.1/24
                    sap 1/1/c3/1:10 create
                exit
            exit
            interface "loopback" create            # only on TOR-2; not on TOR-3
                address 10.100.10.2/32
                loopback
            exit
            static-route-entry 10.100.10.1/32
                next-hop 10.10.26.2                # on TOR-3: 10.10.36.2
                    no shutdown
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                       resolution any
                    exit
                    evi 10
                    evpn-link-bandwidth
                        advertise
                        weighted-ecmp
                    exit
                    route-distinguisher 192.0.2.2:10   # on TOR-3: 192.0.2.3:10
                    vrf-target target:64500:10
                    no shutdown
                exit
            exit
            bgp                    # only on TOR-2; no EBGP in VPRN-10 on TOR-3
                rapid-withdrawal
                split-horizon
                group "PE-CE-10"
                    family ipv4
                    type external
                    export "export-evpn-ifl-bgp"
                    peer-as 64501
                    neighbor 10.100.10.1
                        local-address 10.100.10.2
                        evpn-link-bandwidth
                            add-to-received-bgp 1
                        exit
                    exit
                exit
            exit
            no shutdown

The nodes in AS 64500 exchange EVPN IFL routes for VPRN-10, while the EBGP sessions between VPRN-10 on the TORs and the base router on the CNFs exchange BGP IPv4 routes. The export policy "export-evpn-ifl-bgp" in VPRN-10 on the TORs is needed to export BGP routes for the corresponding EVPN IFL routes:

# on TOR-2, TOR-3, TOR-4, TOR-5:
configure
    router Base
        policy-options
            begin
            policy-statement "export-evpn-ifl-bgp"
                entry 10
                    from
                        protocol evpn-ifl
                    exit
                    to
                        protocol bgp
                    exit
                    action accept
                    exit
                exit
            exit
            commit

On TOR-4, VPRN-10 is configured as follows:

# on TOR-4:
configure
    service
        vprn 10 name "VPRN-10" customer 1 create
            description "EVPN-MPLS IFL VPRN-10"
            ecmp 10
            autonomous-system 64500
            interface "int-VPRN10-TOR-4-CNF-7" create
                address 10.10.47.1/24
                    sap 1/1/c4/1:10 create
                exit
            exit
            interface "int-VPRN10-TOR-4-CNF-8" create
                address 10.10.48.1/24
                    sap 1/1/c5/1:10 create
                exit
            exit
            interface "int-VPRN10-TOR-4-CNF-9" create
                address 10.10.49.1/24
                    sap 1/1/c6/1:10 create
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                       resolution any
                    exit
                    evi 10
                    evpn-link-bandwidth
                        advertise
                        weighted-ecmp
                    exit
                    route-distinguisher 192.0.2.4:10
                    vrf-target target:64500:10
                    no shutdown
                exit
            exit
            bgp
                multi-path
                    ipv4 10
                exit
                rapid-withdrawal
                split-horizon
                group "PE-CE-10"
                    family ipv4
                    type external
                    export "export-evpn-ifl-bgp"
                    peer-as 64501
                    neighbor 10.10.47.2
                        evpn-link-bandwidth
                            add-to-received-bgp 1
                        exit
                    exit
                    neighbor 10.10.48.2
                        evpn-link-bandwidth
                            add-to-received-bgp 1
                        exit
                    exit
                    neighbor 10.10.49.2
                        evpn-link-bandwidth
                            add-to-received-bgp 1
                        exit
                    exit
                exit
            exit
            no shutdown

For VPRN-10, TOR-5 only has neighbor CNF-10 and the configuration is as follows:

# on TOR-5:
configure
    service
        vprn 10 name "VPRN-10" customer 1 create
            description "EVPN-MPLS IFL VPRN-10"
            ecmp 10
            autonomous-system 64500
            interface "int-VPRN10-TOR-5-CNF-10" create
                address 10.10.105.1/24
                    sap 1/1/c3/1:10 create
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                       resolution any
                    exit
                    evi 10
                    evpn-link-bandwidth
                        advertise
                        weighted-ecmp
                    exit
                    route-distinguisher 192.0.2.5:10
                    vrf-target target:64500:10
                    no shutdown
                exit
            exit
            bgp
                rapid-withdrawal
                split-horizon
                group "PE-CE-10"
                    family ipv4
                    type external
                    export "export-evpn-ifl-bgp"
                    peer-as 64501
                    neighbor 10.10.105.2
                        evpn-link-bandwidth
                            add-to-received-bgp 1
                        exit
                    exit
                exit
            exit
            no shutdown

The BGP configuration on the corresponding CNF-10 is as follows:

# on CNF-10 (simulated by SR OS node):
configure
    router Base
        policy-options
            begin
            community "comm-vrf10"
                members "target:64500:10"
            exit
            prefix-list "anycast-ip-10"
                prefix 10.10.0.0/24 longer
            exit
            policy-statement "export-anycast-ip-10"
                entry 10
                    from
                        protocol direct
                        prefix-list "anycast-ip-10" 
                    exit
                    action accept
                        community add "comm-vrf10"      # for VPRN-10 on TORs
                    exit
                exit
            exit
            commit
            info
        exit
        bgp
            rapid-withdrawal
            split-horizon
            group "PE-CE-10"
                neighbor 10.10.105.1
                    type external
                    export "export-anycast-ip-10"
                    local-as 64501
                    peer-as 64500
                exit
            exit
            no shutdown
        exit

The configuration on the other CNFs is similar. The BGP configuration of CNF-6 uses the alias IP 10.100.10.1, as follows:

# on CNF-6 (simulated by SR OS node):
configure
    router Base
        bgp
            rapid-withdrawal
            split-horizon
            group "PE-CE-10"
                neighbor 10.100.10.2    # neighbor reachable via static route
                    local-address 10.100.10.1
                    type external
                    export "export-anycast-ip-10"
                    local-as 64501
                    peer-as 64500
                exit
            exit
            no shutdown

Verification

BL-1 has stickiness applied for destination 10.10.0.0/24 on IBGP peer 192.0.2.2. For the same destination 10.10.0.0/24, BL-1 programs the next-hops as sticky even if only one of them is configured with sticky ECMP. The following route table for prefix 10.10.0.0 in VPRN-10 shows that sticky ECMP is not only requested for the EVPN IFL route with next-hop 192.0.2.2, but also for the EVPN IFL routes with next-hops 192.0.2.3, 192.0.2.4, or 192.0.2.5, as indicated with [S]:

*A:BL-1# show router service-name "VPRN-10" route-table 10.10.0.0

===============================================================================
Route Table (Service: 10)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric
-------------------------------------------------------------------------------
10.10.0.0/24   [S]                            Remote  EVPN-IFL  00h02m50s  170
       192.0.2.2 (tunneled:SR-ISIS:524291)                          10
10.10.0.0/24   [S]                            Remote  EVPN-IFL  00h02m50s  170
       192.0.2.3 (tunneled:SR-ISIS:524295)                          10
10.10.0.0/24   [S]                            Remote  EVPN-IFL  00h02m50s  170
       192.0.2.4 (tunneled:SR-ISIS:524299)                          10
10.10.0.0/24   [S]                            Remote  EVPN-IFL  00h02m50s  170
       192.0.2.5 (tunneled:SR-ISIS:524303)                          10
-------------------------------------------------------------------------------
No. of Routes: 4
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================

Stickiness is applied for all routes to the same destination, regardless of the weight of these routes.

On BL-1, the extensive route table for prefix 10.10.0.0 shows that sticky ECMP is enabled (Sticky ECMP: Yes). The ECMP weight is different for the different next-hops, but the stickiness applies to all next-hops.

*A:BL-1# show router service-name "VPRN-10" route-table 10.10.0.0 extensive

===============================================================================
Route Table (Service: 10)
===============================================================================
Dest Prefix             : 10.10.0.0/24
  Protocol              : EVPN-IFL
  Age                   : 00h03m34s
  Preference            : 170
  Sticky ECMP           : Yes
  Indirect Next-Hop     : 192.0.2.2
    Label               : 524283
    VPN Next-Hop Index  : 23
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 192.0.2.2 (SR-ISIS tunnel:524291)
      Metric            : 10
      ECMP-Weight       : N/A
  Indirect Next-Hop     : 192.0.2.3
    Label               : 524281
    VPN Next-Hop Index  : 20
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 192.0.2.3 (SR-ISIS tunnel:524295)
      Metric            : 10
      ECMP-Weight       : N/A
  Indirect Next-Hop     : 192.0.2.4
    Label               : 524281
    VPN Next-Hop Index  : 25
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 3
    Resolving Next-Hop  : 192.0.2.4 (SR-ISIS tunnel:524299)
      Metric            : 10
      ECMP-Weight       : N/A
  Indirect Next-Hop     : 192.0.2.5
    Label               : 524283
    VPN Next-Hop Index  : 27
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 192.0.2.5 (SR-ISIS tunnel:524303)
      Metric            : 10
      ECMP-Weight       : N/A
-------------------------------------------------------------------------------
No. of Destinations: 1
===============================================================================

The FIB for prefix 10.10.0.0/24 includes the S-flag, as follows:

*A:BL-1# show router service-name "VPRN-10" fib 1 10.10.0.0/24

===============================================================================
FIB Display
===============================================================================
Prefix [Flags]                                              Protocol
  NextHop
-------------------------------------------------------------------------------
10.10.0.0/24 [S]                                            EVPN-IFL
  192.0.2.2 (VPRN Label:524283 Transport:SR-ISIS:524291)
  192.0.2.3 (VPRN Label:524281 Transport:SR-ISIS:524295)
  192.0.2.4 (VPRN Label:524281 Transport:SR-ISIS:524299)
  192.0.2.5 (VPRN Label:524283 Transport:SR-ISIS:524303)
-------------------------------------------------------------------------------
Total Entries : 1
-------------------------------------------------------------------------------
Flags : S = sticky ECMP supported; R = missing hardware resources
-------------------------------------------------------------------------------
===============================================================================

TOR-2, TOR-4, and TOR-5 have a BGP route for anycast prefix 10.10.0.0/24 in the route table for VPRN-10 with default preference 170 and these TORs generate an EVPN IFL route for this anycast prefix. BL-1 receives the following three IP prefix routes for prefix 10.0.0.0/24:

*A:BL-1# show router bgp routes evpn ip-prefix prefix 10.10.0.0/24
===============================================================================
 BGP Router ID:192.0.2.1        AS:64500       Local AS:64500
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
Flag  Route Dist.         Prefix
      Tag                 Gw Address
                          NextHop
                          Label
                          ESI
-------------------------------------------------------------------------------
u*>i  192.0.2.2:10        10.10.0.0/24
      0                   00:00:00:00:00:00
                          192.0.2.2
                          LABEL 524283
                          00:00:00:00:00:23:23:23:10:00    # EVPN IP aliasing

u*>i  192.0.2.4:10        10.10.0.0/24
      0                   00:00:00:00:00:00
                          192.0.2.4
                          LABEL 524281
                          ESI-0

u*>i  192.0.2.5:10        10.10.0.0/24
      0                   00:00:00:00:00:00
                          192.0.2.5
                          LABEL 524283
                          ESI-0

-------------------------------------------------------------------------------
Routes : 3
===============================================================================

The detailed information of these IP prefix routes includes the Sticky flag for the route with next-hop 192.0.2.2, which is the peer that is configured with the import policy to add stickiness:

*A:BL-1# show router bgp routes evpn ip-prefix prefix 10.10.0.0/24 hunt
===============================================================================
 BGP Router ID:192.0.2.1        AS:64500       Local AS:64500
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
Network        : n/a
Nexthop        : 192.0.2.2
Path Id        : None
From           : 192.0.2.2
Res. Nexthop   : 192.168.12.2
Local Pref.    : 100                    Interface Name : int-BL-1-TOR-2
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : None
AIGP Metric    : None                   IGP Cost       : 10
Connector      : None
Community      : target:64500:10 evpn-bandwidth:1:1 bgp-tunnel-encap:MPLS
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.2
Origin         : IGP
Flags          : Used Valid Best Sticky
Route Source   : Internal
AS-Path        : 64501
EVPN type      : IP-PREFIX
ESI            : 00:00:00:00:00:23:23:23:10:00
Tag            : 0
Gateway Address: 00:00:00:00:00:00
Prefix         : 10.10.0.0/24
Route Dist.    : 192.0.2.2:10
MPLS Label     : LABEL 524283
Route Tag      : 0
Neighbor-AS    : 64501
DB Orig Val    : N/A                    Final Orig Val : N/A
Source Class   : 0                      Dest Class     : 0
Add Paths Send : Default
Last Modified  : 00h04m46s

-------------------------------------------------------------------------------

Network        : n/a
Nexthop        : 192.0.2.4
Path Id        : None
From           : 192.0.2.4
Res. Nexthop   : 192.168.14.2
Local Pref.    : 100                    Interface Name : int-BL-1-TOR-4
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : None
AIGP Metric    : None                   IGP Cost       : 10
Connector      : None
Community      : target:64500:10 evpn-bandwidth:1:3 bgp-tunnel-encap:MPLS
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.4
Origin         : IGP
Flags          : Used Valid Best
Route Source   : Internal
AS-Path        : 64501
EVPN type      : IP-PREFIX
ESI            : ESI-0
Tag            : 0
Gateway Address: 00:00:00:00:00:00
Prefix         : 10.10.0.0/24
Route Dist.    : 192.0.2.4:10
MPLS Label     : LABEL 524281
Route Tag      : 0
Neighbor-AS    : 64501
DB Orig Val    : N/A                    Final Orig Val : N/A
Source Class   : 0                      Dest Class     : 0
Add Paths Send : Default
Last Modified  : 00h04m30s

-------------------------------------------------------------------------------

Network        : n/a
Nexthop        : 192.0.2.5
Path Id        : None
From           : 192.0.2.5
Res. Nexthop   : 192.168.15.2
Local Pref.    : 100                    Interface Name : int-BL-1-TOR-5
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : None
AIGP Metric    : None                   IGP Cost       : 10
Connector      : None
Community      : target:64500:10 evpn-bandwidth:1:1 bgp-tunnel-encap:MPLS
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.5
Origin         : IGP
Flags          : Used Valid Best
Route Source   : Internal
AS-Path        : 64501
EVPN type      : IP-PREFIX
ESI            : ESI-0
Tag            : 0
Gateway Address: 00:00:00:00:00:00
Prefix         : 10.10.0.0/24
Route Dist.    : 192.0.2.5:10
MPLS Label     : LABEL 524283
Route Tag      : 0
Neighbor-AS    : 64501
DB Orig Val    : N/A                    Final Orig Val : N/A
Source Class   : 0                      Dest Class     : 0
Add Paths Send : Default
Last Modified  : 00h04m21s

-------------------------------------------------------------------------------
RIB Out Entries
-------------------------------------------------------------------------------
---snip---

Sticky ECMP for EVPN IFL over SRv6

Example topology - IFL EVPN-SRv6 shows the topology with VPRN-20 in an EVPN SRv6 network.

Figure 4. Example topology - IFL EVPN-SRv6

Configuration

The initial configuration includes:
  • cards, MDAs, ports
  • router interfaces
  • IS-IS on the router interfaces of BL-1 and the TORs, except for the router interfaces between TORs and CNFs
  • SRv6 on BL-1 and the TORs
  • IBGP on BL-1 and the TORs with BL-1 acting as RR

On BL-1, the policy "add-stickiness-vprn-20" can be applied at BGP peer level or at service level. In the following configuration, the policy is applied as vrf-import policy in VPRN-20:

# on BL-1:
configure
    router Base
        policy-options
            begin
            prefix-list "cnf_ips-20"
                prefix 10.20.0.0/24 longer
            exit
            community "comm-20"
                members "target:64500:20"
            exit
            policy-statement "AS-20"
                entry 1
                    action accept
                        as-path-prepend 64503 1
                        community add "comm-20"
                    exit
                exit
            exit
            policy-statement "add-stickiness-vprn-20"
                entry 12
                    from
                        prefix-list "cnf_ips-20"
                        community "comm-20"
                    exit
                    action accept
                        sticky-ecmp
                    exit
                exit
                entry 13
                    from
                        community "comm-20"
                    exit
                    action accept
                    exit
                exit
            exit
            commit
        exit
    exit
    service 
        vprn 20 name "VPRN-20" customer 1 create
            description "IFL-SRv6"
            ecmp 10
            interface "test-20" create
                address 172.20.20.1/30
                sap 1/1/c10/1:20 create
                exit
            exit
            segment-routing-v6 1 create
                locator "BL1-loc"
                    function
                        end-dt4
                        end-dt6
                        end-dt46
                    exit
                exit
            exit
            bgp-evpn
                segment-routing-v6 bgp 1 
                    evi 20
                    evpn-link-bandwidth
                        advertise
                        weighted-ecmp
                    exit
                    route-distinguisher 192.0.2.1:20
                    source-address 2001:db8::2:1
                    srv6-instance 1 default-locator "BL1-loc" 
                    vrf-export "AS-20"
                    vrf-import "add-stickiness-vprn-20 
                    vrf-target target:64500:20
                    no shutdown
                exit
            exit
            no shutdown

The configuration of VPRN-20 on the TORs is similar, but without the stickiness. IP aliasing ECMP is implemented on TOR-2 and TOR-3 in a similar way as in Sticky ECMP for EVPN IFL over MPLS. The configuration of the CNFs is also similar.

Verification

On BL-1, the routes for prefix 10.20.0.0 have sticky ECMP enabled for all next-hops, as follows:

*A:BL-1# show router service-name "VPRN-20" route-table 10.20.0.0/24

===============================================================================
Route Table (Service: 20)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric
-------------------------------------------------------------------------------
10.20.0.0/24   [S]                            Remote  EVPN-IFL  00h03m58s  170
       2001:db8:aaaa:102:7b1d:b000:: (tunneled:SRV6)                10
10.20.0.0/24   [S]                            Remote  EVPN-IFL  00h03m58s  170
       2001:db8:aaaa:103:7b1d:9000:: (tunneled:SRV6)                10
10.20.0.0/24   [S]                            Remote  EVPN-IFL  00h03m58s  170
       2001:db8:aaaa:104:7b1d:9000:: (tunneled:SRV6)                10
10.20.0.0/24   [S]                            Remote  EVPN-IFL  00h03m58s  170
       2001:db8:aaaa:105:7b1d:b000:: (tunneled:SRV6)                10
-------------------------------------------------------------------------------
No. of Routes: 4
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================

The extensive route table also shows the stickiness, which applies for all the next-hops regardless of the ECMP weights.

*A:BL-1# show router service-name "VPRN-20" route-table 10.20.0.0 extensive

===============================================================================
Route Table (Service: 20)
===============================================================================
Dest Prefix             : 10.20.0.0/24
  Protocol              : EVPN-IFL
  Age                   : 00h02m05s
  Preference            : 170
  Sticky ECMP           : Yes
  Indirect Next-Hop     : 192.0.2.2
    SRV6 SID            : 2001:db8:aaaa:102:7b1d:b000::
    VPN Next-Hop Index  : 33
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 2001:db8:aaaa:102:7b1d:b000:: (SRV6 tunnel)
      Metric            : 10
      ECMP-Weight       : 1
  Indirect Next-Hop     : 192.0.2.3
    SRV6 SID            : 2001:db8:aaaa:103:7b1d:9000::
    VPN Next-Hop Index  : 35
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 2001:db8:aaaa:103:7b1d:9000:: (SRV6 tunnel)
      Metric            : 10
      ECMP-Weight       : 1
  Indirect Next-Hop     : 192.0.2.4
    SRV6 SID            : 2001:db8:aaaa:104:7b1d:9000::
    VPN Next-Hop Index  : 37
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 3
    Resolving Next-Hop  : 2001:db8:aaaa:104:7b1d:9000:: (SRV6 tunnel)
      Metric            : 10
      ECMP-Weight       : 3
  Indirect Next-Hop     : 192.0.2.5
    SRV6 SID            : 2001:db8:aaaa:105:7b1d:b000::
    VPN Next-Hop Index  : 38
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 2001:db8:aaaa:105:7b1d:b000:: (SRV6 tunnel)
      Metric            : 10
      ECMP-Weight       : 1
-------------------------------------------------------------------------------
No. of Destinations: 1
===============================================================================

BL-1 uses the following EVPN IP prefix routes:

*A:BL-1# show router bgp routes evpn ip-prefix prefix 10.20.0.0/24
===============================================================================
 BGP Router ID:192.0.2.1        AS:64500       Local AS:64500
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
Flag  Route Dist.         Prefix
      Tag                 Gw Address
                          NextHop
                          Label
                          ESI
-------------------------------------------------------------------------------
u*>i  192.0.2.2:20        10.20.0.0/24
      0                   00:00:00:00:00:00
                          192.0.2.2
                          504283
                          00:00:00:00:00:23:23:23:20:00    # IP aliasing ECMP

u*>i  192.0.2.4:20        10.20.0.0/24
      0                   00:00:00:00:00:00
                          192.0.2.4
                          504281
                          ESI-0

u*>i  192.0.2.5:20        10.20.0.0/24
      0                   00:00:00:00:00:00
                          192.0.2.5
                          504283
                          ESI-0

-------------------------------------------------------------------------------
Routes : 3
===============================================================================

The detailed information of these IP prefix routes includes the Sticky flag for all next-hops, as follows:

*A:BL-1# show router bgp routes evpn ip-prefix prefix 10.20.0.0/24 hunt
===============================================================================
 BGP Router ID:192.0.2.1        AS:64500       Local AS:64500
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
Network        : n/a
Nexthop        : 192.0.2.2
Path Id        : None
From           : 192.0.2.2
Res. Nexthop   : 192.168.12.2
Local Pref.    : 100                    Interface Name : int-BL-1-TOR-2
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : None
AIGP Metric    : None                   IGP Cost       : 10
Connector      : None
Community      : target:64500:20 evpn-bandwidth:1:1
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.2
Origin         : IGP
Flags          : Used Valid Best Sticky
Route Source   : Internal
AS-Path        : 64501
EVPN type      : IP-PREFIX
ESI            : 00:00:00:00:00:23:23:23:20:00
Tag            : 0
Gateway Address: 00:00:00:00:00:00
Prefix         : 10.20.0.0/24
Route Dist.    : 192.0.2.2:20
MPLS Label     : 504283
Route Tag      : 0
Neighbor-AS    : 64501
DB Orig Val    : N/A                    Final Orig Val : N/A
Source Class   : 0                      Dest Class     : 0
Add Paths Send : Default
Last Modified  : 00h01m13s
SRv6 TLV Type  : SRv6 L3 Service TLV (5)
SRv6 SubTLV    : SRv6 SID Information (1)
Sid            : 2001:db8:aaaa:102::
Full Sid       : 2001:db8:aaaa:102:7b1d:b000::
Behavior       : End.DT4 (19)
SRv6 SubSubTLV : SRv6 SID Structure (1)
Loc-Block-Len  : 48                     Loc-Node-Len   : 16
Func-Len       : 20                     Arg-Len        : 0
Tpose-Len      : 20                     Tpose-offset   : 64

-------------------------------------------------------------------------------

Network        : n/a
Nexthop        : 192.0.2.4
Path Id        : None
From           : 192.0.2.4
Res. Nexthop   : 192.168.14.2
Local Pref.    : 100                    Interface Name : int-BL-1-TOR-4
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : None
AIGP Metric    : None                   IGP Cost       : 10
Connector      : None
Community      : target:64500:20 evpn-bandwidth:1:3
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.4
Origin         : IGP
Flags          : Used Valid Best Sticky
Route Source   : Internal
AS-Path        : 64501
EVPN type      : IP-PREFIX
ESI            : ESI-0
Tag            : 0
Gateway Address: 00:00:00:00:00:00
Prefix         : 10.20.0.0/24
Route Dist.    : 192.0.2.4:20
MPLS Label     : 504281
Route Tag      : 0
Neighbor-AS    : 64501
DB Orig Val    : N/A                    Final Orig Val : N/A
Source Class   : 0                      Dest Class     : 0
Add Paths Send : Default
Last Modified  : 00h00m58s
SRv6 TLV Type  : SRv6 L3 Service TLV (5)
SRv6 SubTLV    : SRv6 SID Information (1)
Sid            : 2001:db8:aaaa:104::
Full Sid       : 2001:db8:aaaa:104:7b1d:9000::
Behavior       : End.DT4 (19)
SRv6 SubSubTLV : SRv6 SID Structure (1)
Loc-Block-Len  : 48                     Loc-Node-Len   : 16
Func-Len       : 20                     Arg-Len        : 0
Tpose-Len      : 20                     Tpose-offset   : 64

-------------------------------------------------------------------------------

Network        : n/a
Nexthop        : 192.0.2.5
Path Id        : None
From           : 192.0.2.5
Res. Nexthop   : 192.168.15.2
Local Pref.    : 100                    Interface Name : int-BL-1-TOR-5
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : None
AIGP Metric    : None                   IGP Cost       : 10
Connector      : None
Community      : target:64500:20 evpn-bandwidth:1:1
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.5
Origin         : IGP
Flags          : Used Valid Best Sticky
Route Source   : Internal
AS-Path        : 64501
EVPN type      : IP-PREFIX
ESI            : ESI-0
Tag            : 0
Gateway Address: 00:00:00:00:00:00
Prefix         : 10.20.0.0/24
Route Dist.    : 192.0.2.5:20
MPLS Label     : 504283
Route Tag      : 0
Neighbor-AS    : 64501
DB Orig Val    : N/A                    Final Orig Val : N/A
Source Class   : 0                      Dest Class     : 0
Add Paths Send : Default
Last Modified  : 00h00m45s
SRv6 TLV Type  : SRv6 L3 Service TLV (5)
SRv6 SubTLV    : SRv6 SID Information (1)
Sid            : 2001:db8:aaaa:105::
Full Sid       : 2001:db8:aaaa:105:7b1d:b000::
Behavior       : End.DT4 (19)
SRv6 SubSubTLV : SRv6 SID Structure (1)
Loc-Block-Len  : 48                     Loc-Node-Len   : 16
Func-Len       : 20                     Arg-Len        : 0
Tpose-Len      : 20                     Tpose-offset   : 64

-------------------------------------------------------------------------------
RIB Out Entries
-------------------------------------------------------------------------------
---snip---

Sticky ECMP for EVPN IFF over VXLAN

Example topology - IFF EVPN-VXLAN shows the topology with R-VPLS BD-31 linked to VPRN-30 in an EVPN-VXLAN network.

Figure 5. Example topology - IFF EVPN-VXLAN

Configuration

The initial configuration includes:
  • cards, MDAs, ports
  • router interfaces
  • IS-IS on the router interfaces of BL-1 and the TORs, except for the router interfaces between TORs and CNFs
  • IBGP on BL-1 and the TORs with BL-1 acting as RR

On BL-1, the import policy adds ECMP stickiness to R-VPLS BD-31. This import policy is applied as vsi-import in R-VPLS BD-31, but it can also be applied at BGP peer level. The configuration on BL-1 is as follows:

# on BL-1:
configure
    router Base
        policy-options
            begin
            prefix-list "cnf_ips-30"
                prefix 10.30.0.0/24 longer
            exit
            community "comm-31"
                members "target:64500:31"
            exit
            policy-statement "import-add-stickiness-rvpls-31"
                entry 10
                    from
                        prefix-list "cnf_ips-30"
                        community "comm-31"
                    exit
                    action accept
                        sticky-ecmp     # add stickiness
                    exit
                exit
                entry 11
                    from
                        community "comm-31"
                    exit
                    action accept 
                    exit
                exit
            exit
        exit
        commit
    exit
    service 
        vprn 30 name "VPRN-30" customer 1 create
            ecmp 10
            interface "test-30" create
                address 172.20.30.1/32
                loopback
            exit
            interface "int-BD-31" create
                vpls "BD-31"
                    evpn-tunnel
                exit
            exit
            no shutdown
        exit
        vpls 31 name "BD-31" customer 1 create
            description "broadcast domain 31 connected to VPRN-30"
            allow-ip-int-bind
            exit
            vxlan instance 1 vni 31 create
            exit
            bgp
                vsi-import "import-add-stickiness-rvpls-31" 
            exit
            bgp-evpn
                ip-route-advertisement
                ip-route-link-bandwidth
                    advertise weight dynamic max-dynamic-weight 128
                    weighted-ecmp
                exit
                evi 31
                vxlan bgp 1 vxlan-instance 1
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
        exit

The configuration on TOR-2 is as follows. The EVI in the ES for IP aliasing ECMP corresponds to the R-VPLS BD-31.

# on TOR-2:
configure
    router Base
        policy-options
            begin
            policy-statement "export-to-bgp"
                entry 10
                    to
                        protocol bgp
                    exit
                    action accept
                    exit
                exit
            exit
            commit
        exit
    exit
    service
        system
            bgp-evpn
                ethernet-segment "ES-31" virtual create    # same on TOR-3
                    esi 00:00:00:00:00:23:23:23:31:00
                    service-carving
                        mode auto
                    exit
                    multi-homing all-active
                    vprn-next-hop 10.100.30.1
                    evi
                        evi-range 31
                    exit
                    no shutdown
                exit
                info
            exit
        exit
        vprn 30 name "VPRN-30" customer 1 create
            ecmp 10
            autonomous-system 64500
            interface "int-BD-31" create
                vpls "BD-31"
                    evpn-tunnel
                exit
            exit
            interface "int-VPRN30-TOR-2-to-CNF-6" create
                address 10.30.26.1/24        # on TOR-3: 10.30.36.1/24
                sap 1/1/c3/1:30 create
                exit
            exit
            interface "loopback" create      # only on TOR-2, not on TOR-3
                address 10.100.30.2/32
                loopback
            exit
            static-route-entry 10.100.30.1/32
                next-hop 10.30.26.2         # on TOR-3: 10.30.36.1
                    no shutdown
                exit
            exit
            bgp                             # only on TOR-2, no EBGP on TOR-3
                preference 168      # preferred over EVPN IFF routes (169)
                rapid-withdrawal
                group "PE-CE"
                    type external
                    export "export-to-bgp"  # from any protocol to bgp
                    peer-as 64501
                    neighbor 10.100.30.1    # IP alias
                        evpn-link-bandwidth
                            add-to-received-bgp 1
                        exit
                    exit
                exit
                no shutdown
            exit
            no shutdown
        exit
        vpls 31 name "BD-31" customer 1 create    # same on TOR-3, TOR-4, TOR-5
            allow-ip-int-bind
            exit
            vxlan instance 1 vni 31 create
            exit
            bgp
            exit
            bgp-evpn
                ip-route-advertisement
                ip-route-link-bandwidth
                    advertise weight dynamic max-dynamic-weight 128
                    weighted-ecmp
                exit
                evi 31
                vxlan bgp 1 vxlan-instance 1
                    auto-disc-route-advertisement
                    mh-mode network
                    no shutdown
                exit
            exit
            stp
                shutdown
            exit
            no shutdown
        exit

The configuration on the other TORs is similar, but the interface addresses are used instead of the IP alias.

EVPN IFF routes have a default preference of 169, whereas EVPN IFL routes have a default preference of 170. The TORs with EBGP sessions to the CNFs receive the EBGP route for the anycast prefix with default BGP preference 170 and they advertise an EVPN IFF route for the anycast prefix with preference 169. When the TORs receive EVPN IFF routes for the anycast prefix with preference 169, the EVPN IFF routes have preference over the EBGP route. The TORs install the EVPN IFF route in the route table for VPRN-30 and they withdraw their own generated EVPN IFF route, so BL-1 will not have EVPN IFF routes to each TOR. To prefer the EBGP route over the EVPN IFF routes, the preference of the EBGP routes is configured with a value lower than 169.

Verification

TOR-2, TOR-4, and TOR-5 install the prefix 10.30.0.0 in VPRN-30 with the configured preference 168; on TOR-2 as follows:

*A:TOR-2# show router 30 route-table 10.30.0.0 

===============================================================================
Route Table (Service: 30)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric   
-------------------------------------------------------------------------------
10.30.0.0/24                                  Remote  BGP       00h01m14s  168
       10.30.26.2                                                   1
-------------------------------------------------------------------------------
No. of Routes: 1

The BGP route for prefix 10.30.0.0/24 is preferred over the EVPN IFF routes for the same anycast prefix, so TOR-2, TOR-4, and TOR-5 each generate an EVPN IFF route for prefix 10.30.0.0/24. BL-1 receives the following three IP prefix routes for prefix 10.30.0.0/24:

*A:BL-1# show router bgp routes evpn ip-prefix prefix 10.30.0.0/24
===============================================================================
 BGP Router ID:192.0.2.1        AS:64500       Local AS:64500
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
Flag  Route Dist.         Prefix
      Tag                 Gw Address
                          NextHop
                          Label
                          ESI
-------------------------------------------------------------------------------
u*>i  192.0.2.2:31        10.30.0.0/24
      0                   00:02:fe:ff:ff:5c
                          192.0.2.2
                          VNI 31
                          00:00:00:00:00:23:23:23:31:00

u*>i  192.0.2.4:31        10.30.0.0/24
      0                   00:04:fe:ff:ff:5c
                          192.0.2.4
                          VNI 31
                          ESI-0

u*>i  192.0.2.5:31        10.30.0.0/24
      0                   00:05:fe:ff:ff:5c
                          192.0.2.5
                          VNI 31
                          ESI-0

-------------------------------------------------------------------------------
Routes : 3
===============================================================================

On BL-1, the EVPN-IFF routes for prefix 10.30.0.0 in VPRN-30 have stickiness for all next-hops, as follows:

*A:BL-1# show router service-name "VPRN-30" route-table 10.30.0.0

===============================================================================
Route Table (Service: 30)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric
-------------------------------------------------------------------------------
10.30.0.0/24   [S]                            Remote  EVPN-IFF  00h00m26s  169
       int-BD-31 (ET-00:02:fe:ff:ff:5c)                             0
10.30.0.0/24   [S]                            Remote  EVPN-IFF  00h00m26s  169
       int-BD-31 (ET-00:03:fe:ff:ff:5c)                             0
10.30.0.0/24   [S]                            Remote  EVPN-IFF  00h00m26s  169
       int-BD-31 (ET-00:04:fe:ff:ff:5c)                             0
10.30.0.0/24   [S]                            Remote  EVPN-IFF  00h00m26s  169
       int-BD-31 (ET-00:05:fe:ff:ff:5c)                             0
-------------------------------------------------------------------------------
No. of Routes: 4
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================
*A:BL-1# show router service-name "VPRN-30" route-table 10.30.0.0 extensive

===============================================================================
Route Table (Service: 30)
===============================================================================
Dest Prefix             : 10.30.0.0/24
  Protocol              : EVPN-IFF
  Age                   : 00h00m58s
  Preference            : 169
  Sticky ECMP           : Yes
  Next-Hop              : int-BD-31 (ET-00:02:fe:ff:ff:5c)
    Interface           : int-BD-31
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    Metric              : 0
    ECMP-Weight         : 1
  Next-Hop              : int-BD-31 (ET-00:03:fe:ff:ff:5c)
    Interface           : int-BD-31
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    Metric              : 0
    ECMP-Weight         : 1
  Next-Hop              : int-BD-31 (ET-00:04:fe:ff:ff:5c)
    Interface           : int-BD-31
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    Metric              : 0
    ECMP-Weight         : 3
  Next-Hop              : int-BD-31 (ET-00:05:fe:ff:ff:5c)
    Interface           : int-BD-31
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    Metric              : 0
    ECMP-Weight         : 1
-------------------------------------------------------------------------------
No. of Destinations: 1
===============================================================================

Conclusion

When EVPN IP prefix routes advertise an additional route with a new next-hop for the same prefix or when an EVPN IP prefix is withdrawn for that prefix, the number of paths changes and therefore, the flow distribution changes. Upon withdrawal of one of the next-hops, sticky ECMP redistributes only the affected flows. When adding a next-hop, sticky ECMP minimizes the impact on existing flows. This way, the number of TCP resets is limited. The stickiness is solely associated with next-hops and not with links at LAG level.

Appendix

The sticky ECMP implementation is based on software. The ECMP behavior is emulated by repeating each ECMP next-hop of the sticky route a number of times, depending on the next-hop normalized weight, in different hashing buckets. The assignment of hashing buckets is not based on the number of existing next-hops for a router, but on the maximum number of internal hashing buckets, which is 64 in FP-based platforms and 16 in IXR platforms.

Note: The closer the number of next-hops to the maximum number of ECMP paths (64 for SR OS), the worse the distribution algorithm works. (In the example, only three next-hops are used for 64 ECMP paths.)

Sticky ECMP flow distribution when one next-hop is removed for 10.10.0.0/24 compares the initial sticky ECMP distribution with three next-hops with the sticky ECMP flow distribution when next-hop 3 is removed as in Redistributed traffic flows after CNF-10 is removed. Next-hop 2 has weight 3 while next-hops 1 and 3 have weight 1.

Table 1. Sticky ECMP flow distribution when one next-hop is removed for 10.10.0.0/24
Initial sticky ECMP distribution with next-hop 1 (weight 1), next-hop 2 (weight 3), and next-hop 3 (weight 1) Sticky ECMP distribution after next-hop 3 fails
bucket next-hop bucket next-hop
00 1 00 1
01 2 01 2
02 2 02 2
03 2 03 2
04 3 04 1
05 1 05 1
06 2 06 2
07 2 07 2
08 2 08 2
09 3 09 2
10 1 10 1
11 2 11 2
12 2 12 2
13 2 13 2
14 3 14 2
15 1 15 1
16 2 16 2
17 2 17 2
18 2 18 2
19 3 19 2
20 1 20 1
21 2 21 2
22 2 22 2
23 2 23 2
24 3 24 1
25 1 25 1
26 2 26 2
27 2 27 2
28 2 28 2
29 3 29 2
30 1 30 1
31 2 31 2
32 2 32 2
33 2 33 2
34 3 34 2
35 1 35 1
36 2 36 2
37 2 37 2
38 2 38 2
39 3 39 2
40 1 40 1
41 2 41 2
42 2 42 2
43 2 43 2
44 3 44 1
45 1 45 1
46 2 46 2
47 2 47 2
48 2 48 2
49 3 49 2
50 1 50 1
51 2 51 2
52 2 52 2
53 2 53 2
54 3 54 2
55 1 55 1
56 2 56 2
57 2 57 2
58 2 58 2
59 3 59 2
60 1 60 1
61 2 61 2
62 2 62 2
63 2 63 2

All existing flows with next-hops 1 (TOR-2) or 2 (TOR-4) remain unchanged; only the flows with next-hop 3 (TOR-5) are redistributed over the remaining paths according to the weighted ECMP set.

Similarly, when the initial ECMP distribution has two next-hops (next-hop 1 with weight 1 and next-hop 2 with weight 3) and a third next-hop (next-hop 3 with weight 1) is added, the stickiness ensures that only 20% of the flows is redistributed, as shown in Sticky ECMP flow distribution when one next-hop is added for 10.10.0.0/24. The initial situation is different from the preceding table.

Table 2. Sticky ECMP flow distribution when one next-hop is added for 10.10.0.0/24
Initial sticky ECMP distribution with next-hop (weight 1) and next-hop 2 (weight 3) Sticky ECMP distribution after next-hop 3 (weight 1) is added
bucket next-hop bucket next-hop
00 1 00 1
01 2 01 2
02 2 02 2
03 2 03 2
04 1 04 3
05 2 05 2
06 2 06 2
07 2 07 2
08 1 08 1
09 2 09 3
10 2 10 2
11 2 11 2
12 1 12 1
13 2 13 2
14 2 14 3
15 2 15 2
16 1 16 1
17 2 17 2
18 2 18 2
19 2 19 3
20 1 20 1
21 2 21 2
22 2 22 2
23 2 23 2
24 1 24 3
25 2 25 2
26 2 26 2
27 2 27 2
28 1 28 1
29 2 29 3
30 2 30 2
31 2 31 2
32 1 32 1
33 2 33 2
34 2 34 3
35 2 35 2
36 1 36 1
37 2 37 2
38 2 38 2
39 2 39 3
40 1 40 1
41 2 41 2
42 2 42 2
43 2 43 2
44 1 44 3
45 2 45 2
46 2 46 2
47 2 47 2
48 1 48 1
49 2 49 3
50 2 50 2
51 2 51 2
52 1 52 1
53 2 53 2
54 2 54 3
55 2 55 2
56 1 56 1
57 2 57 2
58 2 58 2
59 2 59 3
60 1 60 1
61 2 61 2
62 2 62 2
63 2 63 2
Note: With sticky ECMP, the distribution over the hashing buckets is not deterministic. The initial distribution is the result of a number of changes (added next-hops or deleted next-hops) that happened beforehand and sticky ECMP keeps as many flows as possible.