BGP Add-Path
This chapter provides information about BGP Add-Path.
Topics in this chapter include:
Applicability
The chapter was initially written for SR OS Release 14.0.R7, but the CLI in the current edition is based on SR OS Release 22.2.R2.
Overview
When a BGP router learns multiple paths for the same prefix, it selects one route as its best path and advertises only this route to its BGP peers. The BGP add-path feature allows advertising the best n paths for the same prefix, where n is configurable. If the set of n paths includes multiple paths with the same BGP next hop, only the best route with a specific next hop is advertised and the other paths are suppressed.
The BGP add-path feature increases path visibility in the Autonomous System (AS), because more routes are stored in the Routing Information Base (RIB). BGP add-path has the following benefits:
Faster convergence after failure
Enhanced load-sharing
Reduced routing churn
These benefits are described in the following sections.
Faster convergence after failure
RR advertises best path only – path A preferred over path B shows a network that does not support add-path. CE-4 advertises two paths for prefix 10.0.4.0/24 to its EBGP neighbors: PE-1 and PE-2. PE-1 has an import policy that sets the local preference (LP) of path A to 200; PE-2 keeps the default LP of 100 for path B. Therefore, path A that is advertised to PE-1 is preferred in AS 64496. The route reflector RR-5 advertises the preferred path A to PE-2 and PE-3. PE-2 suppresses the advertisement of its external path (B) to RR-5, because path A is preferred. Traffic from CE-6 to CE-4 is sent via PE-3 and PE-1.
When the link between CE-4 and PE-1 fails, the following steps take place for reconvergence:
PE-1 sends a BGP update withdrawing path A to RR-5.
RR-5 receives and propagates the withdrawal to its other clients: PE-2 and PE-3.
PE-2 receives the withdrawal of path A and reruns the BGP decision process. PE-2 selects path B as its best route and advertises path B to RR-5.
RR-5 receives the BGP update for path B and reruns its BGP decision process. RR-5 selects path B as its best path and advertises path B to its other clients: PE-1 and PE-3.
PE-1 and PE-3 rerun their BGP decision process and determine that path B is the best path. Traffic can flow from CE-6 to CE-4 via PE-3 and PE-2.
Reconvergence after path failure (without add-path) shows the BGP updates sent to withdraw path A and advertise path B.
If the propagation time of a BGP update message between RR-5 and any of its clients is X, the convergence time is four times X, plus processing, transmission, and queuing delays.
With the use of add-path on all BGP routers in AS 64496, the convergence time can be reduced considerably, because PE-3 has more than one path for prefix 10.0.4.0/24 in its RIB-IN before the failure takes place. When there are no failures, PE-2 decides that path A is best, and PE-2 also advertises its second-best path (B)—which is its best external path—to RR-5. With add-path enabled, the RR has knowledge of two paths for prefix 10.0.4.0/24 and advertises both to its clients. PE-3 receives two routes for prefix 10.0.4.0/24, reruns the BGP decision process, and updates its forwarding table based on the results. The following options are possible:
Path A is the best path, whereas path B is maintained in the RIB-IN. The FIB entry for destination 10.0.4.0/24 points at path {A} only.
When BGP FRR is enabled as described in chapter BGP Fast Reroute, path A is the best path and path B is the second-best path. The FIB entry for destination 10.0.4.0/24 points to path {A,B}. If path A is available, it is used for all traffic to the destination; if path A is unavailable but path B is available, then all traffic to the destination is directed to path B. In this case, path B is effectively a pre-computed, pre-installed backup path for the destination.
When Equal Cost Multi-Path (ECMP) and BGP multipath are enabled and the paths have an equal cost, both paths A and B represent the best path. The FIB entry for destination 10.0.4.0/24 points to multipath entry {A,B}. When both paths are available, traffic to the destination is load-shared across paths A and B. If only one path is available, traffic is directed to that available path.
Advertised paths when BGP add-path is enabled in PEs and RR shows the BGP update messages prior to any failures. RR-5 receives path A from PE-1 and path B from PE-2, whereas it advertises path B to PE-1, path A to PE-2, and both path A and path B to PE-3. Path B has the default LP 100, whereas path A gets LP 200 as per import policy on PE-1. However, in case of ECMP, both paths keep the default LP 100.
Reconvergence after path failure when BGP add-path is enabled shows the BGP update messages that are sent after a link failure between CE-4 and PE-1. With add-path, fewer steps are required for convergence:
PE-1 sends a BGP update message withdrawing path A.
RR-5 receives the withdrawal and propagates it to its clients PE-2 and PE-3.
PE-2 and PE-3 receive the withdrawal, rerun the BGP decision process, and update the forwarding entry for destination 10.0.4.0/24: path B is best.
The convergence time with add-path is much shorter than without add-path. If X is the propagation time of a BGP update message between RR and any of the PEs, then the convergence time is the time required for the BGP update from PE-1 to RR-5 (X) plus the time required for the BGP update propagation from RR-5 to the other PEs (X), in addition to delays for processing, transmission, and queuing. The convergence with add-path is twice as fast as without add-path.
For some types of failures, the convergence can be even faster:
When PE-1 becomes unreachable, the next-hop tracking by PE-3 will invalidate path A before the BGP withdrawal message is received from RR-5.
If PE-3 implements BGP FRR and path A has been marked as unusable, PE-3 can switch traffic destined to 10.0.4.0/24 to path B.
When Bidirectional Forwarding Detection (BFD) is enabled on the EBGP sessions and on the IGP protocol, the failure is detected faster and BGP convergence can be sped up when BGP FRR is enabled.
Enhanced load-sharing
When paths A and B are equal in cost or preference, and ECMP and BGP multipath are enabled on all PEs, load-sharing can be done for traffic with destination 10.0.4.0/24. With BGP add-path, both paths A and B are advertised to the PEs. PE-3 runs the BGP decision process and determines that paths A and B are both best paths to destination 10.0.4.0/24, so paths A and B are combined into one multipath forwarding entry: {A,B}.
The benefits of load-sharing for traffic to destination 10.0.4.0/24 are the following:
More even bandwidth utilization of the links in AS 64496
More even bandwidth utilization for traffic across peering points PE-1 and PE-2 with AS 64500
Faster reaction to some failures; for example, the BGP next hop for one of the paths becomes unreachable in the IGP and next hop tracking is enabled.
Reduced routing churn
Routing churn refers to repeated advertisements and withdrawals of a prefix and path. Some degree of routing churn is normal and expected in most networks. However, it should be contained as much as possible to avoid overloading router CPUs. Routing churn can be caused by:
Flapping links (links that repeatedly transition between up and down state)
Route oscillation (networks that use RRs or AS confederations and BGP path selection relies on Multi Exit Discriminator (MED) and IGP cost comparisons)
Add-path helps to reduce routing churn by constraining the effect of some failures to the local AS where they occur. For example, the link between CE-4 and PE-1 could repeatedly cycle up and down due to a misconfiguration. When the link goes down, a BGP withdrawal message is sent by PE-1 to RR-5 and from RR-5 to the other RR clients (PE-2 and PE-3). PE-3 will withdraw and advertise path A to its EBGP peer CE-6 in AS 64501, but path B is constantly advertised to CE-6 (when add-path has been negotiated between PE-3 and CE-6).
Without add-path, PE-2 would be affected by the instability in AS 64496 and there would be periods of time when AS 64501 has no paths to destination 10.0.4.0/24 (between the withdrawal of path A and the advertisement of path B).
Add-path implementation
BGP add-path is configured in the base routing instance, for IBGP or EBGP, per address family at different levels: in the global bgp context, per group, and per neighbor. The following address families are supported:
*A:PE-1>config>router>bgp# add-paths ?
- add-paths
- no add-paths
[no] evpn - Configure evpn ADD-PATH limits
[no] ipv4 - Configure ipv4 ADD-PATH limits
[no] ipv6 - Configure ipv6 ADD-PATH limits
[no] label-ipv4 - Configure label-ipv4 ADD-PATH limits
[no] label-ipv6 - Configure label-ipv6 ADD-PATH limits
[no] mcast-vpn-ipv4 - Configure mcast-vpn-ipv4 ADD-PATH limits
[no] mcast-vpn-ipv6 - Configure mcast-vpn-ipv6 ADD-PATH limits
[no] mvpn-ipv4 - Configure mvpn-ipv4 ADD-PATH limits
[no] mvpn-ipv6 - Configure mvpn-ipv6 ADD-PATH limits
[no] vpn-ipv4 - Configure vpn-ipv4 ADD-PATH limits
[no] vpn-ipv6 - Configure vpn-ipv6 ADD-PATH limits
Up to 16 paths are configurable per address family per peer (send-limit):
*A:PE-1>config>router>bgp>add-paths# ipv4 ?
- ipv4 send <send-limit>
- ipv4 send <send-limit> receive [none]
- no ipv4
<send-limit> : [1..16]|none|multipaths
Only the number of advertised routes per prefix is controlled, not the number of received routes. All routes advertised by an add-path peer are accepted; otherwise, routing loops might occur. If a BGP speaker is configured with <send-limit> n, but has more than n paths available in the LOC-RIB, it selects the n best paths with unique BGP next hops following the Add-n path selection algorithm described in draft-ietf-idr-add-paths-guidelines. Also, the send limit n can be overridden, for specific prefixes, using route policies.
When BGP add-path is configured for an address family, the BGP capability will be announced to the BGP peer as part of the BGP open message, as follows:
# Enable debugging for BGP open messages on PE-1:
debug
router "Base"
bgp
open
exit
58 2022/05/04 08:04:37.417 UTC MINOR: DEBUG #2001 Base BGP
"BGP: OPEN
Peer 1: 192.0.2.5 - Send (Passive) BGP OPEN: Version 4
AS Num 64496: Holdtime 90: BGP_ID 192.0.2.1: Opt Length 26 (ExtOpt F)
Opt Para: Type CAPABILITY: Length = 24: Data:
Cap_Code GRACEFUL-RESTART: Length 2
Bytes: 0x0 0x78
Cap_Code MP-BGP: Length 4
Bytes: 0x0 0x1 0x0 0x1
Cap_Code ROUTE-REFRESH: Length 0
Cap_Code 4-OCTET-ASN: Length 4
Bytes: 0x0 0x0 0xfb 0xf0
Cap_Code ADD-PATH: Length 4
Bytes: 0x0 0x1 0x1 0x3
"
The BGP add-path capability code value typically consists of one or more blocks of four bytes; two octets for the Address Family Identifier (AFI), one octet for the Subsequent Address Family Identifier (SAFI), and one octet for send/receive. In this example, AFI/SAFI bytes point to an IPv4 address family and send/receive value "3" means that the sender is able to receive and send multiple paths from/to its BGP peer.
In BGP update messages, a 4-octet path identifier (ID) is added to the Network Layer Reachability Information (NLRI) field. The combination of both prefix and path ID identifies a BGP path. SR OS allocates path IDs sequentially on a per address family basis, not per prefix. The path ID is only locally significant, which means that when a BGP speaker re-advertises a route with path IDs, it must generate its own path ID.
# Enable debugging for BGP UPDATE messages on RR-5:
debug
router "Base"
bgp
update
exit
RR-5 received the following BGP update for prefix 10.0.4.0/24 with path ID.
50 2022/05/04 08:05:07.380 UTC MINOR: DEBUG #2001 Base Peer 1: 192.0.2.2
"Peer 1: 192.0.2.2: UPDATE
Peer 1: 192.0.2.2 - Received BGP UPDATE:
Withdrawn Length = 0
Total Path Attr Length = 27
Flag: 0x40 Type: 1 Len: 1 Origin: 0
Flag: 0x40 Type: 2 Len: 6 AS Path:
Type: 2 Len: 1 < 64500 >
Flag: 0x40 Type: 3 Len: 4 Nexthop: 192.0.2.2
Flag: 0x40 Type: 5 Len: 4 Local Preference: 100
NLRI: Length = 8
10.0.4.0/24 Path-ID 8
"
When routers have negotiated to advertise (and receive) routes with path identifiers, all BGP updates (advertisements or withdrawals) without path identifier will be rejected. There will be an NLRI parsing error—because the BGP update has an incorrect length—and a notification will be sent.
Configuration
The following configuration examples are in this section:
BGP without add-path
BGP with add-path for address family IPv4: no BGP FRR, no ECMP
BGP with add-path for address family IPv4 and BGP FRR enabled
BGP with add-path for address family IPv4 and ECMP enabled
BGP with add-path for address family VPN-IPv4 and BGP FRR enabled
BGP with add-path for address family VPN-IPv4 and ECMP enabled
Example topology shows the example topology with CE-4 in AS 64500 advertising route 10.0.4.0/24 to its EBGP peers PE-1 and PE-2 in AS 64496. PE-1 has an import policy that sets the LP for this route to 200, whereas PE-2 keeps the default local preference of 100. RR-5 is RR for all PEs in AS 64496. CE-6 in AS 64501 peers with PE-3 in AS 64496 and can send traffic to CE-4 in AS 64500.
Initial configuration
The initial configuration on all nodes includes:
Cards, MDAs, ports
Router interfaces
IS-IS as IGP on all interfaces within AS 64496 (alternatively, OSPF can be used)
LDP on all interfaces between the PEs in AS 64496, but not toward RR-5
BGP is configured on all the nodes. CE-4 peers with PE-1 and PE-2 and exports prefix 10.0.4.0/24 to both EBGP peers, as follows:
# on CE-4:
configure
router Base
autonomous-system 64500
policy-options
begin
prefix-list "10.0.4.0/24"
prefix 10.0.4.0/24 exact
exit
policy-statement "export-bgp"
entry 10
from
prefix-list "10.0.4.0/24"
exit
action accept
exit
exit
exit
commit
exit
bgp
rapid-withdrawal
split-horizon
group "EBGP"
export "export-bgp"
peer-as 64496
neighbor 172.16.14.1
exit
neighbor 172.16.24.1
exit
exit
The BGP configuration on CE-6 is similar.
PE-1 peers with CE-4 in AS 64500 and RR-5 in AS 64496. An import policy is configured to set the LP to 200 for all routes received from CE-4, as follows:
# on PE-1:
configure
router Base
autonomous-system 64496
policy-options
begin
policy-statement "import-bgp-LP200"
default-action accept
local-preference 200
exit
exit
commit
exit
bgp
rapid-withdrawal
split-horizon
group "EBGP"
import "import-bgp-LP200"
peer-as 64500
neighbor 172.16.14.2
exit
exit
group "IBGP"
next-hop-self
peer-as 64496
neighbor 192.0.2.5
exit
exit
The BGP configuration on PE-2 and PE-3 is similar, but there is no import policy.
The BGP configuration on RR-5 is as follows:
# on RR-5:
configure
router Base
autonomous-system 64496
bgp
rapid-withdrawal
split-horizon
group "IBGP"
cluster 192.0.2.5
peer-as 64496
neighbor 192.0.2.1
exit
neighbor 192.0.2.2
exit
neighbor 192.0.2.3
exit
exit
PE-1 advertises a route for prefix 10.0.4.0/24 with LP 200 to RR-5. RR-5 propagates this route to its other clients: PE-2 and PE-3. When PE-2 learns this route, it does not advertise its own route for 10.0.4.0/24 with LP 100 to RR-5 anymore. PE-3 only learns the route for prefix 10.0.4.0/24 with LP 200, as follows:
*A:PE-3# show router bgp routes 10.0.4.0/24
===============================================================================
BGP Router ID:192.0.2.3 AS:64496 Local AS:64496
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP IPv4 Routes
===============================================================================
Flag Network LocalPref MED
Nexthop (Router) Path-Id IGP Cost
As-Path Label
-------------------------------------------------------------------------------
u*>i 10.0.4.0/24 200 None
192.0.2.1 None 10
64500 -
-------------------------------------------------------------------------------
Routes : 1
===============================================================================
Reconvergence without add-path
A failure of the link between CE-4 and PE-1 is simulated as follows:
# on CE-4:
configure
router Base
interface "int-CE-4-PE-1"
shutdown
The following four BGP update messages are received or sent by RR-5.
RR-5 receives the following withdrawal message from PE-1:
# on RR-5:
28 2022/05/04 08:00:38.222 UTC MINOR: DEBUG #2001 Base Peer 1: 192.0.2.1
"Peer 1: 192.0.2.1: UPDATE
Peer 1: 192.0.2.1 - Received BGP UPDATE:
Withdrawn Length = 4
10.0.4.0/24
Total Path Attr Length = 0
"
RR-5 propagates this withdrawal to its other clients, for example to PE-2, as follows:
# on RR-5:
29 2022/05/04 08:00:38.223 UTC MINOR: DEBUG #2001 Base Peer 1: 192.0.2.2
"Peer 1: 192.0.2.2: UPDATE
Peer 1: 192.0.2.2 - Send BGP UPDATE:
Withdrawn Length = 4
10.0.4.0/24
Total Path Attr Length = 0
"
When PE-2 receives this withdrawal, it reruns the BGP decision process and decides that its route for prefix 10.0.4.0/24 with LP 100 is the best route. PE-2 advertises this route to RR-5; it is received by RR-5 as follows:
# on RR-5:
31 2022/05/04 08:00:57.380 UTC MINOR: DEBUG #2001 Base Peer 1: 192.0.2.2
"Peer 1: 192.0.2.2: UPDATE
Peer 1: 192.0.2.2 - Received BGP UPDATE:
Withdrawn Length = 0
Total Path Attr Length = 27
Flag: 0x40 Type: 1 Len: 1 Origin: 0
Flag: 0x40 Type: 2 Len: 6 AS Path:
Type: 2 Len: 1 < 64500 >
Flag: 0x40 Type: 3 Len: 4 Nexthop: 192.0.2.2
Flag: 0x40 Type: 5 Len: 4 Local Preference: 100
NLRI: Length = 4
10.0.4.0/24
"
RR-5 propagates this message to its other clients: PE-1 and PE-3. The following BGP update is sent to PE-3:
# on RR-5:
32 2022/05/04 08:01:00.618 UTC MINOR: DEBUG #2001 Base Peer 1: 192.0.2.3
"Peer 1: 192.0.2.3: UPDATE
Peer 1: 192.0.2.3 - Send BGP UPDATE:
Withdrawn Length = 0
Total Path Attr Length = 41
Flag: 0x40 Type: 1 Len: 1 Origin: 0
Flag: 0x40 Type: 2 Len: 6 AS Path:
Type: 2 Len: 1 < 64500 >
Flag: 0x40 Type: 3 Len: 4 Nexthop: 192.0.2.2
Flag: 0x40 Type: 5 Len: 4 Local Preference: 100
Flag: 0x80 Type: 9 Len: 4 Originator ID: 192.0.2.2
Flag: 0x80 Type: 10 Len: 4 Cluster ID:
192.0.2.5
NLRI: Length = 4
10.0.4.0/24
"
Again, PE-3 has only one route for prefix 10.0.4.0/24, but this time with next hop 192.0.2.2, as follows:
*A:PE-3# show router bgp routes 10.0.4.0/24
===============================================================================
BGP Router ID:192.0.2.3 AS:64496 Local AS:64496
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP IPv4 Routes
===============================================================================
Flag Network LocalPref MED
Nexthop (Router) Path-Id IGP Cost
As-Path Label
-------------------------------------------------------------------------------
u*>i 10.0.4.0/24 100 None
192.0.2.2 None 10
64500 -
-------------------------------------------------------------------------------
Routes : 1
===============================================================================
The configuration is restored as follows:
# on CE-4:
configure
router Base
interface "int-CE-4-PE-1"
no shutdown
Add-path enabled: no BGP FRR, no ECMP
Before add-path is enabled, the following information is displayed on PE-1 for BGP neighbor RR-5:
*A:PE-1# show router bgp neighbor 192.0.2.5 | match "Local AddPath" post-lines 2
Local AddPath Capabi*: Disabled
Remote AddPath Capab*: Send - None
: Receive - None
Add-path is enabled on PE-1 and PE-2 with a send path limit of two for groups "EBGP" and "IBGP" and no limit on the receive path limit, which is the default setting, as follows:
# on PE-1 and PE-2:
configure
router Base
bgp
group "EBGP"
add-paths
ipv4 send 2 receive
exit
exit
group "IBGP"
add-paths
ipv4 send 2 receive
exit
exit
When the preceding show command is repeated on PE-1 or PE-2, the local BGP add-path capabilities are specified for address family IPv4: a maximum of two paths can be sent for a specific IPv4 prefix. The remote peer RR-5 does not have add-path enabled yet.
*A:PE-1# show router bgp neighbor 192.0.2.5 | match "Local AddPath" post-lines 3
Local AddPath Capabi*: Send - ipv4 (2)
: Receive - ipv4
Remote AddPath Capab*: Send - None
: Receive - None
Initially, add-path remains disabled on PE-3. On the RR, add-path is enabled for neighbors 192.0.2.1 and 192.0.2.2, but not for 192.0.2.3 yet. For neighbor 192.0.2.1, the receive none option implies that the add-path receive capability is not negotiated.
# on RR-5:
configure
router Base
bgp
group "IBGP"
neighbor 192.0.2.1
add-paths
ipv4 send 2 receive none
exit
exit
exit
group "IBGP"
neighbor 192.0.2.2
add-paths
ipv4 send 2 receive
exit
exit
The following output shows that add-path is enabled locally on RR-5 and remotely on PE-1 for address family IPv4. RR-5 can send a maximum of two paths for a specific prefix toward PE-1 and PE-2; toward PE-3, add-path remains disabled.
*A:RR-5# show router bgp neighbor 192.0.2.1 | match "Local AddPath" post-lines 3
Local AddPath Capabi*: Send - ipv4 (2)
: Receive - None
Remote AddPath Capab*: Send - ipv4
: Receive - ipv4
*A:RR-5# show router bgp neighbor 192.0.2.2 | match "Local AddPath" post-lines 3
Local AddPath Capabi*: Send - ipv4 (2)
: Receive - ipv4
Remote AddPath Capab*: Send - ipv4
: Receive - ipv4
*A:RR-5# show router bgp neighbor 192.0.2.3 | match "Local AddPath" post-lines 2
Local AddPath Capabi*: Disabled
Remote AddPath Capab*: Send - None
: Receive - None
The receive none option indicates that RR-5 does not negotiate the add-path receive capability with its peer. PE-1 knows that peer 192.0.2.5 may send IPv4 routes with a path ID, but has no information about what this peer will receive:
*A:PE-1# show router bgp neighbor 192.0.2.5 | match "Local AddPath" post-lines 3
Local AddPath Capabi*: Send - ipv4 (2)
: Receive - ipv4
Remote AddPath Capab*: Send - ipv4
: Receive - None
With BGP add-path enabled, PE-2 will advertise its second-best route for prefix 10.0.4.0/24 with LP 100 to RR-5. PE-1, PE-2, and RR-5 will have two routes for prefix 10.0.4.0/24 in their RIB-IN, but only the route with LP 200 will be used. The following output shows the BGP routes on RR-5, but it resembles the output on PE-1 and PE-2:
*A:RR-5# show router bgp routes 10.0.4.0/24
===============================================================================
BGP Router ID:192.0.2.5 AS:64496 Local AS:64496
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP IPv4 Routes
===============================================================================
Flag Network LocalPref MED
Nexthop (Router) Path-Id IGP Cost
As-Path Label
-------------------------------------------------------------------------------
u*>i 10.0.4.0/24 200 None
192.0.2.1 None 10
64500 -
*i 10.0.4.0/24 100 None
192.0.2.2 1 10
64500 -
-------------------------------------------------------------------------------
Routes : 2
===============================================================================
Even though RR-5 has two routes for this prefix, it only advertises its best route to PE-3, because add-path is not enabled for this BGP session. Therefore, PE-3 only has the route for 10.0.4.0/24 with LP 200, as follows:
*A:PE-3# show router bgp routes 10.0.4.0/24
===============================================================================
BGP Router ID:192.0.2.3 AS:64496 Local AS:64496
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP IPv4 Routes
===============================================================================
Flag Network LocalPref MED
Nexthop (Router) Path-Id IGP Cost
As-Path Label
-------------------------------------------------------------------------------
u*>i 10.0.4.0/24 200 None
192.0.2.1 None 10
64500 -
-------------------------------------------------------------------------------
Routes : 1
===============================================================================
When add-path is enabled on the session between PE-3 and RR-5, the second route will also be advertised, as follows:
# on PE-3:
configure
router Base
bgp
group "IBGP"
add-paths
ipv4 send 2 receive
exit
# on RR-5:
configure
router Base
bgp
group "IBGP"
neighbor 192.0.2.3
add-paths
ipv4 send 2 receive
exit
*A:PE-3# show router bgp routes 10.0.4.0/24
===============================================================================
BGP Router ID:192.0.2.3 AS:64496 Local AS:64496
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP IPv4 Routes
===============================================================================
Flag Network LocalPref MED
Nexthop (Router) Path-Id IGP Cost
As-Path Label
-------------------------------------------------------------------------------
u*>i 10.0.4.0/24 200 None
192.0.2.1 14 10
64500 -
*i 10.0.4.0/24 100 None
192.0.2.2 15 10
64500 -
-------------------------------------------------------------------------------
Routes : 2
===============================================================================
BGP add-path is enabled, but BGP FRR or ECMP are disabled. The routing table on PE-3 only contains one entry for prefix 10.0.4.0/24:
*A:PE-3# show router route-table 10.0.4.0/24
===============================================================================
Route Table (Router: Base)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
-------------------------------------------------------------------------------
10.0.4.0/24 Remote BGP 00h00m29s 170
192.168.13.1 10
-------------------------------------------------------------------------------
No. of Routes: 1
Flags: n = Number of times nexthop is repeated
B = BGP backup route available
L = LFA nexthop available
S = Sticky ECMP requested
===============================================================================
Reconverge with add-path: no BGP FRR, no ECMP
A link failure between CE-4 and PE-1 is simulated as follows:
# on CE-4:
configure
router Base
interface "int-CE-4-PE-1"
shutdown
PE-1 sends a withdrawal message for route 10.0.4.0/24 with LP 200 to RR-5 and reruns the BGP decision process. RR-5 propagates this withdrawal message to its other clients that rerun the BGP decision process. As a result, the route for prefix 10.0.4.0/24 with LP 100 will be used on all nodes; for example, on PE-3:
*A:PE-3# show router bgp routes 10.0.4.0/24
===============================================================================
BGP Router ID:192.0.2.3 AS:64496 Local AS:64496
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP IPv4 Routes
===============================================================================
Flag Network LocalPref MED
Nexthop (Router) Path-Id IGP Cost
As-Path Label
-------------------------------------------------------------------------------
u*>i 10.0.4.0/24 100 None
192.0.2.2 15 10
64500 -
-------------------------------------------------------------------------------
Routes : 1
===============================================================================
The routing table contains a route to 10.0.4.0/24 with PE-2 as next hop, as follows:
*A:PE-3# show router route-table 10.0.4.0/24
===============================================================================
Route Table (Router: Base)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
-------------------------------------------------------------------------------
10.0.4.0/24 Remote BGP 00h00m10s 170
192.168.23.1 10
-------------------------------------------------------------------------------
No. of Routes: 1
Flags: n = Number of times nexthop is repeated
B = BGP backup route available
L = LFA nexthop available
S = Sticky ECMP requested
===============================================================================
The convergence with add-path enabled is twice as fast as without BGP add-path. With BGP add-path disabled, four sequential messages are sent:
PE-1 sends a withdrawal to RR-5.
RR-5 propagates withdrawal.
PE-2 advertises its route.
RR-5 propagates the route.
In the scenario with add-path, the last two messages are already sent before the failure happened. During convergence, only two withdrawal messages are sent: PE-1 sends a withdrawal to RR-5; RR-5 propagates this to its clients.
Add-path and BGP FRR
The convergence time can be further reduced by enabling BGP FRR, where the BGP decision process runs for the best route and the backup path before any failure happens, as described in chapter BGP Fast Reroute. On all PEs, BGP FRR is enabled for the IPv4 address family, as follows:
# on all PEs:
configure
router Base
bgp
backup-path ipv4
Each PE has two routes for prefix 10.0.4.0/24 and when BGP FRR is enabled, both are used, but one is used as backup, indicated by the "b"-flag in the following output:
*A:PE-3# show router bgp routes 10.0.4.0/24
===============================================================================
BGP Router ID:192.0.2.3 AS:64496 Local AS:64496
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP IPv4 Routes
===============================================================================
Flag Network LocalPref MED
Nexthop (Router) Path-Id IGP Cost
As-Path Label
-------------------------------------------------------------------------------
u*>i 10.0.4.0/24 200 None
192.0.2.1 20 10
64500 -
ub*i 10.0.4.0/24 100 None
192.0.2.2 15 10
64500 -
-------------------------------------------------------------------------------
Routes : 2
===============================================================================
The following routing table on PE-3 shows the active route for 10.0.4.0/24 and adds an indication "B", indicating that a BGP backup route is available:
*A:PE-3# show router route-table 10.0.4.0/24
===============================================================================
Route Table (Router: Base)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
-------------------------------------------------------------------------------
10.0.4.0/24 [B] Remote BGP 00h00m49s 170
192.168.13.1 10
-------------------------------------------------------------------------------
No. of Routes: 1
Flags: n = Number of times nexthop is repeated
B = BGP backup route available
L = LFA nexthop available
S = Sticky ECMP requested
===============================================================================
The following output shows both the active and the backup route for prefix 10.0.4.0/24:
*A:PE-3# show router route-table 10.0.4.0/24 alternative
===============================================================================
Route Table (Router: Base)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
Alt-NextHop Alt-
Metric
-------------------------------------------------------------------------------
10.0.4.0/24 Remote BGP 00h00m49s 170
192.168.13.1 10
10.0.4.0/24 (Backup) Remote BGP 00h00m49s 170
192.168.23.1 10
-------------------------------------------------------------------------------
No. of Routes: 2
Flags: n = Number of times nexthop is repeated
Backup = BGP backup route
LFA = Loop-Free Alternate nexthop
S = Sticky ECMP requested
===============================================================================
In case of link failure between CE-4 and PE-1, the same BGP withdrawals will be sent from PE-1 to RR-5 and from RR-5 to PE-2 and PE-3. When PE-2 and PE-3 receive the withdrawal, the BGP decision process need not run again. The backup path is promoted to active immediately.
BGP FRR is disabled on the PEs as follows:
# on all PEs:
configure
router Base
bgp
no backup-path
Add-path and ECMP
On PE-1, the import policy is removed to have paths with equal cost:
# on PE-1:
configure
router Base
bgp
group "EBGP"
no import
ECMP is enabled on all PEs with a value of two, as follows:
# on all PEs:
configure
router Base
ecmp 2
On all PEs, BGP multipath is configured with the maximum number of paths equal to two in the bgp context, as follows:
# on all PEs:
configure
router Base
bgp
multi-path
maximum-paths 2
For more information about BGP multipath, see chapter BGP Multipath.
All PEs have two routes for prefix 10.0.4.0/24 and both are active when ECMP is enabled; for example, for PE-3, as follows:
*A:PE-3# show router bgp routes 10.0.4.0/24
===============================================================================
BGP Router ID:192.0.2.3 AS:64496 Local AS:64496
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP IPv4 Routes
===============================================================================
Flag Network LocalPref MED
Nexthop (Router) Path-Id IGP Cost
As-Path Label
-------------------------------------------------------------------------------
u*>i 10.0.4.0/24 100 None
192.0.2.1 20 10
64500 -
u*>i 10.0.4.0/24 100 None
192.0.2.2 15 10
64500 -
-------------------------------------------------------------------------------
Routes : 2
===============================================================================
*A:PE-3# show router route-table 10.0.4.0/24
===============================================================================
Route Table (Router: Base)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
-------------------------------------------------------------------------------
10.0.4.0/24 Remote BGP 00h00m54s 170
192.168.13.1 10
10.0.4.0/24 Remote BGP 00h00m54s 170
192.168.23.1 10
-------------------------------------------------------------------------------
No. of Routes: 2
Flags: n = Number of times nexthop is repeated
B = BGP backup route available
L = LFA nexthop available
S = Sticky ECMP requested
===============================================================================
Traffic flows with destination 10.0.4.0/24 will be sprayed over the two active paths.
Add-path for family VPN-IPv4 with BGP FRR
Example topology with VPRNs shows the example topology with VPRN1 configured on the PEs in AS 64496. CE-4 exports prefix 172.31.0.0/16 to VPRN 1 on PE-1 and PE-2.
VPRN 1 is configured on all PEs in AS 64496, but not on the RR. BGP FRR is enabled in the VPRN with the enable-bgp-vpn-backup option. The configuration of VPRN 1 is similar on all PEs; for example, for PE-1, the VPRN configuration is as follows:
# on PE-1:
configure
router Base
policy-options
begin
policy-statement "export-bgp"
entry 10
from
protocol bgp-vpn
exit
to
protocol bgp
exit
action accept
exit
exit
exit
policy-statement "import-bgp-LP200"
default-action accept
local-preference 200
exit
exit
commit
exit
exit
service
vprn 1 name "VPRN 1" customer 1 create
autonomous-system 64496
enable-bgp-vpn-backup ipv4 # BGP FRR
interface "int-PE-1-CE-4_VPRN1" create
address 172.16.114.1/30
sap 1/1/3:1 create
exit
exit
bgp-ipvpn
mpls
auto-bind-tunnel
resolution any
exit
route-distinguisher 64496:1
vrf-target target:64496:1
no shutdown
exit
exit
bgp
split-horizon
group "EBGP_1"
next-hop-self
import "import-bgp-LP200"
export "export-bgp"
peer-as 64500
neighbor 172.16.114.2
exit
exit
exit
export-inactive-bgp # BGP best-external in VPRN
no shutdown
The import policy sets the LP to 200 for the routes received from CE-4. The configuration on PE-2 is similar, but without import policy. Therefore, the path via PE-1 will be preferred over the path via PE-2.
The export-inactive-bgp option must be configured on PE-2, because the route for prefix 172.31.0.0/16 received by PE-2 from CE-4 is inactive, but should still be advertised as BGP VPN-IPv4 route to RR-5; see chapter BGP Best-External in a VPRN. In this example, the export-inactive-bgp option is configured on all PEs.
On the CEs, the configuration is either in the base routing instance—with additional router interfaces and BGP neighbors—or in a VPRN. In this example, the following VPRN is configured on CE-4:
# on CE-4:
configure
router Base
policy-options
begin
prefix-list "172.31.0.0/16"
prefix 172.31.0.0/16 longer
exit
policy-statement "export_172.31.0.0/16"
entry 10
from
prefix-list "172.31.0.0/16"
exit
action accept
exit
exit
exit
commit
exit
exit
service
vprn 1 name "VPRN 1" customer 1 create
autonomous-system 64500
route-distinguisher 64500:1
interface "int-CE-4-PE-1_VPRN1" create
address 172.16.114.2/30
sap 1/1/1:1 create
exit
exit
interface "int-CE-4-PE-2_VPRN1" create
address 172.16.124.2/30
sap 1/1/2:1 create
exit
exit
interface "test_connectedNW" create
address 172.31.0.1/16
loopback
exit
bgp
split-horizon
group "EBGP_1"
export "export_172.31.0.0/16"
peer-as 64496
neighbor 172.16.114.1
exit
neighbor 172.16.124.1
exit
exit
exit
no shutdown
The configuration on CE-6 is similar.
For all BGP speakers in AS 64496, BGP must be configured for address family VPN-IPv4 as well as for IPv4, as follows:
# on PE-1, PE-2, PE-3:
configure
router Base
bgp
group "IBGP"
family ipv4 vpn-ipv4
BGP add-path cannot be enabled in the bgp context within a VPRN. However, BGP add-path can be enabled in the base routing instance for address family VPN-IPv4. This is done on all PEs at group level with the following command:
# on all PEs:
configure
router Base
bgp
group "IBGP"
add-paths
vpn-ipv4 send 2 receive
In this example, BGP add-path is enabled at neighbor level on RR-5, as follows:
# on RR-5:
configure
router Base
bgp
group "IBGP"
neighbor 192.0.2.1
add-paths
vpn-ipv4 send 2 receive
exit
exit
neighbor 192.0.2.2
add-paths
vpn-ipv4 send 2 receive
exit
exit
neighbor 192.0.2.3
add-paths
vpn-ipv4 send 2 receive
exit
exit
The BGP configuration for group "IBGP" on PE-1 is as follows:
*A:PE-1# configure router bgp group "IBGP"
*A:PE-1>config>router>bgp>group# info
----------------------------------------------
family ipv4 vpn-ipv4
next-hop-self
peer-as 64496
add-paths
ipv4 send 2 receive
vpn-ipv4 send 2 receive
exit
neighbor 192.0.2.5
exit
----------------------------------------------
With add-path enabled for address family VPN-IPv4, PE-1 and PE-2 will advertise their route for prefix 172.31.0.0/16 as VPN-IPv4 route to RR-5. RR-5 will advertise both routes to its other RR clients. PE-3 receives two VPN-IPv4 routes for prefix 172.31.0.0/16, as follows:
*A:PE-3# show router bgp routes 172.31.0.0/16 vpn-ipv4
===============================================================================
BGP Router ID:192.0.2.3 AS:64496 Local AS:64496
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP VPN-IPv4 Routes
===============================================================================
Flag Network LocalPref MED
Nexthop (Router) Path-Id IGP Cost
As-Path Label
-------------------------------------------------------------------------------
u*>i 64496:1:172.31.0.0/16 200 None
192.0.2.1 3 10
64500 524284
ub*i 64496:1:172.31.0.0/16 100 None
192.0.2.2 15 10
64500 524283
-------------------------------------------------------------------------------
Routes : 2
===============================================================================
Both routes are used: the route via PE-1 is the active route and the route via PE-2 is used as a backup, as indicated by the "b" flag.
The routing table for VPRN 1 on PE-3 shows that there is a backup route for prefix 172.31.0.0/16, as indicated by "B" as follows:
*A:PE-3# show router 1 route-table 172.31.0.0/16
===============================================================================
Route Table (Service: 1)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
-------------------------------------------------------------------------------
172.31.0.0/16 [B] Remote BGP VPN 00h00m32s 170
192.0.2.1 (tunneled) 10
-------------------------------------------------------------------------------
No. of Routes: 1
Flags: n = Number of times nexthop is repeated
B = BGP backup route available
L = LFA nexthop available
S = Sticky ECMP requested
===============================================================================
The active route and the alternative (backup) route are shown in the following output:
*A:PE-3# show router 1 route-table 172.31.0.0/16 alternative
===============================================================================
Route Table (Service: 1)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
Alt-NextHop Alt-
Metric
-------------------------------------------------------------------------------
172.31.0.0/16 Remote BGP VPN 00h00m32s 170
192.0.2.1 (tunneled) 10
172.31.0.0/16 (Backup) Remote BGP VPN 00h00m32s 170
192.0.2.2 (tunneled) 10
-------------------------------------------------------------------------------
No. of Routes: 2
Flags: n = Number of times nexthop is repeated
Backup = BGP backup route
LFA = Loop-Free Alternate nexthop
S = Sticky ECMP requested
===============================================================================
BGP FRR is disabled in VPRN 1 on the PEs, as follows:
# on PE-1, PE-2, PE-3:
configure
service
vprn "VPRN 1"
no enable-bgp-vpn-backup
Add-path for family VPN-IPv4 with ECMP
The import policy is removed in VPRN 1 on PE-1 to make the cost of the paths via PE-1 and PE-2 equal, as follows:
# on PE-1:
configure
service
vprn "VPRN 1"
bgp
group "EBGP_1"
no import
ECMP is enabled in VPRN 1 on all PEs, as follows:
# on PE-1, PE-2, PE-3:
configure
service
vprn "VPRN 1"
ecmp 2
BGP multipath needs to be enabled in the base routing context, but that already happened.
With ECMP enabled, the two routes that are received on PE-3 from RR-5 are both active, as follows:
*A:PE-3# show router bgp routes 172.31.0.0/16 vpn-ipv4
===============================================================================
BGP Router ID:192.0.2.3 AS:64496 Local AS:64496
===============================================================================
Legend -
Status codes : u - used, s - suppressed, h - history, d - decayed, * - valid
l - leaked, x - stale, > - best, b - backup, p - purge
Origin codes : i - IGP, e - EGP, ? - incomplete
===============================================================================
BGP VPN-IPv4 Routes
===============================================================================
Flag Network LocalPref MED
Nexthop (Router) Path-Id IGP Cost
As-Path Label
-------------------------------------------------------------------------------
u*>i 64496:1:172.31.0.0/16 100 None
192.0.2.1 3 10
64500 524284
u*>i 64496:1:172.31.0.0/16 100 None
192.0.2.2 15 10
64500 524283
-------------------------------------------------------------------------------
Routes : 2
===============================================================================
ECMP is enabled with a value of two, so traffic flows in VPRN 1 on PE-3 with destination 172.31.0.0/16 are distributed over two paths: one via PE-1 and another via PE-2, as follows:
*A:PE-3# show router 1 route-table 172.31.0.0/16
===============================================================================
Route Table (Service: 1)
===============================================================================
Dest Prefix[Flags] Type Proto Age Pref
Next Hop[Interface Name] Metric
-------------------------------------------------------------------------------
172.31.0.0/16 Remote BGP VPN 00h00m48s 170
192.0.2.1 (tunneled) 10
172.31.0.0/16 Remote BGP VPN 00h00m48s 170
192.0.2.2 (tunneled) 10
-------------------------------------------------------------------------------
No. of Routes: 2
Flags: n = Number of times nexthop is repeated
B = BGP backup route available
L = LFA nexthop available
S = Sticky ECMP requested
===============================================================================
Conclusion
BGP add-path allows BGP speakers to advertise multiple distinct paths for the same prefix. The potential benefits of BGP add-path include reduced routing churn, faster convergence, and better load-sharing.