BGP Graceful Restart and Long-Lived Graceful Restart

This chapter provides information about BGP Graceful Restart and Long-Lived Graceful Restart.

Topics in this chapter include:

Applicability

This chapter was initially written for SR OS Release 15.0.R8, but the CLI in the current edition corresponds to SR OS Release 19.10.R2.

Overview

BGP was designed assuming that peer router failures should be reacted to immediately so that the forwarding state of the router can converge toward the current state of the network. However, BGP is often used to signal Network Layer Reachability Information (NLRIs) associated with configuration rather than forwarding, such as flow specifications, Route Target (RT) constraints, BGP Auto-Discovery (BGP-AD), and BGP-VPLS. GR can be applied when there is fate separation between the control plane and the forwarding plane, allowing a restart of the control plane without affecting forwarding.

Supported address families for GR and LLGR in base router and in VPRN lists the supported address families for GR and LLGR in the base router and in a BGP instance in a VPRN.

Table 1. Supported address families for GR and LLGR in base router and in VPRN

Address family

AFI/SAFI

BGP in base router

BGP in VPRN

IPv4 unicast

1/1

X

X

Labeled IPv4

1/4

X

X

VPN-IPv4

1/128

X

RT constraint

1/132

X

FlowSpec IPv4

1/133

X

X

IPv6 unicast

2/1

X

X

Labeled IPv6

2/4

X

VPN-IPv6

2/128

X

FlowSpec IPv6

2/133

X

X

L2 VPN

25/65

X

GR

GR can be applied in the general bgp context, in a BGP group, or per BGP neighbor. BGP GR can be applied for the base router or a VPRN. GR can be enabled as follows:

configure router bgp graceful-restart
configure router bgp group <groupName> graceful-restart
configure router bgp group <groupName> neighbor <neighborName> graceful-restart
configure service vprn <vprnId> bgp graceful-restart
configure service vprn <vprnId> bgp group <groupName> graceful-restart
configure service vprn <vprnId> bgp group <groupName> neighbor <neighborName>
                                                                  graceful-restart

The following shows the BGP configuration on the base router for multiple address families. GR is enabled with a stale routes time of 150 seconds and notifications will be sent. No restart time is configured explicitly; the default restart time is 300 seconds at group level and peer level; at BGP instance level, the default restart time is 120 seconds. LLGR is not configured.

# on PE-2:
configure
    router
        bgp
            split-horizon
            group "iBGP"
                family ipv4 ipv6 vpn-ipv4 vpn-ipv6 l2-vpn flow-ipv4 flow-ipv6
                graceful-restart
                    stale-routes-time 150
                    enable-notification
                exit
                peer-as 64496
                neighbor 192.0.2.1
                exit
            exit
            no shutdown

A BGP speaker can advertise a GR capability to indicate that it is able to preserve its forwarding state per address family (AF) during BGP restart. The GR capability can be used to inform the BGP peers that an end-of-RIB (EOR) message will be generated after all routing updates have been sent for an address family, as follows:

172 2020/02/12 11:49:58.321 UTC MINOR: DEBUG #2001 Base Peer 1: 192.0.2.1
"Peer 1: 192.0.2.1: UPDATE
Peer 1: 192.0.2.1 - Received BGP UPDATE:
    Withdrawn Length = 0
    Total Path Attr Length = 7
    Flag: 0x90 Type: 15 Len: 3 Multiprotocol Unreachable NLRI:
        Address Family IPV6
End-of-Rib marker (IPV6)
"

BGP GR capability shows the GR capability with restart flags, restart time, and forwarding flags per address family. RFC 4724 defines the GR BGP capability. The notification bit N is defined in draft-ietf-idr-bgp-gr-notification-13.

Figure 1. BGP GR capability
  • Restart flags:

    • The restart state bit R is used to avoid a possible deadlock when multiple BGP speakers peering with each other restart simultaneously and are waiting for the EOR. When set (R=1), the bit indicates that the BGP speaker has restarted and its peer must not wait for the EOR before advertising routing information.

    • The notification bit N indicates that the BGP speaker is willing to send and receive BGP notification messages in GR mode, including the BGP Notification Cease message, which is a hard-reset message causing a peer to terminate a BGP session.

    • The remaining two restart flag bits are reserved and must be set to 0.

  • The restart time in seconds is the estimated time required to re-establish a BGP session after a restart. When the restart time expires before the BGP session is re-established, the GR helper stops helping and the (stale) routes received from the failed BGP speaker are removed.

  • Flags for address family:

    • The forwarding state bit F indicates whether the forwarding state for routes with a certain AFI/SAFI are preserved during BGP restart. When set (F=1), the forwarding state is preserved. After a hard reset caused by a BGP Notification Cease message, the forwarding bit must be set to 1.

    • The remaining bits are reserved and must be 0.

A BGP speaker can advertise GR capability without any AFI/SAFI, indicating that the sender cannot preserve its forwarding state during BGP restart, but supports procedures for the receiving speaker.

Debugging is enabled for BGP Open messages, as follows:

# on all PEs:
debug 
    router
        bgp 
            open

The following BGP Open message received by PE-2 from PE-1 shows the GR capability for different address families and with a default start timer of 300 seconds. The restart bit R is false because no GR is taking place on peer PE-1. The notification bit N is set to true. The same AFI/SAFI information is presented in the GR capability as in the MP-BGP capabilities, because GR is always enabled for all configured AFI/SAFIs. LLGR is not enabled yet.

50 2020/02/12 13:25:41.971 UTC MINOR: DEBUG #2001 Base BGP
"BGP: OPEN
Peer 1: 192.0.2.1 - Received BGP OPEN: Version 4
   AS Num 64496: Holdtime 90: BGP_ID 192.0.2.1: Opt Length 84 (ExtOpt F)
   Opt Para: Type CAPABILITY: Length = 82: Data:
     Cap_Code GRACEFUL-RESTART: Length 30
       Bytes: 0x41 0x2c 0x0 0x1 0x1 0x0 0x0 0x2 0x1 0x0 0x0 0x1 0x80 0x0 0x0 0x2 0x80 0x0 0x0 0x19 0x41 0x0 0x0 0x1 0x85 0x0 0x0 0x2 0x85 0x0
     Cap_Code MP-BGP: Length 4
       Bytes: 0x0 0x1 0x0 0x1               # AFI 1/SAFI 1 = IPv4 unicast
     Cap_Code MP-BGP: Length 4
       Bytes: 0x0 0x2 0x0 0x1               # AFI 2/SAFI 1 = IPv6 unicast
     Cap_Code MP-BGP: Length 4
       Bytes: 0x0 0x1 0x0 0x80              # AFI 1/SAFI 128 = VPN-IPv4
     Cap_Code MP-BGP: Length 4
       Bytes: 0x0 0x2 0x0 0x80              # AFI 2/SAFI 128 = VPN-IPv6
     Cap_Code MP-BGP: Length 4
       Bytes: 0x0 0x19 0x0 0x41             # AFI 25/SAFI 65 = L2 VPN
     Cap_Code MP-BGP: Length 4
       Bytes: 0x0 0x1 0x0 0x85              # AFI 1/SAFI 133 = IPv4 FlowSpec
     Cap_Code MP-BGP: Length 4
       Bytes: 0x0 0x2 0x0 0x85              # AFI 2/SAFI 133 = IPv6 FlowSpec
     Cap_Code ROUTE-REFRESH: Length 0
     Cap_Code 4-OCTET-ASN: Length 4
       Bytes: 0x0 0x0 0xfb 0xf0
" 

The first two octets in the GR capability are 0x41 0x2c (01000001 00101100 in binary). The first four bits-0100-represent the restart flags: R=0, N=1, and the remaining two bits are reserved and set to 0. The remaining twelve bits-000100101100-represent the restart time in seconds: 256+32+8+4=300.

The following four octets in the GR capability are 0x0 0x1 0x1 0x0 (00000000 00000001 00000001 00000000 in binary). The first two octets represent AFI 1 for IPv4, the third octet SAFI 1 for unicast, and the last octet represents the flags, with the forwarding bit F=0 and all other bits reserved and set to zero. The other bytes are for the other AFI/SAFIs that are configured in the example.

Debugging is enabled for GR, as follows:

# on all PEs:
debug 
    router 
        bgp 
            graceful-restart

The following messages are in the debug trace on PE-2. The first message shows restart bit R false (no restart ongoing), notification bit N true (GR notifications are supported), restart time 300s (default value), and notification restart false (no GR notifications were sent).

51 2020/02/12 13:25:41.971 UTC MINOR: DEBUG #2001 Base BGP
"BGP: RESTART
Peer 1: 192.0.2.1: Restart Capability Receive: restart BIT FALSE: Graceful Notification BIT TRUE: Restart Time 300 secs: NOTIFICATION restart FALSE
" 

The subsequent messages show the GR capabilities per address family with the value of the forwarding-preserved bit F, for example, for the IPv4 unicast address family, as follows. The forwarding-preserved bit is false.

52 2020/02/12 13:25:41.971 UTC MINOR: DEBUG #2001 Base BGP
"BGP: RESTART
Peer 1: 192.0.2.1: Restart Capability Receive: afi: AFI_IPV4 safi: SAFI_UNICAST forwarding-preserved BIT FALSE
" 

When routers have negotiated the GR capability for an address family and the BGP session drops, the BGP peers enter the GR helper state and do not immediately delete the routes of that address family received from the failed peer. The helpers mark these routes as stale and keep using them until the BGP session is restored, the BGP routes are refreshed, and an EOR message has been received for the AFI/SAFIs.

However, if the BGP session with the restarting router is not restored before the configured restart time expires, the peer router stops helping and will send withdraw messages for the routes received from the restarting router. When the stale routes time expires, the router will withdraw all routes received from the restarting router. The restart time has an upper bound of 4095 seconds, so this mechanism is designed for relatively short outages in the order of minutes, not for hours. GR can deal with simple control plane restarts in terms of scope and severity.

*A:PE-1# configure router bgp graceful-restart restart-time 
  - restart-time <seconds>
  - no restart-time

 <seconds>            : [0..4095]

LLGR

LLGR can handle failure scenarios where the repair takes several hours, such as a network where redundant route reflectors (RRs) fail simultaneously and the configuration-type BGP routes (that is, non-forwarding BGP routes) for FlowSpec, route target, and L2 VPNs can be preserved. BGP routes for forwarding can also be preserved longer. LLGR can be enabled for all address families that have GR enabled, or for a subset of these address families. LLGR allows a BGP session to stay down for hours or even days. The advertised stale time has an upper bound of 16777215 seconds and the default value is 86400 seconds. LLGR is configured in the GR context, which is in the general bgp context, per group, or per neighbor.

*A:PE-1# configure router bgp graceful-restart long-lived advertised-stale-time 
  - advertised-stale-time <seconds>
  - no advertised-stale-time

 <seconds>            : [0..16777215]

When GR is enabled, it automatically applies for all configured AFs; LLGR can be configured per AF, possibly with different LLGR-stale times, for example, for the L2 VPN address family in group "iBGP", as follows:

# on PE-1:
configure 
    router 
        bgp 
            group "iBGP" 
                graceful-restart 
                    long-lived 
                        family l2-vpn 
                            advertised-stale-time 7200
                        exit

LLGR capability shows the LLGR capability—as defined in draft-uttaro-idr-bgp-persistence-03—that adds a long-lived stale time per address family. The LLGR capability must be advertised in conjunction with the GR capability.

Figure 2. LLGR capability

GR and LLGR are configured in the "iBGP" group for all configured AFI/SAFIs, as follows. The default value of the long-lived advertised-stale-time is 86400 seconds.

configure
    router
        bgp
            group "iBGP"
                family ipv4 ipv6 vpn-ipv4 vpn-ipv6 l2-vpn flow-ipv4 flow-ipv6
                graceful-restart
                    stale-routes-time 150
                    enable-notification
                    long-lived
                        advertised-stale-time 3600
                    exit
                exit
            exit

When LLGR is enabled, the BGP Open message contains a long-lived GR capability and a GR capability, with the supported AFI/SAFIs. The following BGP Open message is received by PE-2 from RR PE-1. GR and LLGR are supported for all the AFI/SAFIs in the BGP session.

280 2020/02/12 13:56:38.304 UTC MINOR: DEBUG #2001 Base BGP
"BGP: OPEN
Peer 1: 192.0.2.1 - Received BGP OPEN: Version 4
   AS Num 64496: Holdtime 90: BGP_ID 192.0.2.1: Opt Length 135 (ExtOpt F)
   Opt Para: Type CAPABILITY: Length = 133: Data:
     Cap_Code GRACEFUL-RESTART: Length 30
       Bytes: 0x41 0x2c 0x0 0x1 0x1 0x0 0x0 0x2 0x1 0x0 0x0 0x1 0x80 0x0 0x0 0x2 0x80 0x0 0x0 0x19 0x41 0x0 0x0 0x1 0x85 0x0 0x0 0x2 0x85 0x0
     Cap_Code MP-BGP: Length 4
       Bytes: 0x0 0x1 0x0 0x1
     Cap_Code MP-BGP: Length 4
       Bytes: 0x0 0x2 0x0 0x1
     Cap_Code MP-BGP: Length 4
       Bytes: 0x0 0x1 0x0 0x80
     Cap_Code MP-BGP: Length 4
       Bytes: 0x0 0x2 0x0 0x80
     Cap_Code MP-BGP: Length 4
       Bytes: 0x0 0x19 0x0 0x41
     Cap_Code MP-BGP: Length 4
       Bytes: 0x0 0x1 0x0 0x85
     Cap_Code MP-BGP: Length 4
       Bytes: 0x0 0x2 0x0 0x85
     Cap_Code ROUTE-REFRESH: Length 0
     Cap_Code 4-OCTET-ASN: Length 4
       Bytes: 0x0 0x0 0xfb 0xf0
     Cap_Code LONG-LIVED-GR: Length 49
       Bytes: 0x0 0x1 0x1 0x0 0x0 0xe 0x10 0x0 0x2 0x1 0x0 0x0 0xe 0x10 0x0 0x1 0x80 0x0 0x0 0xe 0x10 0x0 0x2 0x80 0x0 0x0 0xe 0x10 0x0 0x19 0x41 0x0 0x0 0xe 0x10 0x0 0x1 0x85 0x0 0x0 0xe 0x10 0x0 0x2 0x85 0x0 0x0 0xe 0x10
" 

The first seven octets in the LLGR capability—0x0 0x1 0x1 0x0 0x0 0xe 0x10—are for AF IPv4 unicast. The first two—0x0 0x1—represent AFI=1 for IPv4, the third—0x1—represents SAFI=1 for unicast, the fourth—0x0—indicates that the forwarding-preserved bit F=0 (and the other bits are reserved and must be zero). The next three octets represent the LLGR-stale time: 0x0 0xe 0x10 (00000000 00001110 00010000 in binary) is 2048 + 1024 + 512 + 16 = 3600 in decimal.

The LLGR-stale time in seconds specifies how long LLGR-stale routes for the AFI/SAFI may be retained, possibly added to the GR time. LLGR starts when GR terminates before the failed router has recovered, that is, when either the restart timer or the stale-routes timer expires (whichever expires first), as shown in GR and LLGR. LLGR ends when the advertised LLGR-stale time expires or when the failure is restored and all routes are re-advertised followed by an EOR message. When the AFI/SAFI is not listed in the GR capability, the restart time for GR is 0 seconds. The LLGR-stale time is defined by the advertised-stale-time option, which has a default value of 86400 seconds.

Figure 3. GR and LLGR

The forwarding-preserved bit F is configured with the following command. By default, all F bits are 0, indicating that the forwarding state was not preserved during the previous restart. The forwarding-bits-set command allows F bits for all AFI/SAFIs to be set to 1, or only the F bits for configuration-type (that is, non-forwarding) AFI/SAFIs, such as L2 VPN, route target, IPv4 FlowSpec, and IPv6 FlowSpec.

 *A:PE-1# configure router bgp group "iBGP" graceful-restart long-lived forwarding-bits-set 
  - forwarding-bits-set {all|non-fwd}
  - no forwarding-bits-set

An address family is only protected with LLGR if the AFI/SAFI is in the advertised LLGR capability and in the received LLGR capability. In SR OS, LLGR can only be enabled when GR is enabled, so each advertised LLGR capability comes with a GR capability. If a peer advertises the LLGR capability without GR capability, the LLGR capability is ignored.

GR is used for short outages where the helpers pretend that everything is normal; LLGR is for longer outages where the helpers inform the other peers. Helper actions during GR and LLGR shows a comparison of the helper actions during GR and LLGR.

Table 2. Helper actions during GR and LLGR

Helper actions during GR

Helper actions during LLGR

Mark GR-eligible routes from the failed peer as stale

Mark LLGR-eligible routes from the failed peer as LLGR-stale

Attempt to reconnect to the peer at periodic intervals

Attempt to reconnect to the peer at periodic intervals

Depreference LLGR-stale routes so that they are less preferred than any valid non-LLGR-stale route

If an LLGR-stale route remains the best path, inform the other peers by withdrawing the route or re-advertising the route with new attributes

A route is said to be depreferenced if it has its route selection preference reduced in reaction to some event. LLGR automatically depreferences LLGR-stale routes so that any valid non-LLGR-stale route for the same NLRI is more preferred. When advertising LLGR-stale routes to an LLGR-capable peer, LLGR adds the well-known llgr-stale BGP community to the routes, so that the LLGR-capable BGP peers can also depreference the LLGR-stale routes. The following option controls how LLGR-stale routes are advertised.

*A:PE-1# configure router bgp group "iBGP" graceful-restart long-lived advertise-stale-to-all-neighbors 
  - advertise-stale-to-all-neighbors [without-no-export]
  - no advertise-stale-to-all-neighbors

 <without-no-export>  : keyword - Advertise stale routes to neighbors with the
                        addition of the LLGR_STALE community
  • The default is no advertise-stale-to-all-neighbors, in which case LLGR-aware routers re-advertise stale best routes to their LLGR-aware peers, with the addition of the well-known llgr-stale community. Toward BGP peers that did not advertise the LLGR capability, the stale routes are withdrawn.

  • When advertise-stale-to-all-neighbors is configured combined with the default no without-no-export, the LLGR-stale routes are withdrawn toward eBGP peers that did not advertise the LLGR capability and re-advertised to all other peers with LLGR-stale community. Toward iBGP (and confederation-eBGP) peers that signaled the LLGR capability, the route is re-advertised with the well-known llgr-stale and no-export communities and the local preference is set to 0.

  • When advertise-stale-to-all-neighbors is configured combined with without-no-export, the LLGR-stale routes are withdrawn toward eBGP peers that did not advertise the LLGR capability and re-advertised to all other peers with LLGR-stale community. Toward iBGP (and confederation-eBGP) peers that signaled the LLGR capability, the route is re-advertised with the LLGR-stale community, but without the no-export community. The local preference is set to 0.

Route policies can match, delete, or add the BGP well-known communities llgr-stale and no-llgr.

An iBGP peer not supporting LLGR normally does not receive route updates with LLGR-stale community, but if it does, it can only depreference them based on local preference 0.

The LLGR-stale routes timer is not stopped when the BGP session with the failed peer is re-established; it only stops when the EOR is received for the AFI/SAFI. When the LLGR- stale routes time expires for an AFI/SAFI, the LLGR phase ends and all remaining LLGR-stale routes for that AFI/SAFI are deleted. However, stale routes will also be deleted before the LLGR stale-routes timer expires when the BGP session with the failed peer is re-established and either of the following applies:

  • the GR or LLGR capability is missing

  • the AFI/SAFI is missing from the LLGR capability

  • the forwarding state bit F=0 for the AFI/SAFI

Configuration

Example topology shows the example topology with four routers in AS 64496. PE-1 combines the roles of a PE and an RR. A FlowSpec route server sends IPv4 and IPv6 FlowSpec routes to PE-1. Test centers T1 and T2 generate IPv4 and IPv6 traffic to each other, through the base router or a VPLS service. PE-4 is in AS 64500 and has an eBGP session with PE-2 in AS 64496.

Figure 4. Example topology

Initial configuration

The initial configuration on the nodes includes:

  • Cards, MDAs, ports

  • Router interfaces with dual stack

  • IS-IS on all interfaces of the routers in AS 64496 (alternatively, OSPF can be used)

  • LDP on all interfaces in AS 64496, not between PE-2 and PE-4

  • MPLS and RSVP on all interfaces in AS 64496, not between PE-2 and PE-4

  • RSVP-TE LSPs between PE-2 and PE-5

Figure 5 shows the configured services on the PEs. VPRN 1 is configured on PE-2, PE-3, PE-4, and PE-5; VPLS 2 with BGP-AD on PE-2 and PE-5.

Figure 5. VPRN 1 and VPLS 2 in the example topology

The service configuration on PE-2 is as follows. The pseudowire (PW) template is required for BGP-AD in VPLS 2, as described in the LDP VPLS Using BGP-Auto Discovery chapter.

# on PE-2:
configure
    service
        pw-template 1 name "PW 1" create
            split-horizon-group "vpls-shg"
            exit
        exit
        vprn 1 name "VPRN 1" customer 1 create
            route-distinguisher 64496:1
            auto-bind-tunnel
                resolution any
            exit
            vrf-target target:64496:1
            interface "int-VPRN1-PE-2-CE-20" create
                address 172.16.2.1/30
                ipv6
                    address 2001:db8::1:2:1/126
                exit
                sap 1/1/5:1 create
                exit
            exit
            no shutdown
        exit
        vpls 2 name "VPLS 2" customer 1 create
            bgp
                route-distinguisher 64496:2
                route-target export target:64496:2 import target:64496:2
                pw-template-binding 1 import-rt "target:64496:2"
                exit
            exit
            bgp-ad
                vpls-id 64496:2
                vsi-id
                    prefix 192.0.2.2
                exit
                no shutdown
            exit
            sap 1/1/5:2 create
            exit
            sap 1/1/4 create
            exit
            no shutdown
        exit

For the exchange of the routes in the VPRN, the VPN IPv4 and VPN IPv6 address families need to be configured in BGP; for BGP-AD, the L2 VPN address family. BGP is configured on all PEs for the following address families: IPv4, IPv6, VPN-IPv4, VPN-IPv6, L2 VPN, IPv4 FlowSpec, and IPv6 FlowSpec. On RR PE-1, the initial BGP configuration is as follows. The "iBGP" group includes all the PEs in AS 64496, whereas the "FlowSpec" group includes the FlowSpec server only.

# on RR PE-1:
configure
    router
        bgp
            split-horizon
            group "iBGP"
                family ipv4 ipv6 vpn-ipv4 vpn-ipv6 l2-vpn flow-ipv4 flow-ipv6
                cluster 192.0.2.1
                peer-as 64496
                neighbor 192.0.2.2
                exit
                neighbor 192.0.2.3
                exit
                neighbor 192.0.2.5
                exit
            exit
            group "FlowSpec"
                family ipv4 ipv6 flow-ipv4 flow-ipv6
                peer-as 64496
                neighbor 192.168.11.2
                exit
            exit
        exit

On PE-2, the prefixes toward the test center T1 are exported. BGP is configured as follows:

# on PE-2:
configure
    router
        policy-options
            begin
            prefix-list "T1"
                prefix 172.16.112.0/28 longer
                prefix 2001:db8::112:0/124 longer
            exit
            policy-statement "export-T1"
                entry 10
                    from
                        protocol direct
                        prefix-list "T1"
                    exit
                    action accept
                    exit
                exit
            exit
            commit
        exit
        bgp
            split-horizon
            group "eBGP"
                family ipv4 ipv6 vpn-ipv4 vpn-ipv6 l2-vpn flow-ipv4 flow-ipv6
                local-as 64496
                peer-as 64500
                neighbor 192.168.24.2
                exit
            exit
            group "iBGP"
                family ipv4 ipv6 vpn-ipv4 vpn-ipv6 l2-vpn flow-ipv4 flow-ipv6
                export "export-T1"
                peer-as 64496
                neighbor 192.0.2.1
                exit
            exit
        exit

The configuration on PE-5 is similar, but without the "eBGP" group, and for the export, the prefixes from T2 are included, as follows.

#on PE-5:
configure
    router
        policy-options
            begin
            prefix-list "T2"
                prefix 172.16.225.0/28 longer
                prefix 2001:db8::225:0/124 longer
            exit
            policy-statement "export-T2"
                entry 10
                    from
                        protocol direct
                        prefix-list "T2"
                    exit
                    action accept
                    exit
                exit
            exit
            commit
        exit
        bgp
            split-horizon
            group "iBGP"
                family ipv4 ipv6 vpn-ipv4 vpn-ipv6 l2-vpn flow-ipv4 flow-ipv6
                export "export-T2"
                peer-as 64496
                neighbor 192.0.2.1
                exit
            exit
        exit

On PE-3, the BGP configuration is similar, without the export policy.

The BGP configuration on PE-4 is as follows:

# on PE-4:
configure
    router
        bgp
            split-horizon
            group "eBGP"
                family ipv4 ipv6 vpn-ipv4 vpn-ipv6 l2-vpn flow-ipv4 flow-ipv6
                local-as 64500
                peer-as 64496
                neighbor 192.168.24.1
                exit
            exit

BGP routes

Under normal conditions, BGP routes of all the configured address families are advertised. The BGP summary on PE-5 shows the following number of received (Rcv), active (Act), and sent (Sent) BGP routes per address family for neighbor 192.0.2.1. Similar numbers occur on the other RR clients PE-2 and PE-3.

*A:PE-5# show router bgp summary all 

===============================================================================
BGP Summary
===============================================================================
Legend : D - Dynamic Neighbor
===============================================================================
Neighbor
Description
ServiceId          AS PktRcvd InQ  Up/Down   State|Rcv/Act/Sent (Addr Family)
                      PktSent OutQ
-------------------------------------------------------------------------------
192.0.2.1
Def. Instance  64496       69    0 00h01m25s 1/1/1 (IPv4)
                           21    0           1/0/1 (IPv6)
                                             4/3/2 (VpnIPv4)
                                             4/3/2 (VpnIPv6)
                                             1/1/1 (L2VPN)
                                             1/1/0 (FlowIPv4)
                                             1/1/0 (FlowIPv6)

-------------------------------------------------------------------------------

On PE-2, the following BGP IPv4 route is valid, best, and used.

*A:PE-2# show router bgp routes 
===============================================================================
 BGP Router ID:192.0.2.2        AS:64496       Local AS:64496      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP IPv4 Routes
===============================================================================
Flag  Network                                            LocalPref   MED
      Nexthop (Router)                                   Path-Id     IGP Cost
      As-Path                                                        Label
-------------------------------------------------------------------------------
u*>i  172.16.225.0/30                                    100         None
      192.0.2.5                                          None        20
      No As-Path                                                     -
-------------------------------------------------------------------------------
Routes : 1

On PE-2, the following BGP L2 VPN route received from neighbor PE-1 (RR) is valid, best, and used.

*A:PE-2# show router bgp neighbor 192.0.2.1 received-routes l2-vpn 
===============================================================================
 BGP Router ID:192.0.2.2        AS:64496       Local AS:64496      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP L2VPN Routes
===============================================================================
Flag  RouteType                   Prefix                             MED
      RD                          SiteId                             Label
      Nexthop                     VeId                   BlockSize   LocalPref
      As-Path                     BaseOffset             vplsLabelBa 
                                                         se          
-------------------------------------------------------------------------------
u*>i  AutoDiscovery               192.0.2.5              -           0
      64496:2                     -                                  -
      192.0.2.5                   -                      -           100
      No As-Path                  -                      -            
-------------------------------------------------------------------------------
Routes : 1

On PE-2, the following active IPv6 FlowSpec route specifies that all traffic will be dropped (rate limit: 0 kbps) that matches the criteria: DA 2001:db8::225:2/126, SA 2001:db8::112:2/126, destination port 4191, and source port greater than 1024. This route is generated by the FlowSpec route server connected to PE-1.

*A:PE-2# show router bgp routes flow-ipv6 
===============================================================================
 BGP Router ID:192.0.2.2        AS:64496       Local AS:64496      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP FLOW IPV6 Routes
===============================================================================
Flag  Network             Nexthop                 LocalPref       MED
      As-Path                                                     IGP Cost
-------------------------------------------------------------------------------
u*>i  --                  ::                      100             None
      No As-Path
                                                                  
      Community Action:  rate-limit: 0 kbps
      NLRI Subcomponents:                                         
      Dest Pref : 2001:db8::225:2/126 offset 0
      Src Pref  : 2001:db8::112:2/126 offset 0
      Ip Proto  : [ == 6 ]
      Dest Port : [ == 4191 ]
      Src Port  : [ >1024 ]
-------------------------------------------------------------------------------
Routes : 1

The following sections describe:

  • Default BGP behavior without GR

  • GR

  • LLGR

Default BGP behavior without GR

The RR PE-1 is isolated from the other PEs by disabling the ports toward PE-2 and PE-3, as follows:

# on PE-1:
configure 
    port 1/1/1
        shutdown
    exit
    port 1/1/2
        shutdown
    exit

All BGP sessions with the BGP peers drop and the BGP peers remove the routes received from RR PE-1; for example, the list of IPv4 routes on PE-2 is empty. The same is true for the other configured address families.

*A:PE-2# show router bgp routes
===============================================================================
 BGP Router ID:192.0.2.2        AS:64496       Local AS:64496      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP IPv4 Routes
===============================================================================
Flag  Network                                            LocalPref   MED
      Nexthop (Router)                                   Path-Id     IGP Cost
      As-Path                                                        Label
-------------------------------------------------------------------------------
No Matching Entries Found.
===============================================================================

The following BGP summary on PE-2 shows that the session toward PE-4 is established, but the session toward PE-1 is down (state: Connect). A similar output is seen on the other PEs in AS 64496, because all BGP sessions toward the RR are down.

*A:PE-2# show router bgp summary all 

===============================================================================
BGP Summary
===============================================================================
Legend : D - Dynamic Neighbor
===============================================================================
Neighbor
Description
ServiceId          AS PktRcvd InQ  Up/Down   State|Rcv/Act/Sent (Addr Family)
                      PktSent OutQ
-------------------------------------------------------------------------------
192.0.2.1
Def. Instance  64496      108    0 00h01m10s Connect
                            8    0           
192.168.24.2
Def. Instance  64500      137    0 01h05m02s 0/0/0 (IPv4)
                          211    0           0/0/0 (IPv6)
                                             1/1/2 (VpnIPv4)
                                             1/1/2 (VpnIPv6)
                                             0/0/1 (L2VPN)
                                             0/0/0 (FlowIPv4)
                                             0/0/0 (FlowIPv6)

-------------------------------------------------------------------------------

The ports on PE-1 are re-enabled and the BGP routes are re-advertised.

# on PE-1:
configure 
    port 1/1/1
        no shutdown
    exit
    port 1/1/2
        no shutdown
    exit

GR

On all PEs, GR is enabled with a stale routes time of 150 seconds and notification enabled, as follows. The default restart time is 300 seconds, but the stale routes will already be deleted when the stale-routes time expires after 150 seconds. LLGR is not enabled yet.

# on PE-1, PE-2, PE-3, PE-5:
configure
    router
        bgp 
            group "iBGP"
                graceful-restart
                    stale-routes-time 150
                    enable-notification
                exit
            exit

RR PE-1 is isolated, as follows:

# on PE-1:
configure 
    port 1/1/1
        shutdown
    exit
    port 1/1/2
        shutdown
    exit

When the hold timer expires, the BGP session goes down, and the BGP peers enter the helper mode, RR PE-1 as well as its clients. The following debug message occurs on PE-2 if debugging is enabled for graceful restart:

153 2020/02/12 12:50:26.757 UTC MINOR: DEBUG #2001 Base BGP
"BGP: RESTART
Peer VR 1: Group iBGP: Peer 192.0.2.1: entering helper mode due to reason hold_timer_expiry
"

Log 99 logs the event as follows:

119 2020/02/12 12:50:26.757 UTC WARNING: BGP #2018 Base VR 1
"(ASN 64496) Peer 1: 192.0.2.1: graceful restart status changed to restarting"

The client PEs do not remove the routes they received from RR PE-1 immediately, but they mark these routes as stale and they keep using them. In the following list of IPv4 unicast routes, the x-status code indicates that the route is stale.

*A:PE-2# show router bgp routes 
===============================================================================
 BGP Router ID:192.0.2.2        AS:64496       Local AS:64496      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP IPv4 Routes
===============================================================================
Flag  Network                                            LocalPref   MED
      Nexthop (Router)                                   Path-Id     IGP Cost
      As-Path                                                        Label
-------------------------------------------------------------------------------
u*>xi 172.16.225.0/30                                    100         None
      192.0.2.5                                          None        20
      No As-Path                                                     -
-------------------------------------------------------------------------------
Routes : 1

When the BGP sessions are restored and an EOR is received for the AFI/SAFIs, the BGP routes are re-advertised and there are no longer any stale routes. However, if the stale routes timer expires before an EOR is received for the AFI/SAFIs, the stale routes are removed. The following command shows that there are no longer any BGP IPv4 routes in PE-2.

*A:PE-2# show router bgp routes 
===============================================================================
 BGP Router ID:192.0.2.2        AS:64496       Local AS:64496      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP IPv4 Routes
===============================================================================
Flag  Network                                            LocalPref   MED
      Nexthop (Router)                                   Path-Id     IGP Cost
      As-Path                                                        Label
-------------------------------------------------------------------------------
No Matching Entries Found.
===============================================================================

When the stale routes timer expires before an EOR is received for the AFI/SAFIs, the GR phase is terminated and the PE is no longer a GR helper. The following debug messages are logged on PE-2 when debugging is enabled for GR:

154 2020/02/12 12:52:56.757 UTC MINOR: DEBUG #2001 Base BGP
"BGP: RESTART
BGP trying to exit helper for peer Peer 1: 192.0.2.1 with reason stale-routes-time expired for all address families
"

155 2020/02/12 12:52:56.757 UTC MINOR: DEBUG #2001 Base BGP
"BGP: RESTART
BGP flushing stale routes for peer Peer 1: 192.0.2.1 AF All Address Families
"

156 2020/02/12 12:52:56.758 UTC MINOR: DEBUG #2001 Base BGP
"BGP: RESTART
Peer 1: 192.0.2.1: exit helper mode due to reason stale-routes-time expired
"

The following message is logged in log 99 on PE-2.

139 2020/02/12 12:52:56.758 UTC WARNING: BGP #2018 Base VR 1
"(ASN 64496) Peer 1: 192.0.2.1: graceful restart status changed to notHelping"

The situation on PE-1 is restored and the routes are re-advertised.

# on PE-1:
configure 
    port 1/1/1
        no shutdown
    exit
    port 1/1/2
        no shutdown
    exit

In the example, the stale routes time is 150 seconds and the restart time 300 seconds. The helper mode stops when either of these timers expires. When the stale routes time is increased to 400 seconds and the restart time remains 300 seconds, the helper mode will stop when the restart time expires, as shown by the following debug message.

265 2020/02/12 13:40:44.758 UTC MINOR: DEBUG #2001 Base BGP
"BGP: RESTART
Peer 1: 192.0.2.1: exit helper mode due to reason restart-time expired
"

LLGR

Initially, LLGR will be configured with the same LLGR-stale time for all the configured AFI/SAFIs, but it is possible to configure LLGR with a different LLGR-stale time per AF. The LLGR-stale time is configured as advertised-stale-time—which is the value that is advertised to the BGP peer—but can be overridden locally without being advertised.

At first, LLGR will be enabled on the "iBGP" group on PE-1, PE-2, PE-3, and PE-5. Later, LLGR will also be enabled on the "eBGP" group on PE-2 and PE-4.

LLGR enabled on iBGP sessions

The following configuration enables LLGR as well as GR in the "iBGP" group on all PEs in AS 64496 for all the already configured AFI/SAFIs.

# on PE-1, PE-2, PE-3, PE-5:
configure
    router
        bgp 
            group "iBGP"
                graceful-restart
                    stale-routes-time 150
                    enable-notification 
                    long-lived
                        advertised-stale-time 3600
                    exit
                exit

Neither GR nor LLGR is enabled in the "eBGP" group on PE-2 and PE-4. This makes no difference for the GR phase on PE-2; only for the LLGR phase.

When the RR PE-1 gets isolated and the hold timer for the BGP session expires, the GR phase starts for the "iBGP" group and the routes received from PE-1 are marked as stale, but remain in use. In the GR phase, the detailed information for the stale IPv4 route 172.16.225.0/30 on PE-2 shows the flags used, valid, best, IGP, and stale (not LLGR-stale), as follows. PE-2 will keep using the stale routes in the GR phase. PE-2 will not withdraw any stale routes and eBGP peer PE-4 remains unaware of the failure.

*A:PE-2# show router bgp routes detail 
===============================================================================
 BGP Router ID:192.0.2.2        AS:64496       Local AS:64496      
===============================================================================
---snip---
===============================================================================
BGP IPv4 Routes
===============================================================================
Original Attributes
 
Network        : 172.16.225.0/30
Nexthop        : 192.0.2.5
Path Id        : None                   
From           : 192.0.2.1
Res. Protocol  : ISIS                   Res. Metric    : 20
Res. Nexthop   : 192.168.23.2
Local Pref.    : 100                    Interface Name : int-PE-2-PE-3
---snip---
Community      : No Community Members
Cluster        : 192.0.2.1
Originator Id  : 192.0.2.5              Peer Router Id : 192.0.2.1
Fwd Class      : None                   Priority       : None
Flags          : Used  Valid  Best  IGP  Stale  
Route Source   : Internal
---snip---

The routes keep the stale flag "x", as in the GR phase. The following IPv4 route is marked as stale on PE-2:

*A:PE-2# show router bgp routes 
===============================================================================
 BGP Router ID:192.0.2.2        AS:64496       Local AS:64496      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP IPv4 Routes
===============================================================================
Flag  Network                                            LocalPref   MED
      Nexthop (Router)                                   Path-Id     IGP Cost
      As-Path                                                        Label
-------------------------------------------------------------------------------
u*>xi 172.16.225.0/30                                    100         None
      192.0.2.5                                          None        20
      No As-Path                                                     -
-------------------------------------------------------------------------------
Routes : 1

The following detailed information for this route on PE-2 shows the LLGR-stale flag instead of the normal stale flag, as follows:

*A:PE-2# show router bgp routes detail 
===============================================================================
 BGP Router ID:192.0.2.2        AS:64496       Local AS:64496      
===============================================================================
---snip---
===============================================================================
BGP IPv4 Routes
===============================================================================
Original Attributes
 
Network        : 172.16.225.0/30
Nexthop        : 192.0.2.5
Path Id        : None                   
From           : 192.0.2.1
Res. Protocol  : ISIS                   Res. Metric    : 20
Res. Nexthop   : 192.168.23.2
Local Pref.    : 100                    Interface Name : int-PE-2-PE-3
---snip---
Community      : No Community Members
Cluster        : 192.0.2.1
Originator Id  : 192.0.2.5              Peer Router Id : 192.0.2.1
Fwd Class      : None                   Priority       : None
Flags          : Used  Valid  Best  IGP  LlgrStale  
---snip---

When debugging is enabled for GR, the following message on PE-2 is generated when the GR phase starts.

873 2020/02/12 10:44:40.453 UTC MINOR: DEBUG #2001 Base BGP
"BGP: RESTART
Peer VR 1: Group iBGP: Peer 192.0.2.1: entering helper mode due to reason hold_timer_expiry
"

The following message on PE-2 is generated when the LLGR phase starts.

883 2020/02/12 10:47:10.453 UTC MINOR: DEBUG #2001 Base BGP
"BGP: RESTART
Peer VR 1: Group iBGP: Peer 192.0.2.1: Entering helper, LLGR Phase - reason llgr_Start_on_rtRtTm_pop
"

In the LLGR phase, the following command on PE-2 shows that, for the BGP session with RR PE-1, GR and LLGR are both enabled locally and on the BGP peer, and the GR and LLGR status of the peer PE-1 is "received restart request", so the LLGR phase is ongoing.

The advertised NLRIs of the RR PE-1 and its client PE-2 are similar, so the same AFI/SAFIs (and stale time) have been advertised by PE-2 and received from peer PE-1 for GR, GR notification, and LLGR. LLGR can only work if it is enabled on both BGP peers, which is the case.

*A:PE-2# show router bgp neighbor 192.0.2.1 graceful-restart 

===============================================================================
BGP Neighbor 192.0.2.1 Graceful Restart
===============================================================================
Graceful Restart locally configured for peer: Enabled
GR Notification                             : Enabled
Peer's Graceful Restart feature             : Enabled
NLRI(s) that peer supports restart for      : ipv4 ipv6 vpn-ipv4 vpn-ipv6
                                              l2-vpn flow-ipv4 flow-ipv6
NLRI(s) that peer saved forwarding for      : ipv4 ipv6 vpn-ipv4 vpn-ipv6
                                              l2-vpn flow-ipv4 flow-ipv6
NLRI(s) that restart is negotiated for      : None
NLRI(s) of received end-of-rib markers      : None
NLRI(s) of all end-of-rib markers sent      : None
NLRI(s) peer supports NOTIFICATION GR for   : ipv4 ipv6 vpn-ipv4 vpn-ipv6
                                              l2-vpn flow-ipv4 flow-ipv6 
Restart time locally configured for peer    : 300 seconds
Restart time requested by the peer          : 300 seconds
Time until stale routes are deleted or        
become long-lived stale                     : 150 seconds
Graceful restart status on the peer         : Rcvd restart request
Long-Lived GR status on the peer            : Rcvd restart request
Number of Restarts                          : 1
Last Restart at                             : 02/12/2020 09:19:52
-------------------------------------------------------------------------------
LLGR Configuration                          : Enabled
Peer's LLGR feature                         : Enabled
NLRI(s) peer signaled LLGR for & stale time
& F-bit                                     : ipv4 : 3600 seconds (F)
                                              ipv6 : 3600 seconds (F)
                                              vpn-ipv4 : 3600 seconds (F)
                                              vpn-ipv6 : 3600 seconds (F)
                                              l2-vpn : 3600 seconds (F)
                                              flow-ipv4 : 3600 seconds (F)
                                              flow-ipv6 : 3600 seconds (F)
NLRI(s) LLGR negotiated for and stale time  : ipv4 : 3600 seconds
                                              ipv6 : 3600 seconds
                                              vpn-ipv4 : 3600 seconds
                                              vpn-ipv6 : 3600 seconds
                                              l2-vpn : 3600 seconds
                                              flow-ipv4 : 3600 seconds
                                              flow-ipv6 : 3600 seconds
LLGR Restart time overridden for the peer   : n/a
NLRI(s) LLGR advertised & stale time & F-bit: ipv4 : 3600 seconds
                                              ipv6 : 3600 seconds
                                              vpn-ipv4 : 3600 seconds
                                              vpn-ipv6 : 3600 seconds
                                              l2-vpn : 3600 seconds
                                              flow-ipv4 : 3600 seconds
                                              flow-ipv6 : 3600 seconds
===============================================================================

On PE-2, the following command shows that GR and LLGR are disabled for the eBGP session with PE-4.

*A:PE-2# show router bgp neighbor 192.168.24.2 graceful-restart 

===============================================================================
BGP Neighbor 192.168.24.2 Graceful Restart
===============================================================================
Graceful Restart locally configured for peer: Disabled
GR Notification                             : Disabled
Peer's Graceful Restart feature             : Disabled
NLRI(s) that peer supports restart for      : None
NLRI(s) that peer saved forwarding for      : None
NLRI(s) that restart is negotiated for      : None
NLRI(s) of received end-of-rib markers      : ipv4 ipv6
NLRI(s) of all end-of-rib markers sent      : ipv4 ipv6
NLRI(s) peer supports NOTIFICATION GR for   : None
Restart time locally configured for peer    : 120 seconds
Restart time requested by the peer          : 0 seconds
Time until stale routes are deleted or
become long-lived stale                     : 360 seconds
Graceful restart status on the peer         : Not currently being helped
Long-Lived GR status on the peer            : Not currently being helped
Number of Restarts                          : 0
Last Restart at                             : Never
-------------------------------------------------------------------------------
LLGR Configuration                          : Disabled
Peer's LLGR feature                         : Disabled
NLRI(s) peer signaled LLGR for & stale time   
& F-bit                                     : n/a
NLRI(s) LLGR negotiated for and stale time  : n/a
LLGR Restart time overridden for the peer   : n/a
NLRI(s) LLGR advertised & stale time & F-bit: n/a
===============================================================================

In the LLGR phase, the stale routes remain stale, but are depreferenced. In this example, there are no alternative routes with a better preference, so the stale routes remain valid, best, and used. Traffic between PE-2, PE-3, and PE-5 is still uninterrupted.

However, the eBGP session between PE-2 and PE-4 does not have LLGR enabled. In the LLGR phase, the LLGR-stale routes are immediately withdrawn by PE-2; for example, the following BGP update withdraws the VPN-IPv4 routes toward PE-4. Therefore, VPN traffic can no longer be exchanged between VPRN 1 on PE-3 (or PE-5) and VPRN 1 on PE-4.

884 2020/02/12 10:47:10.458 UTC MINOR: DEBUG #2001 Base Peer 1: 192.168.24.2
"Peer 1: 192.168.24.2: UPDATE
Peer 1: 192.168.24.2 - Send BGP UPDATE:
    Withdrawn Length = 5
        172.16.225.0/30
    Total Path Attr Length = 39
    Flag: 0x90 Type: 15 Len: 35 Multiprotocol Unreachable NLRI:
        Address Family VPN_IPV4
        172.16.5.0/30 RD 64496:1 Label 0
        172.16.3.0/30 RD 64496:1 Label 0
"

Even though GR is also disabled for the eBGP session between PE-2 and PE-4, the routes are only withdrawn in the LLGR phase, not in the GR phase. GR is meant for short interruptions where the GR helper PE-2 pretends that the situation is normal and traffic can be forwarded based on stale routes, while LLGR is meant for longer failures and the neighbors need to be informed.

The ports on PE-1 are re-enabled and the routes are re-advertised followed by an EOR per AFI/SAFI, which terminates the LLGR phase.

# on PE-1:
configure 
    port 1/1/1
        no shutdown
    exit
    port 1/1/2
        no shutdown
    exit

LLGR enabled on eBGP session

On PE-2 and PE-4, GR and LLGR are enabled for the "eBGP" group, as follows:

# on PE-2, PE-4:
configure
    router
        bgp
            group "eBGP"
                graceful-restart
                    stale-routes-time 150
                    enable-notification
                    long-lived
                        advertised-stale-time 3600
                    exit
                exit
            exit

PE-2 will re-advertise the routes it sent to PE-4, but with well-known community llgr-stale. PE-4 was unaware of the GR phase; it only got involved in the LLGR phase. The following BGP update was sent by PE-2 to its eBGP peer PE-4 for the IPv4 address family:

339 2020/02/12 12:43:06.007 UTC MINOR: DEBUG #2001 Base Peer 1: 192.168.24.2
"Peer 1: 192.168.24.2: UPDATE
Peer 1: 192.168.24.2 - Send BGP UPDATE:
    Withdrawn Length = 0
    Total Path Attr Length = 27
    Flag: 0x40 Type: 1 Len: 1 Origin: 0
    Flag: 0x40 Type: 2 Len: 6 AS Path:
        Type: 2 Len: 1 < 64496 >
    Flag: 0x40 Type: 3 Len: 4 Nexthop: 192.168.24.1
    Flag: 0xc0 Type: 8 Len: 4 Community:
        llgr-stale
    NLRI: Length = 5
        172.16.225.0/30
"

PE-4 does not mark the route as stale in the way that PE-2 does; the BGP route does not get the stale flag "x", as follows:

*A:PE-4# show router bgp routes        
===============================================================================
 BGP Router ID:192.0.2.4        AS:64500       Local AS:64500      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP IPv4 Routes
===============================================================================
Flag  Network                                            LocalPref   MED
      Nexthop (Router)                                   Path-Id     IGP Cost
      As-Path                                                        Label
-------------------------------------------------------------------------------
u*>i  172.16.225.0/30                                    None        None
      192.168.24.1                                       None        0
      64496                                                          -
-------------------------------------------------------------------------------
Routes : 1

The detailed information for this route on PE-4 shows the community llgr-stale, but no LLGR-stale flag, as follows:

*A:PE-4# show router bgp routes detail 
===============================================================================
 BGP Router ID:192.0.2.4        AS:64500       Local AS:64500      
===============================================================================
---snip---
===============================================================================
BGP IPv4 Routes
===============================================================================
Original Attributes
 
Network        : 172.16.225.0/30
Nexthop        : 192.168.24.1
Path Id        : None                   
From           : 192.168.24.1
Res. Protocol  : LOCAL                  Res. Metric    : 0
Res. Nexthop   : 192.168.24.1
Local Pref.    : n/a                    Interface Name : int-PE-4-PE-2
---snip---
Community      : llgr-stale
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.2
Fwd Class      : None                   Priority       : None
Flags          : Used  Valid  Best  IGP  
Route Source   : External
AS-Path        : 64496 
---snip---

Per-AF LLGR

The following configuration on PE-2 enables GR for the same address families as before, while LLGR will only be applied for IPv4 FlowSpec and IPv6 FlowSpec, with different LLGR-stale times. The default LLGR-stale time—advertised-stale-time—is 86400 seconds, but helper-override-stale-time 0 in the iBGP group context overrides the LLGR-stale time to a zero value for the iBGP group. For FlowSpec routes, the advertised-stale-time is set to a value of 20000 seconds. For IPv4 FlowSpec, the helper-override-stale-time is set to 2000 seconds; for IPv6 FlowSpec, it is set to 3000 seconds. The forwarding bit is only set for non-forwarding AFs—forwarding-bits-set non-fwd—so it will be set for configuration routes, such as FlowSpec routes.

# on PE-2:
configure
    router
        bgp
            group "iBGP"
                family ipv4 ipv6 vpn-ipv4 vpn-ipv6 l2-vpn flow-ipv4 flow-ipv6
                graceful-restart
                    stale-routes-time 150
                    enable-notification
                    long-lived
                        advertised-stale-time 86400         # default
                        helper-override-stale-time 0
                        family flow-ipv4
                            advertised-stale-time 20000
                            helper-override-stale-time 2000
                        exit
                        family flow-ipv6
                        advertised-stale-time 20000
                            helper-override-stale-time 3000
                        exit
                        forwarding-bits-set non-fwd
                        no advertise-stale-to-all-neighbors     # default
                    exit
                exit
                peer-as 64496
                neighbor 192.0.2.1
                exit

With this configuration on PE-2, the LLGR phase will be reduced to zero seconds for all AFs except IPv4 FlowSpec and IPv6 FlowSpec, but the helper-override-stale-time is not advertised to the BGP peer; only the advertised-stale-time is advertised. The GR phase applies for all configured address families with the same timers. When the BGP configuration on PE-1 is preserved and LLGR is enabled for the same address families, the following command shows the GR information on PE-2 for peer PE-1.

*A:PE-2# show router bgp neighbor 192.0.2.1 graceful-restart 

===============================================================================
BGP Neighbor 192.0.2.1 Graceful Restart
===============================================================================
Graceful Restart locally configured for peer: Enabled
GR Notification                             : Enabled
Peer's Graceful Restart feature             : Enabled
NLRI(s) that peer supports restart for      : ipv4 ipv6 vpn-ipv4 vpn-ipv6
                                              l2-vpn flow-ipv4 flow-ipv6
NLRI(s) that peer saved forwarding for      : ipv4 ipv6 vpn-ipv4 vpn-ipv6
                                              l2-vpn flow-ipv4 flow-ipv6
NLRI(s) that restart is negotiated for      : ipv4 ipv6 vpn-ipv4 vpn-ipv6
                                              l2-vpn flow-ipv4 flow-ipv6
NLRI(s) of received end-of-rib markers      : ipv4 ipv6 vpn-ipv4 vpn-ipv6
                                              l2-vpn flow-ipv4 flow-ipv6
NLRI(s) of all end-of-rib markers sent      : ipv4 ipv6 vpn-ipv4 vpn-ipv6
                                              l2-vpn flow-ipv4 flow-ipv6
NLRI(s) peer supports NOTIFICATION GR for   : ipv4 ipv6 vpn-ipv4 vpn-ipv6
                                              l2-vpn flow-ipv4 flow-ipv6
Restart time locally configured for peer    : 300 seconds
Restart time requested by the peer          : 300 seconds
Time until stale routes are deleted or        
become long-lived stale                     : 150 seconds
Graceful restart status on the peer         : Not currently being helped
Long-Lived GR status on the peer            : Not currently being helped
Number of Restarts                          : 0
Last Restart at                             : Never
-------------------------------------------------------------------------------
LLGR Configuration                          : Enabled
Peer's LLGR feature                         : Enabled
NLRI(s) peer signaled LLGR for & stale time   
& F-bit                                     : ipv4 : 3600 seconds (F)
                                              ipv6 : 3600 seconds (F)
                                              vpn-ipv4 : 3600 seconds (F)
                                              vpn-ipv6 : 3600 seconds (F)
                                              l2-vpn : 3600 seconds (F)
                                              flow-ipv4 : 3600 seconds (F)
                                              flow-ipv6 : 3600 seconds (F)
                                              NLRI(s) LLGR negotiated for and stale time  : ipv4 : 0 seconds
                                              ipv6 : 0 seconds
                                              vpn-ipv4 : 0 seconds
                                              vpn-ipv6 : 0 seconds
                                              l2-vpn : 0 seconds
                                              flow-ipv4 : 2000 seconds
                                              flow-ipv6 : 3000 seconds
                                              LLGR Restart time overridden for the peer   : n/a
                                              NLRI(s) LLGR advertised & stale time & F-bit: ipv4 : 86400 seconds
                                              ipv6 : 86400 seconds
                                              vpn-ipv4 : 86400 seconds
                                              vpn-ipv6 : 86400 seconds
                                              l2-vpn : 86400 seconds(F)
                                              flow-ipv4 : 20000 seconds(F)
                                              flow-ipv6 : 20000 seconds(F)
===============================================================================

LLGR is enabled on PE-1 and PE-2. BGP peer PE-1 has signaled LLGR-stale times of 3600 seconds for the supported AFI/SAFIs. PE-2 had advertised the default LLGR-stale time of 86400 seconds for all supported AFI/SAFIs except for the FlowSpec AFI/SAFIs, where the LLGR-stale time is 20000 seconds. On PE-2, the F-bit is only set for the non-forwarding routes; in this case, L2 VPN, IPv4 FlowSpec, and IPv6 FlowSpec.

The helper-override-stale-time is not advertised to the BGP peer, but considered for the local LLGR behavior (in bold). Only the FlowSpec AFs get a non-zero LLGR-stale time: 2000 seconds for IPv4 FlowSpec; 3000 seconds for IPv6 FlowSpec.

The following GR/LLGR information on peer PE-1 shows the advertised LLGR-stale time, not the helper-override-stale-time configured on PE-2.

*A:PE-1# show router bgp neighbor 192.0.2.2 graceful-restart 

===============================================================================
BGP Neighbor 192.0.2.2 Graceful Restart
===============================================================================
Graceful Restart locally configured for peer: Enabled
GR Notification                             : Enabled
Peer's Graceful Restart feature             : Enabled
NLRI(s) that peer supports restart for      : ipv4 ipv6 vpn-ipv4 vpn-ipv6
                                              l2-vpn flow-ipv4 flow-ipv6
NLRI(s) that peer saved forwarding for      : l2-vpn flow-ipv4 flow-ipv6
NLRI(s) that restart is negotiated for      : ipv4 ipv6 vpn-ipv4 vpn-ipv6
                                              l2-vpn flow-ipv4 flow-ipv6
NLRI(s) of received end-of-rib markers      : ipv4 ipv6 vpn-ipv4 vpn-ipv6
                                              l2-vpn flow-ipv4 flow-ipv6
NLRI(s) of all end-of-rib markers sent      : ipv4 ipv6 vpn-ipv4 vpn-ipv6
                                              l2-vpn flow-ipv4 flow-ipv6
NLRI(s) peer supports NOTIFICATION GR for   : ipv4 ipv6 vpn-ipv4 vpn-ipv6
                                              l2-vpn flow-ipv4 flow-ipv6
Restart time locally configured for peer    : 300 seconds
Restart time requested by the peer          : 300 seconds
Time until stale routes are deleted or        
become long-lived stale                     : 150 seconds
Graceful restart status on the peer         : Restart completed
Long-Lived GR status on the peer            : Restart completed
Number of Restarts                          : 4
Last Restart at                             : 02/12/2020 14:21:25
-------------------------------------------------------------------------------
LLGR Configuration                          : Enabled
Peer's LLGR feature                         : Enabled
NLRI(s) peer signaled LLGR for & stale time   
& F-bit                                     : ipv4 : 86400 seconds
                                              ipv6 : 86400 seconds
                                              vpn-ipv4 : 86400 seconds
                                              vpn-ipv6 : 86400 seconds
                                              l2-vpn : 86400 seconds (F)
                                              flow-ipv4 : 20000 seconds (F)
                                              flow-ipv6 : 20000 seconds (F)
NLRI(s) LLGR negotiated for and stale time  : ipv4 : 86400 seconds
                                              ipv6 : 86400 seconds
                                              vpn-ipv4 : 86400 seconds
                                              vpn-ipv6 : 86400 seconds
                                              l2-vpn : 86400 seconds
                                              flow-ipv4 : 20000 seconds
                                              flow-ipv6 : 20000 seconds
LLGR Restart time overridden for the peer   : n/a
NLRI(s) LLGR advertised & stale time & F-bit: ipv4 : 3600 seconds(F)
                                              ipv6 : 3600 seconds(F)
                                              vpn-ipv4 : 3600 seconds(F)
                                              vpn-ipv6 : 3600 seconds(F)
                                              l2-vpn : 3600 seconds(F)
                                              flow-ipv4 : 3600 seconds(F)
                                              flow-ipv6 : 3600 seconds(F)
===============================================================================

With this configuration, only the FlowSpec routes can get the LLGR-stale flag. No LLGR phase will start for the other AFs, so the stale routes of those AFs will be withdrawn when the GR phase ends.

It is possible to override the GR restart time to enter the LLGR phase immediately without going through the GR phase, as follows, on PE-2:

# on PE-2:
configure 
    router 
        bgp 
            group iBGP 
                graceful-restart 
                    long-lived 
                        helper-override-restart-time 0

When the BGP session goes down on PE-2, the GR phase is omitted because the restart time of zero seconds expires instantly, so the LLGR phase starts immediately, as follows.

*A:PE-2# show router bgp neighbor 192.0.2.1 graceful-restart 

===============================================================================
BGP Neighbor 192.0.2.1 Graceful Restart
===============================================================================
Graceful Restart locally configured for peer: Enabled
GR Notification                             : Enabled
Peer's Graceful Restart feature             : Enabled
---snip---
Restart time locally configured for peer    : 300 seconds
---snip---
Graceful restart status on the peer         : Restart completed
Long-Lived GR status on the peer            : Rcvd restart request
---snip---

-------------------------------------------------------------------------------
---snip---

LLGR Restart time overridden for the peer   : 0
---snip---

When LLGR phase starts immediately, only the FlowSpec address families will be protected while all routes of the other AFs are withdrawn. The FlowSpec routes get the LLGR-stale flag and route updates to eBGP peer PE-4 will get the LLGR-stale community, as follows:

*A:PE-2# show router bgp routes flow-ipv4 hunt 
===============================================================================
---snip---
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
---snip---
From           : 192.0.2.1
---snip---
Flags          : Used  Valid  Best  IGP  LlgrStale  
---snip---
 
-------------------------------------------------------------------------------
RIB Out Entries
-------------------------------------------------------------------------------
---snip---
To             : 192.168.24.2
---snip---
Community      : llgr-stale rate-limit: 0 kbps
---snip--- 

Conclusion

Graceful restart helpers avoid withdrawing BGP routes immediately when the BGP session goes down. Routes that were received from the failed router are marked as stale, but remain in use. When the BGP session is down for a longer time, such as hours or days, LLGR can take over when the GR ends, possibly only for a subset of AFI/SAFIs. In the LLGR phase, the LLGR-stale routes are depreferenced, but if they remain best and valid, they can be re-advertised to the BGP peers as LLGR-stale.