BGP Segment Routing Using the Prefix SID Attribute

This chapter describes BGP Segment Routing using the prefix SID attribute.

Topics in this chapter include:

Applicability

The information and configuration in this chapter are based on SR OS Release 23.3.R1. BGP Segment Routing (SR) is supported in SR OS Release 19.10.R1, and later.

Overview

Segment Routing (SR) has become a foundational technology for Software-Defined Networking (SDN) in Wide Area Networks (WANs). Also, SR is being extended beyond WAN borders into Data Centers (DCs).

SR allows an ingress node to route a packet from the source, by prepending an SR header containing an ordered list of segment identifiers (SIDs). A SID represents a topological or service-based instruction. A SID can have a local meaning for one specific node, or a global meaning within the SR domain, such as the instruction to forward a packet on the Equal-Cost Multipath (ECMP) aware shortest path to reach some prefix.

In WAN networks, infrastructure IP reachability is nearly always conveyed by an IGP protocol, such as OSPF and IS-IS, but in large-scale DCs, BGP has become the protocol of choice. In a typical DC design, BGP is used for endpoint reachability, as follows:

  • Each node (Top of Rack (TOR), leaf, spine, and so on) has its own Autonomous System (AS).

  • Each node has an eBGP session to each of its directly connected peers.

  • Each node originates the IPv4 (or IPv6) address of its loopback interface into BGP and announces it to its neighbors.

To extend SR-MPLS into DCs that use this type of BGP design, the SR OS nodes must advertise their loopback IP prefix in a BGP labeled-unicast (BGP-LU) IPv4 route with a prefix SID attribute. The prefix SID attribute is ignored when attached to other types of BGP routes, including BGP-LU IPv6 routes, but it is still be propagated.

A BGP prefix SID is always a global SID within the SR domain and identifies an instruction to forward the packet along the ECMP-aware BGP-computed best paths to reach the prefix. The BGP prefix SID attribute can also help to create SR paths that transit across multiple administrative domains that do not share IGP SR topology information.

BGP-LU IPv4 route with prefix SID BGP path attribute shows a node in AS 64501 advertising a BGP-LU IPv4 route for prefix 10.0.0.1/32 with SID 20101. The SR-capable nodes forward packets with SID 20101 via the best BGP path to 10.0.0.1, using any of the available multipaths computed by BGP.

Figure 1. BGP-LU IPv4 route with prefix SID BGP path attribute

The BGP prefix SID attribute with type code 40 is an optional and transitive BGP path attribute, meaning that the attribute is expected to be propagated by routers that do not recognize the type value. When SR is deployed using an MPLS dataplane (SR-MPLS), the BGP prefix SID encodes:

  • A 32-bit label-index Type-Length-Value (TLV) (mandatory TLV)

  • An originator Segment Routing Global Block (SRGB) TLV containing one or more SRGB fields (optional TLV). If the SRGB field occurs multiple times in the SRGB TLV, the SRGB space of the ingress node consists of multiple ranges that are concatenated.

BGP signaling overview shows that node PE-1 exports a BGP-LU IPv4 route with prefix 10.0.0.1/32 and label 20101. The BGP prefix SID attribute is attribute type 40 and contains an SR label index of 1 and the originator SRGB with start label 20100 and size 100 (from 20100 to 20199). Node PE-2 imports the BGP-LU IPv4 route and exports it to the next node.

Figure 2. BGP signaling overview

To add, replace, or process a BGP prefix SID, SR must be administratively enabled in the bgp context. The BGP prefix SID range can be set to either global (that is, equal to the SRGB also used by SR-OSPF or SR-ISIS and defined in the router "Base" mpls-labels sr-labels context) or a subset of the SRGB defined by the start-label command in combination with max-index. All BGP prefix SID values must reside within the global SRGB or the start-label command fails. The prefix-sid-range is a mandatory requirement.

To originate BGP SR prefixes, two policies are required with an sr-label-index action, which may or may not be identical:

  • route-table-import policy-name <policy-name> used to populate a local BGP-SR table with an SR label index

  • export policy [<policy>] to advertise a prefix to a neighbor with an SR label index

In the example topology used in this chapter, the import and export policies are identical and have an action entry with action-type accept with sr-label-index with value 1, so on PE-1, the prefix SID for the prefix 10.0.0.1/32 equals 20101, which is the sum of the start label for the prefix SID range 20100 and the SR label index 1.

A unique label index value must be assigned to each different IPv4 prefix that is advertised with a BGP prefix SID. However, in case of a conflict with another SR-programmed Label Forwarding Instance Base (LFIB) entry, the conflict situation is addressed as follows:

  • If the conflict is with another BGP-LU IPv4 route for a different prefix with a prefix SID attribute, all the conflicting BGP-LU IPv4 routes for both prefixes are advertised with normal BGP-LU labels from the dynamic label range, not from the dedicated SR label range.

  • If the conflict is with an IGP route and the route-table-import policy action does not contain the prefer-igp in the sr-label-index command, the BGP-LU IPv4 route loses to the IGP route and is advertised with a normal BGP-LU label from the dynamic SR label range.

  • If the conflict is with an IGP route and the route-table-import policy action contains the prefer-igp in the sr-label-index command, this is not considered a conflict and BGP uses the IGP-signaled label index to derive its advertised label. This stitches the BGP SR tunnel to the IGP SR tunnel.

Stitching of SR-ISIS or SR-OSPF to SR-BGP is one of the main advantages of implementing SR-BGP.

Any /32 BGP-LU IPv4 route containing a prefix SID attribute is resolvable and usable in the same way as /32 BGP-LU IPv4 routes without prefix SID attribute. The routes can be installed in the route table and tunnel table, have ECMP next hops or FRR backup next hops, and can be used as transport tunnels.

Receiving a /32 BGP-LU IPv4 route with prefix SID attribute does not create a tunnel in the SR database; it only creates a label swap entry when the route is re-advertised with a new next hop. This means that the first SID in any SID list of an SR policy should not be based on a BGP prefix SID because the data path would not be programmed correctly. However, the BGP prefix SID can be used as a non-first SID in any SR policy.

Each node capable of receiving and propagating the BGP prefix SID attribute can be configured with the block-prefix-sid command at the BGP global, group, or neighbor configuration levels to:

  • block the propagation of the attribute outside its local SR domain

  • block inbound propagation of the attribute from another SR domain

When block-prefix-sid applies to a BGP session, the prefix SID attribute is stripped from all sent and received routes on that session, even if the prefix SID attribute was added to the outbound routes by the local router. By default, this feature is not configured, so the prefix SID is propagated freely to and from all BGP peers.

Configuration

Example topology shows the example topology with four nodes in different ASs. The loopback addresses 10.0.0.1/32 on PE-1 and 10.0.0.4/32 on PE-4 are exported in BGP-LU IPv4 routes with prefix SID attribute.

Figure 3. Example topology

The initial configuration includes:

  • Cards, MDAs, ports

  • Router interfaces

  • eBGP sessions for the label-IPv4 address family

  • PE-3 and PE-4 have ecmp and multipath max-paths set to 2 for BGP address family label-ipv4

No IGP is configured, so SR-OSPF or SR-ISIS cannot be used.

Configure BGP segment routing using prefix SID

BGP SR is enabled on all PEs. Also, the SRGB is configured and the BGP SR labels are defined as a subset of the SRGB, as follows:

# on PE-1, PE-2, PE-3, PE-4:
configure exclusive
    router "Base" {
        mpls-labels {
            sr-labels {
                start 20000
                end 20999
            }
        }
        bgp {
            segment-routing {
                admin-state enable
                prefix-sid-range {
                    start-label 20100
                    max-index 99
                }
            }
        }

It is possible to define different policies with the sr-label-index action for importing and exporting the prefixes, but in this example, the same policy is used. The following policy is used for exporting and importing prefix 10.0.0.1/32 on PE-1:

# on PE-1:
configure exclusive
    policy-options {
        prefix-list "10.0.0.1/32" {
            prefix 10.0.0.1/32 type exact {
            }
        }
        policy-statement "prefix-sid-1" {
            entry 10 {
                from {
                    prefix-list ["10.0.0.1/32"]
                }
                action {
                    action-type accept
                    sr-label-index {
                        value 1
                    }
                }
            }
        }
    }

Likewise, PE-4 exports prefix 10.0.0.4/32 with SR label index value 4, resulting in a BGP prefix SID 20104 (start label 20100 + index 4 = 20104).

The route-table-import policy-name command is used to populate a local BGP-SR table with SR label 20101 (20100 + 1 = 20101), as follows:

# on PE-1:
configure exclusive
    router "Base" {
        bgp {
            rib-management {
                label-ipv4 {
                    route-table-import {
                        policy-name "prefix-sid-1"
                    }
                }
            }
        }

The export policy is configured in the BGP group, as follows:

# on PE-1:
configure exclusive
    router "Base" {
        bgp {
            group "eBGP" {
                family {
                    label-ipv4 true
                }
            }
            neighbor "192.168.12.2" {
                group "eBGP"
                peer-as 64502
            }
                export {
                    policy ["prefix-sid-1"]
                }
            }

The following show commands display the BGP-SR table on the different PEs:

[/]
A:admin@PE-1# show router bgp sr-label

===============================================================================
BGP SR labels
Flags: B - entry has backup next-hop, E - entry has ECMP next-hops
===============================================================================
Prefix                                            Advertised  Received    Flags
                                                    Label       Label     
-------------------------------------------------------------------------------
10.0.0.1/32                                       20101       -           -
10.0.0.4/32                                       20104       20104       -
-------------------------------------------------------------------------------
Total Labels allocated:   2
===============================================================================
[/]
A:admin@PE-2# show router bgp sr-label

===============================================================================
BGP SR labels
Flags: B - entry has backup next-hop, E - entry has ECMP next-hops
===============================================================================
Prefix                                            Advertised  Received    Flags
                                                    Label       Label     
-------------------------------------------------------------------------------
10.0.0.1/32                                       20101       20101       -
10.0.0.4/32                                       20104       20104       -
-------------------------------------------------------------------------------
Total Labels allocated:   2
===============================================================================
[/]
A:admin@PE-3# show router bgp sr-label

===============================================================================
BGP SR labels
Flags: B - entry has backup next-hop, E - entry has ECMP next-hops
===============================================================================
Prefix                                            Advertised  Received    Flags
                                                    Label       Label     
-------------------------------------------------------------------------------
10.0.0.1/32                                       20101       20101       -
10.0.0.4/32                                       20104       20104       E
-------------------------------------------------------------------------------
Total Labels allocated:   2
===============================================================================
[/]
A:admin@PE-4# show router bgp sr-label

===============================================================================
BGP SR labels
Flags: B - entry has backup next-hop, E - entry has ECMP next-hops
===============================================================================
Prefix                                            Advertised  Received    Flags
                                                    Label       Label     
-------------------------------------------------------------------------------
10.0.0.1/32                                       20101       20101       E
10.0.0.4/32                                       20104       -           -
-------------------------------------------------------------------------------
Total Labels allocated:   2
===============================================================================

Because PE-3 and PE-4 have ECMP and BGP multipath configured, traffic flows can be sprayed over two links. The E-flag in the last column indicates that an ECMP next-hop is available for prefix 10.0.0.4/32 on PE-3 and for prefix 10.0.0.1 on PE-4.

The tunnel table on PE-1 shows that a tunnel with ID 262145 is available toward destination 10.0.0.4/32:

[/]
A:admin@PE-1# show router tunnel-table

===============================================================================
IPv4 Tunnel Table (Router: Base)
===============================================================================
Destination           Owner     Encap TunnelId  Pref   Nexthop        Metric
   Color                                                              
-------------------------------------------------------------------------------
10.0.0.4/32           bgp       MPLS  262145    12     192.168.12.2   1000
-------------------------------------------------------------------------------
Flags: B = BGP or MPLS backup hop available
       L = Loop-Free Alternate (LFA) hop available
       E = Inactive best-external BGP route
       k = RIB-API or Forwarding Policy backup hop
===============================================================================

The FP-tunnel table provides more information about the label (20104) and next hop (192.168.12.2):

[/]
A:admin@PE-1# show router fp-tunnel-table 1

===============================================================================
IPv4 Tunnel Table Display

Legend: 
label stack is ordered from bottom-most to top-most
B - FRR Backup
===============================================================================
Destination                                  Protocol         Tunnel-ID
  Lbl/SID                                                      
    NextHop                                                   Intf/Tunnel
  Lbl/SID (backup)                                            
    NextHop   (backup)                                        
-------------------------------------------------------------------------------
10.0.0.4/32                                  BGP               -
  20104
    192.168.12.2                                             1/1/c1/1:100
-------------------------------------------------------------------------------
Total Entries : 1
-------------------------------------------------------------------------------
===============================================================================

On PE-2, two tunnels are available: one toward destination 10.0.0.1/32 with SR label 20101 and another toward destination 10.0.0.4/32 with SR label 20104:

[/]
A:admin@PE-2# show router fp-tunnel-table 1

===============================================================================
IPv4 Tunnel Table Display

Legend: 
label stack is ordered from bottom-most to top-most
B - FRR Backup
===============================================================================
Destination                                  Protocol         Tunnel-ID
  Lbl/SID                                                      
    NextHop                                                   Intf/Tunnel
  Lbl/SID (backup)                                            
    NextHop   (backup)                                        
-------------------------------------------------------------------------------
10.0.0.1/32                                  BGP               -
  20101
    192.168.12.1                                             1/1/c1/2:100
10.0.0.4/32                                  BGP               -
  20104
    192.168.23.2                                             1/1/c1/1:100
-------------------------------------------------------------------------------
Total Entries : 2
-------------------------------------------------------------------------------
===============================================================================

On PE-3, three tunnels are available: one toward destination 10.0.0.1/32 with SR label 20101 and two toward destination 10.0.0.4/32 with SR label 20104.

[/]
A:admin@PE-3# show router fp-tunnel-table 1

===============================================================================
IPv4 Tunnel Table Display

Legend: 
label stack is ordered from bottom-most to top-most
B - FRR Backup
===============================================================================
Destination                                  Protocol         Tunnel-ID
  Lbl/SID                                                      
    NextHop                                                   Intf/Tunnel
  Lbl/SID (backup)                                            
    NextHop   (backup)                                        
-------------------------------------------------------------------------------
10.0.0.1/32                                  BGP               -
  20101
    192.168.23.1                                             1/1/c1/2:100
10.0.0.4/32                                  BGP               -
  20104
    192.168.34.2                                             1/1/c1/1:100
  20104
    192.168.34.6                                             1/1/c1/3:100
-------------------------------------------------------------------------------
Total Entries : 2
-------------------------------------------------------------------------------
===============================================================================

On PE-4, two tunnels are available toward destination 10.0.0.1/32 with SR label 20101:

[/]
A:admin@PE-4# show router fp-tunnel-table 1

===============================================================================
IPv4 Tunnel Table Display

Legend: 
label stack is ordered from bottom-most to top-most
B - FRR Backup
===============================================================================
Destination                                  Protocol         Tunnel-ID
  Lbl/SID                                                      
    NextHop                                                   Intf/Tunnel
  Lbl/SID (backup)                                            
    NextHop   (backup)                                        
-------------------------------------------------------------------------------
10.0.0.1/32                                  BGP               -
  20101
    192.168.34.1                                             1/1/c1/2:100
  20101
    192.168.34.5                                             1/1/c1/3:100
-------------------------------------------------------------------------------
Total Entries : 1
-------------------------------------------------------------------------------
===============================================================================

PE-1 advertised a BGP-LU IPv4 route for prefix 10.0.0.1/32 with label 20101 to PE-2. The following command on PE-2 shows the received route:

[/]
A:admin@PE-2# show router bgp routes 10.0.0.1/32 label-ipv4
===============================================================================
 BGP Router ID:192.0.2.2        AS:64502       Local AS:64502      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP LABEL-IPV4 Routes
===============================================================================
Flag  Network                                            LocalPref   MED
      Nexthop (Router)                                   Path-Id     IGP Cost
      As-Path                                                        Label
-------------------------------------------------------------------------------
u*>i  10.0.0.1/32                                        None        None
      192.168.12.1                                       None        0
      64501                                                          20101
-------------------------------------------------------------------------------
Routes : 1
===============================================================================

This route is advertised to PE-3 and finally to PE-4. The following command on PE-4 shows two BGP-LU IPv4 routes for prefix 10.0.0.1/32 with label 20101: one with next hop 192.168.34.1 and another one with next hop 192.168.34.5.

[/]
A:admin@PE-4# show router bgp routes 10.0.0.1/32 label-ipv4
===============================================================================
 BGP Router ID:192.0.2.4        AS:64504       Local AS:64504      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP LABEL-IPV4 Routes
===============================================================================
Flag  Network                                            LocalPref   MED
      Nexthop (Router)                                   Path-Id     IGP Cost
      As-Path                                                        Label
-------------------------------------------------------------------------------
u*>i  10.0.0.1/32                                        None        None
      192.168.34.1                                       None        0
      64503 64502 64501                                              20101
u*>i  10.0.0.1/32                                        None        None
      192.168.34.5                                       None        0
      64503 64502 64501                                              20101
-------------------------------------------------------------------------------
Routes : 2
===============================================================================

The detailed output for the BGP-LU IPv4 routes on PE-4 show the prefix SID attribute with index 1 and originator SRGB with start label 20100 and size 100, as follows:

[/]
A:admin@PE-4# show router bgp routes 10.0.0.1/32 label-ipv4 detail
===============================================================================
 BGP Router ID:192.0.2.4        AS:64504       Local AS:64504      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP LABEL-IPV4 Routes
===============================================================================
Original Attributes
 
Network        : 10.0.0.1/32
Nexthop        : 192.168.34.1
Path Id        : None                   
From           : 192.168.34.1
Res. Nexthop   : 192.168.34.1
Local Pref.    : n/a                    Interface Name : int-PE-4-PE-3
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : None
AIGP Metric    : None                   IGP Cost       : 0
Connector      : None
Community      : No Community Members
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.3
Fwd Class      : None                   Priority       : None
IPv4 Label     : 20101                  
Flags          : Used Valid Best IGP In-TTM In-RTM 
Route Source   : External
AS-Path        : 64503 64502 64501 
Route Tag      : 0                      
Neighbor-AS    : 64503
DB Orig Val    : NotFound               Final Orig Val : N/A
Source Class   : 0                      Dest Class     : 0
Add Paths Send : Default                
RIB Priority   : Normal                 
Last Modified  : 00h01m18s              
Prefix SID     : index 1, originator-srgb [20100/100]
---snip---
-------------------------------------------------------------------------------
Routes : 2
===============================================================================
===============================================

The following debug message shows how the prefix SID attribute is advertised in a BGP update:

18 2023/04/17 18:16:28.069 CEST MINOR: DEBUG #2001 Base Peer 1: 192.168.34.1
"Peer 1: 192.168.34.1: UPDATE
Peer 1: 192.168.34.1 - Received BGP UPDATE:
    Withdrawn Length = 0
    Total Path Attr Length = 66
    Flag: 0x90 Type: 14 Len: 17 Multiprotocol Reachable NLRI:
        Address Family LBL-IPV4
        NextHop len 4 NextHop 192.168.34.1
        10.0.0.1/32 Label 20101
    Flag: 0x40 Type: 1 Len: 1 Origin: 0
    Flag: 0x40 Type: 2 Len: 14 AS Path:
        Type: 2 Len: 3 < 64503 64502 64501 >
    Flag: 0xc0 Type: 40 Len: 21 Prefix-SID-attr:
       Label Index TLV (10 bytes):-
          flags: 0x0  label Index: 1
       Originator SRGB TLV (11 bytes):-
          flags: 0x0  start_label: 20100  num_label: 100
"

Configure VPRN

Example topology with VPRN 1 shows the example topology with a basic VPRN service to demonstrate the end-to-end control plane signaling and data plane verification.

Figure 4. Example topology with VPRN 1

A BGP multi-hop session for address family VPN-IPv4 is configured between the GRT loopback addresses 10.0.0.1/32 on PE-1 and 10.0.0.4/32 on PE-4. On PE-1, the additional BGP configuration is as follows:

# on PE-1:
configure exclusive
    router "Base" {
        bgp {
            group "eBGP-VPN" {
                family {
                    vpn-ipv4 true
                }
            }
            neighbor "10.0.0.4" {
                group "eBGP-VPN"
                multihop 64
                local-address 10.0.0.1
                peer-as 64504
            }

In addition, the VPRN 1 service has loopback addresses 192.168.1.1/32 on PE-1 and 192.168.1.4/32 on PE-4. The configuration on PE-1 is as follows:

# on PE-1:
configure exclusive
    service {
        vprn "VPRN 1" {
            admin-state enable
            service-id 1
            customer "1"
            bgp-ipvpn {
                mpls {
                    admin-state enable
                    route-distinguisher "1:1"
                    vrf-target {
                        community "target:1:1"
                    }
                    auto-bind-tunnel {
                        resolution any
                    }
                }
            }
            interface "lo1" {
                loopback true
                ipv4 {
                    primary {
                        address 192.168.1.1
                        prefix-length 32
                    }
                }
            }
        }

The configuration on PE-4 is similar.

The following VPN-IPv4 route is received on PE-1:

[/]
A:admin@PE-1# show router bgp routes vpn-ipv4
===============================================================================
 BGP Router ID:192.0.2.1        AS:64501       Local AS:64501      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP VPN-IPv4 Routes
===============================================================================
Flag  Network                                            LocalPref   MED
      Nexthop (Router)                                   Path-Id     IGP Cost
      As-Path                                                        Label
-------------------------------------------------------------------------------
u*>i  4:1:192.168.1.4/32                                 None        None
      10.0.0.4                                           None        0
      64504                                                          524286
-------------------------------------------------------------------------------
Routes : 1
===============================================================================

The route table for VPRN 1 on PE-1 is as follows:

[/]
A:admin@PE-1# show router 1 route-table

===============================================================================
Route Table (Service: 1)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric   
-------------------------------------------------------------------------------
192.168.1.1/32                                Local   Local     00h01m31s  0
       lo1                                                          0
192.168.1.4/32                                Remote  BGP VPN   00h01m14s  170
       10.0.0.4 (tunneled:BGP)                                      1000
-------------------------------------------------------------------------------
No. of Routes: 2
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================

Conclusion

With BGP SR, it is possible to use SR without the use of an IGP protocol (for example, to cross AS boundaries). It is also possible to stitch SR-IGP and SR-BGP tunnels together. BGP SR uses the prefix SID attribute.