BNG UPF resiliency

Resiliency based on Fate Sharing Group

The BNG CPF groups the sessions in Fate Sharing Groups (FSGs). All sessions in an FSG share their fate, that is, they become active or standby together. The BNG CPF provides the following parameters to the BNG UPF per FSG:

  • FSG ID
  • status (active or standby)
  • unique FSG MAC address also known as a virtual MAC
  • list of associated sessions and one or more aggregate routes associated with these sessions. For more information, see IP gateway, services, and routing.
  • FSG template

When the BNG CPF does not provide an FSG template, the template with the name default is used. If there is no default template, the setup of the FSG and any associated session fails.

To configure FSG templates, use the configure subscriber-mgmt up-resiliency fate-sharing-group-template command.

After this, the active BNG UPF and standby BNG UPF are used in the context of a single FSG. Each BNG UPF can have multiple FSGs and can have a different status for each FSG.

To attract traffic from the access network, an active BNG UPF replies to ARP requests or ND messages for any IP gateway associated with the FSG. A standby BNG UPF never replies to those ARP or ND messages. To expedite convergence upon switching from standby to active, the new active BNG UPF sends Gratuitous ARP (GARP) messages using one of the IP gateway addresses for the FSG, or the system IP address if no IP gateway is known.

To configure the granularity of GARP messages for q-in-q SAPs, use the configure subscriber-mgmt up-resiliency fate-sharing-group-template gratuitous-arp command. You can configure the BNG UPF to send a single GARP message per SAP or per outer tag.

To correctly draw traffic to the active BNG UPF, the fsg-active and fsg-standby options are added to the state command in the configure policy-options policy-statement entry from context. All routes received from PFCP, including per-session framed routes, have one of these values as parameter. You can use this parameter to adjust values in routing export policies; for example, adjust a metric or a preference to the needs of the used routing protocol.

The following reduced configuration example shows a simplified policy that sets a metric of 100 for active routes and a metric of 200 for standby routes.

[gl:/configure policy-options policy-statement "upf_resiliency_aware_export"]
A:admin@BNG-UPF# info
    entry 20 {
        from {
            origin pfcp
            state fsg-active
        }
        action {
            action-type accept
            metric {
                set 100
            }
        }
    }
    entry 30 {
        from {
            origin pfcp
            state fsg-standby
        }
        action {
            action-type accept
            metric {
                set 200
            }
        }
    }
An active BNG UPF always forwards traffic in both directions. It uses the FSG MAC as source MAC for downlink unicast traffic. A standby BNG UPF by default forwards downlink traffic using its local port MAC as source MAC and drops all received uplink traffic. You can modify the default behavior in the following ways:
  • To shunt downlink traffic from the standby to the active BNG UPF and have the active BNG UPF forward that downlink traffic, do the following:
    • Configure a redundant interface using the configure subscriber-mgmt up-resiliency fate-sharing-group-template redundant-interface command.
    • Configure the same shunt ID on the active and the standby BNG UPF using the configure service ies subscriber-mgmt multi-chassis-shunt-id or configure service vprn subscriber-mgmt multi-chassis-shunt-id command depending on the service (IES or VPRN).
  • To enable forwarding of uplink traffic by the standby BNG UPF, use the configure subscriber-mgmt up-resiliency fate-sharing-group-template uplink-forwarding-while-standby command.
    Caution: Enabling the uplink-forwarding-while-standby command can lead to packet replication toward the core network. To prevent the possibility of packet replication toward the core network, provision the access network not to replicate unicast packets to the BNG UPF.
When the standby BNG UPF forwards uplink traffic, it can significantly lower packet loss during transition scenarios. The following examples illustrate this benefit:
  • If the current active BNG UPF fails and an access node detects this faster than the BNG CPF (for example, using BFD), the access node can start sending packets to the standby BNG UPF before that BNG UPF has become active. When the uplink-forwarding-while-standby command is enabled, the uplink packets are not lost because the standby BNG UPF forwards them.
  • During scheduled maintenance, the BNG CPF can switch the roles of the active and standby BNG UPF while both are healthy. For some time, the access node continues to send packets to the previously active BNG UPF that has become the standby BNG UPF. When the uplink-forwarding-while-standby command is enabled, the uplink packets are not lost because the previously active BNG UPF still forwards them. When the uplink-forwarding-while-standby command is disabled, the previously active BNG UPF drops the packets until the access node learns the path to the new active BNG UPF (using GARP).

The resiliency based on FSG does not use the SRRP protocol, but the system internally consumes an SRRP instance for each unique combination of FSG, port, and group interface template. To avoid potential conflicts with pre-configured SRRP instance IDs, define a range of SRRP instance IDs for the inter-UPF resiliency functionality using the configure redundancy srrp auto-srrp-id-range command.

BNG UPF health reporting

The BNG UPF can send health reports to the BNG CPF using PFCP Node Report messages. The BNG CPF uses the health reports to determine the need for a BNG UPF status change (active or standby). Per FSG, the BNG CPF selects the active and the standby BNG UPF. For example, the BNG CPF can base its decision on link failures in the access network.

The BNG UPF supports health reports for the following contexts:
  • per network instance

    Configure the health monitoring using the commands in the configure service ies subscriber-mgmt up-resiliency or configure service vprn subscriber-mgmt up-resiliency context, depending on the service. The health reports per network instance can, for example, be used to indicate the status of the network where the subscriber is serviced.

  • per Layer 2 access ID

    Configure the health monitoring using the commands in the configure service vpls capture-sap pfcp up-resiliency context. The health reports per Layer 2 access ID can, for example, be used to indicate the status of the access links.

Each health report generates a single byte health value between 0 (unhealthy) and 255 (healthy). The base health value is 255 and decreases with the number of failed members in the operation group x the configured health drop number for the operational group.

Whenever a member of the operational group changes its state (fails or recovers), the BNG UPF calculates the health value and sends an updated report to the BNG CPF.

To configure the operational group and the health drop number, use the monitor-oper-group and the monitor-oper-group health-drop commands in the above mentioned contexts.

For more information about operational groups, see 7450 ESS, 7750 SR, 7950 XRS, and VSR Layer 3 Services Guide: IES and VPRN, sections Object grouping and state monitoring.

With the following example configuration, the BNG UPF sends health reports for Layer 2 access ID (port) lag-access. The operational group has five members (port 1/1/20 to port 1/1/24) and the health value decreases with 51 per failed member, that is, with 20% of the base health value.

[gl:/configure port 1/1/20]
A:admin@BNG-UPF# info
    oper-group "lag_access_health"
[gl:/configure port 1/1/21]
A:admin@BNG-UPF# info
    oper-group "lag_access_health"
[gl:/configure port 1/1/22]
A:admin@BNG-UPF# info
    oper-group "lag_access_health"
[gl:/configure port 1/1/23]
A:admin@BNG-UPF# info
    oper-group "lag_access_health"
[gl:/configure port 1/1/24]
A:admin@BNG-UPF# info
    oper-group "lag_access_health"
[gl:/configure service vpls "access" capture-sap lag-access:*.* pfcp up-resiliency]
A:admin@BNG-UPF# info
    monitor-oper-group "lag_access_health" {
        health-drop 51
    }

With the following example configuration, the BNG UPF sends health reports for network instance HSI. The health drop number is not configured, that is, the default value of 255 is used. The health is based on a BFD session that is used to check if the BNG UPF is isolated from the rest of the network. When the BFD session is up, the health value equals 255, otherwise, the health value equals 0.

[gl:/configure service oper-group "hsi-bfd"]
A:admin@BNG-UPF# info
    bfd-liveness {
        router-instance "to_uplink_router"
        interface-name "endpoint"
        dest-ip 203.0.113.10
    }
[gl:/configure service vprn "hsi" subscriber-mgmt up-resiliency]
A:admin@BNG-UPF# info
    monitor-oper-group "hsi-bfd" {
    }

The BNG UPF sends a health report for every status change in the operational group. Additionally, it sends all health reports periodically (every 60 seconds) and when a PFCP audit is requested.