Geo-redundancy

Two MAG-c systems can be deployed in a geo-redundant configuration. If one system fails, the other system ensures uninterrupted service and minimizes the impact of failure for broadband clients.

Geo-redundancy overview

Geo-redundancy is the deployment of two MAG-c systems, a primary and secondary system, in a redundant configuration. If the primary system fails, the secondary system continues the service and minimizes the impact of failure for broadband clients.

The session states are synchronized between the two systems so that the secondary system can operate at full state after switchover.

The user configures the administrative primary or secondary role for each system. The operational active and standby roles are determined at run time. The algorithm ensures that only one system has the active role at a specific time.

Both systems get the same service address (for example, the PFCP address) and advertise their route with different metrics. Geo-redundancy is integrated in the routing protocols so that the active system always attracts the CP traffic.

The following figure shows a geo-redundant MAG-c deployment.
Figure 1. Geo-redundant MAG-c deployment

Operational and administrative roles

Each system in a geo-redundant deployment has one of the following operational roles at runtime:
  • active – attracts and processes the traffic
  • standby – takes over from the active system when required

Only one system is active and only the active system processes traffic.

Each system also has one of the following configured administrative roles:
  • primary
  • secondary
The primary system is preferred over the secondary when the active role is assigned.

The administrative and operational roles represent two different concepts; that is, a primary system can be standby, and a secondary system can be active.

The following events trigger a switchover of the operational roles:

  • failure of the active system
  • execution of the following command for a manual switchover:
    admin redundancy mc-mobile-switchover

The mc-mobile protocol runs between the two systems to select the active system. The standby system detects a failure of the active system when mc-mobile goes down. Optionally, a BFD session can be bound to mc-mobile to speed up the failure detection.

Use the following command to display the administrative and operational roles:
show redundancy multi-chassis mc-mobile peer

Traffic detection

A network issue that brings down the mc-mobile protocol triggers an unwanted switchover when the active system has not failed. To avoid the actual switchover in this scenario, the standby system listens for incoming traffic (traffic detection) before switching to the active role.

Use the following command to configure the traffic detection behavior:
configure redundancy multi-chassis peer mc-mobile traffic-detection
When the traffic detection is set to relaxed, the standby system changes to the active role only when it receives a PFCP or IBCP packet.
Use the following command to configure traffic detection on the active system:
configure redundancy multi-chassis peer mc-mobile traffic-detection-master
When the preceding command is enabled, the active system performs does traffic detection to avoid an active/active scenario when mc-mobile is down.
Note: Nokia recommends setting the traffic-detection to relaxed and the traffic-detection-master to enable.

If the number of BNG-UPs is small, ensure that the MAG-c receives PFCP or IBCP packets during traffic detection by decreasing the PFCP heartbeat timer on the BNG-UP and increasing the traffic detection time on the MAG-c.

Use the following command to decrease the PFCP heartbeat timer on the BNG-UP:
  • MD-CLI
    configure subscriber-mgmt pfcp association heartbeat
  • classic CLI
    configure subscriber-mgmt pfcp-association heartbeat
Use the following command to increase the traffic detection timer on the MAG-c:
configure redundancy multi-chassis peer mc-mobile traffic-detection-poll-timer

State synchronization

The session states are synchronized between the active and standby system to ensure continuity of service after a switchover.

The system can be in one of the following synchronization states:

  • hot – all session states are synchronized
  • warm – synchronization is ongoing
  • cold – no session states are synchronized
Use the following command to configure the session state synchronization:
configure redundancy multi-chassis peer mc-mobile mc-complete-ue-sync
Use the following command to display the synchronization state:
show redundancy multi-chassis mc-mobile

Routing

The active system attracts traffic by advertising its corresponding route with a better metric than the standby system. The active and standby systems advertise the same route with different metrics based on their operational state. To achieve routes with different metrics, use the following command to configure a router policy to export the route and configure different entries with different metrics:
configure router policy-options policy-statement entry
The options for the from state command in the preceding context include the following:
  • mobile-master

    Routes associated with an active system match this entry.

  • mobile-slave

    Routes associated with the following systems match this entry:

    • standby system when mc-mobile is down or shunting is down (if shunting is configured) or shunting is not configured (see Shunting)
      Note: When mc-mobile is up and shunting is up, the route on the standby system does not match this entry.
    • active system during manual switchover after the active system switches to standby, but before the route is withdrawn from the active system
  • mobile-pre-slave

    Routes associated with an active system during a manual switchover before the active system switches to standby match this entry.

The configured metrics on the primary and secondary systems for each state must meet the following requirements:

  • The metric values for each state must be assigned from best to worst in the following order:
    1. primary active (best metric value)
    2. secondary active
    3. primary standby
    4. secondary standby (worst metric value)
  • The mobile-pre-slave and the mobile-master metric values must be identical.

Shunting

When the standby system receives a packet (for example, before the routing finishes convergence after a switchover), it can forward the packet to the active system. This behavior, which is called shunting, is configurable.

Use the following command to enable shunting:
configure redundancy multi-chassis peer mc-mobile mc-redirect

Shunting is supported for VPRN over generic routing encapsulation (GRE) service destination point (SDP) for the following types of traffic:

  • PFCP
  • IBCP
  • RADIUS CoA

The standby system performs shunting when a received and resolved multiprotocol BGP (MPBGP) VPN-IPv4 orVPN-IPv6 route matches the local route for supported traffic. The MPBGP route is preferred over the local route.

Manual switchover

Use the following command to execute a manual switchover:
admin redundancy mc-mobile-switchover

A manual switchover can only be triggered on the active system.

The process of a manual switchover is as follows:

  1. A user executes the mc-mobile-switchover command on the active system.
  2. The active system starts synchronizing its session states with the standby system and changes its state to mobile-pre-slave, which triggers a routing metric update.
  3. When the synchronization is complete, the active system changes its state to mobile-slave, which triggers a next routing metric update.
  4. If configured, shunting is enabled on the active system.
  5. The standby system becomes the active system and advertises its route with state mobile-master.
  6. The previously active system (now standby) withdraws its routes.

Deploying and configuring geo-redundancy

Deployment guidelines

When using geo-redundancy, configure the PFCP path timers so that it takes longer to terminate a PFCP path than to complete a geo-redundancy switchover, including the detection time and the convergence of the routing. Otherwise, the BNG-UPs may terminate a PFCP path and remove all corresponding sessions during a geo-redundancy switchover.

The PFCP headless mode can be used to achieve the above configuration. See PFCP connectivity failure for more information about the PFCP path timers.

Configuration commands

The following commands and their leaf commands configure geo-redundancy:
configure redundancy multi-chassis peer mc-mobile
configure router policy-options policy-statement

Geo-redundancy configuration on a primary system

config>redundancy>multi-chassis# info
----------------------------------------------
            peer 46.46.46.46 create
                mc-mobile
                    mc-redirect
                    mc-complete-ue-sync
                    master-traffic-detection enable
                    traffic-detection relaxed
                    mobile-gateway 1 role primary
                        no shutdown
                    exit
                exit
                no shutdown
            exit
----------------------------------------------
config>router>policy-options>policy-statement# info
----------------------------------------------
                entry 10
                    from
                        prefix-list "prefix-list-1"
                        state mobile-slave
                    exit
                    action accept
                        community add "vprn100"
                        metric set 30
                    exit
                exit
                entry 20
                    from
                        prefix-list "prefix-list-1"
                        state mobile-master
                    exit
                    action accept
                        community add "vprn100"
                        metric set 10
                    exit
                exit
                entry 30
                    from
                        prefix-list "prefix-list-1"
                        state mobile-pre-slave
                    exit
                    action accept
                        community add "vprn100"
                        metric set 10
                    exit
                exit

Geo-redundancy configuration on a secondary system

config>redundancy>multi-chassis# info
----------------------------------------------
            peer 45.45.45.45 create
                mc-mobile
                    mc-redirect
                    mc-complete-ue-sync
                    master-traffic-detection enable
                    traffic-detection relaxed
                    mobile-gateway 1 role secondary
                        no shutdown
                    exit
                exit
                no shutdown
            exit
----------------------------------------------
config>router>policy-options>policy-statement# info
            policy-statement "mcred"
                entry 10
                    from
                        prefix-list "prefix-list-1"
                        state mobile-slave
                    exit
                    action accept
                        community add "vprn100"
                        metric set 40
                    exit
                exit
                entry 20
                    from
                        prefix-list "prefix-list-1"
                        state mobile-master
                    exit
                    action accept
                        community add "vprn100"
                        metric set 20
                    exit
                exit
                entry 30
                    from
                        prefix-list "prefix-list-1"
                        state mobile-pre-slave
                    exit
                    action accept
                        community add "vprn100"
                        metric set 20
                    exit
                exit
            exit