Event Handling System

This chapter provides information about event handling systems (EHS).

Topics in this chapter include:

Applicability

This chapter was initially written for SR OS Release 13.0.R3. The MD-CLI in the current edition is based on SR OS Release 23.7.R2.

SR OS Release 13.0.R1 introduced event handling system (EHS).

SR OS Release 14.0.R4 introduced EHS script enhanced capabilities, such as static variables, advanced syntax (shell scripting commands), and so on. The examples in this chapter do not include these enhancements,

Overview

The event handling system (EHS) in SR OS allows operators to configure user-defined actions defined in CLI scripts that the router executes in response to an event. The event is referred to as the trigger, where the trigger can be all or part of any event message generated by the event-control framework. The user-defined action is controlled by the script-control function. This script-control function references one or more scripts that are able to execute any command available in CLI when the trigger event occurs.

This feature allows for customized automated event management based on specific operator requirements.

Configuration

The topology shown in Example topology provides an example of an EHS configuration. All routers within the example topology participate in the same IS-IS level-2 area and run LDP. All routers are BGP speakers and form part of autonomous system 64496, exchanging routes for IPv4 address family only.

Figure 1. Example topology

PE-1 has a CE router connected (CE-1) that is indexed into a VPLS service. This VPLS has spoke-SDPs to an IES instance on both PE-2 and PE-3, which provide a redundant default gateway to CE-1 using the virtual router redundancy protocol (VRRP). The subnet used for this redundant gateway connectivity between PE-2 and PE-3 is 172.16.1.0/29. The configuration at PE-3 is shown in the following output. The configuration at PE-2 is similar; the exception being IP addressing and VRRP priority, which is 254.

# on PE-3:
configure {
    service {
        ies "IES-1" {
            admin-state enable
            service-id 1
            customer "1"
            interface "redundant-interface" {
                ip-mtu 1500
                spoke-sdp 31:1 {
                }
                ipv4 {
                    primary {
                        address 172.16.1.3
                        prefix-length 29
                    }
                    vrrp 1 {
                        backup [172.16.1.1]
                        priority 253
                        ping-reply true
                    }
                }
            }
        }

The objective is to ensure that both upstream and downstream traffic are always routed through the same PE router. That is, if PE-3 is VRRP primary, it will attract upstream traffic from CE-1 using the VRRP virtual IP/MAC. At the same time, PE-3 should also attract the downstream traffic destined toward CE-1. Having both upstream and downstream traffic transit through the same PE router, simplifies troubleshooting, QoS configuration, and reconciliation of ingress/egress statistics.

In normal operation, PE-2 is the VRRP master and advertises the BGP prefix 172.16.1.0/29 with a local preference of 100 (default value). Similarly, PE-3 is the VRRP backup and advertises the BGP prefix 172.16.1.0/29 with a local preference of 50, using the BGP export policy "redundant-interface":

# on PE-3:
configure {
    policy-options {
        prefix-list "172.16.1.0/29" {
            prefix 172.16.1.0/29 type exact {
            }
        }
        policy-statement "redundant-interface" {
            entry 10 {
                from {
                    prefix-list ["172.16.1.0/29"]
                }
                to {
                    protocol {
                        name [bgp]
                    }
                }
                action {
                    action-type accept
                    local-preference 50
                    origin igp
                }
            }
        }

Therefore, upstream and downstream traffic normally transit through PE-2. The following shows that the VRRP instance on "redundant-interface" on PE-3 is backup.

[/]
A:admin@PE-3# show router vrrp instance

===============================================================================
VRRP Instances
===============================================================================
Interface Name                   VR Id Own Adm  State       Base Pri   Msg Int
                                 IP        Opr  Pol Id      InUse Pri  Inh Int
-------------------------------------------------------------------------------
redundant-interface              1     No  Up   Backup       253       1
                                 IPv4      Up   n/a         253        No
  Backup Addr: 172.16.1.1
-------------------------------------------------------------------------------
Instances : 1
===============================================================================

When PE-3 is backup, it advertises the prefix 172.16.1.0/29 with a local preference of 50, as follows:

[/]
A:admin@PE-3# show router bgp routes 172.16.1.0/29 hunt | match 'Network|Nexthop|To|Local Pref'
Network        : 172.16.1.0/29
Nexthop        : 192.0.2.2
Res. Nexthop   : 192.168.23.1
Local Pref.    : 100                    Interface Name : int-PE-3-PE-2
Network        : 172.16.1.0/29
Nexthop        : 192.0.2.3
To             : 192.0.2.6
Res. Nexthop   : n/a
Local Pref.    : 50                     Interface Name : NotAvailable

When PE-3 transitions from backup to primary, it must modify its local preference attribute for prefix 172.16.1.0/29 to a value of 150 to attract downstream traffic destined toward CE-1. Similarly, when PE-3 reverts to backup, it must advertise the prefix with a local preference of 50.

Script control

The first step in configuring event handling is to configure a script containing the CLI commands to be executed when the event is triggered. This script can be stored locally on the compact flash, or it can be stored off-node at a defined remote URL, where it can be accessed using FTP or TFTP. When the script is stored locally on the compact flash and the router is equipped with redundant CPMs, the script must be manually saved on the same compact flash on both CPMs, because it is not synchronized automatically.

The first requirement is to modify the local preference of the prefix 172.16.1.0/29 to 150 on transition to VRRP master. The script, which in this example is held locally on CF3:/, therefore contains the following commands (where the policy-statement, redundant-interface, is the name of the export policy used to advertise the 172.16.1.0/29 prefix):

[/]
A:admin@PE-3# file show cf3:/vrrp-master.txt
File: vrrp-master.txt
-------------------------------------------------------------------------------
exit all
configure global
policy-options {
policy-statement "redundant-interface" {
entry 10 {
action {
action-type accept
local-preference 150
commit
exit all

===============================================================================

There is no syntax checking when the script file is created; instead, the script will fail with a command error. Also, transactional CLI (for example the edit command) cannot be used in the script, and will fail with a command error.

Within the system script-control context, the script is assigned a name and reference is made to its location. It is then configured with admin-state enable. When the script has been defined, a script-policy is configured that calls the previously configured script. The script-policy also specifies a location and filename for a results file that records the successful or unsuccessful conclusion of each script run and each command executed during that run. Each time the script is run, the results are recorded in a file with the name specified for results, followed by an underscore and the date and time when the script was run. A results file must be specified in order for the script to successfully run. The results file can be on the local compact flash, or a remote URL can be specified. As with the script, the script-policy must also be administratively enabled.

# on PE-3:
configure {
    system {
        script-control {
            script "vrrp-master-script" owner "TiMOS CLI" {
                admin-state enable
                location "cf3:/vrrp-master.txt"
            }
            script-policy "vrrp-master-policy" owner "TiMOS CLI" {
                admin-state enable
                max-completed 4
                results "cf3:/script-results.txt"
                lifetime forever
                script {
                    name "vrrp-master-script"
                }
            }

The optional lifetime command specifies the maximum time that the script may run. The max-completed command specifies the maximum number of script run history status entries to be retained. An optional expire-time command specifies the maximum time that the system keeps the run history status (default is 1 h). The system maintains the script run history table, which has a maximum size of 255 entries. Entries are removed from this table when the max-completed or expire-time thresholds are crossed. If the table reaches the maximum value, subsequent script launch requests are not run until older run history entries expire (due to expire-time), or entries are manually cleared. To manually clear entries, the following command is used:

clear system script-control script-policy completed <script-policy-name> 

The script run history status information can be viewed using the following command (in this case, after one successful run of the corresponding script) :

[/]
A:admin@PE-3# show system script-control script-policy "vrrp-master-policy"

===============================================================================
Script-policy Information
===============================================================================
Script-policy                : vrrp-master-policy
Script-policy Owner          : TiMOS CLI
Administrative status        : enabled
Operational status           : enabled
Script                       : vrrp-master-script
Script owner                 : TiMOS CLI
Python script                : N/A
Source location              : cf3:/vrrp-master.txt
Results location             : cf3:/script-results.txt
Max running allowed          : 1
Max completed run histories  : 4
Max lifetime allowed         : 248d 13:13:56 (21474836 seconds)
Completed run histories      : 1
Executing run histories      : 0
Initializing run histories   : 0
Max time run history saved   : 0d 01:00:00 (3600 seconds)
Script start error           : N/A
Python script start error    : N/A
Last change                  : 2023/09/14 07:36:06  N/A
Max row expire time          : never
Last application             : event-script
Last auth. user account      : not-specified

===============================================================================
Script Run History Status Information
-------------------------------------------------------------------------------
Script Run #1
-------------------------------------------------------------------------------
Start time    : 2023/09/14 09:38:29 CEST
End time      : 2023/09/14 09:38:29 CEST
Elapsed time  : 0d 00:00:00             Lifetime      : 0d 00:00:00
State         : terminated              Run exit code : noError
Result time   : 2023/09/14 09:38:29 CEST
Keep history  : 0d 00:58:50
Error time    : never
Source file   : cf3:/vrrp-master.txt
Results file  : cf3:/script-results.txt_20230914-073829-UTC.306729.out
Run exit      : Success
Error         : N/A
Application   : event-script            Auth. user ac*: not-specified
* indicates that the corresponding row element may have been truncated.
===============================================================================

Event handler

The second step in configuring event handling is to assign actions to be performed as a result of the trigger event. These actions are typically one or more configured scripts defined as entries in an action list. In the following output, the event handler is assigned the name event-handler-1, and the action list consists of a single entry. This entry calls the previously configured script policy vrrp-master-policy (which in turn references the previously defined script vrrp-master-script). If multiple actions are required based on a single event trigger, they can be configured in the action list with subsequent entries, which are run in sequence (up to 1500 action list entries are supported).

For this example, only a single entry is required; therefore, there is a one to one relationship between the event handler and the action list entry. Both the entry within the action list and the handler should be configured with admin-state enable.

# on PE-3:
configure {
    log {
        event-handling {
            handler "event-handler-1" {
                admin-state enable
                entry 10 {
                    script-policy {
                        name "vrrp-master-policy"
                    }
                }
            }

Event trigger

The final step in configuring event handling is to configure the event trigger. The event trigger defines the event that triggers the running of the script. The event trigger is based on any event generated by the event-control framework, and can match against the application and event number (event_id). Log filters can also be used to match against specific events using the subject and/or message fields. Regular expressions can be used where required. EHS will not use any message that is suppressed through event-control configuration, or any event message that is throttled.

The general format for an event in an event log is as follows:

nnnn YYYY/MM/DD HH:MM:SS.SS Zone <severity>:<application> # 
                                         <event_id> <router-name> <subject> description

Where:

nnnn                 The log entry sequence number
YYYY/MM/DD           The UTC date stamp for the log entry:
                        YYYY - Year
                        MM - Month
                        DD - Date
HH:MM:SS.SS          The UTC time stamp for the event
                        HH - Hours (24 hour format)
                        MM - Minutes
                        SS.SS - Seconds
TZONE                The timezone 
<severity>           The severity level name of the event
<application>        The application generating the log message
<event_id>           The application’s event ID number for the event
<subject>            The subject/affected object for the event
<message>            A textual description of the event

In the example, the following event message is generated when PE-3 becomes VRRP primary:

160 2023/09/14 09:37:25.606 CEST MINOR: VRRP #2001 Base Becoming Master
"VRRP virtual router instance 1 on interface redundant-interface
                                    (primary address 172.16.1.3) changed state to master"

Therefore, the event-trigger configuration is based on an application of VRRP and an event vrrptrapNewMaster (with event number 2001). In the following snippet, vrrp event vrrpTrapNewMaster is configured. The trigger entry is defined as 1, and in this example, there is only one trigger event. Up to 1500 trigger entries can be included, each of which can act as a potential trigger event. The trigger entry also references the previously configured event-handler-1. (Recall that the event handler references the script control, which in turn references the script that should be run.)

# on PE-3:
configure {
    log {
        event-trigger {
            vrrp event vrrpTrapNewMaster {
                admin-state enable
                entry 1 {
                    filter "itf 172.31.1.3 becomes primary"
                    handler "event-handler-1"
                }
            }

Finally, there is a reference to log-filter "itf 172.31.1.3 becomes primary". Without more explicit filtering, event handling will be triggered on any event with the application of VRRP and event number 2001. There may be multiple VRRP instances running on this router, but the requirement is that event handling should only be triggered when the VRRP instance running on redundant-interface transitions to master at PE-3. Therefore, log filter "itf 172.31.1.3 becomes primary" is used to define a more explicit match using the message field, which contains an explicit reference to the interface. Both the trigger entry and the event handler must be administratively enabled.

configure {
    log {
        filter "itf 172.31.1.3 becomes primary"
            default-action drop
            named-entry "newPrimary" {
                action forward
                match {
                    message {
                        eq "interface redundant-interface 
                                    (primary address 172.16.1.3) changed state to master"
                    }
                }
            }

The configuration of the example event handling for the failure event (PE-3 transitions to VRRP primary) is now complete. By disabling the spoke-SDP between PE-1 and PE-2, it is possible to simulate a failure event where the VRRP message path is broken. Therefore, five events are generated.

  • The first indicates that PE-3 has become VRRP master for the interface named redundant-interface.

  • The second indicates that EHS handler event-handler-1 was invoked by a CLI user.

  • The third indicates that a script file has initiated an attempt to execute CLI commands contained in script file vrrp-master.txt.

  • The fourth indicates that a commit by Cron/EHS was successful.

  • The fifth indicates that the attempt to execute those CLI commands was successful.

164 2023/09/14 09:38:29.306 CEST MINOR: VRRP #2001 Base Becoming Master
"VRRP virtual router instance 1 on interface redundant-interface (primary address 172.16.1.3) 
changed state to master"

165 2023/09/14 09:38:29.306 CEST MINOR: SYSTEM #2069 Base EHS script
"Ehs handler :"event-handler-1" with the description : "" was invoked by the cli-user account
 "not-specified"."

166 2023/09/14 09:38:29.309 CEST MAJOR: SYSTEM #2052 Base CLI 'exec'
"A CLI user has initiated an 'exec' operation to process the commands in the SROS CLI file 
cf3:/vrrp-master.txt"

167 2023/09/14 09:38:29.314 CEST WARNING: SYSTEM #2121 Base Commit
"Commit to configure by  (Cron/EHS) from Cron/EHS succeeded."

168 2023/09/14 09:38:29.315 CEST MAJOR: SYSTEM #2053 Base CLI 'exec'
"The CLI user initiated 'exec' operation to process the commands in the SROS CLI file 
cf3:/vrrp-master.txt has completed with the result of success"

A successful script run shows the commands contained in the script, followed by an indication that the commands were executed.

[/]
A:admin@PE-3# file show cf3:/script-results.txt_20230914-073829-UTC.306729.out
File: script-results.txt_20230914-073829-UTC.306729.out
-------------------------------------------------------------------------------
exit all
configure global
INFO: CLI #2054: Entering global configuration mode
policy-options {
policy-statement "redundant-interface" {
entry 10 {
action {
action-type accept
local-preference 150
commit
exit all
INFO: CLI #2056: Exiting global configuration mode
Executed 10 lines in 0.0 seconds from file "cf3:/vrrp-master.txt"

===============================================================================

The following output confirms that PE-3 is VRRP primary:

[/]
A:admin@PE-3# show router vrrp instance

===============================================================================
VRRP Instances
===============================================================================
Interface Name                   VR Id Own Adm  State       Base Pri   Msg Int
                                 IP        Opr  Pol Id      InUse Pri  Inh Int
-------------------------------------------------------------------------------
redundant-interface              1     No  Up   Master       253       1
                                 IPv4      Up   n/a         253        No
  Backup Addr: 172.16.1.1
-------------------------------------------------------------------------------
Instances : 1
===============================================================================

Also, the local preference attribute for prefix 172.16.1.0/29 has changed to a value of 150. The result of this action is that PE-3 will now be the transit router for both upstream and downstream traffic.

[/]
A:admin@PE-3# show router bgp routes 172.16.1.0/29 hunt | match 'Network|Nexthop|To|Local Pref'
Network        : 172.16.1.0/29
Nexthop        : 192.0.2.3
Res. Nexthop   : Unresolved
Local Pref.    : 150                    Interface Name : NotAvailable
---snip---
Network        : 172.16.1.0/29
Nexthop        : 192.0.2.3
To             : 192.0.2.6
Res. Nexthop   : n/a
Local Pref.    : 150                    Interface Name : NotAvailable

The event handler indicates that the referenced script was triggered and run using the command shown in the following output. The Handler Action-List Entry Execution Statistics window provides statistics on the number of times an action (script) was queued to run, and the number of times an error was experienced, both during launch and due to a non-operational admin status. The remainder of the fields in the output are self-explanatory.

[/]
A:admin@PE-3# show log event-handling handler "event-handler-1"

===============================================================================
Event Handling System - Handlers
===============================================================================

===============================================================================
Handler          : event-handler-1
===============================================================================
Description      : (Not Specified)
Admin State      : up                                Oper State : up

-------------------------------------------------------------------------------
Handler Execution Statistics
  Success        : 1
  Err No Entry   : 0
  Err Adm Status : 0
Total            : 1

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Handler Action-List Entry
-------------------------------------------------------------------------------
Entry-id         : 10
Description      : (Not Specified)
Admin State      : up                                Oper State : up
Script
  Policy Name    : vrrp-master-policy
  Policy Owner   : TiMOS CLI
Min Delay        : 0
Last Exec        : 09/14/23 09:38:29 CEST
-------------------------------------------------------------------------------
Handler Action-List Entry Execution Statistics
  Success        : 1
  Err Min Delay  : 0
  Err Launch     : 0
  Err Adm Status : 0
Total            : 1
===============================================================================

The example includes an event trigger and script to meet the requirements of a fail-forward where PE-3 becomes VRRP primary. Now, configuration is needed for when PE-3 reverts to VRRP backup. Without another event trigger and script, PE-3 will continue to advertise the prefix 172.16.1.0/29 with a local preference of 150 and upstream/downstream traffic will be asymmetric through PE-2/PE-3 respectively.

As before, a script is required. Because PE-2 advertises the prefix with a local preference of 100 (default), PE-3 needs to advertise the same prefix with a lower value (50 in the following output), so that PE-2 is the preferred next hop.

[/]
A:admin@PE-3# file show cf3:vrrp-backup.txt
File: vrrp-backup.txt
-------------------------------------------------------------------------------
exit all
configure global
policy-options {
policy-statement "redundant-interface" {
entry 10 {
action {
action-type accept
local-preference 50
commit
exit all

===============================================================================

The script must then be configured within the script-control context, and subsequently referenced in a script policy as vrrp-backup-policy.

# on PE-3:
configure {
    system {
        script-control {
            script "vrrp-backup-script" owner "TiMOS CLI" {
                admin-state enable
                location "cf3:/vrrp-backup.txt"
            }
            script-policy "vrrp-backup-policy" owner "TiMOS CLI" {
                admin-state enable
                max-completed 4
                results "cf3:/script-revert-results.txt"
                lifetime forever
                script {
                    name "vrrp-backup-script"
                }
            }

The event handler acts as the interface between the configured script policy and event trigger. Therefore, a second event handler is configured with an entry referencing the newly configured vrrp-backup-policy.

# on PE-3:
configure {
    log {
        event-handling {               
            handler "event-handler-2" {
                admin-state enable
                entry 10 {
                    script-policy {
                        name "vrrp-backup-policy"
                    }
                }
            }

Finally, the event trigger is configured. To revert to VRRP Backup, the application is VRRP and the event is tmnxVrrpBecameBackup (event number 2006). The configuration is filtered on the message field, as before, using log filter "itf 172.16.1.3 state becomes backup", so that it is specific to the interface named redundant-interface.

# on PE-3:
configure {
    log {
        filter "itf 172.16.1.3 state becomes backup"
            default-action drop
            named-entry "becameBackup" {
                action forward
                match {
                    message {
                        eq "interface redundant-interface changed state to backup"
                    }
                }
            }
        }
# on PE-3:
configure {
    log {
        event-trigger {
            vrrp event tmnxVrrpBecameBackup {
                admin-state enable
                entry 1 {
                    filter "itf 172.16.1.3 state becomes backup"
                    handler "event-handler-2"
                }
            }
The configuration of the example event handling for the revertive failure event (PE-3 transitions to VRRP backup) is now complete. By re-enabling the spoke-SDP between PE-1 and PE-2, the VRRP message path is restored, and PE-2 again becomes the VRRP master. The following events are generated:
  • The first indicates that PE-3 has become VRRP backup for the interface named redundant-interface.
  • The second indicates that EHS handler event-handler-2 was invoked by a CLI user.
  • The third indicates that a script file has initiated an attempt to execute CLI commands contained in script file vrrp-backup.txt.
  • The fourth indicates that a commit by Cron/EHS was successful.

  • The fifth indicates that the attempt to execute those CLI commands was successful.

175 2023/09/14 09:46:07.110 CEST MINOR: VRRP #2006 Base Becoming Backup
"VRRP virtual router instance 1 on interface redundant-interface changed state to backup
 - current master is 172.16.1.2"

176 2023/09/14 09:46:07.110 CEST MINOR: SYSTEM #2069 Base EHS script
"Ehs handler :"event-handler-2" with the description : "" was invoked by the cli-user 
account "not-specified"."

177 2023/09/14 09:46:07.113 CEST MAJOR: SYSTEM #2052 Base CLI 'exec'
"A CLI user has initiated an 'exec' operation to process the commands in the SROS CLI file 
cf3:/vrrp-backup.txt"

178 2023/09/14 09:46:07.118 CEST WARNING: SYSTEM #2121 Base Commit
"Commit to configure by  (Cron/EHS) from Cron/EHS succeeded."

179 2023/09/14 09:46:07.118 CEST MAJOR: SYSTEM #2053 Base CLI 'exec'
"The CLI user initiated 'exec' operation to process the commands in the SROS CLI file 
cf3:/vrrp-backup.txt has completed with the result of success"

The following outputs confirm that PE-3 is VRRP backup, and that the local preference attribute for prefix 172.16.1.0/29 has changed to a value of 50. The result of this action is that PE-2 will now be the transit router for both upstream and downstream traffic.

[/]
A:admin@PE-3# show router vrrp instance

===============================================================================
VRRP Instances
===============================================================================
Interface Name                   VR Id Own Adm  State       Base Pri   Msg Int
                                 IP        Opr  Pol Id      InUse Pri  Inh Int
-------------------------------------------------------------------------------
redundant-interface              1     No  Up   Backup       253       1
                                 IPv4      Up   n/a         253        No
  Backup Addr: 172.16.1.1
-------------------------------------------------------------------------------
Instances : 1
===============================================================================
[/]
A:admin@PE-3# show router bgp routes 172.16.1.0/29 hunt | match 'Network|Nexthop|To|Local Pref'
Network        : 172.16.1.0/29
Nexthop        : 192.0.2.2
Res. Nexthop   : 192.168.23.1
Local Pref.    : 100                    Interface Name : int-PE-3-PE-2
Network        : 172.16.1.0/29
Nexthop        : 192.0.2.3
To             : 192.0.2.6
Res. Nexthop   : n/a
Local Pref.    : 50                     Interface Name : NotAvailable

Conclusion

EHS allows operators to configure user-defined actions on the router when an event occurs. The event trigger can be anything that is generated by the event-control framework, and explicit filtering is possible using regular expressions. A user-defined action typically runs a script that allows any CLI commands to be executed. Multiple actions are permitted, running multiple scripts if required.