Event Handling System
This chapter provides information about event handling systems (EHS).
Topics in this chapter include:
Applicability
This chapter was initially written for SR OS Release 13.0.R3. The CLI in the current edition is based on SR OS Release 23.7.R2.
SR OS Release 13.0.R1 introduced event handling system (EHS).
SR OS Release 14.0.R4 introduced EHS script enhanced capabilities, such as static variables, advanced syntax (shell scripting commands), and so on. The examples in this chapter do not include these enhancements,
Overview
The event handling system (EHS) in SR OS allows operators to configure user-defined actions defined in CLI scripts that the router executes in response to an event. The event is referred to as the trigger, where the trigger can be all or part of any event message generated by the event-control framework. The user-defined action is controlled by the script-control function. This script-control function references one or more scripts that are able to execute any command available in CLI when the trigger event occurs.
This feature allows for customized automated event management based on specific operator requirements.
Configuration
The topology shown in Example topology provides an example of an EHS configuration. All routers within the example topology participate in the same IS-IS level-2 area and run LDP. All routers are BGP speakers and form part of autonomous system 64496, exchanging routes for IPv4 address family only.
PE-1 has a CE router connected (CE-1) that is indexed into a VPLS service. This VPLS has spoke-SDPs to an IES instance on both PE-2 and PE-3, which provide a redundant default gateway to CE-1 using the virtual router redundancy protocol (VRRP). The subnet used for this redundant gateway connectivity between PE-2 and PE-3 is 172.16.1.0/29. The configuration at PE-3 is shown in the following output. The configuration at PE-2 is similar; the exception being IP addressing and VRRP priority, which is 254.
# on PE-3:
configure
service
ies 1 name "IES-1" customer 1 create
interface "redundant-interface" create
address 172.16.1.3/29
ip-mtu 1500
vrrp 1
backup 172.16.1.1
priority 253
ping-reply
exit
spoke-sdp 31:1 create
no shutdown
exit
exit
no shutdown
exit
The objective is to ensure that both upstream and downstream traffic are always routed through the same PE router. That is, if PE-3 is VRRP primary, it will attract upstream traffic from CE-1 using the VRRP virtual IP/MAC. At the same time, PE-3 should also attract the downstream traffic destined toward CE-1. Having both upstream and downstream traffic transit through the same PE router, simplifies troubleshooting, QoS configuration, and reconciliation of ingress/egress statistics.
In normal operation, PE-2 is the VRRP master and advertises the BGP prefix 172.16.1.0/29 with a local preference of 100 (default value). Similarly, PE-3 is the VRRP backup and advertises the BGP prefix 172.16.1.0/29 with a local preference of 50, using the BGP export policy "redundant-interface":
# on PE-3:
configure
router Base
policy-options
begin
prefix-list "172.16.1.0/29"
prefix 172.16.1.0/29 exact
exit
policy-statement "redundant-interface"
entry 10
from
prefix-list "172.16.1.0/29"
exit
to
protocol bgp
exit
action accept
origin igp
local-preference 50
exit
exit
exit
commit
Therefore, upstream and downstream traffic normally transit through PE-2. The following shows that the VRRP instance on "redundant-interface" on PE-3 is backup.
*A:PE-3# show router vrrp instance
===============================================================================
VRRP Instances
===============================================================================
Interface Name VR Id Own Adm State Base Pri Msg Int
IP Opr Pol Id InUse Pri Inh Int
-------------------------------------------------------------------------------
redundant-interface 1 No Up Backup 253 1
IPv4 Up n/a 253 No
Backup Addr: 172.16.1.1
-------------------------------------------------------------------------------
Instances : 1
===============================================================================
When PE-3 is backup, it advertises the prefix 172.16.1.0/29 with a local preference of 50, as follows:
*A:PE-3# show router bgp routes 172.16.1.0/29 hunt | match expression "Network|Nexthop|To|Local Pref"
Network : 172.16.1.0/29
Nexthop : 192.0.2.2
Res. Nexthop : 192.168.23.1
Local Pref. : 100 Interface Name : int-PE-3-PE-2
Network : 172.16.1.0/29
Nexthop : 192.0.2.3
To : 192.0.2.6
Res. Nexthop : n/a
Local Pref. : 50 Interface Name : NotAvailable
When PE-3 transitions from backup to primary, it must modify its local preference attribute for prefix 172.16.1.0/29 to a value of 150 to attract downstream traffic destined toward CE-1. Similarly, when PE-3 reverts to backup, it must advertise the prefix with a local preference of 50.
Script control
The first step in configuring event handling is to configure a script containing the CLI commands to be executed when the event is triggered. This script can be stored locally on the compact flash, or it can be stored off-node at a defined remote URL, where it can be accessed using FTP or TFTP. When the script is stored locally on the compact flash and the router is equipped with redundant CPMs, the script must be manually saved on the same compact flash on both CPMs, because it is not synchronized automatically.
The first requirement is to modify the local preference of the prefix 172.16.1.0/29 to 150 on transition to VRRP master. The script, which in this example is held locally on CF3:/, therefore contains the following commands (where the policy-statement, redundant-interface, is the name of the export policy used to advertise the 172.16.1.0/29 prefix):
*A:PE-3>file cf3:\ # type cf3:vrrp-master.txt
File: vrrp-master.txt
-------------------------------------------------------------------------------
exit all
configure router policy-options
begin
policy-statement redundant-interface
entry 10
action accept
local-preference 150
exit
exit
exit
commit
exit all
===============================================================================
There is no syntax checking when the script file is created; instead, the script will fail with a command error. Also, transactional CLI (for example the edit command) cannot be used in the script, and will fail with a command error.
Within the system script-control context, the script is assigned a name and reference is made to its location. It is then put in the no shutdown state. When the script has been defined, a script-policy is configured that calls the previously configured script. The script-policy also specifies a location and filename for a results file that records the successful or unsuccessful conclusion of each script run and each command executed during that run. Each time the script is run, the results are recorded in a file with the name specified for results, followed by an underscore and the date and time when the script was run. A results file must be specified in order for the script to successfully run. The results file can be on the local compact flash, or a remote URL can be specified. As with the script, the script-policy must also be put in the no shutdown state.
# on PE-3:
configure
system
script-control
script "vrrp-master-script"
location "cf3:/vrrp-master.txt"
no shutdown
exit
script-policy "vrrp-master-policy"
results "cf3:/script-results.txt"
script "vrrp-master-script"
max-completed 4
expire-time 3600
lifetime forever
no shutdown
exit
exit
The optional lifetime command specifies the maximum time that the script may run. The max-completed command specifies the maximum number of script run history status entries to be retained. An optional expire-time command specifies the maximum time that the system keeps the run history status (default is 1 h). The system maintains the script run history table, which has a maximum size of 255 entries. Entries are removed from this table when the max-completed or expire-time thresholds are crossed. If the table reaches the maximum value, subsequent script launch requests are not run until older run history entries expire (due to expire-time), or entries are manually cleared. To manually clear entries, the following command is used:
clear system script-control script-policy completed <script-policy-name>
The script run history status information can be viewed using the following command (in this case, after one successful run of the corresponding script) :
*A:PE-3# show system script-control script-policy "vrrp-master-policy"
===============================================================================
Script-policy Information
===============================================================================
Script-policy : vrrp-master-policy
Script-policy Owner : TiMOS CLI
Administrative status : enabled
Operational status : enabled
Script : vrrp-master-script
Script owner : TiMOS CLI
Python script : N/A
Source location : cf3:/vrrp-master.txt
Results location : cf3:/script-results.txt
Max running allowed : 1
Max completed run histories : 4
Max lifetime allowed : 248d 13:13:56 (21474836 seconds)
Completed run histories : 1
Executing run histories : 0
Initializing run histories : 0
Max time run history saved : 0d 01:00:00 (3600 seconds)
Script start error : N/A
Python script start error : N/A
Last change : 2023/09/13 07:43:55 UTC
Max row expire time : never
Last application : event-script
Last auth. user account : not-specified
===============================================================================
Script Run History Status Information
-------------------------------------------------------------------------------
Script Run #1
-------------------------------------------------------------------------------
Start time : 2023/09/13 07:45:35 UTC
End time : 2023/09/13 07:45:35 UTC
Elapsed time : 0d 00:00:00 Lifetime : 0d 00:00:00
State : terminated Run exit code : noError
Result time : 2023/09/13 07:45:35 UTC
Keep history : 0d 00:59:29
Error time : never
Source file : cf3:/vrrp-master.txt
Results file : cf3:/script-results.txt_20230913-074534-UTC.833059.out
Run exit : Success
Error : N/A
Application : event-script Auth. user ac*: not-specified
* indicates that the corresponding row element may have been truncated.
===============================================================================
Event handler
The second step in configuring event handling is to assign actions to be performed as a result of the trigger event. These actions are typically one or more configured scripts defined as entries in an action list. In the following output, the event handler is assigned the name event-handler-1, and the action list consists of a single entry. This entry calls the previously configured script policy vrrp-master-policy (which in turn references the previously defined script vrrp-master-script). If multiple actions are required based on a single event trigger, they can be configured in the action list with subsequent entries, which are run in sequence (up to 1500 action list entries are supported).
For this example, only a single entry is required; therefore, there is a one to one relationship between the event handler and the action list entry. Both the entry within the action list and the handler should be put in the no shutdown state.
# on PE-3:
configure
log
event-handling
handler "event-handler-1"
action-list
entry 10
script-policy "vrrp-master-policy"
no shutdown
exit
exit
no shutdown
exit
exit
Event trigger
The final step in configuring event handling is to configure the event trigger. The event trigger defines the event that triggers the running of the script. The event trigger is based on any event generated by the event-control framework, and can match against the application and event number (event_id). Log filters can also be used to match against specific events using the subject and/or message fields. Regular expressions can be used where required. EHS will not use any message that is suppressed through event-control configuration, or any event message that is throttled.
The general format for an event in an event log is as follows:
nnnn YYYY/MM/DD HH:MM:SS.SS Zone <severity>:<application> #
<event_id> <router-name> <subject> description
Where:
nnnn The log entry sequence number
YYYY/MM/DD The UTC date stamp for the log entry:
YYYY - Year
MM - Month
DD - Date
HH:MM:SS.SS The UTC time stamp for the event
HH - Hours (24 hour format)
MM - Minutes
SS.SS - Seconds
TZONE The timezone
<severity> The severity level name of the event
<application> The application generating the log message
<event_id> The application’s event ID number for the event
<subject> The subject/affected object for the event
<message> A textual description of the event
In the example, the following event message is generated when PE-3 becomes VRRP primary:
152 2023/09/13 07:44:50.432 UTC MINOR: VRRP #2001 Base Becoming Master
"VRRP virtual router instance 1 on interface redundant-interface
(primary address 172.16.1.3) changed state to master"
Therefore, the event-trigger configuration is based on an application of VRRP and an event number of 2001 (vrrptrapNewMaster). In the following snippet, vrrp 2001 is configured as the event. The trigger entry is defined as 1, and in this example, there is only one trigger event. Up to 1500 trigger entries can be included, each of which can act as a potential trigger event. The trigger entry also references the previously configured event-handler-1. (Recall that the event handler references the script control, which in turn references the script that should be run.)
# on PE-3:
configure
log
event-trigger
event "vrrp" 2001
trigger-entry 1
event-handler "event-handler-1"
log-filter 1
no shutdown
exit
no shutdown
exit
exit
Finally, there is a reference to log-filter 1. Without more explicit filtering, event handling will be triggered on any event with the application of VRRP and event number 2001. There may be multiple VRRP instances running on this router, but the requirement is that event handling should only be triggered when the VRRP instance running on redundant-interface transitions to master at PE-3. Therefore, log filter 1 is used to define a more explicit match using the message field, which contains an explicit reference to the interface. Both the trigger entry and the event handler should be put in the no shutdown state.
configure
log
filter 1 name "itf 172.31.1.3 becomes primary"
default-action drop
entry 10 name "newPrimary"
action forward
match
message eq pattern "interface redundant-interface
(primary address 172.16.1.3) changed state to master"
exit
exit
exit
The configuration of the example event handling for the failure event (PE-3 transitions to VRRP primary) is now complete. By shutting down the spoke-SDP between PE-1 and PE-2, it is possible to simulate a failure event where the VRRP message path is broken. Therefore, four events are generated.
The first indicates that PE-3 has become VRRP master for the interface named redundant-interface.
The second indicates that EHS handler event-handler-1 was invoked by a CLI user.
The third indicates that a script file has initiated an attempt to execute CLI commands contained in script file vrrp-master.txt.
The fourth indicates that the attempt to execute those CLI commands was successful.
154 2023/09/13 07:45:34.832 UTC MINOR: VRRP #2001 Base Becoming Master
"VRRP virtual router instance 1 on interface redundant-interface
(primary address 172.16.1.3) changed state to master"
155 2023/09/13 07:45:34.832 UTC MINOR: SYSTEM #2069 Base EHS script
"Ehs handler :"event-handler-1" with the description : "" was invoked by the
cli-user account "not-specified"."
156 2023/09/13 07:45:34.836 UTC MAJOR: SYSTEM #2052 Base CLI 'exec'
"A CLI user has initiated an 'exec' operation to process the commands in the SROS CLI
file cf3:/vrrp-master.txt"
157 2023/09/13 07:45:34.841 UTC MAJOR: SYSTEM #2053 Base CLI 'exec'
"The CLI user initiated 'exec' operation to process the commands in the SROS CLI
file cf3:/vrrp-master.txt has completed with the result of success"
A successful script run shows the commands contained in the script, followed by an indication that the commands were executed.
*A:PE-3>file cf3:\ # type script-results.txt_20230913-074534-UTC.833059.out
File: script-results.txt_20230913-074534-UTC.833059.out
-------------------------------------------------------------------------------
exit all
configure router policy-options
begin
policy-statement redundant-interface
entry 10
action accept
local-preference 150
exit
exit
exit
commit
exit all
Executed 14 lines in 0.0 seconds from file "cf3:/vrrp-master.txt"
===============================================================================
The following output confirms that PE-3 is VRRP primary:
*A:PE-3# show router vrrp instance
===============================================================================
VRRP Instances
===============================================================================
Interface Name VR Id Own Adm State Base Pri Msg Int
IP Opr Pol Id InUse Pri Inh Int
-------------------------------------------------------------------------------
redundant-interface 1 No Up Master 253 1
IPv4 Up n/a 253 No
Backup Addr: 172.16.1.1
-------------------------------------------------------------------------------
Instances : 1
===============================================================================
Also, the local preference attribute for prefix 172.16.1.0/29 has changed to a value of 150. The result of this action is that PE-3 will now be the transit router for both upstream and downstream traffic.
*A:PE-3# show router bgp routes 172.16.1.0/29 hunt | match expression "Network|Nexthop|To|Local Pref"
Network : 172.16.1.0/29
Nexthop : 192.0.2.3
Res. Nexthop : Unresolved
Local Pref. : 150 Interface Name : NotAvailable
---snip---
Network : 172.16.1.0/29
Nexthop : 192.0.2.3
To : 192.0.2.6
Res. Nexthop : n/a
Local Pref. : 150 Interface Name : NotAvailable
The event handler indicates that the referenced script was triggered and run using the command shown in the following output. The Handler Action-List Entry Execution Statistics window provides statistics on the number of times an action (script) was queued to run, and the number of times an error was experienced, both during launch and due to a non-operational admin status. The remainder of the fields in the output are self-explanatory.
*A:PE-3# show log event-handling handler "event-handler-1"
===============================================================================
Event Handling System - Handlers
===============================================================================
===============================================================================
Handler : event-handler-1
===============================================================================
Description : (Not Specified)
Admin State : up Oper State : up
-------------------------------------------------------------------------------
Handler Execution Statistics
Success : 1
Err No Entry : 0
Err Adm Status : 0
Total : 1
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Handler Action-List Entry
-------------------------------------------------------------------------------
Entry-id : 10
Description : (Not Specified)
Admin State : up Oper State : up
Script
Policy Name : vrrp-master-policy
Policy Owner : TiMOS CLI
Min Delay : 0
Last Exec : 09/13/23 07:45:35 UTC
-------------------------------------------------------------------------------
Handler Action-List Entry Execution Statistics
Success : 1
Err Min Delay : 0
Err Launch : 0
Err Adm Status : 0
Total : 1
===============================================================================
The example includes an event trigger and script to meet the requirements of a fail-forward where PE-3 becomes VRRP primary. Now, configuration is needed for when PE-3 reverts to VRRP backup. Without another event trigger and script, PE-3 will continue to advertise the prefix 172.16.1.0/29 with a local preference of 150 and upstream/downstream traffic will be asymmetric through PE-2/PE-3 respectively.
As before, a script is required. Because PE-2 advertises the prefix with a local preference of 100 (default), PE-3 needs to advertise the same prefix with a lower value (50 in the following output), so that PE-2 is the preferred next hop.
*A:PE-3>file cf3:\ # type cf3:vrrp-backup.txt
File: vrrp-backup.txt
-------------------------------------------------------------------------------
exit all
configure router policy-options
begin
policy-statement redundant-interface
entry 10
action accept
local-preference 50
exit
exit
exit
commit
exit all
===============================================================================
The script must then be configured within the script-control context, and subsequently referenced in a script policy as vrrp-backup-policy.
# on PE-3:
configure
system
script-control
script "vrrp-backup-script"
location "cf3:/vrrp-backup.txt"
no shutdown
exit
script-policy "vrrp-backup-policy"
results "cf3:/script-revert-results.txt"
script "vrrp-backup-script"
max-completed 4
lifetime forever
no shutdown
exit
The event handler acts as the interface between the configured script policy and event trigger. Therefore, a second event handler is configured with an action list consisting of a single entry referencing the newly configured vrrp-backup-policy.
# on PE-3:
configure
log
event-handling
handler "event-handler-2"
action-list
entry 10
script-policy "vrrp-backup-policy"
no shutdown
exit
exit
no shutdown
exit
Finally, the event trigger is configured. To revert to VRRP Backup, the application is VRRP and the event number is 2006 (tmnxVrrpBecameBackup). The configuration is filtered on the message field, as before, using log filter 2, so that it is specific to the interface named redundant-interface.
# on PE-3:
configure
log
filter 2 name "itf 172.16.1.3 state becomes backup"
default-action drop
entry 10 name "becameBackup"
action forward
match
message eq pattern "interface redundant-interface changed
state to backup"
exit
exit
exit
# on PE-3:
configure
log
event-trigger
event "vrrp" 2006
trigger-entry 1
event-handler "event-handler-2"
log-filter 2
no shutdown
exit
no shutdown
exit
exit
- The first indicates that PE-3 has become VRRP backup for the interface named redundant-interface.
- The second indicates that EHS handler event-handler-2 was invoked by a CLI user.
- The third indicates that a script file has initiated an attempt to execute CLI commands contained in script file vrrp-backup.txt.
-
The fourth indicates that the attempt to execute those CLI commands was successful.
158 2023/09/13 07:58:24.686 UTC MINOR: VRRP #2006 Base Becoming Backup
"VRRP virtual router instance 1 on interface redundant-interface changed state to
backup - current master is 172.16.1.2"
159 2023/09/13 07:58:24.686 UTC MINOR: SYSTEM #2069 Base EHS script
"Ehs handler :"event-handler-2" with the description : "" was invoked by the cli-user
account "not-specified"."
160 2023/09/13 07:58:24.691 UTC MAJOR: SYSTEM #2052 Base CLI 'exec'
"A CLI user has initiated an 'exec' operation to process the commands in the SROS CLI
file cf3:/vrrp-backup.txt"
161 2023/09/13 07:58:24.696 UTC MAJOR: SYSTEM #2053 Base CLI 'exec'
"The CLI user initiated 'exec' operation to process the commands in the SROS CLI
file cf3:/vrrp-backup.txt has completed with the result of success"
The following outputs confirm that PE-3 is VRRP backup, and that the local preference attribute for prefix 172.16.1.0/29 has changed to a value of 50. The result of this action is that PE-2 will now be the transit router for both upstream and downstream traffic.
*A:PE-3# show router vrrp instance
===============================================================================
VRRP Instances
===============================================================================
Interface Name VR Id Own Adm State Base Pri Msg Int
IP Opr Pol Id InUse Pri Inh Int
-------------------------------------------------------------------------------
redundant-interface 1 No Up Backup 253 1
IPv4 Up n/a 253 No
Backup Addr: 172.16.1.1
-------------------------------------------------------------------------------
Instances : 1
===============================================================================
*A:PE-3# show router bgp routes 172.16.1.0/29 hunt | match expression "Network|Nexthop|To|Local Pref"
Network : 172.16.1.0/29
Nexthop : 192.0.2.2
Res. Nexthop : 192.168.23.1
Local Pref. : 100 Interface Name : int-PE-3-PE-2
Network : 172.16.1.0/29
Nexthop : 192.0.2.3
To : 192.0.2.6
Res. Nexthop : n/a
Local Pref. : 50 Interface Name : NotAvailable
Conclusion
EHS allows operators to configure user-defined actions on the router when an event occurs. The event trigger can be anything that is generated by the event-control framework, and explicit filtering is possible using regular expressions. A user-defined action typically runs a script that allows any CLI commands to be executed. Multiple actions are permitted, running multiple scripts if required.