Configuring event handler for operational groups
The key to preventing traffic from being black-holed is to not allow it to be forwarded to a leaf that has no active uplinks; for example, by disabling access links as soon as uplinks become operationally disabled.
The following sections provide an example of configuring event handler for an operational group (oper-group):
Configuring the event handler instance
To configure an event handler instance for the oper-group feature:
-
Define a set of uplinks to monitor in the
paths
statement. -
Specify downlinks (access links) and other parameters in the
options
statement. -
Provide the name of a MicroPython script in the
upython-script
statement.
Defining uplinks to monitor in the paths
statement
In the paths
statement of an event handler instance used with an
oper-group, you define the set of uplinks that are necessary to provide service for
a set of downlinks. The oper-group feature works by monitoring the operational state
of the uplinks, and uses the state information to determine whether the operational
state of the access links must be changed.
In this example, the operational state of two uplink interfaces
ethernet-1/49
and ethernet-1/50
on Leaf 1 are
being monitored. If the operational state of the uplink interfaces changes to
down
, the oper-group feature changes the state of the access
interface to down
to avoid black-holing of traffic from the
client.
To monitor the operational state for a set of interfaces, configure the
paths
statement in an event handler instance. For example:
--{ candidate shared default }--[ ]--
# info system event-handler instance opergroup path
system {
event-handler {
instance opergroup {
paths [
"interface ethernet-1/{49..50} oper-state"
]
}
}
}
Specify the contents of the paths
statement in SR Linux CLI format.
In the example above, the paths
statement is equivalent to the
following CLI command:
--{ running }--[ ]--
# info from state interface ethernet-1/{49..50} oper-state
Specifying downlinks and other parameters in the options
statement
The options
statement in the event handler instance allows you to
define objects that are passed to the script to be used as input parameters.
For an oper-group configuration, you can use the options
statement
to indicate the relationship between the monitored uplinks and the access links.
In the following example, the options define objects that specify the following:
- The access links that react to state changes of the uplinks
- The number of operationally up uplinks required for the access links to stay up
--{ candidate shared default }--[ ]--
# info system event-handler instance opergroup options
system {
event-handler {
instance opergroup {
options {
object down-links {
values [
ethernet-1/1
]
}
object required-up-uplinks {
value 1
}
object debug {
value true
}
}
}
}
}
In this example, the down-links
object specifies an interface name.
When the object is passed to the script, it can be used as a parameter indicating
the access link associated with the uplinks. For this oper-group configuration, the
down-links
object indicates the interface for which the
operational state depends on the state of the uplinks defined in the
paths
statement.
The required-up-uplinks
object specifies the number of uplinks that
need to be operationally up before the access link is brought down. For this
oper-group configuration, the value is 1
, which means that at least
one uplink must be up. The script calculates the number of uplinks that are
operationally up, and compares that number to the value in the
required-up-uplinks
object.
The debug
object is set to true
, which directs the
script to print the values of certain script variables.
Specifying script name in the upython-script
statement
The upython-script
statement in the event handler instance specifies
the name of the MicroPython script to be invoked when SR Linux detects a change in
the interfaces defined in the paths
statement.
The MicroPython script must reside in one of the following locations:
/etc/opt/srlinux/eventmgr
for user-provided scripts/opt/srlinux/eventmgr
for Nokia-provided scripts
--{ candidate shared default }--[ ]--
# info system event-handler instance opergroup upython-script
system {
event-handler {
instance opergroup {
upython-script oper-group.py
}
}
}
For this oper-group configuration, whenever a change occurs to the
oper-state
of the interfaces defined in the
paths
statement, event handler invokes the
oper-group.py
script.
Event handler oper-group configuration
When administratively enabled, the full configuration for this event handler instance looks like the following:
--{ candidate shared default }--[ ]--
# info system event-handler instance oper-group
system {
event-handler {
instance oper-group {
admin-state enable
upython-script oper-group.py
paths [
"interface ethernet-1/{1,2} oper-state"
]
options {
object down-links {
values [
ethernet-1/{20,22}
]
}
object hold-down-time {
value 5000
}
object required-up-uplinks {
value 2
}
}
}
}
}
MicroPython script for oper-group
When there is a state change in any of the paths defined in the paths
statement of the event handler instance, the script defined in the
upython-script
statement is invoked. Event handler calls the
function event_handler_main()
in the script, passing it a JSON string
indicating the current state of the monitored paths, as well as the object:value pairs
defined in the options
statement.
The script receives this input, processes it, and returns a list of actions.
Script input
For the example in Event handler oper-group configuration, the input JSON string consists of the current state of the two uplinks and the
provided options. The following JSON string is passed to the
oper-group.py
script if the operational state of interface
ethernet-1/49
changes to down
:
{
"paths": [
{
"path": "interface ethernet-1/49 oper-state",
"value": "down"
},
{
"path": "interface ethernet-1/50 oper-state",
"value": "up"
}
],
"options": {
"debug": "true",
"required-up-uplinks": "1",
"down-links": [
"ethernet-1/1"
]
}
}
Script processing
The following is the oper-group.py
script referenced in the event
handler instance.
import sys
import json
def count_up_uplinks(paths):
up_cnt = 0
for path in paths:
if path.get('value','down') == 'up':
up_cnt = up_cnt+1
return up_cnt
def required_up_uplinks(options):
return int(options.get('required-up-uplinks', '1'))
def hold_time(options):
return int(options.get('hold-down-time', '0'))
def bool_to_oper_state(val):
return ('down','up')[bool(val)]
def event_handler_main(in_json_str):
in_json = json.loads(in_json_str)
paths = in_json['paths']
options = in_json['options']
persist = in_json.get('persistent-data', {})
num_up_uplinks = count_up_uplinks(paths)
downlink_should_be_up = required_up_uplinks(options) <= num_up_uplinks
needs_hold_down = False
# down->up transition will be held for optional hold-time
if (hold_time(options) > 0) and downlink_should_be_up:
needs_hold_down = persist.get("last-state", "up") == "down"
if options.get("debug") == "true":
print(
f"hold down time = {hold_time(options)}ms\n\
num of required up uplinks = {required_up_uplinks(options)}\n\
detected num of up uplinks = {num_up_uplinks}\n\
downlinks new state = {bool_to_oper_state(downlink_should_be_up)}\n\
needs_hold_down = {str(needs_hold_down)}"
)
response_actions = []
oper_state_str = bool_to_oper_state(not needs_hold_down and downlink_should_be_up)
for downlink in options.get('down-links'):
response_actions.append({'set-ephemeral-path' : {'path':'interface {0} oper-state'.format(downlink),'value':oper_state_str}})
if needs_hold_down:
response_actions.append({'reinvoke-with-delay' : hold_time(options)})
response_persistent_data = {'last-state':bool_to_oper_state(downlink_should_be_up)}
response = {'actions':response_actions,'persistent-data':response_persistent_data}
return json.dumps(response)
The following sections describe how each part the script processes the input for this oper-group example.
Parsing input JSON
Starting with the event_handler_main
function, the incoming JSON
string is parsed and the relevant portions are extracted.
def event_handler_main(in_json_str):
in_json = json.loads(in_json_str)
paths = in_json['paths']
options = in_json['options']
persist = in_json.get('persistent-data', {})
The paths
and options
are objects defined in the
incoming JSON string, and they are saved in their respective like-named
variables.
Determing the state for the downlink
With the input parsed, the script determines the required state of the downlink, based on the received input.
num_up_uplinks = count_up_uplinks(paths)
downlink_should_be_up = required_up_uplinks(options) <= num_up_uplinks
needs_hold_down = False
First, the script counts the number of uplinks in oper-state up
using the count_up_uplinks()
function, which simply walks through
the current state of the uplinks passed into the script by event handler.
def count_up_uplinks(paths):
up_cnt = 0
for path in paths:
if path.get('value','down') == 'up':
up_cnt = up_cnt+1
return up_cnt
After calculating how many uplinks are operationally up, the script determines the
required state for the downlinks. To make this decision, it compares the number of
operational uplinks to the required number of uplinks (from the
required-up-uplinks
option):
If the required number of operationally up uplinks is less than the required number, the downlink is set operationally down to prevent traffic black-holing. On the other hand, if the number of operational uplinks is greater than or equal to the required number of uplinks, the downlink is set operationally up.
The calculated state of the downlink is saved in the
downlinks_new_state
variable.
Populating the debug log
The debug
option causes the script variables to appear in the debug
log.
if options.get("debug") == "true":
print(
f"hold down time = {hold_time(options)}ms\n\
num of required up uplinks = {required_up_uplinks(options)}\n\
detected num of up uplinks = {num_up_uplinks}\n\
downlinks new state = {bool_to_oper_state(downlink_should_be_up)}\n\
needs_hold_down = {str(needs_hold_down)}"
)
The debug log is present only if the debug
option is set to
"true"
in the event handler instance configuration.
You can display the debug log by using the following CLI command:
--{ running }--[ ]--
# info from state system event-handler instance opergroup last-stdout-stderr
Composing output
At this point, the script is able to define the correct state for the downlinks, based on the state of the monitored uplinks and the required number of healthy uplinks. For the event handler to take action, the script needs to output a JSON string following the format defined in Actions.
response_actions = []
oper_state_str = bool_to_oper_state(not needs_hold_down and downlink_should_be_up)
for downlink in options.get('down-links'):
response_actions.append({'set-ephemeral-path' : {'path':'interface {0} oper-state'.format(downlink),'value':oper_state_str}})
if needs_hold_down:
response_actions.append({'reinvoke-with-delay' : hold_time(options)})
response_persistent_data = {'last-state':bool_to_oper_state(downlink_should_be_up)}
response = {'actions':response_actions,'persistent-data':response_persistent_data}
return json.dumps(response)
This example shows an output JSON string, using the calculated
downlinks_new_state
and the list of downlinks provided from the
down-links
option.
The output JSON string contains the set-ephemeral-path action, which sets the
oper-state
of the downlink to the correct value
(up
or down
).
The output is provided via the response
dictionary, and is
JSON-encoded before returning from the function. This routine provides a JSON string
back to the event handler, which processes and executes the actions passed to
it.
The result of this processing shows the implementation of the oper-group feature: the event handler executes actions to set the state of a downlink based on the state of a group of uplinks.
Displaying oper-group information
When an event handler instance is configured and administratively enabled, an initial sync of the monitored paths state is performed. As a result of this initial sync, event handler immediately attempts to execute a script when it receives the state for the monitored paths.
You can display the status of an event handler instance by querying the state datastore. For example:
# /info from state system event-handler instance opergrp
system {
event-handler {
instance opergrp {
admin-state enable
upython-script oper-group.py
oper-state up
paths [
"interface ethernet-1/1 oper-state"
"interface ethernet-1/4 oper-state"
]
options {
object down-links {
values [
ethernet-1/3
ethernet-1/8
]
}
object required-num-up-links {
value 2
}
}
last-execution {
start-time now
end-time now
upython-duration 1
input "{\"paths\":[{\"path\":\"interface ethernet-1/1 oper-state\",\"value\":\"up\"},{\"path\":\"interface ethernet-1/4 oper-state\",\"value\":\"up\"}],\"options\":{\"down-links\":[\"ethernet-1/3\",\"ethernet-1/8\"],\"required-num-up-links\":\"2\"},\"persistent-data\":{\"last-state\":\"up\"}}"
output "{\"actions\": [{\"set-ephemeral-path\": {\"path\": \"interface ethernet-1/3 oper-state\", \"value\": \"up\"}}, {\"set-ephemeral-path\": {\"path\": \"interface ethernet-1/8 oper-state\", \"value\": \"up\"}}], \"persistent-data\": {\"last-state\": \"up\"}}"
stdout-stderr ""
}
last-errored-execution {
oper-down-reason admin-disabled
oper-down-reason-detail ""
start-time "26 seconds ago"
end-time "25 seconds ago"
upython-duration 0
input "{\"paths\":[{\"path\":\"interface ethernet-1/1 oper-state\",\"value\":\"up\"},{\"path\":\"interface ethernet-1/4 oper-state\",\"value\":\"down\"}],\"options\":{\"down-links\":[\"ethernet-1/3\",\"ethernet-1/8\"],\"required-num-up-links\":\"2\"},\"persistent-data\":{\"last-state\":\"down\"}}"
output "{\"actions\": [{\"set-ephemeral-path\": {\"path\": \"interface ethernet-1/3 oper-state\", \"value\": \"up\"}}, {\"set-ephemeral-path\": {\"path\": \"interface ethernet-1/8 oper-state\", \"value\": \"up\"}}], \"persistent-data\": {\"last-state\": \"up\"}}"
stdout-stderr ""
}
statistics {
upython-duration 516
execution-count 1643
execution-successes 1642
execution-errors 1
}
}
}
}
This command displays the following information:
oper-state
The operational state of the event handler instance. In case of any errors in the script and, or configuration the state is
down
.last-execution
Information about the most recent time the script was executed.
last-errored-execution
Information about the last time the script was executed with an error result. This includes the
oper-down-reason
,oper-down-reason-detail
the input and output JSON strings, and the output print statements and log messages sent tostdout-stderr
by the script.statistics
Statistics related to the execution process.
For the oper-group example, the following output is displayed if one of the uplinks goes down:
--{ running }--[ ]--
# info from state system event-handler instance opergrp
system {
event-handler {
instance opergrp {
admin-state enable
upython-script oper-group.py
oper-state up
last-execution {
start-time now
end-time now
upython-duration 1
input "{\"paths\":[{\"path\":\"interface ethernet-1/1 oper-state\",\"value\":\"up\"},{\"path\":\"interface ethernet-1/4 oper-state\",\"value\":\"up\"}],\"options\":{\"down-links\":[\"ethernet-1/3\",\"ethernet-1/8\"],\"required-num-up-links\":\"2\"},\"persistent-data\":{\"last-state\":\"up\"}}"
output "{\"actions\": [{\"set-ephemeral-path\": {\"path\": \"interface ethernet-1/3 oper-state\", \"value\": \"up\"}}, {\"set-ephemeral-path\": {\"path\": \"interface ethernet-1/8 oper-state\", \"value\": \"up\"}}], \"persistent-data\": {\"last-state\": \"up\"}}"
stdout-stderr ""num of required up uplinks = 1
detected num of up uplinks = 1
downlinks new state = up"
}
}
}
}
The setting for downlinks new state
is up
because the
detected num of up uplinks
did not drop below the required number
of 1. The downlink interface therefore remains operationally up.
If both of the uplinks go down, the following is displayed:
--{ running }--[ ]--
# info from state system event-handler instance opergrp
# info from state system event-handler instance opergrp
system {
event-handler {
instance opergrp {
admin-state enable
upython-script oper-group.py
oper-state up
last-execution {
start-time now
end-time now
upython-duration 1
input "{\"paths\":[{\"path\":\"interface ethernet-1/1 oper-state\",\"value\":\"up\"},{\"path\":\"interface ethernet-1/4 oper-state\",\"value\":\"up\"}],\"options\":{\"down-links\":[\"ethernet-1/3\",\"ethernet-1/8\"],\"required-num-up-links\":\"2\"},\"persistent-data\":{\"last-state\":\"up\"}}"
output "{\"actions\": [{\"set-ephemeral-path\": {\"path\": \"interface ethernet-1/3 oper-state\", \"value\": \"up\"}}, {\"set-ephemeral-path\": {\"path\": \"interface ethernet-1/8 oper-state\", \"value\": \"up\"}}], \"persistent-data\": {\"last-state\": \"up\"}}"
stdout-stderr ""num of required up uplinks = 1
detected num of up uplinks = 0
downlinks new state = down"
}
}
}
}
The detected num of up uplinks
is 0, which is below the required number
of 1. This causes event handler to set the downlink interface to operationally down.
In this way, event handler uses the oper-group feature to disable the access link when the uplink interfaces go down, therefore preventing traffic from black-holing.