What are the best practices for indicator creation and management?
Indicator best practices
Keep the following notes in mind when creating and managing indicators:
-
Consider the following when you set the collection interval and window duration:
-
The collection interval configured in NSP is ignored by resources that are managed by NFM-P. Instead, the collection interval of resources managed by NFM-P is determined by the corresponding NFM-P MIB entry policy.
-
When configuring an indicator with a formula for resources managed by NFM-P, the window duration must be at least twice as long as, and be a multiple of, the NFM-P MIB policy. An example is a 5 min MIB policy and a 15 min window duration.
-
When configuring an indicator with a formula for resources managed by MDM, the window duration must be at least twice as long as, and be a multiple of, the collection interval.
-
-
Avoid divide-by-zero scenarios when constructing formulas. Arithmetic operations that are divided by zero produce no output, while aggregation operations output zero from the operation that is divided by zero.
-
Before completing indicator creation, it is a best practice to verify resources to ensure you see a reasonable value for each counter.
Certain counters may not be fully supported in an NE managed by NFM-P and they default to zero, which causes an unexpected calculated KPI value. Other counters may have an actual value of zero. For example, a value of zero for a speed counter can indicate a down port—that is, an actual value of zero speed—or a counter that does not appear in the relevant mapping artifact.
-
Clicking VERIFY RESOURCES in an indicator shows the network device identifiers for the resources that respond to the object filter. The network device identifier is the unique identifier included by the NE in the telemetry messages.
The network device indicator is also the NE identifier used for indicator charting and threshold events in Visualizations.
-
If an indicator configuration is updated, the indicator restarts, causing an expected gap in data until it can start collecting data with the new window duration.
-
Indicator data points are time stamped to the start of the window duration. For example, for an indicator with a window duration of 15 min, the data point or threshold event at 16:15 on the chart represents the window from 16:15 to 16:30.
Output from indicators with formulas
The data required by the formula must be received and processed after the window duration ends before generating the indicator value.
A keep-alive delay of 120 s is used after the window duration, to ensure that all messages from the window have been received before outputting the value. The output for a window is delayed until both of the following criterial are met: the keep-alive has expired and a new message is received after the expiry of the keep-alive.
Example: Indicators with a 5-minute collection interval and 15-minute window duration. The 15-minute collection window starts at midnight (00:00 to 00:15).
After the window closes, the indicator waits for the next set of messages to be received before proceeding to calculate the indicator value for the previous window. When a message is received with a time stamp after 00:17, that is, the end of the window plus the 2 min keep-alive, it confirms that all messages from the 00:00 to 00:15 window have been received and the value can be calculated.
The delay for calculation depends on the formula:
-
Output from an indicator with an arithmetic formula requires an additional collection interval. For this example, the delay is 5 min.
Example formula: {received-octets-periodic_sum}+{transmitted-octets-periodic_sum}
The data point for the 00:00 to 00:15 collection window is processed at 00:20. The sum can be calculated when the next collection occurs and a value for after 00:17 is received.
-
Output from an indicator with an aggregate formula requires an additional collection interval, plus the window duration (for this example, 20 minutes).
Example formulas:
The data point for the 00:00 to 00:15 window is processed at 00:35.
The example formula is a sum of values which are collected at an interval. Each of these values is confirmed when the next interval starts, at 00:20. After the values are confirmed, the sum can be calculated. The sum is confirmed when another aggregated message from after 00:17 is received, at 00:35.
-
A more complex formula requiring multiple levels of aggregation requires additional delays.
Example formula, with two layers of aggregation: max(({transmitted-octets-periodic_sum}+{received-octets-periodic_sum})/sum({speed_avg}))
The delay for this formula is one collection interval plus one window duration per level of aggregation, or 35 min.