How do baselines and anomalies work?

Components

A baseline provides the logic for the collection of baseline statistics, and for the detection of anomalies.

When you create baselines, you specify a resource or group of resources to collect a set of statistics over a defined time window. Information collected during that window is used to calculate a data point for the baseline. You also define a season, which is the length of time statistics need to be measured to assess trends. For example, to assess network traffic, you could set up a 15-minute window with a one-week season, which provides values calculated every 15 minutes over a one-week period.

Creation of a baseline creates a baseline subscription to collect the required data.

Note: Baseline subscriptions and telemetry subscriptions are separate. A baseline cannot be generated from data collected by a telemetry subscription.

On-demand NFM-P statistics cannot be used to create baselines.

A baseline consists of the following components. The components appear in the Create and Edit forms.

Baselines are created on a per-resource basis. A resource is an entity that can collect the desired statistics. In the Create Baselines form, you configure the required parameters and choose the resources to collect the statistics.

If the NE is managed using MDM, configuration of a baseline initiates statistics collection. If the NE is managed by NFM-P, statistics collection must be configured on the NFM-P and the resource must already be collecting the desired statistics for a baseline to be created.

Note: Baseline Analytics is different from the NSP Analytics application.

Baseline Analytics provides near-real-time baseline and anomaly detection from telemetry counters, for example, received octets for the /telemetry:base/interfaces/interface telemetry type.

The Analytics application computes a baseline for data configured for reporting, for example, utilization and throughput for a port in a Port LAG Details report, or bandwidth and data for an application group in a Router Level Usage Summary report with Baseline. See the Analytics Report Catalog and the NSP User Guide for more information about Analytics.

Baseline Analytics data storage

Baseline data is stored in Postgres, unless there is an auxiliary database enabled, in which case all collected data goes in the auxiliary database.

The following data is stored:

By default, data is stored in Postgres for 35 days and in the auxiliary database for 90 days. These values can be changed using the RESTCONF API or by updating the age-out policy; see How do I edit an age-out policy?.

General

The general parameters include the following:

For example, if you create a baseline and set the Collection Interval to 30, the Season to 1 week, and the Window Duration to 15 minutes, the baseline subscription collects the statistics values every 30 seconds, calculates a baseline data point every 15 minutes, and assesses trends based on one week of data.

Filter & Counters

The Filter & Counters parameters declare the telemetry values to be collected, the counter types, and the resources of interest.

When a telemetry type is selected, the png2.pngCOUNTERS button becomes available.

You can configure one of the following counter types:

Configure an object filter as needed to filter the available resources; see How do object filters work?.

When at least one counter is added and a counter type is specified, the VERIFY RESOURCES button becomes available.

Detectors

A detector defines the rules for anomaly detection. The detector rule provides an acceptable range of expected values. If a detected value exceeds the range, it is marked anomalous.

Anomaly detection is optional.

A detector rule is composed of the following:

The comparison and threshold parameters define the range of acceptable values. For example, a rule could state that a value with an absolute Z-score greater than 2 is an anomaly.

Algorithms

You can define a rule based on an algorithm.

The following algorithms are suitable for most purposes:

The Z-score algorithms are useful because they incorporate the standard deviation: in addition to recording how far the current value is from the mean, the algorithm also factors in the variability of the values. This can be very important when deciding if a value is anomalous. If your values are highly variable, that is, the standard deviation is high, it is important to choose a Z-score algorithm.

You can also use one of the following: