Thank you for your feedback.
Form temporarily unavailable. Please try again or contact docfeedback@servicenow.com to submit your comments.

Understanding Operational Intelligence

Log in to subscribe to topics and get notified when content changes.

Understanding Operational Intelligence

Use Operational Intelligence to identify and prevent potential service outages. Operational Intelligence, based on historical metric data, indicates anomalous behavior of CIs which events might not capture. Anomaly alerts can be promoted to regular IT alerts and appear on the Alert Console and service health dashboard for preventive actions.

Starting with the New York release, Operational Intelligence is part of ITOM Health in the IT Operations Management product.

Anomaly detection

Metric data is collected by various data sources such as SCOM, SolarWinds monitoring system, or Nagios XI server (some partially configured for metric collection by default). These monitoring systems collect metric data from the source environment regularly. Operational Intelligence captures the raw data from these monitoring systems, and uses event rules and the CMDB identification engine to map data to existing CIs and their resources. The data is then analyzed to detect anomalies and to provide other statistical scores.

Operational Intelligence uses historical metric data to build statistical models. These models facilitate projection of expected metric values along with upper and lower bounds. Operational Intelligence then uses these projections to detect statistical outliers and to calculate anomaly scores. Anomalies are scored on a range of 0-10. High anomaly scores for CI metrics can indicate that a CI is at risk of causing a service outage.

After processing, the Insights Explorer shows metric statistics and charts, and the Anomaly Map shows correlated scores for CIs with the highest anomaly scores, across a timeline.

Operational Intelligence is available when you activate the Operational Intelligence (com.snc.sa.metric) plugin.

Terms used with Operational Intelligence

Source metric type

A metric such as '% Free Space' or 'Current Bandwidth' that can be measured by a data source for a CI. For each data source, you can choose which of all possible source metric types are processed. For example, there are about 380 source metric types that are active by default for the SCOM data source.

Anomaly
Data that is outside the control bounds is considered a statistical outlier. These outliers are used to compute an anomaly score, which is a value between 0–10 that indicates the degree to which the metric appears unlikely. When an anomaly score is above a threshold, an anomaly alert is generated. Anomaly alerts are reported separately from regular IT alerts.
Resource
A component of a CI that consists of multiple individual components of similar type, where each subcomponent can be monitored separately. For example, individual Web pages, or specific disks such as 'Disk C:' and 'Disk D:'.
Time series
A series of values (such as metric values) over a time range, associated with a CI and a metric type. Because an anomaly score is evaluated for each metric, the series of anomaly scores over a period of time are also a time series. Time series are computed by the statistical model built for a metric data series, and are used with metric data values, anomaly scores, and upper and lower control bounds.

Statistical models

Operational Intelligence jobs learn from past metric data (up to 32 days old). A model training process analyzes historical data to construct a model that projects future values. Typically, models are in effect until the next time the model learning process runs. These models are used to calculate upper and lower bounds. Incoming values that are beyond those bounds, and that deviate with statistical significance from expected values, generate anomalies. Each model is uniquely patterned and is labeled with a classifier that illustrates the general behavior of the model. This classification determines if anomaly detection can be applied. For most models, it is possible to project which future values deviate from expected values. Such models are associated with control bounds and anomaly detection can be applied (if enabled).

However, for some models, there is insufficient data to determine which values are anomalous and anomaly detection cannot be applied without additional information (even if anomaly detection is enabled).

The learned data models are stored in the Metric Time Series Models [sa_time_series] table.

The following statistical models and classifiers are used in anomaly detection:

Time Series statistical model
After it is established, a time series model does not adjust to changes in the incoming metric data. Therefore, if the pattern of incoming data changes, those changes are likely to be identified as anomalous. Upper and lower control bounds, after they are learned, persist until the next time the learning process runs (data is learned every day).
Weekly
Data with a pattern that repeats itself over weekly intervals (seasonal model).

Requires a minimum of 15 days of data in the series, as set by the weekly_model_min_days configuration setting.

Weekly classifier

Daily
Data with a pattern that repeats itself over a daily interval (seasonal model).

Requires a minimum of 3 days of data in the series, as set by the daily_model_min_days configuration setting.

Daily classifier

Trendy
Data that has a linear trend with some slope and with some noise.

Requires a minimum of 30 data points in the series, as set by the corrupt_data_count_threshold configuration setting.

Trendy classifier

Noisy
Typical noisy data that is a basic pattern classification in a data model. The pattern cannot be identified with a specific trend or seasonality.

Requires a minimum of 30 data points in the series, as set by the corrupt_data_count_threshold configuration setting.

Noisy classifier

Positive clipped noisy
Similar to the noisy classifier other than the lower bound that is fixed on 0.

Requires a minimum of 30 data points in the series, as set by the corrupt_data_count_threshold configuration setting.

Positive clipped noisy classifier

Centered noisy

Noisy data that typically spreads symmetrically between user-specified upper and lower bounds. The formula that is used to set bounds and width values, ignores the statistical data, and the lower and the upper widths have an identical value.

Requires that the number of data points in the series is zero.

See Specify custom upper and lower metric bounds for more information.

Centered noisy classifier

Skewed noisy

Noisy data that is not evenly spread between user-specified upper and lower bounds, but instead tends to concentrate closer to one of the bounds. The median of the data is used to separately compute an upper width and a lower width.

Requires a minimum of one data point in the series.

See Specify custom upper and lower metric bounds for more information.

Skewed noisy classifier

Accumulator
Data pattern similar to the trendy classifier but with a monotonous increase and without noise. For this classifier, there is no data model and no anomaly detection.

Requires a minimum of 30 data points in the series, as set by the corrupt_data_count_threshold configuration setting.

Diagram of the Accumulator classifier.

Near Constant
Nearly constant data, in which most values are a specific constant value. For this classifier, there is no data model and no anomaly detection.

Requires a minimum of 30 data points in the series, as set by the corrupt_data_count_threshold configuration setting.

Diagram of the Near Constant classifier.

Multinomial
Data pattern in which all values are one of a relatively small number of values. For example, values are always 100 or 99.9. For this classifier, there is no data model and no anomaly detection.

Requires a minimum of 400 data points in the series, calculated as 10 times the value of the multinomial_count_threshold configuration setting.

Multinomial classifier

Corrupt
Data has insufficient data points to identify a pattern. For this classifier, there is no data model and no anomaly detection.

Requires that the number of data points in the series is less than the value of the corrupt_data_count_threshold configuration setting (30 by default).

Kalman Filter statistical model
Add on to the time series statistical model and applicable only to the noisy and positive noisy classifiers. This model is a general method of estimating model parameters from a stream of data where level is the only parameter in the model. The Kalman Filter model can adjust to new values in incoming metric data. When there are no clear patterns in the noise or if there is too much noise, the Kalman Filter model is not used.
Local level
When incoming data clusters around a new value according to the current control bounds, the Learner adjusts the data model to accommodate a permanent change. This clustering is detected as a new value in the data model so that most incoming data is again within the control bounds rather than anomalous. Such change detection is useful when for example, cores or memory are added to a server, which impact the baselines.

Requires a minimum of 30 data points in the series, as set by the corrupt_data_count_threshold configuration setting.

Diagram of the Kalman Filter Local Level classifier.

Unrecognized
When data does not fit the local level classifier, time series classifiers are used. This happens when it is not possible to adjust the variance ratio in a learned local level model to reasonable values.
Non-Parametric statistical model
Add on to the positive noisy classifier. In the nonparametric model, noise distribution is not symmetrical and does not fit any seasonal pattern. The nonparametric model creates control bounds that better fit the actual data, and once learned, the control bounds persist until the next learning cycle. This model does not adjust itself to changes in the data, and it takes longer for a deviation to be identified as an anomaly.
Stationary Non-Parametric
Data that is not time-dependent meaning that there is no significant shift in parameters such as mean and variance when shifting data in time.

Requires a minimum of 5000 data points in the series, as set by the snpm_minimum_data_count configuration setting.

Diagram of the Non-Parametric Stationary classifier.

Unrecognized
When data does not fit the stationary classifier, time series classifiers are used.
Feedback