Menu
Grafana Cloud

State and health of alerts

There are three key components that help you understand how your alerts behave during their evaluation: alert instance state, alert rule state, and alert rule health. Although related, each component conveys subtly different information.

Alert instance state

An alert instance can be in either of the following states:

StateDescription
NormalThe state of an alert when the condition (threshold) is not met.
PendingThe state of an alert that has breached the threshold but for less than the pending period.
AlertingThe state of an alert that has breached the threshold for longer than the pending period.
NoDataThe state of an alert whose query returns no data or all values are null.
ErrorThe state of an alert when an error or timeout occurred evaluating the alert rule.
Alert instance state diagram
Alert instance state diagram

Notifications

Alert instances will be routed for notifications when they are in the Alerting state or have been Resolved, transitioning from Alerting to Normal state.

Keep last state

The “Keep Last State” option helps mitigate temporary data source issues, preventing alerts from unintentionally firing, resolving, and re-firing.

In the alert rule settings, you can configure to keep the last state of the alert instance when a NoData and/or Error state is encountered. Just like normal evaluation, the alert instance transitions from Pending to Alerting after the pending period has elapsed.

However, in situations where strict monitoring is critical, relying solely on the “Keep Last State” option may not be appropriate. Instead, consider using an alternative or implementing additional alert rules to ensure that issues with prolonged data source disruptions are detected.

Special alerts for NoData and Error

When evaluation of an alert rule produces state NoData or Error, Grafana Alerting generates a new alert instance that have the following additional labels:

  • alertname: Either DatasourceNoData or DatasourceError depending on the state.
  • datasource_uid: The UID of the data source that caused the state.

You can manage these alerts like regular ones by using their labels to apply actions such as adding a silence, routing via notification policies, and more.

Alert rule state

The alert rule state is determined by the “worst case” state of the alert instances produced. For example, if one alert instance is Alerting, the alert rule state is firing.

An alert rule can be in either of the following states:

StateDescription
NormalNone of the alert instances returned by the evaluation engine is in a Pending or Alerting state.
PendingAt least one alert instances returned by the evaluation engine is Pending.
FiringAt least one alert instances returned by the evaluation engine is Alerting.

Alert rule health

An alert rule can have one of the following health statuses:

StateDescription
OkNo error when evaluating an alerting rule.
ErrorAn error occurred when evaluating an alerting rule.
NoDataThe absence of data in at least one time series returned during a rule evaluation.
{status}, KeepLastThe rule would have received another status but was configured to keep the last state of the alert rule.