Prometheus Time Series Collection and Processing Server

Rules

ansible managed alert rules			5.261s ago	1.422ms
Rule	State	Error	Last Evaluation	Evaluation Time
alert: Watchdog expr: vector(1) for: 10m labels: severity: warning annotations: description: This is an alert meant to ensure that the entire alerting pipeline is functional. This alert is always firing, therefore it should always be firing in Alertmanager and always fire against a receiver. There are integrations with various notification mechanisms that send a notification when this alert is not firing. For example the "DeadMansSnitch" integration in PagerDuty. summary: Ensure entire alerting pipeline is functional	ok		5.263s ago	387.4us
alert: InstanceDown expr: up == 0 for: 5m labels: severity: critical annotations: description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.' summary: Instance {{ $labels.instance }} down	ok		5.263s ago	387.6us
alert: CriticalCPULoad expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{job="node",mode="idle"}[5m])) * 100) > 96 for: 2m labels: severity: critical annotations: description: '{{ $labels.instance }} of job {{ $labels.job }} has Critical CPU load for more than 2 minutes.' summary: Instance {{ $labels.instance }} - Critical CPU load	ok		5.263s ago	186.2us
alert: CriticalRAMUsage expr: (1 - ((node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes) / node_memory_MemTotal_bytes)) * 100 > 98 for: 5m labels: severity: critical annotations: description: '{{ $labels.instance }} has Critical Memory Usage more than 5 minutes.' summary: Instance {{ $labels.instance }} has Critical Memory Usage	ok		5.263s ago	130.8us
alert: CriticalDiskSpace expr: node_filesystem_free_bytes{fstype!~"(squashfs\|fuse.)",job="node",mountpoint!~"^/run(/.\|$)"} / node_filesystem_size_bytes{job="node"} < 0.1 for: 4m labels: severity: critical annotations: description: '{{ $labels.instance }} of job {{ $labels.job }} has less than 10% space remaining.' summary: Instance {{ $labels.instance }} - Critical disk space usage	ok		5.263s ago	198.1us
alert: RebootRequired expr: node_reboot_required > 0 labels: severity: warning annotations: description: '{{ $labels.instance }} requires a reboot.' summary: Instance {{ $labels.instance }} - reboot required	ok		5.263s ago	54.58us
alert: ClockSkewDetected expr: abs(node_timex_offset_seconds) * 1000 > 30 for: 2m labels: severity: warning annotations: description: Clock skew detected on {{ $labels.instance }}. Ensure NTP is configured correctly on this host. summary: Instance {{ $labels.instance }} - Clock skew detected	ok		5.263s ago	59.28us