Common information
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudbackup1002-dev
- description: Unit remove_dangling_cinder_snapshots.service on node cloudbackup1002-dev has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit remove_dangling_cinder_snapshots.service on node cloudbackup1002-dev has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudbackup1002-dev:9100
- job: node
- name: remove_dangling_cinder_snapshots.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
Firing alerts
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudbackup1002-dev
- description: Unit remove_dangling_cinder_snapshots.service on node cloudbackup1002-dev has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit remove_dangling_cinder_snapshots.service on node cloudbackup1002-dev has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudbackup1002-dev:9100
- job: node
- name: remove_dangling_cinder_snapshots.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source