Jump to content

Server Admin Log

From Wikitech

2024-12-14

  • 12:46 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 12:41 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync

2024-12-13

  • 20:19 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1025.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:28 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1011.eqiad.wmnet with OS bullseye
  • 19:27 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1025.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 19:21 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 19:10 mstyles@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 19:09 mstyles@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 19:09 mstyles@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 19:09 mstyles@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 19:09 mstyles@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 19:09 mstyles@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 19:08 mstyles@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 19:08 mstyles@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 19:08 mstyles@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 19:07 mstyles@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 18:26 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1011.eqiad.wmnet with OS bullseye
  • 18:25 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1011.eqiad.wmnet with OS bullseye
  • 17:56 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2113.codfw.wmnet
  • 17:56 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2113.codfw.wmnet
  • 17:56 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2112.codfw.wmnet
  • 17:56 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2112.codfw.wmnet
  • 17:54 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2111.codfw.wmnet
  • 17:54 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2111.codfw.wmnet
  • 17:54 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2110.codfw.wmnet
  • 17:54 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2110.codfw.wmnet
  • 17:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2110.codfw.wmnet with OS bookworm
  • 17:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2111.codfw.wmnet with OS bookworm
  • 17:26 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 17:24 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2110.codfw.wmnet with reason: host reimage
  • 17:21 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 17:20 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2111.codfw.wmnet with reason: host reimage
  • 17:18 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1011.eqiad.wmnet with OS bullseye
  • 17:17 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2110.codfw.wmnet with reason: host reimage
  • 17:16 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2111.codfw.wmnet with reason: host reimage
  • 17:16 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 17:11 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 17:10 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 17:05 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 17:04 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 17:02 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 17:00 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2110
  • 16:59 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2110
  • 16:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2111
  • 16:59 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2111
  • 16:59 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2110.codfw.wmnet with OS bookworm
  • 16:59 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2111.codfw.wmnet with OS bookworm
  • 16:58 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:58 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2110.codfw.wmnet
  • 16:57 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2110.codfw.wmnet
  • 16:57 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:56 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2111.codfw.wmnet
  • 16:56 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2111.codfw.wmnet
  • 16:54 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:50 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:48 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:47 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:45 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:41 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:39 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:36 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:35 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:35 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:34 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:32 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:31 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:31 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:30 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:29 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:19 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2112.codfw.wmnet with OS bookworm
  • 16:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2113.codfw.wmnet with OS bookworm
  • 15:59 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 15:57 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 15:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2112.codfw.wmnet with reason: host reimage
  • 15:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2113.codfw.wmnet with reason: host reimage
  • 15:45 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2112.codfw.wmnet with reason: host reimage
  • 15:45 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2113.codfw.wmnet with reason: host reimage
  • 15:28 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2113
  • 15:28 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2113
  • 15:28 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2112
  • 15:28 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2112
  • 15:28 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2113.codfw.wmnet with OS bookworm
  • 15:28 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2112.codfw.wmnet with OS bookworm
  • 15:27 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2112.codfw.wmnet
  • 15:26 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2112.codfw.wmnet
  • 15:26 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2113.codfw.wmnet
  • 15:26 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2113.codfw.wmnet
  • 15:18 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2114.codfw.wmnet
  • 15:18 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2114.codfw.wmnet
  • 15:17 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2115.codfw.wmnet
  • 15:17 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2115.codfw.wmnet
  • 15:16 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2115.codfw.wmnet with OS bookworm
  • 15:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2114.codfw.wmnet with OS bookworm
  • 14:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2115.codfw.wmnet with reason: host reimage
  • 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host build2002.codfw.wmnet with OS bookworm
  • 14:54 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[2007-2010].codfw.wmnet} and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
  • 14:54 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2010.codfw.wmnet with OS bookworm
  • 14:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2114.codfw.wmnet with reason: host reimage
  • 14:49 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2115.codfw.wmnet with reason: host reimage
  • 14:48 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2114.codfw.wmnet with reason: host reimage
  • 14:35 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2010.codfw.wmnet with reason: host reimage
  • 14:31 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2010.codfw.wmnet with reason: host reimage
  • 14:31 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 14:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2115
  • 14:30 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2115
  • 14:29 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2114
  • 14:29 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2114
  • 14:29 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2115.codfw.wmnet with OS bookworm
  • 14:29 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2114.codfw.wmnet with OS bookworm
  • 14:29 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 14:28 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 14:27 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2114.codfw.wmnet
  • 14:26 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2114.codfw.wmnet
  • 14:26 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 14:26 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2115.codfw.wmnet
  • 14:25 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2115.codfw.wmnet
  • 14:25 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 14:21 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2116.codfw.wmnet
  • 14:21 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2116.codfw.wmnet
  • 14:21 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2117.codfw.wmnet
  • 14:21 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2117.codfw.wmnet
  • 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on build2002.codfw.wmnet with reason: host reimage
  • 14:15 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 14:14 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2010
  • 14:14 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2010
  • 14:13 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2010.codfw.wmnet with OS bookworm
  • 14:13 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 14:13 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on build2002.codfw.wmnet with reason: host reimage
  • 14:12 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2009.codfw.wmnet with OS bookworm
  • 14:03 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 13:58 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 13:56 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host build2002.codfw.wmnet with OS bookworm
  • 13:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM build2002.codfw.wmnet
  • 13:49 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2009.codfw.wmnet with reason: host reimage
  • 13:48 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM build2002.codfw.wmnet
  • 13:48 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 13:45 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2009.codfw.wmnet with reason: host reimage
  • 13:38 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 13:28 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 13:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2116.codfw.wmnet with OS bookworm
  • 13:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2117.codfw.wmnet with OS bookworm
  • 13:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2116.codfw.wmnet with reason: host reimage
  • 13:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2117.codfw.wmnet with reason: host reimage
  • 13:01 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2116.codfw.wmnet with reason: host reimage
  • 13:00 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2117.codfw.wmnet with reason: host reimage
  • 12:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2116
  • 12:43 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2116
  • 12:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2117
  • 12:43 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2117
  • 12:43 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2117.codfw.wmnet with OS bookworm
  • 12:43 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2116.codfw.wmnet with OS bookworm
  • 12:34 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2116.codfw.wmnet
  • 12:33 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2116.codfw.wmnet
  • 12:31 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2117.codfw.wmnet
  • 12:30 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2009
  • 12:30 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2009
  • 12:30 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2009.codfw.wmnet with OS bookworm
  • 12:28 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2008.codfw.wmnet with OS bookworm
  • 12:27 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2117.codfw.wmnet
  • 12:24 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2119.codfw.wmnet
  • 12:24 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2119.codfw.wmnet
  • 12:24 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2118.codfw.wmnet
  • 12:24 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2118.codfw.wmnet
  • 12:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2118.codfw.wmnet with OS bookworm
  • 12:17 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2119.codfw.wmnet with OS bookworm
  • 12:09 moritzm: bump build2002 to 400G T379343
  • 12:08 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2008.codfw.wmnet with reason: host reimage
  • 12:06 aokoth@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Update
  • 12:05 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2008.codfw.wmnet with reason: host reimage
  • 12:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2118.codfw.wmnet with reason: host reimage
  • 11:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2119.codfw.wmnet with reason: host reimage
  • 11:54 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2118.codfw.wmnet with reason: host reimage
  • 11:54 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2119.codfw.wmnet with reason: host reimage
  • 11:48 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2008
  • 11:48 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2008
  • 11:47 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2008.codfw.wmnet with OS bookworm
  • 11:46 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2007.codfw.wmnet with OS bookworm
  • 11:36 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2118
  • 11:36 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2118
  • 11:36 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2119
  • 11:36 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2119
  • 11:36 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2118.codfw.wmnet with OS bookworm
  • 11:36 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2119.codfw.wmnet with OS bookworm
  • 11:32 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2118.codfw.wmnet
  • 11:32 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2118.codfw.wmnet
  • 11:31 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2121.codfw.wmnet
  • 11:31 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2121.codfw.wmnet
  • 11:31 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2120.codfw.wmnet
  • 11:31 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2120.codfw.wmnet
  • 11:31 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2119.codfw.wmnet
  • 11:30 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2119.codfw.wmnet
  • 11:26 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2007.codfw.wmnet with reason: host reimage
  • 11:23 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2007.codfw.wmnet with reason: host reimage
  • 11:06 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2007
  • 11:06 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2007
  • 11:05 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2007.codfw.wmnet with OS bookworm
  • 11:04 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[2007-2010].codfw.wmnet} and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
  • 11:00 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[2001-2004].codfw.wmnet} and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
  • 11:00 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2004.codfw.wmnet with OS bookworm
  • 10:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2120.codfw.wmnet with OS bookworm
  • 10:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2121.codfw.wmnet with OS bookworm
  • 10:40 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2004.codfw.wmnet with reason: host reimage
  • 10:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2120.codfw.wmnet with reason: host reimage
  • 10:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2121.codfw.wmnet with reason: host reimage
  • 10:30 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2004.codfw.wmnet with reason: host reimage
  • 10:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2120.codfw.wmnet with reason: host reimage
  • 10:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2121.codfw.wmnet with reason: host reimage
  • 10:13 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2004
  • 10:13 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2004
  • 10:13 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2004.codfw.wmnet with OS bookworm
  • 10:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2120
  • 10:12 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2120
  • 10:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2121
  • 10:12 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2121
  • 10:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2120.codfw.wmnet with OS bookworm
  • 10:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2121.codfw.wmnet with OS bookworm
  • 10:11 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2003.codfw.wmnet with OS bookworm
  • 10:10 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2120.codfw.wmnet
  • 10:10 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2120.codfw.wmnet
  • 10:08 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2121.codfw.wmnet
  • 10:07 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2121.codfw.wmnet
  • 09:57 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2122.codfw.wmnet
  • 09:57 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2122.codfw.wmnet
  • 09:56 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2123.codfw.wmnet
  • 09:56 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2123.codfw.wmnet
  • 09:50 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2003.codfw.wmnet with reason: host reimage
  • 09:50 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2123.codfw.wmnet with OS bookworm
  • 09:47 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2003.codfw.wmnet with reason: host reimage
  • 09:45 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2122.codfw.wmnet with OS bookworm
  • 09:42 Emperor: depool/restart swift/repool ms-fe1014
  • 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2018.codfw.wmnet
  • 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2018.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 09:36 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2018.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 09:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 09:29 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2123.codfw.wmnet with reason: host reimage
  • 09:29 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2003
  • 09:29 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2003
  • 09:29 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2003.codfw.wmnet with OS bookworm
  • 09:27 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2002.codfw.wmnet with OS bookworm
  • 09:27 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2018.codfw.wmnet
  • 09:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2122.codfw.wmnet with reason: host reimage
  • 09:23 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2123.codfw.wmnet with reason: host reimage
  • 09:22 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2122.codfw.wmnet with reason: host reimage
  • 09:09 xSavitar: T382078 Ran mwscript-k8s --comment="T382078" -f -- extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=trwikiquote --logwiki=metawiki 'Roggenwolf' 'ChopinAficionado'
  • 09:08 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2002.codfw.wmnet with reason: host reimage
  • 09:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2123
  • 09:05 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2123
  • 09:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2122
  • 09:05 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2122
  • 09:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2123.codfw.wmnet with OS bookworm
  • 09:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2122.codfw.wmnet with OS bookworm
  • 09:05 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2002.codfw.wmnet with reason: host reimage
  • 09:02 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2122.codfw.wmnet
  • 09:02 xSavitar: T382078 Ran mwscript-k8s --comment="T382078" -f -- extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --logwiki=metawiki 'Norberto Luis Amoroso Jacquet' 'Renamed user fe0fd27068061604303a2a5ab7390149'
  • 09:01 aokoth@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Update
  • 08:59 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2122.codfw.wmnet
  • 08:59 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2123.codfw.wmnet
  • 08:58 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2123.codfw.wmnet
  • 08:51 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2126.codfw.wmnet
  • 08:51 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2126.codfw.wmnet
  • 08:51 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2125.codfw.wmnet
  • 08:51 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2125.codfw.wmnet
  • 08:50 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2126.codfw.wmnet with OS bookworm
  • 08:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2017.codfw.wmnet
  • 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2017.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 08:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2017.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 08:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2125.codfw.wmnet with OS bookworm
  • 08:45 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2002
  • 08:45 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2002
  • 08:45 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2002.codfw.wmnet with OS bookworm
  • 08:43 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2001.codfw.wmnet with OS bookworm
  • 08:41 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2006.codfw.wmnet
  • 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2005.codfw.wmnet
  • 08:36 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2017.codfw.wmnet
  • 08:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2006.codfw.wmnet
  • 08:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2005.codfw.wmnet
  • 08:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2126.codfw.wmnet with reason: host reimage
  • 08:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2004.codfw.wmnet
  • 08:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2003.codfw.wmnet
  • 08:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2125.codfw.wmnet with reason: host reimage
  • 08:23 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2001.codfw.wmnet with reason: host reimage
  • 08:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2004.codfw.wmnet
  • 08:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2003.codfw.wmnet
  • 08:22 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host maps-test2002.codfw.wmnet
  • 08:21 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2126.codfw.wmnet with reason: host reimage
  • 08:21 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host maps-test2001.codfw.wmnet
  • 08:21 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2125.codfw.wmnet with reason: host reimage
  • 08:19 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2001.codfw.wmnet with reason: host reimage
  • 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2002.codfw.wmnet
  • 08:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2001.codfw.wmnet
  • 08:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2125
  • 08:02 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2125
  • 08:02 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2001
  • 08:02 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2001
  • 08:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2126
  • 08:02 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2126
  • 08:02 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2001.codfw.wmnet with OS bookworm
  • 08:02 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2125.codfw.wmnet with OS bookworm
  • 08:01 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2126.codfw.wmnet with OS bookworm
  • 08:00 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[2001-2004].codfw.wmnet} and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
  • 07:59 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2125.codfw.wmnet
  • 07:59 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 07:58 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2125.codfw.wmnet
  • 07:58 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2126.codfw.wmnet
  • 07:57 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2126.codfw.wmnet
  • 07:49 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 07:41 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 07:30 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync

2024-12-12

  • away: UTC late deploys done
  • 22:35 tgr@deploy2002: Finished scap sync-world: Backport for change metric types back to counters (T374050) (duration: 19m 10s)
  • 22:30 tgr@deploy2002: tgr, cwhite: Continuing with sync
  • 22:29 eileen: config revision changed from ca701cba to 404bbbd5
  • 22:20 tgr@deploy2002: tgr, cwhite: Backport for change metric types back to counters (T374050) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:16 tgr@deploy2002: Started scap sync-world: Backport for change metric types back to counters (T374050)
  • 22:14 tgr@deploy2002: Finished scap sync-world: Backport for EditCheck: move checks to a sidebar (T341308 T379443) (duration: 29m 12s)
  • 22:09 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker127[6-7].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 22:09 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1277.eqiad.wmnet with OS bookworm
  • 22:08 inflatador: bking@cumin2002 sudo cumin A:gitlab-runner 'systemctl restart ferm.service' T371994
  • 22:03 tgr@deploy2002: tgr, kemayo: Continuing with sync
  • 22:02 tgr@deploy2002: tgr, kemayo: Backport for EditCheck: move checks to a sidebar (T341308 T379443) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:49 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1277.eqiad.wmnet with reason: host reimage
  • 21:46 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1277.eqiad.wmnet with reason: host reimage
  • 21:45 tgr@deploy2002: Started scap sync-world: Backport for EditCheck: move checks to a sidebar (T341308 T379443)
  • 21:38 tgr@deploy2002: Finished scap sync-world: Backport for Reader Survey: Deploy on eswiki, dewiki and frwiki. (T378660) (duration: 12m 42s)
  • 21:35 inflatador: bking@gitlab-runner2004 restart ferm to troubleshoot missing iptables rules T371994
  • 21:32 tgr@deploy2002: dani, tgr: Continuing with sync
  • 21:32 inflatador: bking@gitlab-runner2004 restart docker to troubleshoot missing iptables rules T371994
  • 21:32 tgr@deploy2002: dani, tgr: Backport for Reader Survey: Deploy on eswiki, dewiki and frwiki. (T378660) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1277
  • 21:25 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1277
  • 21:25 tgr@deploy2002: Started scap sync-world: Backport for Reader Survey: Deploy on eswiki, dewiki and frwiki. (T378660)
  • 21:25 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1277.eqiad.wmnet with OS bookworm
  • 21:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1276.eqiad.wmnet with OS bookworm
  • 21:20 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1275.eqiad.wmnet with OS bookworm
  • 21:05 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1276.eqiad.wmnet with reason: host reimage
  • 21:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1275.eqiad.wmnet with reason: host reimage
  • 21:00 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1276.eqiad.wmnet with reason: host reimage
  • 20:56 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1275.eqiad.wmnet with reason: host reimage
  • 20:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1276
  • 20:40 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1276
  • 20:40 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1276.eqiad.wmnet with OS bookworm
  • 20:38 kamila@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker127[6-7].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 20:37 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1275
  • 20:37 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1275
  • 20:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1275.eqiad.wmnet with OS bookworm
  • 20:32 kamila@cumin1002: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P{wikikube-worker[1270-1275].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 20:32 kamila@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1275.eqiad.wmnet with OS bookworm
  • 19:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1275
  • 19:48 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1275
  • 19:48 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1275.eqiad.wmnet with OS bookworm
  • 19:46 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1274.eqiad.wmnet with OS bookworm
  • 19:42 swfrench@deploy2002: Finished scap sync-world: Deployment to populate mw-api-int migration release files - T377040 (duration: 02m 13s)
  • 19:40 swfrench@deploy2002: Started scap sync-world: Deployment to populate mw-api-int migration release files - T377040
  • 19:30 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 19:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 19:27 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 19:27 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 19:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1274.eqiad.wmnet with reason: host reimage
  • 19:24 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1274.eqiad.wmnet with reason: host reimage
  • 19:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1274
  • 19:04 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1274
  • 19:03 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1274.eqiad.wmnet with OS bookworm
  • 19:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 19:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1273.eqiad.wmnet with OS bookworm
  • 18:42 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1273.eqiad.wmnet with reason: host reimage
  • 18:39 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1273.eqiad.wmnet with reason: host reimage
  • 18:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:22 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-eqiad: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 18:19 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1273
  • 18:19 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1273
  • 18:19 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1273.eqiad.wmnet with OS bookworm
  • 18:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1272.eqiad.wmnet with OS bookworm
  • 18:08 James_F: Running `mwscript-k8s -f -- extensions/WikiLambda/maintenance/updateSecondaryTables.php --wiki=wikifunctionswiki --zType Z4 --report --verbose`
  • 17:58 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1272.eqiad.wmnet with reason: host reimage
  • 17:55 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1272.eqiad.wmnet with reason: host reimage
  • 17:54 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:54 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:53 ottomata: killing wikidatawiki xml dump process to try to unstick it - T382084
  • 17:41 aqu@deploy2002: Finished deploy [airflow-dags/analytics@c2d7e08]: Backfill pageview actor hourly 2024 12 (duration: 03m 03s)
  • 17:41 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:41 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 17:41 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:41 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 17:38 aqu@deploy2002: Started deploy [airflow-dags/analytics@c2d7e08]: Backfill pageview actor hourly 2024 12
  • 17:37 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:37 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1272
  • 17:35 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1272
  • 17:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1272.eqiad.wmnet with OS bookworm
  • 17:33 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1271.eqiad.wmnet with OS bookworm
  • 17:25 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 17:23 bvibber: charts-renderer deployment T382039 complete
  • 17:21 bvibber@deploy2002: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
  • 17:20 bvibber@deploy2002: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
  • 17:20 bvibber@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
  • 17:19 bvibber@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
  • 17:19 bvibber@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
  • 17:18 bvibber@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
  • 17:16 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 17:15 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 17:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1271.eqiad.wmnet with reason: host reimage
  • 17:13 bvibber: doing service deploy for chart-renderer (T382039)
  • 17:11 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1271.eqiad.wmnet with reason: host reimage
  • 16:53 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: sync
  • 16:53 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: sync
  • 16:53 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: sync
  • 16:53 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: sync
  • 16:52 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-eqiad: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 16:52 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 16:52 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 16:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1271
  • 16:52 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1271
  • 16:51 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1271.eqiad.wmnet with OS bookworm
  • 16:51 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1270.eqiad.wmnet with OS bookworm
  • 16:44 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:43 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
  • 16:41 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:39 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:39 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
  • 16:39 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:37 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-codfw: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 16:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1270.eqiad.wmnet with reason: host reimage
  • 16:28 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1270.eqiad.wmnet with reason: host reimage
  • 16:27 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:17 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:08 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1270
  • 16:08 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1270
  • 16:08 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1270.eqiad.wmnet with OS bookworm
  • 16:06 kamila@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1270-1275].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 16:06 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 15:56 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 15:43 bking@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: 0.3.150 (duration: 00m 13s)
  • 15:43 bking@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: 0.3.150
  • 15:34 ladsgroup@deploy2002: Finished scap sync-world: Backport for Activate tigwiki (T381377) (duration: 09m 25s)
  • 15:34 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 15:29 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 15:28 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 15:28 ladsgroup@deploy2002: ladsgroup: Backport for Activate tigwiki (T381377) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:24 ladsgroup@deploy2002: Started scap sync-world: Backport for Activate tigwiki (T381377)
  • 15:24 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 15:19 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
  • 15:19 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
  • 15:18 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 15:17 ladsgroup@deploy2002: Finished scap sync-world: Backport for Add tigwiki to pre-install (T381377) (duration: 09m 35s)
  • 15:16 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 15:16 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 15:16 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 15:12 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 15:11 ladsgroup@deploy2002: ladsgroup: Backport for Add tigwiki to pre-install (T381377) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:09 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-codfw: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 15:09 bking@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: 0.3.150 (duration: 00m 05s)
  • 15:09 bking@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: 0.3.150
  • 15:08 ladsgroup@deploy2002: Started scap sync-world: Backport for Add tigwiki to pre-install (T381377)
  • 15:03 eevans@cumin1002: END (ERROR) - Cookbook sre.cassandra.roll-restart (exit_code=97) for nodes matching A:aqs-codfw: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 14:58 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:58 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:58 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:58 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:57 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:57 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:55 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-codfw: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 14:54 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:54 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:54 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 14:54 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 14:54 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 14:53 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 14:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71708 and previous config saved to /var/cache/conftool/dbconfig/20241212-144846-root.json
  • 14:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71707 and previous config saved to /var/cache/conftool/dbconfig/20241212-143340-root.json
  • 14:30 Amir1: ladsgroup@mwmaint2002:~$ foreachwikiindblist all userOptions.php --delete VectorSkinVersion (T54777)
  • 14:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71706 and previous config saved to /var/cache/conftool/dbconfig/20241212-141835-root.json
  • 14:15 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host db1208.eqiad.wmnet
  • 14:04 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
  • 14:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71705 and previous config saved to /var/cache/conftool/dbconfig/20241212-140329-root.json
  • 14:03 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 14:03 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2127.codfw.wmnet
  • 14:03 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2127.codfw.wmnet
  • 14:03 elukey@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: sync
  • 14:01 btullis@cumin1002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 14:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2127.codfw.wmnet with OS bookworm
  • 13:55 btullis@cumin1002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 13:53 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:52 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:52 btullis@cumin1002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 13:48 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:48 elukey@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71704 and previous config saved to /var/cache/conftool/dbconfig/20241212-134824-root.json
  • 13:48 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:47 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
  • 13:47 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 13:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1169.eqiad.wmnet with reason: maintenance
  • 13:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db1169.eqiad.wmnet with reason: maintenance
  • 13:46 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 13:46 btullis@cumin1002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 13:41 moritzm: installing Python 3.11 security updates
  • 13:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2127.codfw.wmnet with reason: host reimage
  • 13:39 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:38 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2127.codfw.wmnet with reason: host reimage
  • 13:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T381532)', diff saved to https://phabricator.wikimedia.org/P71703 and previous config saved to /var/cache/conftool/dbconfig/20241212-133633-marostegui.json
  • 13:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 13:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 13:32 moritzm: rebalance Ganeti cluster in codfw/D following server refresh T376594
  • 13:29 elukey@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2014,2016].codfw.wmnet with reason: maintenance
  • 13:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on pc[2014,2016].codfw.wmnet with reason: maintenance
  • 13:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2127
  • 13:18 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2127
  • 13:18 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2127.codfw.wmnet with OS bookworm
  • 13:16 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2127.codfw.wmnet
  • 13:15 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2127.codfw.wmnet
  • 13:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[1013,1017].eqiad.wmnet with reason: maintenance
  • 13:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on pc[1013,1017].eqiad.wmnet with reason: maintenance
  • 13:12 mszabo@deploy2002: Finished scap sync-world: Backport for Enable IRS in the Project namespace on ptwiki (T382061) (duration: 09m 41s)
  • 13:06 mszabo@deploy2002: mszabo: Continuing with sync
  • 13:05 mszabo@deploy2002: mszabo: Backport for Enable IRS in the Project namespace on ptwiki (T382061) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:02 mszabo@deploy2002: Started scap sync-world: Backport for Enable IRS in the Project namespace on ptwiki (T382061)
  • 12:36 btullis@cumin1002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: Restarting to pick up new JRE for T377938 - btullis@cumin1002 - T377938
  • 12:31 btullis@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: Restarting to pick up new JRE for T377938 - btullis@cumin1002 - T377938
  • 12:30 btullis@cumin1002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: Restarting to pick up new JRE for T377938 - btullis@cumin1002 - T377938
  • 12:29 btullis@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: Restarting to pick up new JRE for T377938 - btullis@cumin1002 - T377938
  • 12:15 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 12:15 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 12:15 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 12:15 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 12:10 hnowlan@deploy2002: Finished scap sync-world: syncing changes to mediawiki chart vendor dependencies (duration: 09m 30s)
  • 12:07 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Bugfixes - oblivian@cumin1002 - T382062"
  • 12:06 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Bugfixes - oblivian@cumin1002 - T382062
  • 12:06 oblivian@cumin1002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Bugfixes - oblivian@cumin1002 - T382062
  • 12:06 oblivian@cumin1002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Bugfixes - oblivian@cumin1002 - T382062"
  • 12:03 hnowlan@deploy2002: Started scap sync-world: syncing changes to mediawiki chart vendor dependencies
  • 11:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1154.eqiad.wmnet with reason: maintenance
  • 11:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db1154.eqiad.wmnet with reason: maintenance
  • 11:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:33 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
  • 11:33 elukey@deploy2002: helmfile [eqiad] START helmfile.d/admin 'sync'.
  • 11:33 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
  • 11:32 elukey@deploy2002: helmfile [codfw] START helmfile.d/admin 'sync'.
  • 11:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2187.codfw.wmnet with reason: maintenance
  • 11:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2187.codfw.wmnet with reason: maintenance
  • 11:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2186.codfw.wmnet with reason: maintenance
  • 11:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2186.codfw.wmnet with reason: maintenance
  • 11:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2186.codfw.wmnet with reason: maintenance
  • 11:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2186.codfw.wmnet with reason: maintenance
  • 11:19 aqu@deploy2002: Finished deploy [airflow-dags/analytics@0e18d4f]: Backfill webrequest actor label hourly 2024 12 (duration: 02m 52s)
  • 11:16 aqu@deploy2002: Started deploy [airflow-dags/analytics@0e18d4f]: Backfill webrequest actor label hourly 2024 12
  • 11:08 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: update Homer wmf-plugin - cmooney@cumin1002
  • 11:07 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 11:07 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 11:07 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: update Homer wmf-plugin - cmooney@cumin1002
  • 11:04 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: update Homer wmf-plugin - cmooney@cumin1002
  • 11:04 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 11:04 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 11:03 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 11:03 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 11:03 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: update Homer wmf-plugin - cmooney@cumin1002
  • 11:02 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 11:02 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 11:01 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 11:01 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 11:01 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 10:48 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: update Homer wmf-plugin - cmooney@cumin1002
  • 10:44 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: update Homer wmf-plugin - cmooney@cumin1002
  • 10:43 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 10:43 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 10:43 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 10:36 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 10:36 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 10:34 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 10:34 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 10:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:22 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 09:22 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 09:18 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 09:18 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 09:17 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 09:17 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 09:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1027.eqiad.wmnet with reason: maintenance
  • 09:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on dbproxy1027.eqiad.wmnet with reason: maintenance
  • 08:54 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:54 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1026.eqiad.wmnet with reason: maintenance
  • 08:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on dbproxy1026.eqiad.wmnet with reason: maintenance
  • 08:27 kartik@deploy2002: Finished scap sync-world: Backport for Translate: Enable message group subscription for 7 wikis (T372386) (duration: 20m 05s)
  • 08:23 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:23 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:21 kartik@deploy2002: kartik, abi: Continuing with sync
  • 08:12 kartik@deploy2002: kartik, abi: Backport for Translate: Enable message group subscription for 7 wikis (T372386) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1024.eqiad.wmnet with reason: maintenance
  • 08:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on dbproxy1024.eqiad.wmnet with reason: maintenance
  • 08:07 kartik@deploy2002: Started scap sync-world: Backport for Translate: Enable message group subscription for 7 wikis (T372386)
  • 08:02 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:02 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1023.eqiad.wmnet with reason: maintenance
  • 08:00 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:00 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on dbproxy1023.eqiad.wmnet with reason: maintenance
  • 08:00 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:00 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 07:56 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 07:56 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 07:52 moritzm: installing upx-ucl security updates

2024-12-11

  • 23:24 tzatziki: removing 7 files for legal compliance
  • 23:02 tzatziki: removing 4 files for legal compliance
  • 22:52 tzatziki: removing three files for legal compliance
  • 21:53 eileen: civicrm upgraded from ddda6d67 to 0d7f2866
  • 21:33 TheresNoTime: done UTC late backport window
  • 21:32 samtar@deploy2002: Finished scap sync-world: Backport for Follow-up I9df39fdcc: Convert missed 'this' to 'el' (T381741) (duration: 10m 01s)
  • 21:26 samtar@deploy2002: novemlinguae, samtar: Continuing with sync
  • 21:25 samtar@deploy2002: novemlinguae, samtar: Backport for Follow-up I9df39fdcc: Convert missed 'this' to 'el' (T381741) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:22 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1011.eqiad.wmnet with OS bullseye
  • 21:22 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1012.eqiad.wmnet with OS bullseye
  • 21:22 samtar@deploy2002: Started scap sync-world: Backport for Follow-up I9df39fdcc: Convert missed 'this' to 'el' (T381741)
  • 21:14 samtar@deploy2002: Finished scap sync-world: Backport for Enable AutoModerator on bnwiki (T381000) (duration: 11m 01s)
  • 21:11 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS bullseye
  • 21:11 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host cloudelastic1011.eqiad.wmnet with OS bullseye
  • 21:10 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:10 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:08 samtar@deploy2002: kgraessle, samtar: Continuing with sync
  • 21:08 samtar@deploy2002: kgraessle, samtar: Backport for Enable AutoModerator on bnwiki (T381000) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:03 samtar@deploy2002: Started scap sync-world: Backport for Enable AutoModerator on bnwiki (T381000)
  • 21:00 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:00 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 20:42 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on aqs1014.eqiad.wmnet with reason: Hardware replacement
  • 20:42 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on aqs1014.eqiad.wmnet with reason: Hardware replacement
  • 18:18 claime: homer 'lsw1-d6-codfw*' commit 'T379788'
  • 18:17 claime: homer 'lsw1-c1-codfw*' commit 'T379788'
  • 18:16 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wikikube-worker[2180-2183].codfw.wmnet
  • 18:16 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:16 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[2180-2183].codfw.wmnet decommissioned, removing all IPs except the asset tag one - cgoubert@cumin1002"
  • 18:15 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[2180-2183].codfw.wmnet decommissioned, removing all IPs except the asset tag one - cgoubert@cumin1002"
  • 18:10 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 18:06 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:06 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:05 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:05 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:05 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:05 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:04 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:04 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:00 cgoubert@cumin1002: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[2180-2183].codfw.wmnet
  • 17:59 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:59 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:58 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:58 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:55 claime: homer 'lsw1-a6-codfw' commit 'T379788'
  • 17:54 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wikikube-worker[2047,2066,2085-2086].codfw.wmnet
  • 17:54 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:54 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[2047,2066,2085-2086].codfw.wmnet decommissioned, removing all IPs except the asset tag one - cgoubert@cumin1002"
  • 17:53 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[2047,2066,2085-2086].codfw.wmnet decommissioned, removing all IPs except the asset tag one - cgoubert@cumin1002"
  • 17:48 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 17:38 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "UI improvements, add uncomitted changes warning - oblivian@cumin1002"
  • 17:38 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: UI improvements, add uncomitted changes warning - oblivian@cumin1002
  • 17:37 oblivian@cumin1002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: UI improvements, add uncomitted changes warning - oblivian@cumin1002
  • 17:37 oblivian@cumin1002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "UI improvements, add uncomitted changes warning - oblivian@cumin1002"
  • 17:32 cgoubert@cumin1002: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[2047,2066,2085-2086].codfw.wmnet
  • 17:31 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.restart (exit_code=97)
  • 17:31 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 17:19 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
  • 17:16 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2047,2066,2085-2086,2180-2183].codfw.wmnet
  • 17:09 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2047,2066,2085-2086,2180-2183].codfw.wmnet
  • 16:48 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:47 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 16:46 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 16:43 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:42 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 16:35 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:34 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 16:32 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:25 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 16:25 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 16:24 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 16:24 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 16:23 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:22 otto@deploy2002: Finished scap sync-world: Backport for mediawiki.org/beacon/event/index.php - use EventBus->send (T353817) (duration: 11m 36s)
  • 16:21 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 16:21 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 16:16 otto@deploy2002: otto: Continuing with sync
  • 16:16 otto@deploy2002: otto: Backport for mediawiki.org/beacon/event/index.php - use EventBus->send (T353817) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:10 otto@deploy2002: Started scap sync-world: Backport for mediawiki.org/beacon/event/index.php - use EventBus->send (T353817)
  • 15:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2135.codfw.wmnet with reason: maintenance
  • 15:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2135.codfw.wmnet with reason: maintenance
  • 15:39 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on archiva1002.wikimedia.org with reason: Adding new disk
  • 15:38 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on archiva1002.wikimedia.org with reason: Adding new disk
  • 15:38 klausman@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 15:38 klausman@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 15:36 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 15:36 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 15:36 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet
  • 15:35 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet
  • 15:23 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 15:23 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 15:22 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 15:21 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 15:20 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 15:19 klausman@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 15:19 klausman@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 15:15 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 15:15 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 15:13 elukey@deploy2002: Finished deploy [docker-pkg/deploy@9305554]: Update to 4.0.3 (duration: 00m 37s)
  • 15:13 elukey@deploy2002: Started deploy [docker-pkg/deploy@9305554]: Update to 4.0.3
  • 15:02 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2184-2187].codfw.wmnet
  • 15:02 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2184-2187].codfw.wmnet
  • 15:02 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2085.codfw.wmnet with OS bullseye
  • 15:00 jelto: homer 'lsw1-d3-codfw*' commit 'T377877'
  • 14:58 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 14:57 jelto: homer 'lsw1-c3-codfw*' commit 'T377877'
  • 14:57 klausman@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 14:56 klausman@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 14:56 jelto: homer 'lsw1-d5-codfw*' commit 'T377877'
  • 14:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2185.codfw.wmnet with OS bookworm
  • 14:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2187.codfw.wmnet with OS bookworm
  • 14:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2184.codfw.wmnet with OS bookworm
  • 14:45 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2186.codfw.wmnet with OS bookworm
  • 14:42 TheresNoTime: done UTC afternoon backport window
  • 14:41 samtar@deploy2002: Finished scap sync-world: Backport for Add Atieno's public key (duration: 08m 47s)
  • 14:39 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2085.codfw.wmnet with reason: host reimage
  • 14:36 samtar@deploy2002: arlolra, samtar: Continuing with sync
  • 14:36 samtar@deploy2002: arlolra, samtar: Backport for Add Atieno's public key synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2185.codfw.wmnet with reason: host reimage
  • 14:34 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2085.codfw.wmnet with reason: host reimage
  • 14:32 samtar@deploy2002: Started scap sync-world: Backport for Add Atieno's public key
  • 14:32 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2185.codfw.wmnet with reason: host reimage
  • 14:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2187.codfw.wmnet with reason: host reimage
  • 14:30 samtar@deploy2002: Finished scap sync-world: Backport for ve.ui.CodeMirror.v6: Use plugin callback to load the actual module (T374072), styles: Avoid misalignments when line numbering is disabled (T381714) (duration: 10m 42s)
  • 14:28 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2184.codfw.wmnet with reason: host reimage
  • 14:28 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2187.codfw.wmnet with reason: host reimage
  • 14:25 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2184.codfw.wmnet with reason: host reimage
  • 14:25 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2186.codfw.wmnet with reason: host reimage
  • 14:25 samtar@deploy2002: samtar, func: Continuing with sync
  • 14:23 samtar@deploy2002: samtar, func: Backport for ve.ui.CodeMirror.v6: Use plugin callback to load the actual module (T374072), styles: Avoid misalignments when line numbering is disabled (T381714) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:22 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2186.codfw.wmnet with reason: host reimage
  • 14:19 samtar@deploy2002: Started scap sync-world: Backport for ve.ui.CodeMirror.v6: Use plugin callback to load the actual module (T374072), styles: Avoid misalignments when line numbering is disabled (T381714)
  • 14:19 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be2085.codfw.wmnet with OS bullseye
  • 14:18 samtar@deploy2002: Finished scap sync-world: Backport for Remove feature flag which controls wikibase item link location (T377809) (duration: 12m 32s)
  • 14:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2185
  • 14:15 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2185
  • 14:14 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on archiva1002.wikimedia.org with reason: Adding new disk
  • 14:14 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on archiva1002.wikimedia.org with reason: Adding new disk
  • 14:14 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2185
  • 14:14 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2185.codfw.wmnet 89.32.192.10.in-addr.arpa 9.8.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:14 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2185.codfw.wmnet 89.32.192.10.in-addr.arpa 9.8.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:14 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:14 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2185 - jelto@cumin1002"
  • 14:14 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 14:14 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2185 - jelto@cumin1002"
  • 14:12 samtar@deploy2002: samtar, joelyrookewmde: Continuing with sync
  • 14:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2187
  • 14:11 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2187
  • 14:11 btullis@cumin1002: END (PASS) - Cookbook sre.apifeatureusage.roll-restart-reboot-logstash (exit_code=0) rolling restart_daemons on A:apifeatureusage
  • 14:11 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2187
  • 14:11 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2187.codfw.wmnet 87.48.192.10.in-addr.arpa 7.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:11 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:10 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2187.codfw.wmnet 87.48.192.10.in-addr.arpa 7.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:10 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:09 samtar@deploy2002: samtar, joelyrookewmde: Backport for Remove feature flag which controls wikibase item link location (T377809) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:09 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2185
  • 14:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2184
  • 14:08 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2184
  • 14:08 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:08 btullis@cumin1002: START - Cookbook sre.apifeatureusage.roll-restart-reboot-logstash rolling restart_daemons on A:apifeatureusage
  • 14:07 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2184
  • 14:07 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2184.codfw.wmnet 41.32.192.10.in-addr.arpa 1.4.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:07 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2184.codfw.wmnet 41.32.192.10.in-addr.arpa 1.4.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:07 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:06 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:06 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:05 samtar@deploy2002: Started scap sync-world: Backport for Remove feature flag which controls wikibase item link location (T377809)
  • 14:05 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2186
  • 14:05 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2186
  • 14:05 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2186
  • 14:05 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2186.codfw.wmnet 180.48.192.10.in-addr.arpa 0.8.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:04 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2186.codfw.wmnet 180.48.192.10.in-addr.arpa 0.8.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:04 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:04 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2186 - jelto@cumin1002"
  • 14:04 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2186 - jelto@cumin1002"
  • 14:00 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:00 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2187
  • 14:00 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2186
  • 14:00 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2184
  • 14:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2187.codfw.wmnet with OS bookworm
  • 14:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2186.codfw.wmnet with OS bookworm
  • 14:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2185.codfw.wmnet with OS bookworm
  • 14:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2184.codfw.wmnet with OS bookworm
  • 13:57 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2184.codfw.wmnet wikikube-worker2185.codfw.wmnet wikikube-worker2186.codfw.wmnet wikikube-worker2187.codfw.wmnet on all recursors
  • 13:57 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2184.codfw.wmnet wikikube-worker2185.codfw.wmnet wikikube-worker2186.codfw.wmnet wikikube-worker2187.codfw.wmnet on all recursors
  • 13:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2024 to wikikube-worker2187
  • 13:53 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2187
  • 13:53 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2187
  • 13:53 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:53 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2024 to wikikube-worker2187 - jelto@cumin1002"
  • 13:53 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2024 to wikikube-worker2187 - jelto@cumin1002"
  • 13:42 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:42 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2024 to wikikube-worker2187
  • 13:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2022 to wikikube-worker2186
  • 13:40 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2186
  • 13:40 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2186
  • 13:40 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:40 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2022 to wikikube-worker2186 - jelto@cumin1002"
  • 13:39 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2022 to wikikube-worker2186 - jelto@cumin1002"
  • 13:34 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:34 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2022 to wikikube-worker2186
  • 13:33 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2021 to wikikube-worker2185
  • 13:32 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2185
  • 13:32 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2185
  • 13:32 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:32 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2021 to wikikube-worker2185 - jelto@cumin1002"
  • 13:31 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2021 to wikikube-worker2185 - jelto@cumin1002"
  • 13:28 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:27 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2021 to wikikube-worker2185
  • 13:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
  • 13:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2017 to wikikube-worker2184
  • 13:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
  • 13:25 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2184
  • 13:25 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2184
  • 13:25 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:25 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2017 to wikikube-worker2184 - jelto@cumin1002"
  • 13:25 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2017 to wikikube-worker2184 - jelto@cumin1002"
  • 13:21 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:21 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2017 to wikikube-worker2184
  • 13:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
  • 13:18 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[2017,2021-2022,2024].codfw.wmnet
  • 13:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
  • 13:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
  • 13:13 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[2017,2021-2022,2024].codfw.wmnet
  • 13:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:04 kart_: Updated cxserver to 2024-12-10-132417-production (T369815)
  • 13:04 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 13:01 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 13:00 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 13:00 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 12:59 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 12:57 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 12:54 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 12:54 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 12:54 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 12:47 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 12:15 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 12:14 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 12:13 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 12:12 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 12:12 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 12:11 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 12:11 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 12:11 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 12:11 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 12:11 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 12:08 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:08 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 12:05 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:04 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 11:57 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2180-2183].codfw.wmnet
  • 11:57 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2180-2183].codfw.wmnet
  • 11:56 jelto: homer 'lsw1-c1-codfw*' commit 'T377877'
  • 11:54 jelto: homer 'lsw1-d6-codfw*' commit 'T377877'
  • 11:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2181.codfw.wmnet with OS bookworm
  • 11:37 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 11:33 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2181.codfw.wmnet with reason: host reimage
  • 11:29 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2181.codfw.wmnet with reason: host reimage
  • 11:25 mszabo@deploy2002: Finished scap sync-world: Backport for Prep pilot wiki config for IRS (T374105) (duration: 11m 04s)
  • 11:20 mszabo@deploy2002: mszabo: Continuing with sync
  • 11:17 mszabo@deploy2002: mszabo: Backport for Prep pilot wiki config for IRS (T374105) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2182.codfw.wmnet with OS bookworm
  • 11:14 mszabo@deploy2002: Started scap sync-world: Backport for Prep pilot wiki config for IRS (T374105)
  • 11:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2181
  • 11:12 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2181
  • 11:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2181.codfw.wmnet with OS bookworm
  • 11:11 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2181.codfw.wmnet with OS bookworm
  • 11:09 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Revert^2 "Stats: Move StatsFactory flush into emitBufferedStats" (duration: 14m 22s)
  • 11:03 dreamyjazz@deploy2002: dreamyjazz, cwhite: Continuing with sync
  • 10:59 dreamyjazz@deploy2002: dreamyjazz, cwhite: Backport for Revert^2 "Stats: Move StatsFactory flush into emitBufferedStats" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:58 fabfur: merging https://gerrit.wikimedia.org/r/c/operations/dns/+/1100084 to direct Argentina, Chile, Uruguay to magru (T359054)
  • 10:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2182.codfw.wmnet with reason: host reimage
  • 10:54 dreamyjazz@deploy2002: Started scap sync-world: Backport for Revert^2 "Stats: Move StatsFactory flush into emitBufferedStats"
  • 10:51 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2182.codfw.wmnet with reason: host reimage
  • 10:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2183.codfw.wmnet with OS bookworm
  • 10:33 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2182
  • 10:33 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2182
  • 10:33 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2182.codfw.wmnet with OS bookworm
  • 10:32 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2182.codfw.wmnet with OS bookworm
  • 10:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2180.codfw.wmnet with OS bookworm
  • 10:17 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2183.codfw.wmnet with reason: host reimage
  • 10:14 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2183.codfw.wmnet with reason: host reimage
  • 10:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71699 and previous config saved to /var/cache/conftool/dbconfig/20241211-101051-root.json
  • 10:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2180.codfw.wmnet with reason: host reimage
  • 10:02 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2180.codfw.wmnet with reason: host reimage
  • 09:59 aqu@deploy2002: Finished deploy [airflow-dags/analytics@416a3c0]: Backfill webrequest actor metrics rollup hourly 2024 12 (duration: 01m 02s)
  • 09:58 aqu@deploy2002: Started deploy [airflow-dags/analytics@416a3c0]: Backfill webrequest actor metrics rollup hourly 2024 12
  • 09:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2183
  • 09:56 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2183
  • 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71698 and previous config saved to /var/cache/conftool/dbconfig/20241211-095546-root.json
  • 09:55 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2183
  • 09:55 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2183.codfw.wmnet 29.48.192.10.in-addr.arpa 9.2.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:55 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2183.codfw.wmnet 29.48.192.10.in-addr.arpa 9.2.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:55 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:55 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2183 - jelto@cumin1002"
  • 09:55 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2183 - jelto@cumin1002"
  • 09:51 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2181
  • 09:51 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2181
  • 09:51 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2181
  • 09:51 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2181.codfw.wmnet 110.32.192.10.in-addr.arpa 0.1.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:51 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2181.codfw.wmnet 110.32.192.10.in-addr.arpa 0.1.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:51 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:51 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2181 - jelto@cumin1002"
  • 09:51 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2181 - jelto@cumin1002"
  • 09:48 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2183
  • 09:47 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2182
  • 09:47 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2182
  • 09:47 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:46 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2182
  • 09:46 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2182.codfw.wmnet 28.48.192.10.in-addr.arpa 8.2.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:46 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2182.codfw.wmnet 28.48.192.10.in-addr.arpa 8.2.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:46 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:44 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2181
  • 09:44 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2180
  • 09:44 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2180
  • 09:44 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:44 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2180
  • 09:44 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2180.codfw.wmnet 109.32.192.10.in-addr.arpa 9.0.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:44 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2180.codfw.wmnet 109.32.192.10.in-addr.arpa 9.0.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:44 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:44 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2180 - jelto@cumin1002"
  • 09:44 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2180 - jelto@cumin1002"
  • 09:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71697 and previous config saved to /var/cache/conftool/dbconfig/20241211-094040-root.json
  • 09:40 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:40 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2183.codfw.wmnet with OS bookworm
  • 09:40 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2182
  • 09:40 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2182.codfw.wmnet with OS bookworm
  • 09:40 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2181.codfw.wmnet with OS bookworm
  • 09:40 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2180
  • 09:39 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2180.codfw.wmnet with OS bookworm
  • 09:37 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2180.codfw.wmnet wikikube-worker2181.codfw.wmnet wikikube-worker2182.codfw.wmnet wikikube-worker2183.codfw.wmnet on all recursors
  • 09:37 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2180.codfw.wmnet wikikube-worker2181.codfw.wmnet wikikube-worker2182.codfw.wmnet wikikube-worker2183.codfw.wmnet on all recursors
  • 09:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2014 to wikikube-worker2183
  • 09:36 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2183
  • 09:36 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2183
  • 09:36 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:36 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2014 to wikikube-worker2183 - jelto@cumin1002"
  • 09:36 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2014 to wikikube-worker2183 - jelto@cumin1002"
  • 09:32 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:32 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-lab1002.eqiad.wmnet with OS bookworm
  • 09:32 elukey@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1002"
  • 09:32 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2014 to wikikube-worker2183
  • 09:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2013 to wikikube-worker2182
  • 09:31 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2182
  • 09:31 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2182
  • 09:30 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:30 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2013 to wikikube-worker2182 - jelto@cumin1002"
  • 09:30 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2013 to wikikube-worker2182 - jelto@cumin1002"
  • 09:26 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:26 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2013 to wikikube-worker2182
  • 09:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2012 to wikikube-worker2181
  • 09:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71696 and previous config saved to /var/cache/conftool/dbconfig/20241211-092535-root.json
  • 09:25 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2181
  • 09:25 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2181
  • 09:25 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:25 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2012 to wikikube-worker2181 - jelto@cumin1002"
  • 09:24 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2012 to wikikube-worker2181 - jelto@cumin1002"
  • 09:20 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:20 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2012 to wikikube-worker2181
  • 09:20 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2011 to wikikube-worker2180
  • 09:19 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2180
  • 09:19 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2180
  • 09:19 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:19 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2011 to wikikube-worker2180 - jelto@cumin1002"
  • 09:18 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2011 to wikikube-worker2180 - jelto@cumin1002"
  • 09:14 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:14 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2011 to wikikube-worker2180
  • 09:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71695 and previous config saved to /var/cache/conftool/dbconfig/20241211-091029-root.json
  • 09:08 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[2011-2014].codfw.wmnet
  • 09:08 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[2011-2014].codfw.wmnet
  • 09:06 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[2011-2014].codfw.wmnet
  • 09:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2136 to upgrade MariaDB 10.11 T378940', diff saved to https://phabricator.wikimedia.org/P71694 and previous config saved to /var/cache/conftool/dbconfig/20241211-090538-marostegui.json
  • 09:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2136.codfw.wmnet with reason: maintenance
  • 09:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2136.codfw.wmnet with reason: maintenance
  • 09:04 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[2011-2014].codfw.wmnet
  • 02:30 eileen: civicrm upgraded from 3ef855ca to ddda6d67
  • 01:36 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, xfer wdqs scholarly 2023(public)->2026(internal)) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2027.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 00:50 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, xfer wdqs scholarly 2023(public)->2026(internal)) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2027.codfw.wmnet w/ force delete existing files, repooling both afterwards

2024-12-10

  • 23:35 eileen: config revision changed from b3741848 to ca701cba add phone update job
  • 22:54 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1088.eqiad.wmnet with reason: T381919
  • 22:54 jhathaway@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on ms-be1088.eqiad.wmnet with reason: T381919
  • 22:49 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1088.eqiad.wmnet with OS bookworm
  • 22:36 cjming: end of UTC late backport window
  • 22:22 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
  • 22:19 jhathaway@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
  • 22:15 cjming@deploy2002: Finished scap sync-world: Backport for Disable stats collection when WMF_MAINTENANCE_OFFLINE is set (T380609) (duration: 11m 24s)
  • 22:10 cjming@deploy2002: cwhite, cjming: Continuing with sync
  • 22:08 cjming@deploy2002: cwhite, cjming: Backport for Disable stats collection when WMF_MAINTENANCE_OFFLINE is set (T380609) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:04 cjming@deploy2002: Started scap sync-world: Backport for Disable stats collection when WMF_MAINTENANCE_OFFLINE is set (T380609)
  • 21:59 cjming@deploy2002: Finished scap sync-world: Backport for Beta Cluster: Enable MetricsPlatform extension on all wikis (T381849 T381853) (duration: 10m 50s)
  • 21:56 jhathaway@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1088.eqiad.wmnet with OS bookworm
  • 21:53 cjming@deploy2002: cjming, phuedx: Continuing with sync
  • 21:52 cjming@deploy2002: cjming, phuedx: Backport for Beta Cluster: Enable MetricsPlatform extension on all wikis (T381849 T381853) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:48 cjming@deploy2002: Started scap sync-world: Backport for Beta Cluster: Enable MetricsPlatform extension on all wikis (T381849 T381853)
  • 21:47 eileen: ivicrm upgraded from f9c89e50 to 3ef855ca
  • 21:47 cjming@deploy2002: Finished scap sync-world: Backport for Reader Survey: Increase coverage (T378660) (duration: 10m 02s)
  • 21:41 cjming@deploy2002: cjming, dani: Continuing with sync
  • 21:41 cjming@deploy2002: cjming, dani: Backport for Reader Survey: Increase coverage (T378660) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:37 cjming@deploy2002: Started scap sync-world: Backport for Reader Survey: Increase coverage (T378660)
  • 21:35 cjming@deploy2002: Finished scap sync-world: Backport for LanguageConverter: Ignore content inside <math> and <svg> elements (T381617) (duration: 11m 55s)
  • 21:30 cjming@deploy2002: bvibber, cjming: Continuing with sync
  • 21:27 cjming@deploy2002: bvibber, cjming: Backport for LanguageConverter: Ignore content inside <math> and <svg> elements (T381617) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:23 cjming@deploy2002: Started scap sync-world: Backport for LanguageConverter: Ignore content inside <math> and <svg> elements (T381617)
  • 21:22 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, xfer wdqs scholarly 2023(public)->2026(internal)) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2026.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 20:55 mforns@deploy2002: Finished deploy [airflow-dags/analytics@2af4e1a]: Fix for the Commons Impact Metrics job (duration: 01m 38s)
  • 20:54 mforns@deploy2002: Started deploy [airflow-dags/analytics@2af4e1a]: Fix for the Commons Impact Metrics job
  • 20:47 mforns@deploy2002: Finished deploy [analytics/refinery@25c1946] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@25c1946c] (duration: 00m 27s)
  • 20:46 mforns@deploy2002: Started deploy [analytics/refinery@25c1946] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@25c1946c]
  • 20:46 mforns@deploy2002: Finished deploy [analytics/refinery@25c1946] (thin): Regular analytics weekly train THIN [analytics/refinery@25c1946c] (duration: 00m 31s)
  • 20:45 mforns@deploy2002: Started deploy [analytics/refinery@25c1946] (thin): Regular analytics weekly train THIN [analytics/refinery@25c1946c]
  • 20:45 mforns@deploy2002: Finished deploy [analytics/refinery@25c1946]: Regular analytics weekly train [analytics/refinery@25c1946c] (duration: 13m 12s)
  • 20:38 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, xfer wdqs scholarly 2023(public)->2026(internal)) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2026.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 20:38 ryankemper@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T376150, xfer wdqs scholarly 2023(public)->2026(internal)) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2026.codfw.wmnet, repooling source-only afterwards
  • 20:37 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, xfer wdqs scholarly 2023(public)->2026(internal)) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2026.codfw.wmnet, repooling source-only afterwards
  • 20:32 mforns@deploy2002: Started deploy [analytics/refinery@25c1946]: Regular analytics weekly train [analytics/refinery@25c1946c]
  • 20:28 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on ms-be1088.eqiad.wmnet with reason: T381919
  • 20:28 jhathaway@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on ms-be1088.eqiad.wmnet with reason: T381919
  • 20:04 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ms-be1088.eqiad.wmnet with reason: T381919
  • 20:04 jhathaway@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on ms-be1088.eqiad.wmnet with reason: T381919
  • 18:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 100%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71693 and previous config saved to /var/cache/conftool/dbconfig/20241210-183545-root.json
  • 18:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 75%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71692 and previous config saved to /var/cache/conftool/dbconfig/20241210-182040-root.json
  • 18:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 50%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71691 and previous config saved to /var/cache/conftool/dbconfig/20241210-180534-root.json
  • 18:02 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 18:02 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 18:02 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 18:01 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1002"
  • 18:01 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 18:00 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 18:00 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 17:55 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 17:54 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 17:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 25%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71690 and previous config saved to /var/cache/conftool/dbconfig/20241210-175029-root.json
  • 17:47 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:47 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 17:42 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:42 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 17:42 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:41 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 17:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 10%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71688 and previous config saved to /var/cache/conftool/dbconfig/20241210-173524-root.json
  • 17:30 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-lab1002.eqiad.wmnet with reason: host reimage
  • 17:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2158.codfw.wmnet with reason: maintenance
  • 17:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2158.codfw.wmnet with reason: maintenance
  • 17:25 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-lab1002.eqiad.wmnet with reason: host reimage
  • 17:24 herron@cumin1002: dbctl commit (dc=all): 'depooling db2158 T381901', diff saved to https://phabricator.wikimedia.org/P71687 and previous config saved to /var/cache/conftool/dbconfig/20241210-172424-herron.json
  • 17:18 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:18 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 17:13 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ml-lab1002.eqiad.wmnet with OS bookworm
  • 17:13 swfrench-wmf: deployed shellbox 2024-12-07-073046 for T381830
  • 17:12 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 17:12 klausman@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-lab1002.eqiad.wmnet with OS bookworm
  • 17:12 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 17:11 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 17:11 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 17:10 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:10 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:10 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 17:10 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 17:09 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 17:09 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 17:09 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 17:08 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 17:08 otto@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: sync
  • 17:07 otto@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: sync
  • 17:06 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: sync
  • 17:05 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: sync
  • 17:05 otto@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: sync
  • 17:04 otto@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: sync
  • 17:03 ottomata: restarting eventgate-analytics to pick up stream config changes for T381322
  • 17:01 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 17:00 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 16:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
  • 16:59 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:58 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ml-lab1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:58 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:58 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
  • 16:57 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
  • 16:57 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:57 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 16:56 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
  • 16:56 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
  • 16:51 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 16:51 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 16:50 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:50 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 16:50 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:49 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:49 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 16:49 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 16:49 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:48 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 16:48 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 16:47 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 16:38 denisse@deploy2002: Finished deploy [librenms/librenms@f049593]: Upgrade LibreNMS to 24.10.0 - T381785 (duration: 00m 13s)
  • 16:38 denisse@deploy2002: Started deploy [librenms/librenms@f049593]: Upgrade LibreNMS to 24.10.0 - T381785
  • 16:26 klausman@cumin1002: START - Cookbook sre.hosts.reimage for host ml-lab1002.eqiad.wmnet with OS bookworm
  • 16:25 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:20 klausman@cumin1002: START - Cookbook sre.hosts.provision for host ml-lab1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:15 klausman@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ml-lab1002
  • 16:15 klausman@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ml-lab1002
  • 16:13 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 16:13 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 16:13 moritzm: installing postgresql-15 security updates
  • 16:12 klausman@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:12 klausman@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update DNS for newly-provisioned ml-lab1002 - klausman@cumin1002"
  • 16:12 klausman@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update DNS for newly-provisioned ml-lab1002 - klausman@cumin1002"
  • 16:09 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
  • 16:09 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on phab1004.eqiad.wmnet with reason: nftables
  • 16:09 elukey@deploy2002: helmfile [codfw] START helmfile.d/admin 'sync'.
  • 16:09 mutante: phabricator production host needs a maintenance reboot - expect short downtime
  • 16:09 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 0:10:00 on phab1004.eqiad.wmnet with reason: nftables
  • 16:09 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
  • 16:08 elukey@deploy2002: helmfile [eqiad] START helmfile.d/admin 'sync'.
  • 16:08 klausman@cumin1002: START - Cookbook sre.dns.netbox
  • 16:07 moritzm: manually clean out ganeti1009 from puppetdb, decom cookbook got interrupted T381652
  • 16:06 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: ganeti1009.eqiad.wmnet
  • 16:06 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: ganeti1009.eqiad.wmnet
  • 16:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71685 and previous config saved to /var/cache/conftool/dbconfig/20241210-160322-root.json
  • 15:53 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 15:53 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 15:52 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 15:52 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 15:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71684 and previous config saved to /var/cache/conftool/dbconfig/20241210-154816-root.json
  • 15:48 moritzm: installing usb.ids updates from Bullseye point release
  • 15:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 15:35 moritzm: installing imagemagick security updates
  • 15:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71683 and previous config saved to /var/cache/conftool/dbconfig/20241210-153311-root.json
  • 15:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71682 and previous config saved to /var/cache/conftool/dbconfig/20241210-151805-root.json
  • 15:15 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 15:15 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 15:06 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:05 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Event Logging: Update streamName and schemaId (T364460) (duration: 25m 40s)
  • 15:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 10%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71681 and previous config saved to /var/cache/conftool/dbconfig/20241210-150300-root.json
  • 15:00 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, wangombe: Continuing with sync
  • 14:47 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ml-lab1002.eqiad.wmnet
  • 14:47 klausman@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:47 klausman@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ml-lab1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - klausman@cumin1002"
  • 14:46 klausman@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ml-lab1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - klausman@cumin1002"
  • 14:44 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, wangombe: Backport for Event Logging: Update streamName and schemaId (T364460) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:40 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Event Logging: Update streamName and schemaId (T364460)
  • 14:39 samtar@deploy2002: Finished scap sync-world: Backport for Revert "IS/IS-l: wgUseCodexSpecialBlock for beta, prod test.wiki" (duration: 10m 15s)
  • 14:32 samtar@deploy2002: samtar: Continuing with sync
  • 14:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:32 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1080,1082-1083].eqiad.wmnet
  • 14:32 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1080,1082-1083].eqiad.wmnet
  • 14:32 samtar@deploy2002: samtar: Backport for Revert "IS/IS-l: wgUseCodexSpecialBlock for beta, prod test.wiki" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:31 klausman@cumin1002: START - Cookbook sre.dns.netbox
  • 14:29 TheresNoTime: revert 1101545 for T377121
  • 14:29 samtar@deploy2002: Started scap sync-world: Backport for Revert "IS/IS-l: wgUseCodexSpecialBlock for beta, prod test.wiki"
  • 14:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T381532)', diff saved to https://phabricator.wikimedia.org/P71678 and previous config saved to /var/cache/conftool/dbconfig/20241210-141820-marostegui.json
  • 14:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:17 jelto: homer 'lsw1-f3-eqiad*' commit 'T377876' , homer 'cr*eqiad*' commit 'T377876'
  • 14:15 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1081.eqiad.wmnet with OS bookworm
  • 13:01 klausman@cumin1002: START - Cookbook sre.hosts.decommission for hosts ml-lab1002.eqiad.wmnet
  • 13:00 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on ml-lab1002.eqiad.wmnet with reason: Moving to analytics network
  • 13:00 klausman@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on ml-lab1002.eqiad.wmnet with reason: Moving to analytics network
  • 12:55 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1081.eqiad.wmnet with OS bookworm
  • 12:53 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1081.eqiad.wmnet with OS bookworm
  • 12:19 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1083.eqiad.wmnet with OS bookworm
  • 12:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1080.eqiad.wmnet with OS bookworm
  • 12:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1082.eqiad.wmnet with OS bookworm
  • 12:07 samtar@deploy2002: Finished scap sync-world: Backport for IS/IS-l: wgUseCodexSpecialBlock for beta, prod test.wiki (T377121) (duration: 14m 06s)
  • 12:02 samtar@deploy2002: samtar: Continuing with sync
  • 12:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1083.eqiad.wmnet with reason: host reimage
  • 11:58 samtar@deploy2002: samtar: Backport for IS/IS-l: wgUseCodexSpecialBlock for beta, prod test.wiki (T377121) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1080.eqiad.wmnet with reason: host reimage
  • 11:53 samtar@deploy2002: Started scap sync-world: Backport for IS/IS-l: wgUseCodexSpecialBlock for beta, prod test.wiki (T377121)
  • 11:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1082.eqiad.wmnet with reason: host reimage
  • 11:51 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1080.eqiad.wmnet with reason: host reimage
  • 11:49 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1083.eqiad.wmnet with reason: host reimage
  • 11:49 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1082.eqiad.wmnet with reason: host reimage
  • 11:33 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1083.eqiad.wmnet with OS bookworm
  • 11:33 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1082.eqiad.wmnet with OS bookworm
  • 11:33 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1081.eqiad.wmnet with OS bookworm
  • 11:32 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1080.eqiad.wmnet with OS bookworm
  • 11:29 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1080.eqiad.wmnet wikikube-worker1081.eqiad.wmnet wikikube-worker1082.eqiad.wmnet wikikube-worker1083.eqiad.wmnet on all recursors
  • 11:29 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1080.eqiad.wmnet wikikube-worker1081.eqiad.wmnet wikikube-worker1082.eqiad.wmnet wikikube-worker1083.eqiad.wmnet on all recursors
  • 11:29 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1058 to wikikube-worker1083
  • 11:28 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1083
  • 11:27 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1083
  • 11:27 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:27 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1058 to wikikube-worker1083 - jelto@cumin1002"
  • 11:27 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1058 to wikikube-worker1083 - jelto@cumin1002"
  • 11:24 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:23 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1058 to wikikube-worker1083
  • 11:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1057 to wikikube-worker1082
  • 11:21 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1082
  • 11:20 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1082
  • 11:20 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:20 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1057 to wikikube-worker1082 - jelto@cumin1002"
  • 11:19 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1057 to wikikube-worker1082 - jelto@cumin1002"
  • 11:16 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:15 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1057 to wikikube-worker1082
  • 11:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1056 to wikikube-worker1081
  • 11:14 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1081
  • 11:14 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1081
  • 11:14 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:14 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1056 to wikikube-worker1081 - jelto@cumin1002"
  • 11:14 claime: Done deploying no-op cfssl-issuer admin_ng change - 1101455
  • 11:14 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1056 to wikikube-worker1081 - jelto@cumin1002"
  • 11:13 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 11:12 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 11:12 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 11:12 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 11:12 cgoubert@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:11 cgoubert@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:11 cgoubert@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 11:11 cgoubert@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 11:10 cgoubert@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:10 cgoubert@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 11:10 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:10 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1056 to wikikube-worker1081
  • 11:09 cgoubert@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:09 cgoubert@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 11:09 cgoubert@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1055 to wikikube-worker1080
  • 11:08 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1080
  • 11:08 cgoubert@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 11:08 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1080
  • 11:08 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:08 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1055 to wikikube-worker1080 - jelto@cumin1002"
  • 11:08 cgoubert@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:07 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1055 to wikikube-worker1080 - jelto@cumin1002"
  • 11:06 cgoubert@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 11:05 claime: Deploying no-op cfssl-issuer admin_ng change - 1101455
  • 11:02 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:01 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1055 to wikikube-worker1080
  • 10:56 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1055-1058].eqiad.wmnet
  • 10:53 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1055-1058].eqiad.wmnet
  • 10:20 marostegui@cumin1002: dbctl commit (dc=all): 'es2045 (re)pooling @ 100%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71675 and previous config saved to /var/cache/conftool/dbconfig/20241210-102038-root.json
  • 10:18 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 100%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71674 and previous config saved to /var/cache/conftool/dbconfig/20241210-101815-root.json
  • 10:17 aqu@deploy2002: Finished deploy [airflow-dags/analytics@7428c06]: Backfill webrequest actor metrics 2024 12 (duration: 20m 51s)
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'es2045 (re)pooling @ 75%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71673 and previous config saved to /var/cache/conftool/dbconfig/20241210-100532-root.json
  • 10:03 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 75%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71672 and previous config saved to /var/cache/conftool/dbconfig/20241210-100310-root.json
  • 10:00 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1076-1079].eqiad.wmnet
  • 10:00 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1076-1079].eqiad.wmnet
  • 09:57 jelto: homer 'lsw1-f3-eqiad*' commit 'T377876' , homer 'lsw1-e3-eqiad*' commit 'T377876'
  • 09:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1078.eqiad.wmnet with OS bookworm
  • 09:56 aqu@deploy2002: Started deploy [airflow-dags/analytics@7428c06]: Backfill webrequest actor metrics 2024 12
  • 09:56 aqu@deploy2002: Finished deploy [airflow-dags/analytics@7428c06]: Backfill webrequest actor metrics 2024 12 (duration: 07m 37s)
  • 09:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1079.eqiad.wmnet with OS bookworm
  • 09:50 marostegui@cumin1002: dbctl commit (dc=all): 'es2045 (re)pooling @ 50%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71670 and previous config saved to /var/cache/conftool/dbconfig/20241210-095027-root.json
  • 09:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1077.eqiad.wmnet with OS bookworm
  • 09:48 aqu@deploy2002: Started deploy [airflow-dags/analytics@7428c06]: Backfill webrequest actor metrics 2024 12
  • 09:48 aqu@deploy2002: Finished deploy [airflow-dags/analytics@7428c06]: Backfill webrequest actor metrics 2024 12 (duration: 07m 22s)
  • 09:48 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 50%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71669 and previous config saved to /var/cache/conftool/dbconfig/20241210-094805-root.json
  • 09:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1076.eqiad.wmnet with OS bookworm
  • 09:45 joal@deploy2002: Finished deploy [analytics/refinery@0ffc330] (hadoop-test): Analytics backfill train - TEST [analytics/refinery@0ffc3306] (duration: 00m 26s)
  • 09:44 joal@deploy2002: Started deploy [analytics/refinery@0ffc330] (hadoop-test): Analytics backfill train - TEST [analytics/refinery@0ffc3306]
  • 09:44 joal@deploy2002: Finished deploy [analytics/refinery@0ffc330] (thin): Analytics backfill train - THIN [analytics/refinery@0ffc3306] (duration: 00m 31s)
  • 09:44 kevinbazira@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:44 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:44 joal@deploy2002: Started deploy [analytics/refinery@0ffc330] (thin): Analytics backfill train - THIN [analytics/refinery@0ffc3306]
  • 09:43 joal@deploy2002: Finished deploy [analytics/refinery@0ffc330]: Analytics backfill train [analytics/refinery@0ffc3306] (duration: 02m 04s)
  • 09:43 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:41 joal@deploy2002: Started deploy [analytics/refinery@0ffc330]: Analytics backfill train [analytics/refinery@0ffc3306]
  • 09:41 aqu@deploy2002: Started deploy [airflow-dags/analytics@7428c06]: Backfill webrequest actor metrics 2024 12
  • 09:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1078.eqiad.wmnet with reason: host reimage
  • 09:36 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:35 marostegui@cumin1002: dbctl commit (dc=all): 'es2045 (re)pooling @ 25%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71668 and previous config saved to /var/cache/conftool/dbconfig/20241210-093522-root.json
  • 09:34 moritzm: rebalance Ganeti cluster in codfw/c following server refresh T376594
  • 09:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1079.eqiad.wmnet with reason: host reimage
  • 09:33 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 25%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71667 and previous config saved to /var/cache/conftool/dbconfig/20241210-093259-root.json
  • 09:32 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 09:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1077.eqiad.wmnet with reason: host reimage
  • 09:27 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1076.eqiad.wmnet with reason: host reimage
  • 09:24 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1079.eqiad.wmnet with reason: host reimage
  • 09:24 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1078.eqiad.wmnet with reason: host reimage
  • 09:24 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1077.eqiad.wmnet with reason: host reimage
  • 09:23 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1076.eqiad.wmnet with reason: host reimage
  • 09:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1159 (re)pooling @ 100%: 5', diff saved to https://phabricator.wikimedia.org/P71666 and previous config saved to /var/cache/conftool/dbconfig/20241210-092243-root.json
  • 09:20 marostegui@cumin1002: dbctl commit (dc=all): 'es2045 (re)pooling @ 10%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71665 and previous config saved to /var/cache/conftool/dbconfig/20241210-092016-root.json
  • 09:17 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 10%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71664 and previous config saved to /var/cache/conftool/dbconfig/20241210-091754-root.json
  • 09:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1159 (re)pooling @ 75%: 5', diff saved to https://phabricator.wikimedia.org/P71663 and previous config saved to /var/cache/conftool/dbconfig/20241210-090738-root.json
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 100%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71662 and previous config saved to /var/cache/conftool/dbconfig/20241210-090732-root.json
  • 09:07 stevemunene@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 09:06 stevemunene@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 09:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1079.eqiad.wmnet with OS bookworm
  • 09:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1078.eqiad.wmnet with OS bookworm
  • 09:05 marostegui@cumin1002: dbctl commit (dc=all): 'es2045 (re)pooling @ 5%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71661 and previous config saved to /var/cache/conftool/dbconfig/20241210-090511-root.json
  • 09:04 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1077.eqiad.wmnet with OS bookworm
  • 09:04 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1076.eqiad.wmnet with OS bookworm
  • 09:02 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 5%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71660 and previous config saved to /var/cache/conftool/dbconfig/20241210-090248-root.json
  • 09:01 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1076.eqiad.wmnet wikikube-worker1077.eqiad.wmnet wikikube-worker1078.eqiad.wmnet wikikube-worker1079.eqiad.wmnet on all recursors
  • 09:01 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1076.eqiad.wmnet wikikube-worker1077.eqiad.wmnet wikikube-worker1078.eqiad.wmnet wikikube-worker1079.eqiad.wmnet on all recursors
  • 09:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1054 to wikikube-worker1079
  • 09:00 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1079
  • 08:59 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1079
  • 08:59 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:59 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1054 to wikikube-worker1079 - jelto@cumin1002"
  • 08:59 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1054 to wikikube-worker1079 - jelto@cumin1002"
  • 08:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet with reason: Alter table
  • 08:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet with reason: Alter table
  • 08:55 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:55 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1054 to wikikube-worker1079
  • 08:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1053 to wikikube-worker1078
  • 08:54 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1078
  • 08:54 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1078
  • 08:54 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:54 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1053 to wikikube-worker1078 - jelto@cumin1002"
  • 08:53 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1053 to wikikube-worker1078 - jelto@cumin1002"
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1159 (re)pooling @ 50%: 5', diff saved to https://phabricator.wikimedia.org/P71659 and previous config saved to /var/cache/conftool/dbconfig/20241210-085232-root.json
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 75%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71658 and previous config saved to /var/cache/conftool/dbconfig/20241210-085227-root.json
  • 08:50 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'es2045 (re)pooling @ 1%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71657 and previous config saved to /var/cache/conftool/dbconfig/20241210-085006-root.json
  • 08:50 elukey: manual run of docker-system-prune-all on build2001 to free some space
  • 08:49 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1053 to wikikube-worker1078
  • 08:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1052 to wikikube-worker1077
  • 08:49 marostegui@cumin1002: dbctl commit (dc=all): 'Change es2024 weight', diff saved to https://phabricator.wikimedia.org/P71656 and previous config saved to /var/cache/conftool/dbconfig/20241210-084932-marostegui.json
  • 08:49 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1077
  • 08:48 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1077
  • 08:48 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:48 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1052 to wikikube-worker1077 - jelto@cumin1002"
  • 08:48 marostegui@cumin1002: dbctl commit (dc=all): 'Add es2045 to dbctl T381259', diff saved to https://phabricator.wikimedia.org/P71655 and previous config saved to /var/cache/conftool/dbconfig/20241210-084844-marostegui.json
  • 08:48 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1052 to wikikube-worker1077 - jelto@cumin1002"
  • 08:47 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 1%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71654 and previous config saved to /var/cache/conftool/dbconfig/20241210-084743-root.json
  • 08:44 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:44 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1052 to wikikube-worker1077
  • 08:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1051 to wikikube-worker1076
  • 08:42 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1076
  • 08:41 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1076
  • 08:41 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:41 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1051 to wikikube-worker1076 - jelto@cumin1002"
  • 08:41 gmodena: UTC morning backport deploys done
  • 08:41 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1051 to wikikube-worker1076 - jelto@cumin1002"
  • 08:39 gmodena@deploy2002: Finished scap sync-world: Backport for EventStreamConfig: add content_history streams. (T381322) (duration: 17m 16s)
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1159 (re)pooling @ 25%: 5', diff saved to https://phabricator.wikimedia.org/P71653 and previous config saved to /var/cache/conftool/dbconfig/20241210-083726-root.json
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 50%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71652 and previous config saved to /var/cache/conftool/dbconfig/20241210-083721-root.json
  • 08:37 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:36 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1051 to wikikube-worker1076
  • 08:34 gmodena@deploy2002: gmodena: Continuing with sync
  • 08:34 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1051-1054].eqiad.wmnet
  • 08:31 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1051-1054].eqiad.wmnet
  • 08:26 gmodena@deploy2002: gmodena: Backport for EventStreamConfig: add content_history streams. (T381322) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:22 gmodena@deploy2002: Started scap sync-world: Backport for EventStreamConfig: add content_history streams. (T381322)
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1159 (re)pooling @ 10%: 5', diff saved to https://phabricator.wikimedia.org/P71650 and previous config saved to /var/cache/conftool/dbconfig/20241210-082221-root.json
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 25%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71649 and previous config saved to /var/cache/conftool/dbconfig/20241210-082216-root.json
  • 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'Add db1159 to dbctl depooled T381550', diff saved to https://phabricator.wikimedia.org/P71648 and previous config saved to /var/cache/conftool/dbconfig/20241210-082020-marostegui.json
  • 08:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 10%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71647 and previous config saved to /var/cache/conftool/dbconfig/20241210-080710-root.json
  • 05:01 mwpresync@deploy2002: Pruned MediaWiki: 1.44.0-wmf.4 (duration: 01m 25s)

2024-12-09

  • 22:29 ryankemper: [wdqs-internal graph split] Cleared away old categories units on 5 hosts (`wdqs20[18-20],wdqs202[6-7]`)
  • 22:28 cjming: end of UTC late backport window
  • 22:23 cjming@deploy2002: Finished scap sync-world: Backport for idwikivoyage: add timezone, sitename and project namespace (T381080) (duration: 10m 46s)
  • 22:17 cjming@deploy2002: cjming, anzx: Continuing with sync
  • 22:16 cjming@deploy2002: cjming, anzx: Backport for idwikivoyage: add timezone, sitename and project namespace (T381080) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:12 cjming@deploy2002: Started scap sync-world: Backport for idwikivoyage: add timezone, sitename and project namespace (T381080)
  • 22:10 cjming@deploy2002: Finished scap sync-world: Backport for jawiki: lift IP cap on 2024-12-17 and 2025-01-14 for Edit-a-ton (T381729) (duration: 10m 02s)
  • 22:05 cjming@deploy2002: cjming, anzx: Continuing with sync
  • 22:05 cjming@deploy2002: cjming, anzx: Backport for jawiki: lift IP cap on 2024-12-17 and 2025-01-14 for Edit-a-ton (T381729) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:00 cjming@deploy2002: Started scap sync-world: Backport for jawiki: lift IP cap on 2024-12-17 and 2025-01-14 for Edit-a-ton (T381729)
  • 21:57 cjming@deploy2002: Finished scap sync-world: Backport for Disable QuickSurveys for recommendations (T379241 T380778) (duration: 10m 15s)
  • 21:52 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:51 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:51 cjming@deploy2002: cjming, jdlrobson: Continuing with sync
  • 21:51 cjming@deploy2002: cjming, jdlrobson: Backport for Disable QuickSurveys for recommendations (T379241 T380778) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:47 cjming@deploy2002: Started scap sync-world: Backport for Disable QuickSurveys for recommendations (T379241 T380778)
  • 21:46 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:46 cjming@deploy2002: Finished scap sync-world: Backport for Expand support for dark mode for anonymous users (itwiki, enwikivoyage) (T379352) (duration: 11m 08s)
  • 21:44 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:40 cjming@deploy2002: jdlrobson, cjming: Continuing with sync
  • 21:39 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:39 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:38 cjming@deploy2002: jdlrobson, cjming: Backport for Expand support for dark mode for anonymous users (itwiki, enwikivoyage) (T379352) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:34 cjming@deploy2002: Started scap sync-world: Backport for Expand support for dark mode for anonymous users (itwiki, enwikivoyage) (T379352)
  • 21:34 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:33 cjming@deploy2002: Finished scap sync-world: Backport for cirrus: Enable mlr-2024 for select wikis (T377128) (duration: 10m 28s)
  • 21:32 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:29 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:28 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:27 cjming@deploy2002: cjming, ebernhardson: Continuing with sync
  • 21:27 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:27 cjming@deploy2002: cjming, ebernhardson: Backport for cirrus: Enable mlr-2024 for select wikis (T377128) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:22 cjming@deploy2002: Started scap sync-world: Backport for cirrus: Enable mlr-2024 for select wikis (T377128)
  • 21:21 cjming@deploy2002: Finished scap sync-world: Backport for Actually load IRS in production (T374105) (duration: 12m 29s)
  • 21:14 cjming@deploy2002: cjming, mszabo: Continuing with sync
  • 21:13 cjming@deploy2002: cjming, mszabo: Backport for Actually load IRS in production (T374105) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:09 cjming@deploy2002: Started scap sync-world: Backport for Actually load IRS in production (T374105)
  • 20:25 aqu@deploy2002: Finished deploy [airflow-dags/analytics@1d9b4b5]: Canary events generation: pooling (duration: 01m 46s)
  • 20:23 aqu@deploy2002: Started deploy [airflow-dags/analytics@1d9b4b5]: Canary events generation: pooling
  • 20:07 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching aqs1010.eqiad.wmnet: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 19:58 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching aqs1010.eqiad.wmnet: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 18:17 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 18:17 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 18:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 17:52 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 17:51 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 17:47 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1025.eqiad.wmnet with reason: host reimage
  • 17:44 cdanis: 💙cdanis@cumin1002.eqiad.wmnet ~ 🕧☕ sudo cumin 'A:cp' 'enable-puppet "cdanis testing in production I464702d8fb T381771"'
  • 17:43 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1025.eqiad.wmnet with reason: host reimage
  • 17:36 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 17:35 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 17:22 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1072,1074-1075].eqiad.wmnet
  • 17:22 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1072,1074-1075].eqiad.wmnet
  • 17:20 jelto: homer 'lsw1-e3-eqiad*' commit 'T377876'
  • 17:18 cdanis: T381771 💙cdanis@cp1107.eqiad.wmnet ~ 🕧☕ sudo run-puppet-agent --force
  • 17:16 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1073.eqiad.wmnet with OS bookworm
  • 17:15 cdanis: 💙cdanis@cumin1002.eqiad.wmnet ~ 🕛☕ sudo cumin 'A:cp' 'disable-puppet "cdanis testing in production I464702d8fb T381771"'
  • 17:14 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 16:59 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 16:58 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 16:47 hnowlan@deploy2002: Finished scap sync-world: Rebuild and deploy to pick up new php8.1 base (duration: 21m 09s)
  • 16:26 hnowlan@deploy2002: Started scap sync-world: Rebuild and deploy to pick up new php8.1 base
  • 16:12 moritzm: rebalance Ganeti cluster in codfw/B following server refresh T376594
  • 16:06 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2089-2090].codfw.wmnet
  • 16:06 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2089-2090].codfw.wmnet
  • 16:05 hnowlan@deploy2002: Finished scap sync-world: Rebuild and deploy to pick up new php8.1 base (duration: 23m 00s)
  • 15:56 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1073.eqiad.wmnet with OS bookworm
  • 15:55 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1073.eqiad.wmnet with OS bookworm
  • 15:45 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2090.codfw.wmnet with OS bookworm
  • 15:44 hnowlan@deploy2002: Started scap sync-world: Rebuild and deploy to pick up new php8.1 base
  • 15:43 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2089.codfw.wmnet with OS bookworm
  • 15:34 Emperor: depool/restart swift/repool ms-fe1012
  • 15:34 mszabo@deploy2002: Finished scap sync-world: Backport for dialog: Fix wrong title on Types of unacceptable behavior step (T381529), dialog: Fix spacing between buttons in the dialog footer (T381530), Prep IRS config for testwiki (duration: 13m 39s)
  • 15:33 Emperor: depool/restart swift/repool ms-fe1010
  • 15:28 mszabo@deploy2002: mszabo: Continuing with sync
  • 15:25 mszabo@deploy2002: mszabo: Backport for dialog: Fix wrong title on Types of unacceptable behavior step (T381529), dialog: Fix spacing between buttons in the dialog footer (T381530), Prep IRS config for testwiki synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:25 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2090.codfw.wmnet with reason: host reimage
  • 15:22 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2089.codfw.wmnet with reason: host reimage
  • 15:21 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1075.eqiad.wmnet with OS bookworm
  • 15:20 mszabo@deploy2002: Started scap sync-world: Backport for dialog: Fix wrong title on Types of unacceptable behavior step (T381529), dialog: Fix spacing between buttons in the dialog footer (T381530), Prep IRS config for testwiki
  • 15:20 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2090.codfw.wmnet with reason: host reimage
  • 15:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1074.eqiad.wmnet with OS bookworm
  • 15:18 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2089.codfw.wmnet with reason: host reimage
  • 15:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1072.eqiad.wmnet with OS bookworm
  • 15:04 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1075.eqiad.wmnet with reason: host reimage
  • 15:01 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1074.eqiad.wmnet with reason: host reimage
  • 15:00 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2090
  • 15:00 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2090
  • 15:00 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2090.codfw.wmnet with OS bookworm
  • 15:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1025.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:59 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2090.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:59 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for idwikivoyage: add logo, wordmark (T381080) (duration: 11m 44s)
  • 14:59 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2089
  • 14:59 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2089
  • 14:58 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2089.codfw.wmnet with OS bookworm
  • 14:58 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2089.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1072.eqiad.wmnet with reason: host reimage
  • 14:54 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1075.eqiad.wmnet with reason: host reimage
  • 14:53 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1074.eqiad.wmnet with reason: host reimage
  • 14:53 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Continuing with sync
  • 14:53 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1072.eqiad.wmnet with reason: host reimage
  • 14:51 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Backport for idwikivoyage: add logo, wordmark (T381080) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:47 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for idwikivoyage: add logo, wordmark (T381080)
  • 14:46 bking@cumin2002: START - Cookbook sre.hosts.provision for host wdqs1025.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:44 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2089.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:44 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2090.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:39 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Translate: Enable message group subscription for 6 wikis (T372386) (duration: 14m 34s)
  • 14:36 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1075.eqiad.wmnet with OS bookworm
  • 14:35 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1074.eqiad.wmnet with OS bookworm
  • 14:35 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1073.eqiad.wmnet with OS bookworm
  • 14:35 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1072.eqiad.wmnet with OS bookworm
  • 14:34 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wikikube-worker[2089-2090].codfw.wmnet with reason: reimage
  • 14:34 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wikikube-worker[2089-2090].codfw.wmnet with reason: reimage
  • 14:34 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2089-2090].codfw.wmnet
  • 14:33 lucaswerkmeister-wmde@deploy2002: abi, lucaswerkmeister-wmde: Continuing with sync
  • 14:33 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2089-2090].codfw.wmnet
  • 14:29 lucaswerkmeister-wmde@deploy2002: abi, lucaswerkmeister-wmde: Backport for Translate: Enable message group subscription for 6 wikis (T372386) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:25 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1072.eqiad.wmnet wikikube-worker1073.eqiad.wmnet wikikube-worker1074.eqiad.wmnet wikikube-worker1075.eqiad.wmnet on all recursors
  • 14:25 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1072.eqiad.wmnet wikikube-worker1073.eqiad.wmnet wikikube-worker1074.eqiad.wmnet wikikube-worker1075.eqiad.wmnet on all recursors
  • 14:25 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1050 to wikikube-worker1075
  • 14:24 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1075
  • 14:24 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Translate: Enable message group subscription for 6 wikis (T372386)
  • 14:24 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1075
  • 14:24 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:24 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1050 to wikikube-worker1075 - jelto@cumin1002"
  • 14:23 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Add Metrics Platform stream configuration for translate_extension (T364460) (duration: 17m 12s)
  • 14:23 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1050 to wikikube-worker1075 - jelto@cumin1002"
  • 14:19 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:18 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1050 to wikikube-worker1075
  • 14:17 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1049 to wikikube-worker1074
  • 14:17 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1074
  • 14:16 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, wangombe: Continuing with sync
  • 14:16 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1074
  • 14:16 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:16 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1049 to wikikube-worker1074 - jelto@cumin1002"
  • 14:15 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1049 to wikikube-worker1074 - jelto@cumin1002"
  • 14:12 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:12 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1049 to wikikube-worker1074
  • 14:11 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, wangombe: Backport for Add Metrics Platform stream configuration for translate_extension (T364460) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1048 to wikikube-worker1073
  • 14:10 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1073
  • 14:10 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1073
  • 14:10 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:10 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1048 to wikikube-worker1073 - jelto@cumin1002"
  • 14:09 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1048 to wikikube-worker1073 - jelto@cumin1002"
  • 14:06 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:06 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Add Metrics Platform stream configuration for translate_extension (T364460)
  • 14:05 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1048 to wikikube-worker1073
  • 14:04 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1047 to wikikube-worker1072
  • 14:03 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1072
  • 14:03 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1072
  • 14:03 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:03 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1047 to wikikube-worker1072 - jelto@cumin1002"
  • 14:02 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1047 to wikikube-worker1072 - jelto@cumin1002"
  • 14:00 Lucas_WMDE: 'Updated the Wikidata property suggester with data from 20241125’s JSON dump: mwscript-k8s --attach -- extensions/PropertySuggester/maintenance/UpdateTable.php --wiki wikidatawiki --file php://stdin < wbs_propertypairs.csv # T377986, T376604'
  • 13:58 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:57 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1047 to wikikube-worker1072
  • 13:54 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1047-1050].eqiad.wmnet
  • 13:49 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1047-1050].eqiad.wmnet
  • 13:46 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2103-2106].codfw.wmnet
  • 13:46 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2103-2106].codfw.wmnet
  • 13:30 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1068,1070-1071].eqiad.wmnet
  • 13:30 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1068,1070-1071].eqiad.wmnet
  • 13:16 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 12:57 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1069.eqiad.wmnet with OS bookworm
  • 12:26 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2106.codfw.wmnet with OS bookworm
  • 12:22 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2105.codfw.wmnet with OS bookworm
  • 12:13 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1068.eqiad.wmnet with OS bookworm
  • 12:12 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2104.codfw.wmnet with OS bookworm
  • 12:08 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2103.codfw.wmnet with OS bookworm
  • 12:07 moritzm: installing reportbug bugfix updates
  • 12:06 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2106.codfw.wmnet with reason: host reimage
  • 12:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1069.eqiad.wmnet with OS bookworm
  • 12:04 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1069.eqiad.wmnet with OS bookworm
  • 12:02 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2105.codfw.wmnet with reason: host reimage
  • 11:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1071.eqiad.wmnet with OS bookworm
  • 11:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1068.eqiad.wmnet with reason: host reimage
  • 11:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1070.eqiad.wmnet with OS bookworm
  • 11:51 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2104.codfw.wmnet with reason: host reimage
  • 11:49 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1068.eqiad.wmnet with reason: host reimage
  • 11:48 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2103.codfw.wmnet with reason: host reimage
  • 11:46 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2106.codfw.wmnet with reason: host reimage
  • 11:45 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2105.codfw.wmnet with reason: host reimage
  • 11:45 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2104.codfw.wmnet with reason: host reimage
  • 11:45 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2103.codfw.wmnet with reason: host reimage
  • 11:42 root@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1210.eqiad.wmnet onto db1159.eqiad.wmnet
  • 11:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1071.eqiad.wmnet with reason: host reimage
  • 11:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1070.eqiad.wmnet with reason: host reimage
  • 11:32 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1068.eqiad.wmnet with OS bookworm
  • 11:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1071.eqiad.wmnet with reason: host reimage
  • 11:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1070.eqiad.wmnet with reason: host reimage
  • 11:27 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1068.eqiad.wmnet with OS bookworm
  • 11:27 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2106
  • 11:27 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2106
  • 11:26 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2106.codfw.wmnet with OS bookworm
  • 11:26 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2105
  • 11:26 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2105
  • 11:26 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2105.codfw.wmnet with OS bookworm
  • 11:25 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2104
  • 11:25 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2104
  • 11:25 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2104.codfw.wmnet with OS bookworm
  • 11:25 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2103
  • 11:25 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2103
  • 11:25 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2103.codfw.wmnet with OS bookworm
  • 11:21 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2104.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:21 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2103.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:21 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2106.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:21 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2105.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1071.eqiad.wmnet with OS bookworm
  • 11:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1070.eqiad.wmnet with OS bookworm
  • 11:11 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1069.eqiad.wmnet with OS bookworm
  • 11:11 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1068.eqiad.wmnet with OS bookworm
  • 11:05 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1068.eqiad.wmnet wikikube-worker1069.eqiad.wmnet wikikube-worker1070.eqiad.wmnet wikikube-worker1071.eqiad.wmnet on all recursors
  • 11:05 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1068.eqiad.wmnet wikikube-worker1069.eqiad.wmnet wikikube-worker1070.eqiad.wmnet wikikube-worker1071.eqiad.wmnet on all recursors
  • 11:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1046 to wikikube-worker1071
  • 11:04 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1071
  • 11:03 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1071
  • 11:03 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:03 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1046 to wikikube-worker1071 - jelto@cumin1002"
  • 11:03 root@cumin1002: START - Cookbook sre.mysql.clone of db1210.eqiad.wmnet onto db1159.eqiad.wmnet
  • 11:03 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1046 to wikikube-worker1071 - jelto@cumin1002"
  • 11:02 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2106.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:01 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2105.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:01 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2104.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:01 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2103.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: cloning
  • 11:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: cloning
  • 11:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: cloning
  • 11:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: cloning
  • 10:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1210 to clone db1159 T381550', diff saved to https://phabricator.wikimedia.org/P71640 and previous config saved to /var/cache/conftool/dbconfig/20241209-105941-marostegui.json
  • 10:59 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:59 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1046 to wikikube-worker1071
  • 10:58 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1045 to wikikube-worker1070
  • 10:57 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1070
  • 10:56 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1070
  • 10:56 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:56 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1045 to wikikube-worker1070 - jelto@cumin1002"
  • 10:56 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1045 to wikikube-worker1070 - jelto@cumin1002"
  • 10:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2024 to clone es2045', diff saved to https://phabricator.wikimedia.org/P71639 and previous config saved to /var/cache/conftool/dbconfig/20241209-105508-marostegui.json
  • 10:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2024.codfw.wmnet with reason: cloning
  • 10:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2024.codfw.wmnet with reason: cloning
  • 10:52 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:52 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1045 to wikikube-worker1070
  • 10:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1044 to wikikube-worker1069
  • 10:51 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1069
  • 10:50 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1069
  • 10:50 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:50 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1044 to wikikube-worker1069 - jelto@cumin1002"
  • 10:49 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1044 to wikikube-worker1069 - jelto@cumin1002"
  • 10:45 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:45 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1044 to wikikube-worker1069
  • 10:44 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1043 to wikikube-worker1068
  • 10:44 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1068
  • 10:42 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1068
  • 10:42 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:42 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1043 to wikikube-worker1068 - jelto@cumin1002"
  • 10:42 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1043 to wikikube-worker1068 - jelto@cumin1002"
  • 10:41 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2103-2106].codfw.wmnet
  • 10:39 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2103-2106].codfw.wmnet
  • 10:38 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:38 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wikikube-worker[2103-2106].codfw.wmnet with reason: reimage
  • 10:38 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1043 to wikikube-worker1068
  • 10:38 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wikikube-worker[2103-2106].codfw.wmnet with reason: reimage
  • 10:36 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1043-1046].eqiad.wmnet
  • 10:35 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2074-2075,2091,2124].codfw.wmnet
  • 10:35 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2074-2075,2091,2124].codfw.wmnet
  • 10:34 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1043-1046].eqiad.wmnet
  • 10:30 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2091.codfw.wmnet with OS bookworm
  • 10:25 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2075.codfw.wmnet with OS bookworm
  • 10:23 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2074.codfw.wmnet with OS bookworm
  • 10:20 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1064-1067].eqiad.wmnet
  • 10:20 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1064-1067].eqiad.wmnet
  • 10:10 moritzm: rebalance Ganeti cluster in codfw/A following server refresh T376594
  • 10:10 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2091.codfw.wmnet with reason: host reimage
  • 10:06 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 10:06 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2075.codfw.wmnet with reason: host reimage
  • 10:04 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1065.eqiad.wmnet with OS bookworm
  • 10:03 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2074.codfw.wmnet with reason: host reimage
  • 10:02 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2124.codfw.wmnet with OS bookworm
  • 10:00 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2091.codfw.wmnet with reason: host reimage
  • 09:59 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2075.codfw.wmnet with reason: host reimage
  • 09:59 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2074.codfw.wmnet with reason: host reimage
  • 09:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1067.eqiad.wmnet with OS bookworm
  • 09:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1066.eqiad.wmnet with OS bookworm
  • 09:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1064.eqiad.wmnet with OS bookworm
  • 09:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1065.eqiad.wmnet with reason: host reimage
  • 09:42 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2124.codfw.wmnet with reason: host reimage
  • 09:40 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2075.codfw.wmnet with OS bookworm
  • 09:40 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2074.codfw.wmnet with OS bookworm
  • 09:40 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2091.codfw.wmnet with OS bookworm
  • 09:39 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2075.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:39 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2074.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:39 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2091.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:38 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2124.codfw.wmnet with reason: host reimage
  • 09:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1067.eqiad.wmnet with reason: host reimage
  • 09:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1066.eqiad.wmnet with reason: host reimage
  • 09:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1064.eqiad.wmnet with reason: host reimage
  • 09:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1067.eqiad.wmnet with reason: host reimage
  • 09:29 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1066.eqiad.wmnet with reason: host reimage
  • 09:28 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1065.eqiad.wmnet with reason: host reimage
  • 09:28 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1064.eqiad.wmnet with reason: host reimage
  • 09:18 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2124.codfw.wmnet with OS bookworm
  • 09:16 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2124.codfw.wmnet with OS bookworm
  • 09:16 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2124.codfw.wmnet with OS bookworm
  • 09:14 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2091.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:13 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2075.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:13 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2074.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1067.eqiad.wmnet with OS bookworm
  • 09:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1066.eqiad.wmnet with OS bookworm
  • 09:10 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1065.eqiad.wmnet with OS bookworm
  • 09:10 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1064.eqiad.wmnet with OS bookworm
  • 09:07 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1064.eqiad.wmnet wikikube-worker1065.eqiad.wmnet wikikube-worker1066.eqiad.wmnet wikikube-worker1067.eqiad.wmnet on all recursors
  • 09:07 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1064.eqiad.wmnet wikikube-worker1065.eqiad.wmnet wikikube-worker1066.eqiad.wmnet wikikube-worker1067.eqiad.wmnet on all recursors
  • 09:07 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1042 to wikikube-worker1067
  • 09:06 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1067
  • 09:05 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1067
  • 09:04 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:04 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1042 to wikikube-worker1067 - jelto@cumin1002"
  • 09:04 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1042 to wikikube-worker1067 - jelto@cumin1002"
  • 09:04 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2074-2075,2091,2124].codfw.wmnet
  • 09:02 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2074-2075,2091,2124].codfw.wmnet
  • 09:01 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wikikube-worker[2074-2075,2091,2124].codfw.wmnet with reason: reimage
  • 09:01 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:00 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wikikube-worker[2074-2075,2091,2124].codfw.wmnet with reason: reimage
  • 09:00 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1042 to wikikube-worker1067
  • 08:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1041 to wikikube-worker1066
  • 08:59 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1066
  • 08:58 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1066
  • 08:58 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:58 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1041 to wikikube-worker1066 - jelto@cumin1002"
  • 08:57 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1041 to wikikube-worker1066 - jelto@cumin1002"
  • 08:54 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:53 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1041 to wikikube-worker1066
  • 08:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1040 to wikikube-worker1065
  • 08:52 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1065
  • 08:50 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1065
  • 08:50 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:50 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1040 to wikikube-worker1065 - jelto@cumin1002"
  • 08:50 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1040 to wikikube-worker1065 - jelto@cumin1002"
  • 08:46 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:45 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1040 to wikikube-worker1065
  • 08:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1039 to wikikube-worker1064
  • 08:42 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1064
  • 08:41 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:40 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1064
  • 08:40 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:40 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1039 to wikikube-worker1064 - jelto@cumin1002"
  • 08:39 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1039 to wikikube-worker1064 - jelto@cumin1002"
  • 08:36 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:35 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:35 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1039 to wikikube-worker1064
  • 08:35 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:34 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:32 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1039-1042].eqiad.wmnet
  • 08:29 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1039-1042].eqiad.wmnet
  • 07:34 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1056.eqiad.wmnet
  • 07:34 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1056.eqiad.wmnet
  • 07:18 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 06:29 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 06:28 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 05:54 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 05:53 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 05:41 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 05:40 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 05:39 tstarling@deploy2002: Finished deploy [restbase/deploy@8184836]: also deploy to restbase2036-9 T380726 T377896 (duration: 16m 06s)
  • 05:23 tstarling@deploy2002: Started deploy [restbase/deploy@8184836]: also deploy to restbase2036-9 T380726 T377896
  • 04:45 tstarling@deploy2002: Finished deploy [restbase/deploy@0531d4e]: try again after removing decom servers T380790 T380726 (duration: 14m 36s)
  • 04:31 tstarling@deploy2002: Started deploy [restbase/deploy@0531d4e]: try again after removing decom servers T380790 T380726
  • 04:29 tstarling@deploy2002: Finished deploy [restbase/deploy@27f4a8e]: try again, seems like restbase2026 at least was skipped T380726 (duration: 09m 00s)
  • 04:20 tstarling@deploy2002: Started deploy [restbase/deploy@27f4a8e]: try again, seems like restbase2026 at least was skipped T380726
  • 04:08 tstarling@deploy2002: Finished deploy [restbase/deploy@27f4a8e]: add 3 wikis T380726 (duration: 10m 46s)
  • 03:58 tstarling@deploy2002: Started deploy [restbase/deploy@27f4a8e]: add 3 wikis T380726
  • 03:55 tstarling@deploy2002: Finished deploy [restbase/deploy@6d0b97e]: no-op test deploy (duration: 11m 22s)
  • 03:44 tstarling@deploy2002: Started deploy [restbase/deploy@6d0b97e]: no-op test deploy
  • 03:30 tstarling@deploy2002: Finished scap sync-world: Backport for Prepare for migration of the Interwiki extension to core (T33951) (duration: 31m 17s)
  • 03:20 tstarling@deploy2002: tstarling: Continuing with sync
  • 03:10 tstarling@deploy2002: tstarling: Backport for Prepare for migration of the Interwiki extension to core (T33951) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 02:59 tstarling@deploy2002: Started scap sync-world: Backport for Prepare for migration of the Interwiki extension to core (T33951)

2024-12-08

  • 19:25 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 19:24 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply

2024-12-07

  • 00:33 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:32 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply

2024-12-06

  • 23:18 mutante: clouddumps1001/clouddumps1002: rm /srv/dumps/xmldatadumps/public/other/misc/phabricator_public.dump - an uncompressed old file from Sep 2023 - normal dumps are gzipped and current
  • 22:33 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 22:33 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 20:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 20:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1025.eqiad.wmnet with reason: host reimage
  • 20:08 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1025.eqiad.wmnet with reason: host reimage
  • 19:40 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 19:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1025.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 19:00 bking@cumin2002: START - Cookbook sre.hosts.provision for host wdqs1025.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 17:31 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1057.eqiad.wmnet with OS bookworm
  • 17:29 topranks: splitting codfw -> eqsin traffic over path via ulsfo as direct link is saturated
  • 17:08 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1058-1063].eqiad.wmnet
  • 17:08 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1058-1063].eqiad.wmnet
  • 17:08 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1059.eqiad.wmnet with OS bookworm
  • 16:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1059.eqiad.wmnet with reason: host reimage
  • 16:45 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1059.eqiad.wmnet with reason: host reimage
  • 16:30 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1061.eqiad.wmnet with OS bookworm
  • 16:29 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1059.eqiad.wmnet with OS bookworm
  • 16:29 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1059.eqiad.wmnet with OS bookworm
  • 16:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1063.eqiad.wmnet with OS bookworm
  • 16:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1062.eqiad.wmnet with OS bookworm
  • 16:20 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1060.eqiad.wmnet with OS bookworm
  • 16:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1058.eqiad.wmnet with OS bookworm
  • 16:12 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1061.eqiad.wmnet with reason: host reimage
  • 16:11 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1057.eqiad.wmnet with OS bookworm
  • 16:10 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1057.eqiad.wmnet with OS bookworm
  • 16:09 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1063.eqiad.wmnet with reason: host reimage
  • 16:05 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1062.eqiad.wmnet with reason: host reimage
  • 16:01 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1060.eqiad.wmnet with reason: host reimage
  • 15:59 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1063.eqiad.wmnet with reason: host reimage
  • 15:58 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1062.eqiad.wmnet with reason: host reimage
  • 15:58 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1058.eqiad.wmnet with reason: host reimage
  • 15:58 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1061.eqiad.wmnet with reason: host reimage
  • 15:57 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1060.eqiad.wmnet with reason: host reimage
  • 15:54 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1058.eqiad.wmnet with reason: host reimage
  • 15:43 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1063.eqiad.wmnet with OS bookworm
  • 15:43 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1062.eqiad.wmnet with OS bookworm
  • 15:42 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1061.eqiad.wmnet with OS bookworm
  • 15:41 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1060.eqiad.wmnet with OS bookworm
  • 15:41 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1059.eqiad.wmnet with OS bookworm
  • 15:39 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1058.eqiad.wmnet with OS bookworm
  • 15:36 kamila@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker1058.eqiad.wmnet
  • 15:36 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1058.eqiad.wmnet with OS bullseye
  • 15:36 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1058.eqiad.wmnet with OS bullseye
  • 15:36 kamila@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker1058.eqiad.wmnet
  • 15:34 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1058.eqiad.wmnet wikikube-worker1059.eqiad.wmnet wikikube-worker1060.eqiad.wmnet wikikube-worker1061.eqiad.wmnet wikikube-worker1062.eqiad.wmnet wikikube-worker1063.eqiad.wmnet on all recursors
  • 15:34 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1058.eqiad.wmnet wikikube-worker1059.eqiad.wmnet wikikube-worker1060.eqiad.wmnet wikikube-worker1061.eqiad.wmnet wikikube-worker1062.eqiad.wmnet wikikube-worker1063.eqiad.wmnet on all recursors
  • 15:33 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1434 to wikikube-worker1062
  • 15:33 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1062
  • 15:32 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1062
  • 15:32 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1435 to wikikube-worker1063
  • 15:30 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1063
  • 15:29 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:29 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1063
  • 15:29 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1433 to wikikube-worker1061
  • 15:28 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1061
  • 15:27 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:27 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1061
  • 15:26 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:26 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1433 to wikikube-worker1061 - kamila@cumin1002"
  • 15:26 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1433 to wikikube-worker1061 - kamila@cumin1002"
  • 15:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1432 to wikikube-worker1060
  • 15:23 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1060
  • 15:22 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:22 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1060
  • 15:22 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:22 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1432 to wikikube-worker1060 - kamila@cumin1002"
  • 15:22 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1432 to wikikube-worker1060 - kamila@cumin1002"
  • 15:20 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1435 to wikikube-worker1063
  • 15:20 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1434 to wikikube-worker1062
  • 15:20 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1433 to wikikube-worker1061
  • 15:19 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1431 to wikikube-worker1059
  • 15:19 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1059
  • 15:18 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:18 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1059
  • 15:18 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:18 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1431 to wikikube-worker1059 - kamila@cumin1002"
  • 15:18 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1432 to wikikube-worker1060
  • 15:18 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1431 to wikikube-worker1059 - kamila@cumin1002"
  • 15:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1430 to wikikube-worker1058
  • 15:14 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1058
  • 15:14 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:13 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1058
  • 15:13 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:13 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1430 to wikikube-worker1058 - kamila@cumin1002"
  • 15:13 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1430 to wikikube-worker1058 - kamila@cumin1002"
  • 15:10 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1431 to wikikube-worker1059
  • 15:08 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:08 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1430 to wikikube-worker1058
  • 15:05 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[1430-1435].eqiad.wmnet
  • 15:02 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[1430-1435].eqiad.wmnet
  • 14:50 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1057.eqiad.wmnet with OS bookworm
  • 14:49 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1057.eqiad.wmnet with OS bookworm
  • 14:27 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: ganeti1009.eqiad.wmnet
  • 14:27 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: ganeti1009.eqiad.wmnet
  • 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1020.eqiad.wmnet
  • 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1020.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 14:19 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1020.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 14:11 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 14:10 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1056.eqiad.wmnet with OS bookworm
  • 13:56 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1020.eqiad.wmnet
  • 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1018.eqiad.wmnet
  • 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1018.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:50 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1056.eqiad.wmnet with reason: host reimage
  • 13:47 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1018.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:46 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1056.eqiad.wmnet with reason: host reimage
  • 13:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:36 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1018.eqiad.wmnet
  • 13:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1017.eqiad.wmnet
  • 13:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1017.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:35 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1017.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:31 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:29 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1057.eqiad.wmnet with OS bookworm
  • 13:29 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1056.eqiad.wmnet with OS bookworm
  • 13:25 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1056.eqiad.wmnet wikikube-worker1057.eqiad.wmnet on all recursors
  • 13:25 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1056.eqiad.wmnet wikikube-worker1057.eqiad.wmnet on all recursors
  • 13:21 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1017.eqiad.wmnet
  • 13:19 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1038 to wikikube-worker1057
  • 13:19 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1057
  • 13:18 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1057
  • 13:18 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:17 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1038 to wikikube-worker1057 - jelto@cumin1002"
  • 13:17 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1038 to wikikube-worker1057 - jelto@cumin1002"
  • 13:13 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:13 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1038 to wikikube-worker1057
  • 13:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1037 to wikikube-worker1056
  • 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1016.eqiad.wmnet
  • 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:11 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1056
  • 13:11 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:10 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1056
  • 13:10 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:10 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1037 to wikikube-worker1056 - jelto@cumin1002"
  • 13:07 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:01 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1037 to wikikube-worker1056 - jelto@cumin1002"
  • 12:58 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1016.eqiad.wmnet
  • 12:56 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:56 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1037 to wikikube-worker1056
  • 12:48 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1037-1038].eqiad.wmnet
  • 12:47 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1037-1038].eqiad.wmnet
  • 12:40 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti1009.eqiad.wmnet
  • 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1009.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:39 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1009.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 12:21 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1009.eqiad.wmnet
  • 12:15 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1054-1055].eqiad.wmnet
  • 12:15 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1054-1055].eqiad.wmnet
  • 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host build2002.codfw.wmnet with OS bookworm
  • 11:58 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 11:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1055.eqiad.wmnet with OS bookworm
  • 11:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1054.eqiad.wmnet with OS bookworm
  • 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on build2002.codfw.wmnet with reason: host reimage
  • 11:48 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on build2002.codfw.wmnet with reason: host reimage
  • 11:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1055.eqiad.wmnet with reason: host reimage
  • 11:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1054.eqiad.wmnet with reason: host reimage
  • 11:32 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1055.eqiad.wmnet with reason: host reimage
  • 11:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1054.eqiad.wmnet with reason: host reimage
  • 11:30 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host build2002.codfw.wmnet with OS bookworm
  • 11:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1055.eqiad.wmnet with OS bookworm
  • 11:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1054.eqiad.wmnet with OS bookworm
  • 11:00 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1054.eqiad.wmnet wikikube-worker1055.eqiad.wmnet on all recursors
  • 11:00 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1054.eqiad.wmnet wikikube-worker1055.eqiad.wmnet on all recursors
  • 10:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1036 to wikikube-worker1055
  • 10:58 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1055
  • 10:57 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1055
  • 10:57 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:57 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1036 to wikikube-worker1055 - jelto@cumin1002"
  • 10:57 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1036 to wikikube-worker1055 - jelto@cumin1002"
  • 10:53 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2042.codfw.wmnet to cluster codfw and group D
  • 10:53 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:53 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1036 to wikikube-worker1055
  • 10:52 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2042.codfw.wmnet to cluster codfw and group D
  • 10:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1035 to wikikube-worker1054
  • 10:48 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1054
  • 10:47 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1054
  • 10:47 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:47 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1035 to wikikube-worker1054 - jelto@cumin1002"
  • 10:47 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1035 to wikikube-worker1054 - jelto@cumin1002"
  • 10:43 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:43 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1035 to wikikube-worker1054
  • 10:41 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1035-1036].eqiad.wmnet
  • 10:39 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1035-1036].eqiad.wmnet
  • 10:27 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1052-1053].eqiad.wmnet
  • 10:27 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1052-1053].eqiad.wmnet
  • 10:11 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 10:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1053.eqiad.wmnet with OS bookworm
  • 10:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1052.eqiad.wmnet with OS bookworm
  • 09:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1053.eqiad.wmnet with reason: host reimage
  • 09:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1052.eqiad.wmnet with reason: host reimage
  • 09:46 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1053.eqiad.wmnet with reason: host reimage
  • 09:45 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1052.eqiad.wmnet with reason: host reimage
  • 09:28 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1053.eqiad.wmnet with OS bookworm
  • 09:28 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1052.eqiad.wmnet with OS bookworm
  • 09:24 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1052.eqiad.wmnet wikikube-worker1053.eqiad.wmnet on all recursors
  • 09:24 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1052.eqiad.wmnet wikikube-worker1053.eqiad.wmnet on all recursors
  • 09:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1034 to wikikube-worker1053
  • 09:22 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1053
  • 09:21 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1053
  • 09:21 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:21 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1034 to wikikube-worker1053 - jelto@cumin1002"
  • 09:20 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1034 to wikikube-worker1053 - jelto@cumin1002"
  • 09:16 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:16 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1034 to wikikube-worker1053
  • 09:16 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1033 to wikikube-worker1052
  • 09:15 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1052
  • 09:14 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1052
  • 09:13 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:13 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1033 to wikikube-worker1052 - jelto@cumin1002"
  • 09:13 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1033 to wikikube-worker1052 - jelto@cumin1002"
  • 09:09 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:09 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1033 to wikikube-worker1052
  • 09:03 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1033-1034].eqiad.wmnet
  • 09:02 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1033-1034].eqiad.wmnet
  • 09:00 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:55 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:43 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:33 moritzm: uploaded ruby-sys-filesystem 1.4.3-1~wmf11u1 to component/puppet7 for Bullseye (needed by the mountpoints fact in facter 4) T381538
  • 08:33 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:30 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:30 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:26 hashar@deploy2002: Finished deploy [gerrit/gerrit@ac50ebe]: Reinstate the banner for the developer survey (duration: 00m 11s)
  • 08:26 hashar@deploy2002: Started deploy [gerrit/gerrit@ac50ebe]: Reinstate the banner for the developer survey
  • 08:18 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:18 elukey@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:17 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:17 elukey@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:17 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:16 elukey@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2044 (re)pooling @ 100%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71633 and previous config saved to /var/cache/conftool/dbconfig/20241206-072120-root.json
  • 07:20 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 07:19 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 07:07 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 07:06 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2044 (re)pooling @ 75%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71632 and previous config saved to /var/cache/conftool/dbconfig/20241206-070614-root.json
  • 07:05 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 07:04 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 06:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2044 (re)pooling @ 25%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71630 and previous config saved to /var/cache/conftool/dbconfig/20241206-063603-root.json
  • 06:25 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P71629 and previous config saved to /var/cache/conftool/dbconfig/20241206-062527-root.json
  • 06:20 marostegui@cumin1002: dbctl commit (dc=all): 'es2044 (re)pooling @ 10%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71628 and previous config saved to /var/cache/conftool/dbconfig/20241206-062058-root.json
  • 06:10 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P71627 and previous config saved to /var/cache/conftool/dbconfig/20241206-061021-root.json
  • 06:05 marostegui@cumin1002: dbctl commit (dc=all): 'es2044 (re)pooling @ 5%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71626 and previous config saved to /var/cache/conftool/dbconfig/20241206-060552-root.json
  • 05:55 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P71625 and previous config saved to /var/cache/conftool/dbconfig/20241206-055516-root.json
  • 05:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1154.eqiad.wmnet with reason: Alter table
  • 05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1154.eqiad.wmnet with reason: Alter table
  • 05:50 marostegui@cumin1002: dbctl commit (dc=all): 'es2044 (re)pooling @ 1%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71624 and previous config saved to /var/cache/conftool/dbconfig/20241206-055047-root.json
  • 05:44 marostegui@cumin1002: dbctl commit (dc=all): 'Add es2044 to dbctl depooled T381259', diff saved to https://phabricator.wikimedia.org/P71623 and previous config saved to /var/cache/conftool/dbconfig/20241206-054457-marostegui.json
  • 05:40 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P71622 and previous config saved to /var/cache/conftool/dbconfig/20241206-054010-root.json
  • 01:50 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:49 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:49 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:48 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:47 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:47 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:47 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:47 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:32 TimStarling: on mwmaint2002: deleting MediaWiki:Sitesupport-url pages per T379205
  • 01:16 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:15 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:19 urbanecm: mwmaint2002: foreachwikiindblist growthexperiments extensions/GrowthExperiments/maintenance/revalidateLinkRecommendations.php --all --verbose # T380455
  • 00:19 urbanecm: Delete previously-started mwscript-k8s instances of revalidateLinkRecommendations.php (T380455)

2024-12-05

  • 23:26 jhathaway: looking at puppet failures on an-workers
  • 23:23 urbanecm: Start revalidateLinkRecommendations.php for Add Link-enabled wikis via mwscript-k8s (T380455)
  • 22:53 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2020.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 22:00 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2020.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 21:24 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
  • 21:23 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
  • 21:21 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
  • 21:21 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
  • 21:20 cdanis@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
  • 21:19 cdanis@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
  • 20:27 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2019.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 19:34 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2019.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 19:28 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.44.0-wmf.6 refs T375665
  • 18:44 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 18:13 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1048-1049,1051].eqiad.wmnet
  • 18:13 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1048-1049,1051].eqiad.wmnet
  • 18:00 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 18:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:59 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 17:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1051.eqiad.wmnet with OS bookworm
  • 17:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1050.eqiad.wmnet with OS bookworm
  • 17:41 pt1979@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:40 pt1979@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1051.eqiad.wmnet with reason: host reimage
  • 17:36 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1051.eqiad.wmnet with reason: host reimage
  • 17:36 jhathaway: upgrading facter on bullseye puppet nodes
  • 17:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1050.eqiad.wmnet with reason: host reimage
  • 17:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1050.eqiad.wmnet with reason: host reimage
  • 17:25 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1125.eqiad.wmnet with reason: Test setup should not alert
  • 17:25 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db1125.eqiad.wmnet with reason: Test setup should not alert
  • 17:21 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:21 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:19 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 17:19 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1051.eqiad.wmnet with OS bookworm
  • 17:19 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 17:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1050.eqiad.wmnet with OS bookworm
  • 17:11 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1051.eqiad.wmnet with OS bookworm
  • 17:11 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1050.eqiad.wmnet with OS bookworm
  • 17:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1049.eqiad.wmnet with OS bookworm
  • 17:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1048.eqiad.wmnet with OS bookworm
  • 17:04 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:03 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:03 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:03 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:01 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:01 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:00 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:00 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1049.eqiad.wmnet with reason: host reimage
  • 16:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1048.eqiad.wmnet with reason: host reimage
  • 16:46 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS bullseye
  • 16:45 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host cloudelastic1011.eqiad.wmnet with OS bullseye
  • 16:43 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1049.eqiad.wmnet with reason: host reimage
  • 16:43 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1048.eqiad.wmnet with reason: host reimage
  • 16:36 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:36 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:31 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:31 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:30 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:29 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:29 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/benthos-cache-invalidator: apply
  • 16:29 jiji@deploy2002: helmfile [staging] START helmfile.d/services/benthos-cache-invalidator: apply
  • 16:29 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 16:28 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 16:28 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 16:28 jiji@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 16:27 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:27 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:26 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1051.eqiad.wmnet with OS bookworm
  • 16:26 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1050.eqiad.wmnet with OS bookworm
  • 16:26 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1049.eqiad.wmnet with OS bookworm
  • 16:25 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1048.eqiad.wmnet with OS bookworm
  • 16:23 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1048.eqiad.wmnet wikikube-worker1049.eqiad.wmnet wikikube-worker1050.eqiad.wmnet wikikube-worker1051.eqiad.wmnet on all recursors
  • 16:23 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1048.eqiad.wmnet wikikube-worker1049.eqiad.wmnet wikikube-worker1050.eqiad.wmnet wikikube-worker1051.eqiad.wmnet on all recursors
  • 16:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1032 to wikikube-worker1051
  • 16:22 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:22 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:22 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1051
  • 16:21 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1051
  • 16:21 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:21 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1032 to wikikube-worker1051 - jelto@cumin1002"
  • 16:21 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1032 to wikikube-worker1051 - jelto@cumin1002"
  • 16:20 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 16:20 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:20 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:20 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 16:20 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 16:19 jiji@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 16:19 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 16:18 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 16:18 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 16:18 jiji@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 16:17 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:17 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:17 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:17 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:15 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 16:15 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:15 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for cloudelastic - jclark@cumin1002"
  • 16:14 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for cloudelastic - jclark@cumin1002"
  • 16:14 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:14 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 16:13 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1032 to wikikube-worker1051
  • 16:13 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:12 jiji@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 16:11 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:11 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 16:11 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:11 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/benthos-cache-invalidator: apply
  • 16:11 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:11 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:11 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 16:11 jiji@deploy2002: helmfile [staging] START helmfile.d/services/benthos-cache-invalidator: apply
  • 16:08 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kafka-main1010.eqiad.wmnet
  • 16:08 jiji@cumin1002: START - Cookbook sre.hosts.remove-downtime for kafka-main1010.eqiad.wmnet
  • 16:08 jiji@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:07 jiji@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 16:07 jiji@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:07 jiji@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 16:07 jiji@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:07 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 16:06 jiji@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 16:06 jiji@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 16:06 jiji@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 16:06 jiji@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:06 jiji@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 16:05 jiji@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:05 jiji@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 16:05 jiji@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:05 jiji@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 16:05 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 16:04 jiji@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 16:04 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 16:04 jiji@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 16:02 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal scholarly tier) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1027.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:58 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1031 to wikikube-worker1050
  • 15:58 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1050
  • 15:56 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1050
  • 15:56 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:56 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1031 to wikikube-worker1050 - jelto@cumin1002"
  • 15:55 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1031 to wikikube-worker1050 - jelto@cumin1002"
  • 15:55 moritzm: installing nghttp2 security updates
  • 15:52 jiji@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
  • 15:52 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 15:51 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1031 to wikikube-worker1050
  • 15:41 jiji@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
  • 15:29 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1030 to wikikube-worker1049
  • 15:28 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1049
  • 15:27 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1049
  • 15:27 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:27 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1030 to wikikube-worker1049 - jelto@cumin1002"
  • 15:26 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1030 to wikikube-worker1049 - jelto@cumin1002"
  • 15:24 moritzm: installing postgresql security updates
  • 15:22 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 15:22 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1030 to wikikube-worker1049
  • 15:20 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1029 to wikikube-worker1048
  • 15:20 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1048
  • 15:18 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1048
  • 15:18 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:18 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1029 to wikikube-worker1048 - jelto@cumin1002"
  • 15:18 dbrant@deploy2002: Finished scap sync-world: Backport for Enable Parsoid Fragment mode on Chart pilot wikis (T381436 T381312 T380758) (duration: 19m 51s)
  • 15:18 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1029 to wikikube-worker1048 - jelto@cumin1002"
  • 15:15 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:15 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal scholarly tier) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1027.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:15 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:13 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 15:12 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1029 to wikikube-worker1048
  • 15:10 dbrant@deploy2002: dbrant, cscott: Continuing with sync
  • 15:10 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1029-1032].eqiad.wmnet
  • 15:10 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:10 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T376150, initialize wdqs internal scholarly tier) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1027.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:10 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:10 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:09 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal scholarly tier) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1027.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:09 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:08 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1029-1032].eqiad.wmnet
  • 15:07 dbrant@deploy2002: dbrant, cscott: Backport for Enable Parsoid Fragment mode on Chart pilot wikis (T381436 T381312 T380758) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:00 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 14:58 dbrant@deploy2002: Started scap sync-world: Backport for Enable Parsoid Fragment mode on Chart pilot wikis (T381436 T381312 T380758)
  • 14:55 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1045.eqiad.wmnet with OS bookworm
  • 14:55 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:53 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:37 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
  • 14:36 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1046-1047].eqiad.wmnet
  • 14:36 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1046-1047].eqiad.wmnet
  • 14:34 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
  • 14:29 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 14:28 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 14:28 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 14:27 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 14:20 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 14:20 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 14:20 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 14:20 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1047.eqiad.wmnet with OS bookworm
  • 14:18 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS bookworm
  • 14:17 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 14:16 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1046.eqiad.wmnet with OS bookworm
  • 14:04 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 14:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1047.eqiad.wmnet with reason: host reimage
  • 13:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1045.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 13:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1046.eqiad.wmnet with reason: host reimage
  • 13:54 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1047.eqiad.wmnet with reason: host reimage
  • 13:54 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1046.eqiad.wmnet with reason: host reimage
  • 13:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 13:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es1045.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 13:42 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2178.codfw.wmnet
  • 13:42 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2178.codfw.wmnet
  • 13:36 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1047.eqiad.wmnet with OS bookworm
  • 13:36 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1046.eqiad.wmnet with OS bookworm
  • 13:32 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1046.eqiad.wmnet wikikube-worker1047.eqiad.wmnet on all recursors
  • 13:32 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1046.eqiad.wmnet wikikube-worker1047.eqiad.wmnet on all recursors
  • 13:23 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2178.codfw.wmnet with OS bookworm
  • 13:08 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[1005,1010].eqiad.wmnet with reason: Hardware refresh
  • 13:08 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[1005,1010].eqiad.wmnet with reason: Hardware refresh
  • 13:03 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2178.codfw.wmnet with reason: host reimage
  • 12:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1028 to wikikube-worker1047
  • 12:58 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1047
  • 12:57 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1047
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1028 to wikikube-worker1047 - jelto@cumin1002"
  • 12:57 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1028 to wikikube-worker1047 - jelto@cumin1002"
  • 12:56 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2178.codfw.wmnet with reason: host reimage
  • 12:53 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:53 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1028 to wikikube-worker1047
  • 12:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1027 to wikikube-worker1046
  • 12:51 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1046
  • 12:50 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1046
  • 12:50 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:50 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1027 to wikikube-worker1046 - jelto@cumin1002"
  • 12:50 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1027 to wikikube-worker1046 - jelto@cumin1002"
  • 12:46 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:45 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1027 to wikikube-worker1046
  • 12:42 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2176-2177,2179].codfw.wmnet
  • 12:42 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2176-2177,2179].codfw.wmnet
  • 12:39 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1027-1028].eqiad.wmnet
  • 12:37 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2178
  • 12:37 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2178
  • 12:36 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2178
  • 12:36 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2178.codfw.wmnet 185.48.192.10.in-addr.arpa 5.8.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:36 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2178.codfw.wmnet 185.48.192.10.in-addr.arpa 5.8.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:36 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:36 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2178 - jayme@cumin2002"
  • 12:36 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2178 - jayme@cumin2002"
  • 12:36 jgleeson: payments updated from 119448ca to ab7e70ec
  • 12:35 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1027-1028].eqiad.wmnet
  • 12:26 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 12:16 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 12:16 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 12:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T371742)', diff saved to https://phabricator.wikimedia.org/P71620 and previous config saved to /var/cache/conftool/dbconfig/20241205-121609-ladsgroup.json
  • 12:15 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1038-1043].eqiad.wmnet
  • 12:15 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1038-1043].eqiad.wmnet
  • 12:06 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2176.codfw.wmnet with OS bookworm
  • 12:02 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2177.codfw.wmnet with OS bookworm
  • 12:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P71617 and previous config saved to /var/cache/conftool/dbconfig/20241205-120102-ladsgroup.json
  • 11:58 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2179.codfw.wmnet with OS bookworm
  • 11:46 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2176.codfw.wmnet with reason: host reimage
  • 11:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P71616 and previous config saved to /var/cache/conftool/dbconfig/20241205-114555-ladsgroup.json
  • 11:43 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2176.codfw.wmnet with reason: host reimage
  • 11:42 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2177.codfw.wmnet with reason: host reimage
  • 11:38 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2177.codfw.wmnet with reason: host reimage
  • 11:38 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2179.codfw.wmnet with reason: host reimage
  • 11:34 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2179.codfw.wmnet with reason: host reimage
  • 11:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T371742)', diff saved to https://phabricator.wikimedia.org/P71615 and previous config saved to /var/cache/conftool/dbconfig/20241205-113048-ladsgroup.json
  • 11:26 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2178
  • 11:25 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2176
  • 11:25 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2176
  • 11:25 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2176
  • 11:25 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2176.codfw.wmnet 81.48.192.10.in-addr.arpa 1.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:25 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2176.codfw.wmnet 81.48.192.10.in-addr.arpa 1.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:25 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:25 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2176 - jayme@cumin2002"
  • 11:24 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2176 - jayme@cumin2002"
  • 11:20 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 11:19 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2176
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2177
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2177
  • 11:19 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2177
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2177.codfw.wmnet 83.48.192.10.in-addr.arpa 3.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:19 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2177.codfw.wmnet 83.48.192.10.in-addr.arpa 3.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2177 - jayme@cumin2002"
  • 11:19 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2177 - jayme@cumin2002"
  • 11:15 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2176.codfw.wmnet with OS bookworm
  • 11:15 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 11:15 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2176.codfw.wmnet with OS bookworm
  • 11:15 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2177
  • 11:15 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2179
  • 11:15 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2179
  • 11:15 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2179
  • 11:15 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2179.codfw.wmnet 207.48.192.10.in-addr.arpa 7.0.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:15 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2179.codfw.wmnet 207.48.192.10.in-addr.arpa 7.0.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:15 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:15 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2179 - jayme@cumin2002"
  • 11:15 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2179 - jayme@cumin2002"
  • 11:14 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2176.codfw.wmnet with OS bookworm
  • 11:13 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2177.codfw.wmnet with OS bookworm
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2449 to wikikube-worker2177
  • 11:12 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2177
  • 11:12 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2178.codfw.wmnet with OS bookworm
  • 11:11 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2177
  • 11:11 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:11 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2449 to wikikube-worker2177 - jayme@cumin2002"
  • 11:11 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 11:11 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2179
  • 11:11 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2179.codfw.wmnet with OS bookworm
  • 11:10 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2179.codfw.wmnet with OS bookworm
  • 11:09 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2449 to wikikube-worker2177 - jayme@cumin2002"
  • 11:09 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2179.codfw.wmnet with OS bookworm
  • 11:05 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 11:05 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2449 to wikikube-worker2177
  • 10:54 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2451 to wikikube-worker2179
  • 10:54 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2179
  • 10:52 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2179
  • 10:52 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:51 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2450 to wikikube-worker2178
  • 10:50 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2178
  • 10:50 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:50 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2178
  • 10:50 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:50 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2450 to wikikube-worker2178 - jayme@cumin2002"
  • 10:50 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2450 to wikikube-worker2178 - jayme@cumin2002"
  • 10:47 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2448 to wikikube-worker2176
  • 10:46 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:46 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2176
  • 10:46 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2176
  • 10:46 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:46 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2448 to wikikube-worker2176 - jayme@cumin2002"
  • 10:45 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2448 to wikikube-worker2176 - jayme@cumin2002"
  • 10:43 dcausse: reindexed all wikidata entity schemas (T376252)
  • 10:42 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2451 to wikikube-worker2179
  • 10:42 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=93) from mw2449 to wikikube-worker2177
  • 10:42 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2450 to wikikube-worker2178
  • 10:42 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2449 to wikikube-worker2177
  • 10:42 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:41 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2448 to wikikube-worker2176
  • 10:37 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2451.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:37 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2450.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:37 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2449.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:37 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2448.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:33 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1044-1045].eqiad.wmnet
  • 10:33 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1044-1045].eqiad.wmnet
  • 10:28 jelto: homer 'lsw1-f3-eqiad*' commit 'T377876'
  • 10:27 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1045.eqiad.wmnet with OS bookworm
  • 10:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1045.eqiad.wmnet with reason: host reimage
  • 10:04 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1045.eqiad.wmnet with reason: host reimage
  • 09:56 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2451.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1226 (T371742)', diff saved to https://phabricator.wikimedia.org/P71614 and previous config saved to /var/cache/conftool/dbconfig/20241205-095554-ladsgroup.json
  • 09:55 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 09:55 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2450.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:55 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 09:55 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2449.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:54 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2448.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:48 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1045.eqiad.wmnet with OS bookworm
  • 09:47 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1044.eqiad.wmnet with OS bookworm
  • 09:40 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2450-2451].codfw.wmnet
  • 09:39 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2450-2451].codfw.wmnet
  • 09:39 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw[2448-2451].codfw.wmnet with reason: reimage
  • 09:38 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw[2448-2451].codfw.wmnet with reason: reimage
  • 09:38 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2448-2449].codfw.wmnet
  • 09:37 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2448-2449].codfw.wmnet
  • 09:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1044.eqiad.wmnet with reason: host reimage
  • 09:25 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1044.eqiad.wmnet with reason: host reimage
  • 09:20 jayme: destroyed unused expiring puppet certs - T381474
  • 09:15 fabfur: deploying haproxykafka also on magru and drmrs (T378578)
  • 09:09 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1044.eqiad.wmnet with OS bookworm
  • 09:06 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1044.eqiad.wmnet wikikube-worker1045.eqiad.wmnet on all recursors
  • 09:06 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1044.eqiad.wmnet wikikube-worker1045.eqiad.wmnet on all recursors
  • 09:04 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1026 to wikikube-worker1045
  • 09:04 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1045
  • 09:03 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1045
  • 09:03 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:03 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1026 to wikikube-worker1045 - jelto@cumin1002"
  • 09:03 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1026 to wikikube-worker1045 - jelto@cumin1002"
  • 08:58 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:58 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1026 to wikikube-worker1045
  • 08:58 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1025 to wikikube-worker1044
  • 08:57 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1044
  • 08:55 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1044
  • 08:55 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:55 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1025 to wikikube-worker1044 - jelto@cumin1002"
  • 08:54 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1025 to wikikube-worker1044 - jelto@cumin1002"
  • 08:49 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:49 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1025 to wikikube-worker1044
  • 08:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 08:46 moritzm: rebalance Ganeti eqiad/D following server refreshes
  • 08:08 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 08:07 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 08:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T371742)', diff saved to https://phabricator.wikimedia.org/P71611 and previous config saved to /var/cache/conftool/dbconfig/20241205-080745-ladsgroup.json
  • 07:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P71610 and previous config saved to /var/cache/conftool/dbconfig/20241205-075237-ladsgroup.json
  • 07:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P71609 and previous config saved to /var/cache/conftool/dbconfig/20241205-073730-ladsgroup.json
  • 07:36 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1025-1026].eqiad.wmnet
  • 07:32 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1025-1026].eqiad.wmnet
  • 07:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T371742)', diff saved to https://phabricator.wikimedia.org/P71608 and previous config saved to /var/cache/conftool/dbconfig/20241205-072223-ladsgroup.json
  • 07:16 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 06:31 marostegui@cumin1002: dbctl commit (dc=all): 'es2043 (re)pooling @ 100%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71607 and previous config saved to /var/cache/conftool/dbconfig/20241205-063132-root.json
  • 06:16 marostegui@cumin1002: dbctl commit (dc=all): 'es2043 (re)pooling @ 75%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71606 and previous config saved to /var/cache/conftool/dbconfig/20241205-061626-root.json
  • 06:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 100%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71605 and previous config saved to /var/cache/conftool/dbconfig/20241205-060631-root.json
  • 06:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 100%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71604 and previous config saved to /var/cache/conftool/dbconfig/20241205-060612-root.json
  • 06:01 marostegui@cumin1002: dbctl commit (dc=all): 'es2043 (re)pooling @ 50%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71603 and previous config saved to /var/cache/conftool/dbconfig/20241205-060121-root.json
  • 05:58 eileen: civicrm upgraded from 74c059a4 to f9c89e50
  • 05:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1214 (T371742)', diff saved to https://phabricator.wikimedia.org/P71602 and previous config saved to /var/cache/conftool/dbconfig/20241205-055442-ladsgroup.json
  • 05:54 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 05:54 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 05:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T371742)', diff saved to https://phabricator.wikimedia.org/P71601 and previous config saved to /var/cache/conftool/dbconfig/20241205-055420-ladsgroup.json
  • 05:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 75%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71600 and previous config saved to /var/cache/conftool/dbconfig/20241205-055125-root.json
  • 05:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 75%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71599 and previous config saved to /var/cache/conftool/dbconfig/20241205-055106-root.json
  • 05:46 marostegui@cumin1002: dbctl commit (dc=all): 'es2043 (re)pooling @ 25%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71598 and previous config saved to /var/cache/conftool/dbconfig/20241205-054615-root.json
  • 05:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2023.codfw.wmnet with reason: cloning
  • 05:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2023.codfw.wmnet with reason: cloning
  • 05:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2023 to clone es2044', diff saved to https://phabricator.wikimedia.org/P71597 and previous config saved to /var/cache/conftool/dbconfig/20241205-054200-marostegui.json
  • 05:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2025.codfw.wmnet with reason: cloning
  • 05:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2025.codfw.wmnet with reason: cloning
  • 05:41 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2025 to es5 master T381259', diff saved to https://phabricator.wikimedia.org/P71596 and previous config saved to /var/cache/conftool/dbconfig/20241205-054114-marostegui.json
  • 05:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P71595 and previous config saved to /var/cache/conftool/dbconfig/20241205-053912-ladsgroup.json
  • 05:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 50%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71593 and previous config saved to /var/cache/conftool/dbconfig/20241205-053620-root.json
  • 05:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 50%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71592 and previous config saved to /var/cache/conftool/dbconfig/20241205-053601-root.json
  • 05:31 marostegui@cumin1002: dbctl commit (dc=all): 'es2043 (re)pooling @ 10%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71591 and previous config saved to /var/cache/conftool/dbconfig/20241205-053109-root.json
  • 05:28 marostegui: Failover m3 from db1159 to db1213 - T381365
  • 05:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P71590 and previous config saved to /var/cache/conftool/dbconfig/20241205-052405-ladsgroup.json
  • 05:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 25%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71589 and previous config saved to /var/cache/conftool/dbconfig/20241205-052114-root.json
  • 05:20 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 25%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71588 and previous config saved to /var/cache/conftool/dbconfig/20241205-052056-root.json
  • 05:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2134,2160].codfw.wmnet,db[1159,1213,1217].eqiad.wmnet with reason: m3 master switchover T381365
  • 05:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db[2134,2160].codfw.wmnet,db[1159,1213,1217].eqiad.wmnet with reason: m3 master switchover T381365
  • 05:16 marostegui@cumin1002: dbctl commit (dc=all): 'es2043 (re)pooling @ 1%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71587 and previous config saved to /var/cache/conftool/dbconfig/20241205-051604-root.json
  • 05:15 marostegui@cumin1002: dbctl commit (dc=all): 'Add es2043 depooled T381259', diff saved to https://phabricator.wikimedia.org/P71586 and previous config saved to /var/cache/conftool/dbconfig/20241205-051545-marostegui.json
  • 05:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T371742)', diff saved to https://phabricator.wikimedia.org/P71585 and previous config saved to /var/cache/conftool/dbconfig/20241205-050858-ladsgroup.json
  • 05:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 10%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71584 and previous config saved to /var/cache/conftool/dbconfig/20241205-050609-root.json
  • 05:05 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 10%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71583 and previous config saved to /var/cache/conftool/dbconfig/20241205-050550-root.json
  • 03:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1211 (T371742)', diff saved to https://phabricator.wikimedia.org/P71578 and previous config saved to /var/cache/conftool/dbconfig/20241205-033803-ladsgroup.json
  • 03:37 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 03:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 03:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T371742)', diff saved to https://phabricator.wikimedia.org/P71577 and previous config saved to /var/cache/conftool/dbconfig/20241205-033751-ladsgroup.json
  • 03:34 eileen: tools upgraded from b230f718 to c7b53ecd
  • 03:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P71576 and previous config saved to /var/cache/conftool/dbconfig/20241205-032245-ladsgroup.json
  • 03:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P71575 and previous config saved to /var/cache/conftool/dbconfig/20241205-030737-ladsgroup.json
  • 02:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T371742)', diff saved to https://phabricator.wikimedia.org/P71574 and previous config saved to /var/cache/conftool/dbconfig/20241205-025230-ladsgroup.json
  • 02:45 eileen: civicrm upgraded from 6361a578 to 74c059a4
  • 01:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1209 (T371742)', diff saved to https://phabricator.wikimedia.org/P71573 and previous config saved to /var/cache/conftool/dbconfig/20241205-012108-ladsgroup.json
  • 01:21 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 01:20 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 01:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T371742)', diff saved to https://phabricator.wikimedia.org/P71572 and previous config saved to /var/cache/conftool/dbconfig/20241205-012046-ladsgroup.json
  • 01:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P71571 and previous config saved to /var/cache/conftool/dbconfig/20241205-010539-ladsgroup.json
  • 01:03 sukhe: re-enabling puppet on A:lvs [post-wdqs merge]
  • 00:58 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:57 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P71570 and previous config saved to /var/cache/conftool/dbconfig/20241205-005031-ladsgroup.json
  • 00:49 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:49 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T371742)', diff saved to https://phabricator.wikimedia.org/P71569 and previous config saved to /var/cache/conftool/dbconfig/20241205-003524-ladsgroup.json
  • 00:30 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 00:15 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1085.eqiad.wmnet with OS bullseye
  • 00:15 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 00:15 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 00:00 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 23:00:00 on 8 hosts with reason: T376150 non-prod hosts

2024-12-04

  • 23:59 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 23:00:00 on 8 hosts with reason: T376150 non-prod hosts
  • 23:57 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1085.eqiad.wmnet with reason: host reimage
  • 23:54 vriley@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1085.eqiad.wmnet with reason: host reimage
  • 23:47 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 23:43 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1085.eqiad.wmnet with OS bullseye
  • 23:42 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1044.eqiad.wmnet with OS bookworm
  • 23:42 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:40 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:39 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:35 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:35 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:32 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:32 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:26 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be1085.eqiad.wmnet with OS bullseye
  • 23:21 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:20 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:16 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1043.eqiad.wmnet with OS bookworm
  • 23:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1042.eqiad.wmnet with OS bookworm
  • 23:10 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 22:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1043.eqiad.wmnet with reason: host reimage
  • 22:56 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 22:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1203 (T371742)', diff saved to https://phabricator.wikimedia.org/P71567 and previous config saved to /var/cache/conftool/dbconfig/20241204-225545-ladsgroup.json
  • 22:55 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 22:55 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 22:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T371742)', diff saved to https://phabricator.wikimedia.org/P71566 and previous config saved to /var/cache/conftool/dbconfig/20241204-225523-ladsgroup.json
  • 22:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1041.eqiad.wmnet with OS bookworm
  • 22:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1042.eqiad.wmnet with reason: host reimage
  • 22:51 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1040.eqiad.wmnet with OS bookworm
  • 22:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1043.eqiad.wmnet with reason: host reimage
  • 22:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1042.eqiad.wmnet with reason: host reimage
  • 22:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P71565 and previous config saved to /var/cache/conftool/dbconfig/20241204-224016-ladsgroup.json
  • 22:38 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
  • 22:37 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1045.eqiad.wmnet with OS bookworm
  • 22:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1041.eqiad.wmnet with reason: host reimage
  • 22:34 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1043.eqiad.wmnet with OS bookworm
  • 22:34 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1042.eqiad.wmnet with OS bookworm
  • 22:33 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
  • 22:32 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1040.eqiad.wmnet with reason: host reimage
  • 22:30 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1041.eqiad.wmnet with reason: host reimage
  • 22:29 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1040.eqiad.wmnet with reason: host reimage
  • 22:26 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 22:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P71564 and previous config saved to /var/cache/conftool/dbconfig/20241204-222509-ladsgroup.json
  • 22:18 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1085.eqiad.wmnet with OS bullseye
  • 22:16 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1044.eqiad.wmnet with OS bookworm
  • 22:13 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1041.eqiad.wmnet with OS bookworm
  • 22:13 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1040.eqiad.wmnet with OS bookworm
  • 22:12 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1086.eqiad.wmnet with OS bullseye
  • 22:12 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 22:12 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 22:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T371742)', diff saved to https://phabricator.wikimedia.org/P71563 and previous config saved to /var/cache/conftool/dbconfig/20241204-221001-ladsgroup.json
  • 21:59 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.44.0-wmf.6 refs T375665
  • 21:57 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 21:57 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1044.eqiad.wmnet with OS bookworm
  • 21:49 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
  • 21:46 cjming: end of UTC late backport window
  • 21:46 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host es1045
  • 21:45 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
  • 21:45 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host es1045
  • 21:43 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:43 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 21:43 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 21:43 cjming@deploy2002: Finished scap sync-world: Backport for CSP for banner preview: allow remind me later SMS host (T380232) (duration: 17m 39s)
  • 21:40 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 21:37 cjming@deploy2002: cjming, gjg: Continuing with sync
  • 21:34 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1086.eqiad.wmnet with OS bullseye
  • 21:34 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:34 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:32 cjming@deploy2002: cjming, gjg: Backport for CSP for banner preview: allow remind me later SMS host (T380232) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:26 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs-internal-scholarly.discovery.wmnet on all recursors
  • 21:26 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache wdqs-internal-scholarly.discovery.wmnet on all recursors
  • 21:26 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs-internal-main.discovery.wmnet on all recursors
  • 21:26 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache wdqs-internal-main.discovery.wmnet on all recursors
  • 21:25 cjming@deploy2002: Started scap sync-world: Backport for CSP for banner preview: allow remind me later SMS host (T380232)
  • 21:25 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
  • 21:24 ryankemper: T379334 `ryankemper@dns1004:~$ sudo -i authdns-update` completed
  • 21:23 cjming@deploy2002: Finished scap sync-world: Backport for Enable Chart extension on several pilot wikis (T381436 T381312) (duration: 17m 29s)
  • 21:22 sukhe@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
  • 21:21 ryankemper: T379334 Final step (step 9) of spinning up these new services; merged https://gerrit.wikimedia.org/r/c/operations/dns/+/1100165/, next up is the authdns update
  • 21:18 ryankemper@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly
  • 21:18 ryankemper@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-main
  • 21:17 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS bookworm
  • 21:14 cjming@deploy2002: cjming, bvibber: Continuing with sync
  • 21:13 cjming@deploy2002: cjming, bvibber: Backport for Enable Chart extension on several pilot wikis (T381436 T381312) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:09 ryankemper: T380555 Rolling out prod change => `ryankemper@cumin2002:~$ sudo cumin -b 8 'A:dnsbox' 'run-puppet-agent'`
  • 21:05 ryankemper: T380555 Moving `wdqs-internal-[main,scholarly]` services into prod by merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1094074
  • 21:05 cjming@deploy2002: Started scap sync-world: Backport for Enable Chart extension on several pilot wikis (T381436 T381312)
  • 21:03 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: rebooting shortly
  • 21:03 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2013.codfw.wmnet with reason: rebooting shortly
  • 21:01 joal@deploy2002: Finished deploy [analytics/refinery@7ba91e1] (hadoop-test): Regular analytics weekly train TEST - HOTFIX 2 [analytics/refinery@7ba91e13] (duration: 00m 29s)
  • 21:00 joal@deploy2002: Started deploy [analytics/refinery@7ba91e1] (hadoop-test): Regular analytics weekly train TEST - HOTFIX 2 [analytics/refinery@7ba91e13]
  • 21:00 joal@deploy2002: Finished deploy [analytics/refinery@7ba91e1] (thin): Regular analytics weekly train THIN - HOTFIX 2 [analytics/refinery@7ba91e13] (duration: 00m 31s)
  • 20:59 joal@deploy2002: Started deploy [analytics/refinery@7ba91e1] (thin): Regular analytics weekly train THIN - HOTFIX 2 [analytics/refinery@7ba91e13]
  • 20:59 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
  • 20:59 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
  • 20:59 joal@deploy2002: Finished deploy [analytics/refinery@7ba91e1]: Regular analytics weekly train - HOTFIX 2 [analytics/refinery@7ba91e13] (duration: 01m 48s)
  • 20:57 joal@deploy2002: Started deploy [analytics/refinery@7ba91e1]: Regular analytics weekly train - HOTFIX 2 [analytics/refinery@7ba91e13]
  • 20:51 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1039.eqiad.wmnet with OS bookworm
  • 20:50 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:49 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:47 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1038.eqiad.wmnet with OS bookworm
  • 20:37 ryankemper: T380555 hosts happily pooled (except that `lvs2013` aka `A:lvs-low-traffic-codfw` cannot talk to `wdqs2026`) and `sudo ipvsadm -L -n` shows `10.2.1.93` and `10.2.1.94` as expected, codfw all done
  • 20:33 sukhe@cumin1002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-eqiad and A:lvs
  • 20:32 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1039.eqiad.wmnet with reason: host reimage
  • 20:32 sukhe@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-eqiad and A:lvs
  • 20:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1192 (T371742)', diff saved to https://phabricator.wikimedia.org/P71562 and previous config saved to /var/cache/conftool/dbconfig/20241204-203043-ladsgroup.json
  • 20:30 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 20:30 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 20:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T371742)', diff saved to https://phabricator.wikimedia.org/P71561 and previous config saved to /var/cache/conftool/dbconfig/20241204-203021-ladsgroup.json
  • 20:29 ryankemper@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-codfw and A:lvs
  • 20:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1038.eqiad.wmnet with reason: host reimage
  • 20:28 ryankemper: T380555 `sudo cookbook sre.loadbalancer.restart-pybal --query 'A:lvs-low-traffic-codfw' --reason 'rolling out new wdqs-internal-[main,scholarly] services' restart_daemons`
  • 20:28 ryankemper@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-codfw and A:lvs
  • 20:28 ryankemper: T380555 ran `sudo -E cumin 'A:lvs-low-traffic-codfw' 'run-puppet-agent --force'`
  • 20:28 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1020.eqiad.wmnet
  • 20:28 sukhe@cumin1002: START - Cookbook sre.hosts.remove-downtime for lvs1020.eqiad.wmnet
  • 20:26 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1039.eqiad.wmnet with reason: host reimage
  • 20:25 sukhe@cumin1002: END (ERROR) - Cookbook sre.loadbalancer.restart-pybal (exit_code=97) rolling-restart of pybal on A:lvs-secondary-eqiad and A:lvs
  • 20:25 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1038.eqiad.wmnet with reason: host reimage
  • 20:24 ryankemper: T380555 hosts happily pooled and `sudo ipvsadm -L -n` shows `10.2.1.93` and `10.2.1.94` as expected), proceeding to `A:lvs-low-traffic-codfw`
  • 20:23 sukhe@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-eqiad and A:lvs
  • 20:23 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
  • 20:22 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
  • 20:22 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
  • 20:21 ryankemper@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-codfw and A:lvs
  • 20:21 ryankemper: T380555 `sudo cookbook sre.loadbalancer.restart-pybal --query 'A:lvs-secondary-codfw' --reason 'rolling out new wdqs-internal-[main,scholarly] services' restart_daemons`
  • 20:21 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
  • 20:20 ryankemper@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-codfw and A:lvs
  • 20:20 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
  • 20:20 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/push-notifications: apply
  • 20:18 ryankemper: T380555 `sudo cookbook sre.loadbalancer.restart-pybal 'A:lvs-secondary-codfw' --reason 'rolling out new wdqs-internal-[main,scholarly] services'`
  • 20:17 ryankemper: T380555 `sudo -E cumin 'A:lvs-secondary-codfw' 'run-puppet-agent --force'`
  • 20:17 ryankemper: T380555 Beginning lvs rolling restarts. first up `A:lvs-secondary-codfw`
  • 20:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P71560 and previous config saved to /var/cache/conftool/dbconfig/20241204-201513-ladsgroup.json
  • 20:12 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
  • 20:12 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
  • 20:12 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
  • 20:12 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
  • 20:10 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1039.eqiad.wmnet with OS bookworm
  • 20:09 cdanis@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
  • 20:09 cdanis@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
  • 20:08 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1038.eqiad.wmnet with OS bookworm
  • 20:08 ryankemper: T380555 ran `ryankemper@cumin2002:~$ sudo -E cumin 'lvs*' 'disable-puppet T380555'`
  • 20:07 ryankemper: T380555 Disabling puppet on lvs hosts in preparation for merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1094070 which will move `wdqs-internal-[main,scholarly]` from `service_setup` to `lvs_setup`
  • 20:04 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "T377876 - kamila@cumin1002"
  • 20:04 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "T377876 - kamila@cumin1002"
  • 20:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P71559 and previous config saved to /var/cache/conftool/dbconfig/20241204-200006-ladsgroup.json
  • 19:57 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1038.eqiad.wmnet wikikube-worker1039.eqiad.wmnet wikikube-worker1040.eqiad.wmnet wikikube-worker1041.eqiad.wmnet wikikube-worker1042.eqiad.wmnet wikikube-worker1043.eqiad.wmnet on all recursors
  • 19:57 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1038.eqiad.wmnet wikikube-worker1039.eqiad.wmnet wikikube-worker1040.eqiad.wmnet wikikube-worker1041.eqiad.wmnet wikikube-worker1042.eqiad.wmnet wikikube-worker1043.eqiad.wmnet on all recursors
  • 19:55 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1038.eqiad.wmnet on all recursors
  • 19:55 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1038.eqiad.wmnet on all recursors
  • 19:55 ryankemper: T380555 Running puppet on `wdqs2018`
  • 19:55 ryankemper: T380555 Proceeding to step 5 of new lvs service process. Merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1094069 to enable lvs::realserver functionality
  • 19:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1496 to wikikube-worker1043
  • 19:53 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
  • 19:53 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1043
  • 19:53 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
  • 19:53 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1043
  • 19:53 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:53 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1496 to wikikube-worker1043 - kamila@cumin1002"
  • 19:52 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1496 to wikikube-worker1043 - kamila@cumin1002"
  • 19:52 sukhe: sudo cumin "O:config_master" "run-puppet-agent"
  • 19:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1495 to wikikube-worker1042
  • 19:49 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1042
  • 19:49 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 19:49 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1042
  • 19:49 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:49 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1495 to wikikube-worker1042 - kamila@cumin1002"
  • 19:49 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1495 to wikikube-worker1042 - kamila@cumin1002"
  • 19:45 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1496 to wikikube-worker1043
  • 19:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T371742)', diff saved to https://phabricator.wikimedia.org/P71558 and previous config saved to /var/cache/conftool/dbconfig/20241204-194459-ladsgroup.json
  • 19:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1494 to wikikube-worker1041
  • 19:43 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 19:43 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1041
  • 19:43 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1041
  • 19:43 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:43 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1494 to wikikube-worker1041 - kamila@cumin1002"
  • 19:42 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1494 to wikikube-worker1041 - kamila@cumin1002"
  • 19:40 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
  • 19:40 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1495 to wikikube-worker1042
  • 19:40 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
  • 19:39 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1493 to wikikube-worker1040
  • 19:39 joal@deploy2002: Finished deploy [airflow-dags/analytics@df2cac9]: Regular analytics weekly train [airflow-dags/analytics@df2cac98] (duration: 03m 55s)
  • 19:38 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1040
  • 19:38 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 19:38 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1040
  • 19:38 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:38 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1493 to wikikube-worker1040 - kamila@cumin1002"
  • 19:38 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1493 to wikikube-worker1040 - kamila@cumin1002"
  • 19:37 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1494 to wikikube-worker1041
  • 19:36 cdanis@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
  • 19:35 cdanis@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
  • 19:35 joal@deploy2002: Started deploy [airflow-dags/analytics@df2cac9]: Regular analytics weekly train [airflow-dags/analytics@df2cac98]
  • 19:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1492 to wikikube-worker1039
  • 19:34 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 19:34 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1039
  • 19:34 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1039
  • 19:34 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:34 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1492 to wikikube-worker1039 - kamila@cumin1002"
  • 19:33 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1492 to wikikube-worker1039 - kamila@cumin1002"
  • 19:30 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1493 to wikikube-worker1040
  • 19:30 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1491 to wikikube-worker1038
  • 19:29 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1038
  • 19:29 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 19:29 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1038
  • 19:29 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:29 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1491 to wikikube-worker1038 - kamila@cumin1002"
  • 19:29 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1491 to wikikube-worker1038 - kamila@cumin1002"
  • 19:26 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1492 to wikikube-worker1039
  • 19:25 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 19:25 joal@deploy2002: Finished deploy [analytics/refinery@1f94312] (hadoop-test): Regular analytics weekly train TEST - HOTFIX [analytics/refinery@1f94312a] (duration: 00m 26s)
  • 19:25 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1491 to wikikube-worker1038
  • 19:24 joal@deploy2002: Started deploy [analytics/refinery@1f94312] (hadoop-test): Regular analytics weekly train TEST - HOTFIX [analytics/refinery@1f94312a]
  • 19:24 joal@deploy2002: Finished deploy [analytics/refinery@1f94312] (thin): Regular analytics weekly train THIN - HOTFIX [analytics/refinery@1f94312a] (duration: 00m 30s)
  • 19:23 joal@deploy2002: Started deploy [analytics/refinery@1f94312] (thin): Regular analytics weekly train THIN - HOTFIX [analytics/refinery@1f94312a]
  • 19:23 joal@deploy2002: Finished deploy [analytics/refinery@1f94312]: Regular analytics weekly train - HOTFIX [analytics/refinery@1f94312a] (duration: 03m 17s)
  • 19:22 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[1491-1496].eqiad.wmnet
  • 19:20 joal@deploy2002: Started deploy [analytics/refinery@1f94312]: Regular analytics weekly train - HOTFIX [analytics/refinery@1f94312a]
  • 19:19 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[1491-1496].eqiad.wmnet
  • 19:16 ryankemper: T380555 Merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1094069 to enable `lvs::realserver`
  • 19:09 ryankemper: T379333 Merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1097542 to establish envoy on `A:wdqs-internal-main` and `A:wdqs-internal-scholarly`; running puppet on `wdqs2018` to test change
  • 19:03 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 19:02 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 19:02 ryankemper: T380555 Merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1094061 to establish initial service definitions for `wdqs-internal-main` and `wdqs-internal-scholarly`
  • 18:58 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 18:58 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 18:58 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 18:58 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 18:57 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs-internal-main.svc.eqiad.wmnet on all recursors
  • 18:57 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache wdqs-internal-main.svc.eqiad.wmnet on all recursors
  • 18:55 ryankemper: T379334 Successfully ran `sudo authdns-update` on `dns1004`
  • 18:52 ryankemper: T379334 Creating A and PTR records for `wdqs-internal-main` and `wdqs-internal-scholarly` VIPs [merging https://gerrit.wikimedia.org/r/c/operations/dns/+/1100010/ & running authdns update after]
  • 18:48 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
  • 18:47 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
  • 18:47 ryankemper: T379330 `wdqs-internal-main` and `wdqs-internal-scholarly` pools created
  • 18:46 ryankemper@cumin2002: conftool action : set/pooled=yes:weight=10; selector: cluster=wdqs-internal-main,service=wdqs-main
  • 18:46 ryankemper@cumin2002: conftool action : set/pooled=yes:weight=10; selector: cluster=wdqs-internal-scholarly,service=wdqs-scholarly
  • 18:35 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
  • 18:35 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
  • 18:34 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
  • 18:33 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
  • 18:30 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
  • 18:30 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/push-notifications: apply
  • 18:13 swfrench@deploy2002: Finished scap sync-world: Deployment to clear noop chart diff from 1081449 - T377040 (duration: 02m 07s)
  • 18:11 swfrench@deploy2002: Started scap sync-world: Deployment to clear noop chart diff from 1081449 - T377040
  • 18:04 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 18:04 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 18:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1178 (T371742)', diff saved to https://phabricator.wikimedia.org/P71556 and previous config saved to /var/cache/conftool/dbconfig/20241204-180114-ladsgroup.json
  • 18:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 18:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 18:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T371742)', diff saved to https://phabricator.wikimedia.org/P71555 and previous config saved to /var/cache/conftool/dbconfig/20241204-180052-ladsgroup.json
  • 17:55 joal@deploy2002: Finished deploy [analytics/refinery@6e3ee14] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@6e3ee14b] (duration: 00m 31s)
  • 17:54 joal@deploy2002: Started deploy [analytics/refinery@6e3ee14] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@6e3ee14b]
  • 17:54 joal@deploy2002: Finished deploy [analytics/refinery@6e3ee14] (thin): Regular analytics weekly train THIN [analytics/refinery@6e3ee14b] (duration: 00m 37s)
  • 17:54 joal@deploy2002: Started deploy [analytics/refinery@6e3ee14] (thin): Regular analytics weekly train THIN [analytics/refinery@6e3ee14b]
  • 17:52 joal@deploy2002: Finished deploy [analytics/refinery@6e3ee14]: Regular analytics weekly train [analytics/refinery@6e3ee14b] (duration: 02m 05s)
  • 17:50 bd808: Moved SAL fediverse posts to https://wikimedia.social/@sal. Many thanks to botsin.space for providing hosting for so long.
  • 17:50 joal@deploy2002: Started deploy [analytics/refinery@6e3ee14]: Regular analytics weekly train [analytics/refinery@6e3ee14b]
  • 17:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P71554 and previous config saved to /var/cache/conftool/dbconfig/20241204-174544-ladsgroup.json
  • 17:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P71553 and previous config saved to /var/cache/conftool/dbconfig/20241204-173037-ladsgroup.json
  • 17:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T371742)', diff saved to https://phabricator.wikimedia.org/P71551 and previous config saved to /var/cache/conftool/dbconfig/20241204-171530-ladsgroup.json
  • 17:10 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephmon1003.eqiad.wmnet
  • 17:10 andrew@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:10 andrew@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcephmon1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 17:08 andrew@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcephmon1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 17:04 andrew@cumin1002: START - Cookbook sre.dns.netbox
  • 17:00 andrew@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudcephmon1003.eqiad.wmnet
  • 16:59 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephmon1002.eqiad.wmnet
  • 16:59 andrew@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:59 andrew@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcephmon1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 16:59 andrew@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcephmon1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 16:56 andrew@cumin1002: START - Cookbook sre.dns.netbox
  • 16:52 jgleeson: smashpig-listeners updated from 79b463b4 to 17ac74f2
  • 16:51 andrew@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudcephmon1002.eqiad.wmnet
  • 16:46 isaranto@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 16:45 isaranto@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 16:38 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephmon1001.eqiad.wmnet
  • 16:38 andrew@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:38 andrew@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcephmon1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 16:38 andrew@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcephmon1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 16:37 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 16:37 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 16:37 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 16:37 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 16:36 jiji@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 16:36 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:36 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:36 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 16:36 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 16:36 jiji@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 16:35 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2173-2175].codfw.wmnet
  • 16:35 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2173-2175].codfw.wmnet
  • 16:34 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:34 andrew@cumin1002: START - Cookbook sre.dns.netbox
  • 16:34 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 16:34 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 16:34 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 16:33 jiji@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 16:33 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 16:33 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:33 jiji@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 16:33 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/benthos-cache-invalidator: apply
  • 16:32 jiji@deploy2002: helmfile [staging] START helmfile.d/services/benthos-cache-invalidator: apply
  • 16:32 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2175.codfw.wmnet with OS bookworm
  • 16:27 andrew@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudcephmon1001.eqiad.wmnet
  • 16:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Schema change
  • 16:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Schema change
  • 16:21 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2174.codfw.wmnet with OS bookworm
  • 16:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P71550 and previous config saved to /var/cache/conftool/dbconfig/20241204-162127-root.json
  • 16:19 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kafka-main1009.eqiad.wmnet
  • 16:19 jiji@cumin1002: START - Cookbook sre.hosts.remove-downtime for kafka-main1009.eqiad.wmnet
  • 16:18 jiji@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:17 jiji@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 16:17 jiji@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:17 jiji@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 16:17 jiji@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:16 jiji@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 16:16 jiji@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 16:16 jiji@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 16:16 jiji@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:15 jiji@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 16:15 jiji@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:15 jiji@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 16:15 jiji@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:15 jiji@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 16:14 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 16:14 jiji@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 16:14 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 16:14 jiji@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 16:13 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2173.codfw.wmnet with OS bookworm
  • 16:13 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2175.codfw.wmnet with reason: host reimage
  • 16:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 16:12 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 16:09 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2175.codfw.wmnet with reason: host reimage
  • 16:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P71549 and previous config saved to /var/cache/conftool/dbconfig/20241204-160622-root.json
  • 16:06 hnowlan@deploy2002: Finished scap sync-world: Rebuild and deploy to pick up new php8.1 base (duration: 42m 17s)
  • 16:01 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2174.codfw.wmnet with reason: host reimage
  • 15:55 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2174.codfw.wmnet with reason: host reimage
  • 15:54 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2173.codfw.wmnet with reason: host reimage
  • 15:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P71548 and previous config saved to /var/cache/conftool/dbconfig/20241204-155116-root.json
  • 15:51 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2175
  • 15:51 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2175
  • 15:50 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2173.codfw.wmnet with reason: host reimage
  • 15:46 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2175
  • 15:46 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2175.codfw.wmnet 80.48.192.10.in-addr.arpa 0.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:46 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2175.codfw.wmnet 80.48.192.10.in-addr.arpa 0.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:46 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:45 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2175 - jayme@cumin2002"
  • 15:45 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2175 - jayme@cumin2002"
  • 15:45 vgutierrez: restarting purged on cp1115
  • 15:41 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1036-1037].eqiad.wmnet
  • 15:41 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1036-1037].eqiad.wmnet
  • 15:39 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 15:37 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2175
  • 15:36 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2174
  • 15:36 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2174
  • 15:36 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2174
  • 15:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P71546 and previous config saved to /var/cache/conftool/dbconfig/20241204-153611-root.json
  • 15:36 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2174.codfw.wmnet 79.48.192.10.in-addr.arpa 9.7.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:36 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2174.codfw.wmnet 79.48.192.10.in-addr.arpa 9.7.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:36 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:36 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2174 - jayme@cumin2002"
  • 15:36 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2174 - jayme@cumin2002"
  • 15:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1177 (T371742)', diff saved to https://phabricator.wikimedia.org/P71545 and previous config saved to /var/cache/conftool/dbconfig/20241204-153234-ladsgroup.json
  • 15:32 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 15:32 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 15:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T371742)', diff saved to https://phabricator.wikimedia.org/P71544 and previous config saved to /var/cache/conftool/dbconfig/20241204-153212-ladsgroup.json
  • 15:31 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 15:31 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2174
  • 15:31 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2173
  • 15:31 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2173
  • 15:30 jiji@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
  • 15:30 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2173
  • 15:30 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2173.codfw.wmnet 78.48.192.10.in-addr.arpa 8.7.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:30 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2173.codfw.wmnet 78.48.192.10.in-addr.arpa 8.7.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:30 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:30 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2173 - jayme@cumin2002"
  • 15:30 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2173 - jayme@cumin2002"
  • 15:28 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2175.codfw.wmnet with OS bookworm
  • 15:27 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2174.codfw.wmnet with OS bookworm
  • 15:27 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 15:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1037.eqiad.wmnet with OS bookworm
  • 15:26 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 15:26 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2173.codfw.wmnet wikikube-worker2174.codfw.wmnet wikikube-worker2175.codfw.wmnet on all recursors
  • 15:26 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2173.codfw.wmnet wikikube-worker2174.codfw.wmnet wikikube-worker2175.codfw.wmnet on all recursors
  • 15:26 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2173
  • 15:26 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2173.codfw.wmnet with OS bookworm
  • 15:25 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2447 to wikikube-worker2175
  • 15:24 hnowlan@deploy2002: Started scap sync-world: Rebuild and deploy to pick up new php8.1 base
  • 15:24 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2175
  • 15:24 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2175
  • 15:24 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:22 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2446 to wikikube-worker2174
  • 15:22 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2174
  • 15:22 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 15:21 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2174
  • 15:21 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:21 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2446 to wikikube-worker2174 - jayme@cumin2002"
  • 15:21 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2446 to wikikube-worker2174 - jayme@cumin2002"
  • 15:21 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1045.eqiad.wmnet with OS bookworm
  • 15:21 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2445 to wikikube-worker2173
  • 15:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P71543 and previous config saved to /var/cache/conftool/dbconfig/20241204-152105-root.json
  • 15:20 jiji@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
  • 15:20 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2173
  • 15:18 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2173
  • 15:18 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:18 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2445 to wikikube-worker2173 - jayme@cumin2002"
  • 15:18 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2445 to wikikube-worker2173 - jayme@cumin2002"
  • 15:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P71542 and previous config saved to /var/cache/conftool/dbconfig/20241204-151705-ladsgroup.json
  • 15:16 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 15:10 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2447 to wikikube-worker2175
  • 15:10 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2446 to wikikube-worker2174
  • 15:10 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 15:10 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2445 to wikikube-worker2173
  • 15:06 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 15:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1037.eqiad.wmnet with reason: host reimage
  • 15:03 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1037.eqiad.wmnet with reason: host reimage
  • 15:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P71541 and previous config saved to /var/cache/conftool/dbconfig/20241204-150157-ladsgroup.json
  • 15:01 TheresNoTime: '[samtar@deploy2002 ~]$ mwscript-k8s --comment="T373634" -f -- namespaceDupes.php --wiki hsbwiktionary --fix' for T373634
  • 14:59 samtar@deploy2002: Finished scap sync-world: Backport for Add new namespaces to hsb wiktionary (T373634) (duration: 10m 16s)
  • 14:54 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2015.codfw.wmnet
  • 14:54 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2015.codfw.wmnet
  • 14:52 samtar@deploy2002: samtar, srishakatux: Continuing with sync
  • 14:51 samtar@deploy2002: samtar, srishakatux: Backport for Add new namespaces to hsb wiktionary (T373634) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:50 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2446.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:50 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2447.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:50 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2445.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1025.eqiad.wmnet with reason: host reimage
  • 14:48 samtar@deploy2002: Started scap sync-world: Backport for Add new namespaces to hsb wiktionary (T373634)
  • 14:47 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1037.eqiad.wmnet with OS bookworm
  • 14:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T371742)', diff saved to https://phabricator.wikimedia.org/P71540 and previous config saved to /var/cache/conftool/dbconfig/20241204-144651-ladsgroup.json
  • 14:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1036.eqiad.wmnet with OS bookworm
  • 14:46 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1025.eqiad.wmnet with reason: host reimage
  • 14:31 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:29 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Translate: Enable message group subscription for 6 wikis (T372386) (duration: 18m 12s)
  • 14:27 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1036.eqiad.wmnet with reason: host reimage
  • 14:23 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1036.eqiad.wmnet with reason: host reimage
  • 14:22 lucaswerkmeister-wmde@deploy2002: abi, lucaswerkmeister-wmde: Continuing with sync
  • 14:19 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 14:18 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2015.codfw.wmnet with OS bookworm
  • 14:17 lucaswerkmeister-wmde@deploy2002: abi, lucaswerkmeister-wmde: Backport for Translate: Enable message group subscription for 6 wikis (T372386) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:11 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Translate: Enable message group subscription for 6 wikis (T372386)
  • 14:07 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1036.eqiad.wmnet with OS bookworm
  • 14:05 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1036.eqiad.wmnet wikikube-worker1037.eqiad.wmnet on all recursors
  • 14:05 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1036.eqiad.wmnet wikikube-worker1037.eqiad.wmnet on all recursors
  • 14:01 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2446.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:01 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2447.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:01 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2445.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:00 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS bookworm
  • 14:00 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1044.eqiad.wmnet with OS bookworm
  • 14:00 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 14:00 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Ensure IP reveal buttons are not shown on Special:MassGlobalBlock (T124607) (duration: 13m 08s)
  • 13:58 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2015.codfw.wmnet with reason: host reimage
  • 13:58 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw[2445-2447].codfw.wmnet with reason: reimage
  • 13:57 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on mw[2445-2447].codfw.wmnet with reason: reimage
  • 13:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1024 to wikikube-worker1037
  • 13:55 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1037
  • 13:54 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2015.codfw.wmnet with reason: host reimage
  • 13:54 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1037
  • 13:54 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:54 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1024 to wikikube-worker1037 - jelto@cumin1002"
  • 13:54 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1024 to wikikube-worker1037 - jelto@cumin1002"
  • 13:53 dreamyjazz@deploy2002: tchanders, dreamyjazz: Continuing with sync
  • 13:53 dreamyjazz@deploy2002: tchanders, dreamyjazz: Backport for Ensure IP reveal buttons are not shown on Special:MassGlobalBlock (T124607) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:51 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2445-2447].codfw.wmnet
  • 13:50 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:50 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2445-2447].codfw.wmnet
  • 13:50 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1024 to wikikube-worker1037
  • 13:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1023 to wikikube-worker1036
  • 13:48 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1036
  • 13:47 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1036
  • 13:47 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:47 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1023 to wikikube-worker1036 - jelto@cumin1002"
  • 13:47 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1023 to wikikube-worker1036 - jelto@cumin1002"
  • 13:47 dreamyjazz@deploy2002: Started scap sync-world: Backport for Ensure IP reveal buttons are not shown on Special:MassGlobalBlock (T124607)
  • {{safesubst:SAL entry|1=13:42 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Stats: Move StatsFactory flush into emitBufferedStats (T380609), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), [[gerrit:1100449|Revert "Stats: Move StatsFactory flush into emitBufferedSt}}
  • 13:41 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2016,2171-2172].codfw.wmnet
  • 13:41 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2016,2171-2172].codfw.wmnet
  • 13:39 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:39 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1023 to wikikube-worker1036
  • 13:35 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2015.codfw.wmnet with OS bookworm
  • 13:35 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2015.codfw.wmnet with OS bookworm
  • 13:33 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 13:33 dreamyjazz@deploy2002: dreamyjazz: Backport for Stats: Move StatsFactory flush into emitBufferedStats (T380609), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), Revert "Stats: Move StatsFactory flush into emitBufferedStats" synced
  • 13:31 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1023-1024].eqiad.wmnet
  • 13:30 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1023-1024].eqiad.wmnet
  • {{safesubst:SAL entry|1=13:28 dreamyjazz@deploy2002: Started scap sync-world: Backport for Stats: Move StatsFactory flush into emitBufferedStats (T380609), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), [[gerrit:1100449|Revert "Stats: Move StatsFactory flush into emitBufferedSta}}
  • 13:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Alter table
  • 13:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Alter table
  • 13:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on clouddb1020.eqiad.wmnet with reason: Alter table
  • 13:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on clouddb1020.eqiad.wmnet with reason: Alter table
  • 13:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on clouddb1016.eqiad.wmnet with reason: Alter table
  • 13:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on clouddb1016.eqiad.wmnet with reason: Alter table
  • 13:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1154.eqiad.wmnet with reason: Alter table
  • 13:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1154.eqiad.wmnet with reason: Alter table
  • 13:12 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 13:12 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 13:12 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:12 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 13:12 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:12 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:09 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 13:09 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 13:09 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:09 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 13:09 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:09 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1172 (T371742)', diff saved to https://phabricator.wikimedia.org/P71537 and previous config saved to /var/cache/conftool/dbconfig/20241204-130614-ladsgroup.json
  • 13:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 13:06 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 13:05 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 13:05 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:05 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 13:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 13:05 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:05 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:05 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 13:04 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 13:04 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:04 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 13:04 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:04 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:00 dreamyjazz@deploy2002: Started scap sync-world: Backport for Stats: Move StatsFactory flush into emitBufferedStats (T380609), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169)
  • 12:59 dreamyjazz@deploy2002: Started scap sync-world: Backport for Stats: Move StatsFactory flush into emitBufferedStats (T380609), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169)
  • 12:57 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:56 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 12:55 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:55 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 12:54 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[1004,1009].eqiad.wmnet with reason: Hardware refresh
  • 12:54 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[1004,1009].eqiad.wmnet with reason: Hardware refresh
  • 12:52 stevemunene@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production
  • 12:47 stevemunene@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production
  • 12:47 moritzm: uploaded mailman3 3.3.8-2~deb12u2+wmf1 T377045
  • 12:42 stevemunene@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 12:40 hnowlan: imported debs for mercurius_1.0.2
  • 12:38 stevemunene@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 12:33 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 12:32 moritzm: installing glib2.0 security updates
  • 12:22 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2172.codfw.wmnet with OS bookworm
  • 12:06 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Create a DB list for wikis with continuous MediaModeration scans (T355169) (duration: 13m 02s)
  • 12:02 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2172.codfw.wmnet with reason: host reimage
  • 12:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 12:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 11:59 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 11:59 dreamyjazz@deploy2002: dreamyjazz: Backport for Create a DB list for wikis with continuous MediaModeration scans (T355169) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:58 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2172.codfw.wmnet with reason: host reimage
  • 11:53 dreamyjazz@deploy2002: Started scap sync-world: Backport for Create a DB list for wikis with continuous MediaModeration scans (T355169)
  • 11:52 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2016.codfw.wmnet with OS bookworm
  • 11:49 vgutierrez: re-enabling outbound bandwidth limits enforced by haproxy on the upload cluster
  • 11:39 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2172.codfw.wmnet with OS bookworm
  • 11:38 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2172.codfw.wmnet with OS bookworm
  • 11:36 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2171.codfw.wmnet with OS bookworm
  • 11:35 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 11:35 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 11:32 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2016.codfw.wmnet with reason: host reimage
  • 11:26 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2016.codfw.wmnet with reason: host reimage
  • 11:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 11:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 11:14 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2171.codfw.wmnet with reason: host reimage
  • 11:13 vgutierrez: disabling outbound bandwidth limits enforced by haproxy on the upload cluster (we are getting haproxy crashes)
  • 11:11 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2171.codfw.wmnet with reason: host reimage
  • 11:07 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2016
  • 11:07 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2016
  • 11:07 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2016
  • 11:07 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2016.codfw.wmnet 151.32.192.10.in-addr.arpa 1.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:07 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2016.codfw.wmnet 151.32.192.10.in-addr.arpa 1.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:07 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:07 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2016 - jayme@cumin2002"
  • 11:07 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2016 - jayme@cumin2002"
  • 11:03 vgutierrez: restarting haproxy on cp1107
  • 10:58 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:58 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2016
  • 10:57 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2015
  • 10:57 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2015
  • 10:57 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2015.codfw.wmnet wikikube-worker2016.codfw.wmnet wikikube-worker2171.codfw.wmnet wikikube-worker2172.codfw.wmnet on all recursors
  • 10:57 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2015.codfw.wmnet wikikube-worker2016.codfw.wmnet wikikube-worker2171.codfw.wmnet wikikube-worker2172.codfw.wmnet on all recursors
  • 10:57 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2015
  • 10:57 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2015.codfw.wmnet 149.32.192.10.in-addr.arpa 9.4.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:57 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2015.codfw.wmnet 149.32.192.10.in-addr.arpa 9.4.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:57 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:57 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2015 - jayme@cumin2002"
  • 10:57 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2015 - jayme@cumin2002"
  • 10:54 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2016.codfw.wmnet with OS bookworm
  • 10:53 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:53 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2015
  • 10:52 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2171
  • 10:52 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2171
  • 10:52 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2171
  • 10:52 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2171.codfw.wmnet 152.32.192.10.in-addr.arpa 2.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:52 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2171.codfw.wmnet 152.32.192.10.in-addr.arpa 2.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:52 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:50 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:49 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2016.codfw.wmnet with OS bookworm
  • 10:49 brouberol@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-presto1005.eqiad.wmnet
  • 10:49 brouberol@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:48 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2016.codfw.wmnet with OS bookworm
  • 10:47 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2442 to wikikube-worker2016
  • 10:47 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2016
  • 10:46 brouberol@cumin2002: START - Cookbook sre.dns.netbox
  • 10:46 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2016
  • 10:46 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:46 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2442 to wikikube-worker2016 - jayme@cumin2002"
  • 10:46 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2442 to wikikube-worker2016 - jayme@cumin2002"
  • 10:41 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:41 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2172
  • 10:41 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2172
  • 10:41 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2172
  • 10:41 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2172.codfw.wmnet 77.48.192.10.in-addr.arpa 7.7.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:41 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2172.codfw.wmnet 77.48.192.10.in-addr.arpa 7.7.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:41 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:41 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2172 - jayme@cumin2002"
  • 10:40 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2172 - jayme@cumin2002"
  • 10:39 brouberol@cumin2002: START - Cookbook sre.hosts.decommission for hosts an-presto1005.eqiad.wmnet
  • 10:38 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2015.codfw.wmnet with OS bookworm
  • 10:38 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2440 to wikikube-worker2015
  • 10:37 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2442 to wikikube-worker2016
  • 10:37 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2015
  • 10:37 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2171
  • 10:36 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2171.codfw.wmnet with OS bookworm
  • 10:36 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:36 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2015
  • 10:36 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:36 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2172
  • 10:36 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2172.codfw.wmnet with OS bookworm
  • 10:35 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2443 to wikikube-worker2171
  • 10:35 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2171
  • 10:34 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:33 brouberol@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-presto1004.eqiad.wmnet
  • 10:33 moritzm: removing ganeti2018 from active Ganeti nodes T376594
  • 10:33 brouberol@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:30 brouberol@cumin2002: START - Cookbook sre.dns.netbox
  • 10:30 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2444 to wikikube-worker2172
  • 10:30 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2171
  • 10:30 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:30 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2172
  • 10:29 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2172
  • 10:29 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:29 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2444 to wikikube-worker2172 - jayme@cumin2002"
  • 10:28 vgutierrez: enabling outbound bandwidth limits enforced by haproxy on the upload cluster
  • 10:28 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:27 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=93) from mw2442 to wikikube-worker2016
  • 10:27 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2444 to wikikube-worker2172 - jayme@cumin2002"
  • 10:27 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2442 to wikikube-worker2016
  • 10:23 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=93) from mw2442 to wikikube-worker2016
  • 10:23 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2442 to wikikube-worker2016
  • 10:23 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:23 jayme@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 10:22 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2444 to wikikube-worker2172
  • 10:22 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2443 to wikikube-worker2171
  • 10:22 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=93) from mw2442 to wikikube-worker20160
  • 10:22 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:22 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2442 to wikikube-worker20160
  • 10:22 brouberol@cumin2002: START - Cookbook sre.hosts.decommission for hosts an-presto1004.eqiad.wmnet
  • 10:21 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:21 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:20 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2440 to wikikube-worker2015
  • 10:19 brouberol@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-presto1003.eqiad.wmnet
  • 10:19 brouberol@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:19 brouberol@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-presto1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin2002"
  • 10:19 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:19 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:19 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:19 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:13 brouberol@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-presto1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin2002"
  • 10:10 brouberol@cumin2002: START - Cookbook sre.dns.netbox
  • 10:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 8 hosts with reason: Rebooting
  • 10:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 8 hosts with reason: Rebooting
  • 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2018.codfw.wmnet
  • 10:04 brouberol@cumin2002: START - Cookbook sre.hosts.decommission for hosts an-presto1003.eqiad.wmnet
  • 10:03 brouberol@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-presto1002.eqiad.wmnet
  • 10:03 brouberol@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:03 brouberol@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-presto1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin2002"
  • 10:02 brouberol@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-presto1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin2002"
  • 09:58 brouberol@cumin2002: START - Cookbook sre.dns.netbox
  • 09:56 godog: bump space for prometheus k8s-mlserve in eqiad
  • 09:50 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2444.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:50 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2443.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:46 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2440.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:39 brouberol@cumin2002: START - Cookbook sre.hosts.decommission for hosts an-presto1002.eqiad.wmnet
  • 09:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2024.codfw.wmnet with reason: cloning
  • 09:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2024.codfw.wmnet with reason: cloning
  • 09:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2024 to clone es2045', diff saved to https://phabricator.wikimedia.org/P71535 and previous config saved to /var/cache/conftool/dbconfig/20241204-093541-marostegui.json
  • 09:35 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2023 to es5 master T381259', diff saved to https://phabricator.wikimedia.org/P71534 and previous config saved to /var/cache/conftool/dbconfig/20241204-093519-marostegui.json
  • 09:35 brouberol@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-presto1001.eqiad.wmnet
  • 09:35 brouberol@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:35 brouberol@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-presto1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin2002"
  • 09:34 brouberol@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-presto1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin2002"
  • 09:33 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2444.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:33 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2443.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:32 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2442.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:30 brouberol@cumin2002: START - Cookbook sre.dns.netbox
  • 09:21 brouberol@cumin2002: START - Cookbook sre.hosts.decommission for hosts an-presto1001.eqiad.wmnet
  • 09:15 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2442.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:14 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2440.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:12 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw[2440,2442-2444].codfw.wmnet with reason: T377877
  • 09:12 marostegui@cumin1002: dbctl commit (dc=all): 'es2046 (re)pooling @ 100%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71533 and previous config saved to /var/cache/conftool/dbconfig/20241204-091229-root.json
  • 09:12 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw[2440,2442-2444].codfw.wmnet with reason: T377877
  • 09:07 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2440,2442-2444].codfw.wmnet
  • 09:05 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2440,2442-2444].codfw.wmnet
  • 08:57 marostegui@cumin1002: dbctl commit (dc=all): 'es2046 (re)pooling @ 75%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71532 and previous config saved to /var/cache/conftool/dbconfig/20241204-085724-root.json
  • 08:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2022.codfw.wmnet with reason: cloning
  • 08:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2022.codfw.wmnet with reason: cloning
  • 08:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2022 to clone es2043', diff saved to https://phabricator.wikimedia.org/P71531 and previous config saved to /var/cache/conftool/dbconfig/20241204-085143-marostegui.json
  • 08:51 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2020 to es4 master T381259', diff saved to https://phabricator.wikimedia.org/P71530 and previous config saved to /var/cache/conftool/dbconfig/20241204-085124-marostegui.json
  • 08:46 marostegui@cumin1002: dbctl commit (dc=all): 'es2042 (re)pooling @ 100%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71529 and previous config saved to /var/cache/conftool/dbconfig/20241204-084650-root.json
  • 08:42 marostegui@cumin1002: dbctl commit (dc=all): 'es2046 (re)pooling @ 50%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71528 and previous config saved to /var/cache/conftool/dbconfig/20241204-084219-root.json
  • 08:35 moritzm: rebalance Ganeti eqiad/C following server refreshes
  • 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'es2042 (re)pooling @ 75%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71527 and previous config saved to /var/cache/conftool/dbconfig/20241204-083145-root.json
  • 08:27 marostegui@cumin1002: dbctl commit (dc=all): 'es2046 (re)pooling @ 25%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71526 and previous config saved to /var/cache/conftool/dbconfig/20241204-082714-root.json
  • 08:25 kharlan@deploy2002: Finished scap sync-world: Backport for dialog: Don't duplicate the footer in the behaviour list template (T381189) (duration: 12m 08s)
  • 08:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2018.codfw.wmnet
  • 08:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2018.codfw.wmnet
  • 08:18 kharlan@deploy2002: kharlan: Continuing with sync
  • 08:18 kharlan@deploy2002: kharlan: Backport for dialog: Don't duplicate the footer in the behaviour list template (T381189) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:16 marostegui@cumin1002: dbctl commit (dc=all): 'es2042 (re)pooling @ 50%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71525 and previous config saved to /var/cache/conftool/dbconfig/20241204-081640-root.json
  • 08:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2018.codfw.wmnet
  • 08:13 kharlan@deploy2002: Started scap sync-world: Backport for dialog: Don't duplicate the footer in the behaviour list template (T381189)
  • 08:12 marostegui@cumin1002: dbctl commit (dc=all): 'es2046 (re)pooling @ 10%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71524 and previous config saved to /var/cache/conftool/dbconfig/20241204-081208-root.json
  • 08:01 marostegui@cumin1002: dbctl commit (dc=all): 'es2042 (re)pooling @ 25%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71522 and previous config saved to /var/cache/conftool/dbconfig/20241204-080134-root.json
  • 07:57 marostegui@cumin1002: dbctl commit (dc=all): 'es2046 (re)pooling @ 1%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71520 and previous config saved to /var/cache/conftool/dbconfig/20241204-075703-root.json
  • 07:54 marostegui@cumin1002: dbctl commit (dc=all): 'Add es2046 to es5 depooled T381259', diff saved to https://phabricator.wikimedia.org/P71519 and previous config saved to /var/cache/conftool/dbconfig/20241204-075427-marostegui.json
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'es2042 (re)pooling @ 10%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71518 and previous config saved to /var/cache/conftool/dbconfig/20241204-074629-root.json
  • 07:08 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 100%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71517 and previous config saved to /var/cache/conftool/dbconfig/20241204-070855-root.json
  • 07:08 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 100%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71516 and previous config saved to /var/cache/conftool/dbconfig/20241204-070829-root.json
  • 06:53 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 75%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71515 and previous config saved to /var/cache/conftool/dbconfig/20241204-065349-root.json
  • 06:53 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 75%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71514 and previous config saved to /var/cache/conftool/dbconfig/20241204-065324-root.json
  • 06:38 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 50%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71513 and previous config saved to /var/cache/conftool/dbconfig/20241204-063844-root.json
  • 06:38 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 50%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71512 and previous config saved to /var/cache/conftool/dbconfig/20241204-063819-root.json
  • 06:23 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 25%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71511 and previous config saved to /var/cache/conftool/dbconfig/20241204-062339-root.json
  • 06:23 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 25%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71510 and previous config saved to /var/cache/conftool/dbconfig/20241204-062313-root.json
  • 06:18 marostegui@cumin1002: dbctl commit (dc=all): 'Add es2042 to dbctl depooled T381259', diff saved to https://phabricator.wikimedia.org/P71509 and previous config saved to /var/cache/conftool/dbconfig/20241204-061821-marostegui.json
  • 06:08 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 10%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71508 and previous config saved to /var/cache/conftool/dbconfig/20241204-060834-root.json
  • 06:08 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 10%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71507 and previous config saved to /var/cache/conftool/dbconfig/20241204-060808-root.json
  • 02:40 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1045.eqiad.wmnet with OS bookworm
  • 02:33 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1046.eqiad.wmnet with OS bookworm
  • 02:33 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 02:32 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1041.eqiad.wmnet with OS bookworm
  • 02:32 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 02:08 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be1085.eqiad.wmnet with OS bullseye
  • 01:56 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 01:46 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 01:42 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1044.eqiad.wmnet with OS bookworm
  • 01:39 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 01:39 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1046.eqiad.wmnet with reason: host reimage
  • 01:36 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1046.eqiad.wmnet with reason: host reimage
  • 01:23 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:22 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:22 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
  • 01:20 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1046.eqiad.wmnet with OS bookworm
  • 01:20 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS bookworm
  • 01:19 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
  • 01:15 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:15 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:03 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1041.eqiad.wmnet with OS bookworm
  • 01:02 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1041.eqiad.wmnet with OS bookworm
  • 01:00 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:00 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:57 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1042.eqiad.wmnet with OS bookworm
  • 00:57 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 00:56 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 00:53 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:52 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:50 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:48 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:48 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:47 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1085.eqiad.wmnet with OS bullseye
  • 00:47 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:43 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:43 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:42 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:42 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:41 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1085.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:40 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
  • 00:37 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:36 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
  • 00:31 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1085.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:30 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2020.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 00:26 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:18 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1085.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:18 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1085.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:16 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:13 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1084.eqiad.wmnet with OS bullseye
  • 00:13 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 00:09 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:09 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply

2024-12-03

  • 23:58 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:52 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 23:50 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1085.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:48 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1085.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:48 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1044.eqiad.wmnet with OS bookworm
  • 23:42 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:41 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 23:41 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1042.eqiad.wmnet with OS bookworm
  • 23:40 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1041.eqiad.wmnet with OS bookworm
  • 23:39 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:39 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt ms-be1085 - vriley@cumin1002"
  • 23:39 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt ms-be1085 - vriley@cumin1002"
  • 23:37 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:36 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2020.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 23:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2019.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 23:35 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 23:34 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1084.eqiad.wmnet with reason: host reimage
  • 23:30 vriley@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1084.eqiad.wmnet with reason: host reimage
  • 23:29 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1086
  • 23:28 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1086
  • 23:27 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1089.eqiad.wmnet with OS bullseye
  • 23:27 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:25 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:22 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:22 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 23:22 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 23:20 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1084.eqiad.wmnet with OS bullseye
  • 23:19 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1083.eqiad.wmnet with OS bullseye
  • 23:19 vriley@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 23:19 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 23:12 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1087.eqiad.wmnet with OS bullseye
  • 23:12 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:11 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:11 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1090.eqiad.wmnet with OS bullseye
  • 23:11 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:11 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:08 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1088.eqiad.wmnet with OS bullseye
  • 23:08 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:08 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1089.eqiad.wmnet with reason: host reimage
  • 23:08 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:04 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1089.eqiad.wmnet with reason: host reimage
  • 23:04 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:02 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs[2018-2020,2026-2027].codfw.wmnet with reason: T376150
  • 23:02 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs[2018-2020,2026-2027].codfw.wmnet with reason: T376150
  • 23:02 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs[1026-1027].eqiad.wmnet with reason: T376150
  • 23:01 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs[1026-1027].eqiad.wmnet with reason: T376150
  • 22:57 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 22:53 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:53 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1089.eqiad.wmnet with OS bullseye
  • 22:52 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1090.eqiad.wmnet with reason: host reimage
  • 22:52 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1089.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:52 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:52 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:50 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
  • 22:46 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1087.eqiad.wmnet with reason: host reimage
  • 22:43 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1090.eqiad.wmnet with reason: host reimage
  • 22:43 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
  • 22:43 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2019.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 22:43 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1087.eqiad.wmnet with reason: host reimage
  • 22:42 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1089.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:38 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host (duration: 00m 13s)
  • 22:38 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host
  • 22:37 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1083.eqiad.wmnet with reason: host reimage
  • 22:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on wdqs[1026-1027].eqiad.wmnet with reason: T376150
  • 22:35 bking@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on wdqs[1026-1027].eqiad.wmnet with reason: T376150
  • 22:34 vriley@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1083.eqiad.wmnet with reason: host reimage
  • 22:32 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1088.eqiad.wmnet with OS bullseye
  • 22:32 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1090.eqiad.wmnet with OS bullseye
  • 22:32 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1087.eqiad.wmnet with OS bullseye
  • 22:32 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host (duration: 00m 13s)
  • 22:32 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host
  • 22:32 ryankemper@deploy2002: deploy aborted: deploy to fresh wdqs-internal-scholarly host (duration: 03m 59s)
  • 22:32 dancy@deploy2002: Installation of scap version "4.132.0" completed for 1 hosts
  • 22:31 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1090.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:31 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1088.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:31 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1087.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:31 dancy@deploy2002: Installing scap version "4.132.0" for 1 host(s)
  • 22:28 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host
  • 22:23 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1083.eqiad.wmnet with OS bullseye
  • 22:21 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1089.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:21 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1090.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:21 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1088.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:21 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1089.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:21 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1087.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:15 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:15 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:12 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:10 brett@cumin2002: START - Cookbook sre.dns.netbox
  • 21:52 ebernhardson@deploy2002: Finished scap sync-world: Backport for cirrus: Configure MLR buckets (T377128) (duration: 17m 47s)
  • 21:45 ebernhardson@deploy2002: ebernhardson: Continuing with sync
  • 21:40 ebernhardson@deploy2002: ebernhardson: Backport for cirrus: Configure MLR buckets (T377128) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:34 ebernhardson@deploy2002: Started scap sync-world: Backport for cirrus: Configure MLR buckets (T377128)
  • 21:32 ebernhardson@deploy2002: Finished scap sync-world: Backport for Rerunning Web browser extension survey (T380778), Reenable non-UI experiment quick survey (T379241), Deploy Vector22 To Wikis (T381041) (duration: 22m 00s)
  • 21:28 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:28 swfrench@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Backfill allocations for mw-parsoid LVS VIPs - swfrench@cumin2002"
  • 21:28 swfrench@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Backfill allocations for mw-parsoid LVS VIPs - swfrench@cumin2002"
  • 21:24 ebernhardson@deploy2002: bwang, ebernhardson, lmora, jdrewniak: Continuing with sync
  • 21:23 swfrench@cumin2002: START - Cookbook sre.dns.netbox
  • 21:16 ebernhardson@deploy2002: bwang, ebernhardson, lmora, jdrewniak: Backport for Rerunning Web browser extension survey (T380778), Reenable non-UI experiment quick survey (T379241), Deploy Vector22 To Wikis (T381041) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:10 ebernhardson@deploy2002: Started scap sync-world: Backport for Rerunning Web browser extension survey (T380778), Reenable non-UI experiment quick survey (T379241), Deploy Vector22 To Wikis (T381041)
  • 21:08 dancy@deploy2002: Installation of scap version "4.132.0" completed for 1 hosts
  • 21:07 dancy@deploy2002: Installing scap version "4.132.0" for 1 host(s)
  • 20:49 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1087
  • 20:49 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1087
  • 20:49 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1088
  • 20:48 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1088
  • 20:48 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1090
  • 20:48 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1090
  • 20:48 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1089
  • 20:48 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1089
  • 20:46 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:46 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 20:46 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 20:42 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 20:38 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1278-1279].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 20:38 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1279.eqiad.wmnet with OS bookworm
  • 20:19 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1279.eqiad.wmnet with reason: host reimage
  • 20:15 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1279.eqiad.wmnet with reason: host reimage
  • 20:01 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 20:00 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal scholarly tier) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1027.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:57 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:55 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1279.eqiad.wmnet with OS bookworm
  • 19:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1278.eqiad.wmnet with OS bookworm
  • 19:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1278.eqiad.wmnet with reason: host reimage
  • 19:31 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1278.eqiad.wmnet with reason: host reimage
  • 19:19 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
  • 19:18 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.44.0-wmf.6 refs T375665
  • 19:15 topranks: rebooting rpki2003 to clear out tmpfs filesystem which is full
  • 19:15 cmooney@cumin1002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
  • 19:14 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal scholarly tier) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1027.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:13 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host (duration: 00m 07s)
  • 19:13 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host
  • 19:13 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host (duration: 01m 09s)
  • 19:11 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:11 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host
  • 19:11 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host (duration: 02m 45s)
  • 19:10 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1278.eqiad.wmnet with OS bookworm
  • 19:09 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host
  • 19:04 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:04 kamila@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1278-1279].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 19:02 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:00 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 18:59 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T376150, initialize wdqs internal main tier) xfer scholarly_articles from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet, repooling source-only afterwards
  • 18:58 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on wdqs2027.codfw.wmnet with reason: T376150
  • 18:58 bking@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on wdqs2027.codfw.wmnet with reason: T376150
  • 18:56 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer scholarly_articles from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet, repooling source-only afterwards
  • 18:49 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host (duration: 00m 14s)
  • 18:49 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host
  • 18:47 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host (duration: 00m 14s)
  • 18:47 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host
  • 18:43 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host (duration: 03m 31s)
  • 18:40 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host
  • 18:39 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host (duration: 00m 11s)
  • 18:39 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host
  • 18:39 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host (duration: 00m 11s)
  • 18:39 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host
  • 18:35 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1034-1035].eqiad.wmnet
  • 18:35 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1034-1035].eqiad.wmnet
  • 18:23 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 18:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1035.eqiad.wmnet with OS bookworm
  • 18:00 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1091.eqiad.wmnet with OS bullseye
  • 18:00 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 17:57 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 17:57 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 17:57 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 17:56 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 17:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1035.eqiad.wmnet with reason: host reimage
  • 17:50 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on wdqs2026.codfw.wmnet with reason: T376150
  • 17:50 bking@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on wdqs2026.codfw.wmnet with reason: T376150
  • 17:48 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1035.eqiad.wmnet with reason: host reimage
  • 17:47 brett@puppetserver1001: conftool action : set/pooled=yes; selector: dc=magru,service=cdn,name=cp7001.magru.wmnet
  • 17:46 brett: Removing RSA certificate support from haproxy/cp (T370837)
  • 17:38 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 17:32 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1035.eqiad.wmnet with OS bookworm
  • 17:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1034.eqiad.wmnet with OS bookworm
  • 17:20 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1091.eqiad.wmnet with reason: host reimage
  • 17:17 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1091.eqiad.wmnet with reason: host reimage
  • 17:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1034.eqiad.wmnet with reason: host reimage
  • 17:08 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1034.eqiad.wmnet with reason: host reimage
  • 17:07 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1091.eqiad.wmnet with OS bullseye
  • 16:58 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:52 brett@puppetserver1001: conftool action : set/pooled=no; selector: dc=magru,service=cdn,name=cp7001.magru.wmnet
  • 16:51 sbisson@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 16:51 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1034.eqiad.wmnet with OS bookworm
  • 16:50 urbanecm@deploy2002: Finished scap sync-world: Backport for Revert "Increase Nuke max age to 90 days" (T380846) (duration: 12m 29s)
  • 16:49 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1034.eqiad.wmnet with OS bookworm
  • 16:47 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:44 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:44 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:38 urbanecm@deploy2002: Started scap sync-world: Backport for Revert "Increase Nuke max age to 90 days" (T380846)
  • 16:30 brett: Disabling puppet on A:cp to prep for RSA removal - T370837
  • 16:27 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:27 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:19 moritzm: rebalance Ganeti eqiad/B following server refreshes
  • 16:07 moritzm: installing intel-microcode security updates
  • 15:51 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1034.eqiad.wmnet with OS bookworm
  • 15:48 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1034.eqiad.wmnet wikikube-worker1035.eqiad.wmnet on all recursors
  • 15:48 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1034.eqiad.wmnet wikikube-worker1035.eqiad.wmnet on all recursors
  • 15:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1022 to wikikube-worker1035
  • 15:47 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1035
  • 15:45 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1035
  • 15:45 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:45 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1022 to wikikube-worker1035 - jelto@cumin1002"
  • 15:43 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1022 to wikikube-worker1035 - jelto@cumin1002"
  • 15:39 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 15:39 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1022 to wikikube-worker1035
  • 15:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1021 to wikikube-worker1034
  • 15:37 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1034
  • 15:36 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1034
  • 15:36 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:36 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1021 to wikikube-worker1034 - jelto@cumin1002"
  • 15:35 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1021 to wikikube-worker1034 - jelto@cumin1002"
  • 15:31 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 15:31 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1021 to wikikube-worker1034
  • 15:14 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1021-1022].eqiad.wmnet
  • 15:13 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1021-1022].eqiad.wmnet
  • 15:10 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2242.codfw.wmnet with OS bookworm
  • 15:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 15:10 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS bookworm
  • 15:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:57 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:44 urbanecm@deploy2002: Finished scap sync-world: Backport for fix: show thumbnails in surfacing popups (T381364), fix: show thumbnails in surfacing popups (T381364) (duration: 19m 24s)
  • 14:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
  • 14:37 urbanecm@deploy2002: migr, urbanecm: Continuing with sync
  • 14:35 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
  • 14:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
  • 14:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
  • 14:30 urbanecm@deploy2002: migr, urbanecm: Backport for fix: show thumbnails in surfacing popups (T381364), fix: show thumbnails in surfacing popups (T381364) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:25 urbanecm@deploy2002: Started scap sync-world: Backport for fix: show thumbnails in surfacing popups (T381364), fix: show thumbnails in surfacing popups (T381364)
  • 14:22 urbanecm@deploy2002: Finished scap sync-world: Backport for Increase Nuke max age to 90 days (T380846), knwiki: remove module namespace names from core-Namespaces.php (T346583), Remove temporary fix for badly set CentralAuth cookies (duration: 17m 04s)
  • 14:17 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS bookworm
  • 14:17 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2242.codfw.wmnet with OS bookworm
  • 14:13 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1015-1016].eqiad.wmnet
  • 14:13 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1015-1016].eqiad.wmnet
  • 14:13 urbanecm@deploy2002: matmarex, chlod, urbanecm, anzx: Continuing with sync
  • 14:11 urbanecm@deploy2002: matmarex, chlod, urbanecm, anzx: Backport for Increase Nuke max age to 90 days (T380846), knwiki: remove module namespace names from core-Namespaces.php (T346583), Remove temporary fix for badly set CentralAuth cookies synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:05 urbanecm@deploy2002: Started scap sync-world: Backport for Increase Nuke max age to 90 days (T380846), knwiki: remove module namespace names from core-Namespaces.php (T346583), Remove temporary fix for badly set CentralAuth cookies
  • 13:57 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 13:41 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:41 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.restart_sanitarium (exit_code=0) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:40 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:39 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:39 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:35 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:34 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.restart_sanitarium (exit_code=0) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:34 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.restart_sanitarium (exit_code=0) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:33 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:32 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.restart_sanitarium (exit_code=0) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:32 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:30 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:30 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:28 fabfur: upgrade haproxykafka to version 0.3.4 (https://gitlab.wikimedia.org/repos/sre/haproxykafka/-/commits/main?ref_type=heads) (T380583)
  • 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1022.eqiad.wmnet
  • 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1022.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:24 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1022.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:23 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
  • 13:22 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
  • 13:22 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
  • 13:22 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
  • 13:21 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:20 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:20 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:19 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:19 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:19 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
  • 13:18 jelto@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
  • 13:18 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:18 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1016.eqiad.wmnet with OS bookworm
  • 13:14 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1022.eqiad.wmnet
  • 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1012.eqiad.wmnet
  • 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:13 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:13 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:10 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:10 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:10 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:06 jnuche@deploy2002: Installation of scap version "4.132.0" completed for 1 hosts
  • 13:06 jnuche@deploy2002: Installing scap version "4.132.0" for 1 host(s)
  • 13:04 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1012.eqiad.wmnet
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1016.eqiad.wmnet with reason: host reimage
  • 12:57 jnuche@deploy2002: Installing scap version "4.132.0" for 207 host(s)
  • 12:56 jnuche@deploy2002: Installation of scap version "4.132.0" completed for 1 hosts
  • 12:55 jnuche@deploy2002: Installing scap version "4.132.0" for 1 host(s)
  • 12:54 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1016.eqiad.wmnet with reason: host reimage
  • 12:54 jnuche@deploy2002: Installation of scap version "4.132.0" completed for 1 hosts
  • 12:53 jnuche@deploy2002: Installing scap version "4.132.0" for 1 host(s)
  • 12:47 klausman@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ml-lab1001.eqiad.wmnet
  • 12:37 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1016.eqiad.wmnet with OS bookworm
  • 12:36 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1015.eqiad.wmnet with OS bookworm
  • 12:35 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-lab1001.eqiad.wmnet
  • 12:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1015.eqiad.wmnet with reason: host reimage
  • 12:15 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1015.eqiad.wmnet with reason: host reimage
  • 11:58 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1015.eqiad.wmnet with OS bookworm
  • 11:53 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubernetes1019.eqiad.wmnet wikikube-worker1015.eqiad.wmnet kubernetes1020.eqiad.wmnet wikikube-worker1016.eqiad.wmnet on all recursors
  • 11:53 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache kubernetes1019.eqiad.wmnet wikikube-worker1015.eqiad.wmnet kubernetes1020.eqiad.wmnet wikikube-worker1016.eqiad.wmnet on all recursors
  • 11:50 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1020 to wikikube-worker1016
  • 11:50 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1016
  • 11:49 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1016
  • 11:49 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:49 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1020 to wikikube-worker1016 - jelto@cumin1002"
  • 11:49 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1020 to wikikube-worker1016 - jelto@cumin1002"
  • 11:45 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:44 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1020 to wikikube-worker1016
  • 11:44 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1019 to wikikube-worker1015
  • 11:43 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1015
  • 11:42 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1015
  • 11:42 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:42 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1019 to wikikube-worker1015 - jelto@cumin1002"
  • 11:41 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1019 to wikikube-worker1015 - jelto@cumin1002"
  • 11:37 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:37 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1019 to wikikube-worker1015
  • 11:33 volans@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1061.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:32 volans@cumin1002: START - Cookbook sre.hosts.provision for host cloudvirt1061.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:31 topranks: pushing new nftables rules to cloudgw1001 to block abuse from paws T381078
  • 11:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2025.codfw.wmnet with reason: cloning
  • 11:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2025.codfw.wmnet with reason: cloning
  • 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2025 to clone es2046', diff saved to https://phabricator.wikimedia.org/P71497 and previous config saved to /var/cache/conftool/dbconfig/20241203-112015-marostegui.json
  • 10:49 volans: installed spicerack v9.0.0 on cumin[12]002
  • 10:42 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1019-1020].eqiad.wmnet
  • 10:41 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1019-1020].eqiad.wmnet
  • 10:30 volans@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1061.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:27 volans@cumin1002: START - Cookbook sre.hosts.provision for host cloudvirt1061.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2041 (re)pooling @ 100%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71496 and previous config saved to /var/cache/conftool/dbconfig/20241203-102143-root.json
  • 10:19 volans@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Update hieradata from Netbox - volans@cumin2002"
  • 10:19 volans@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Update hieradata from Netbox - volans@cumin2002"
  • 10:16 robh@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti7004.magru.wmnet with OS bookworm
  • 10:16 robh@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - robh@cumin2002"
  • 10:16 bking@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
  • 10:16 bking@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin1002"
  • 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2041 (re)pooling @ 75%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71495 and previous config saved to /var/cache/conftool/dbconfig/20241203-100638-root.json
  • 09:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:52 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1006.eqiad.wmnet
  • 09:52 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1006.eqiad.wmnet
  • 09:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2041 (re)pooling @ 50%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71494 and previous config saved to /var/cache/conftool/dbconfig/20241203-095133-root.json
  • 09:40 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 09:38 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2041 (re)pooling @ 25%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71493 and previous config saved to /var/cache/conftool/dbconfig/20241203-093627-root.json
  • 09:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:27 moritzm: rebalance Ganeti eqiad/A following server refreshes
  • 09:24 moritzm: removing ganeti1009 from active Ganeti nodes T378921
  • 09:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1009.eqiad.wmnet
  • 09:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2041 (re)pooling @ 10%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71492 and previous config saved to /var/cache/conftool/dbconfig/20241203-092122-root.json
  • 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: parse2017.codfw.wmnet
  • 08:45 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: parse2017.codfw.wmnet
  • 08:37 elukey@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be1091.eqiad.wmnet with OS bullseye
  • 08:35 urbanecm@deploy2002: Finished scap sync-world: Backport for Growth: enable temporary Surfacing Alpha on pilot wikis (T379976) (duration: 21m 30s)
  • 08:34 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1091.eqiad.wmnet with OS bullseye
  • 08:32 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:31 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:27 moritzm: installing unbound security updates
  • 08:26 urbanecm@deploy2002: urbanecm, migr: Continuing with sync
  • 08:21 urbanecm@deploy2002: urbanecm, migr: Backport for Growth: enable temporary Surfacing Alpha on pilot wikis (T379976) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:14 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db1213 from dbctl T375593', diff saved to https://phabricator.wikimedia.org/P71489 and previous config saved to /var/cache/conftool/dbconfig/20241203-081434-marostegui.json
  • 08:13 urbanecm@deploy2002: Started scap sync-world: Backport for Growth: enable temporary Surfacing Alpha on pilot wikis (T379976)
  • 08:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1217.eqiad.wmnet with reason: Moving to m3
  • 08:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db1217.eqiad.wmnet with reason: Moving to m3
  • 08:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1213.eqiad.wmnet with reason: Moving to m3
  • 08:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db1213.eqiad.wmnet with reason: Moving to m3
  • 08:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1213', diff saved to https://phabricator.wikimedia.org/P71487 and previous config saved to /var/cache/conftool/dbconfig/20241203-080726-marostegui.json
  • 07:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2021.codfw.wmnet with reason: cloning
  • 07:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2021.codfw.wmnet with reason: cloning
  • 07:57 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2021', diff saved to https://phabricator.wikimedia.org/P71486 and previous config saved to /var/cache/conftool/dbconfig/20241203-075751-marostegui.json
  • 07:57 marostegui: Switchover es4 codfw master to es2022 dbmaint (this happened an hour ago) T381259
  • 07:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1009.eqiad.wmnet
  • 07:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1009.eqiad.wmnet
  • 07:21 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1009.eqiad.wmnet
  • 06:41 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 06:41 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change VIPs for wdqs-internal-main and wdqs-internal-scholarly to avoid mw-parsoid collision - ryankemper@cumin2002"
  • 06:41 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change VIPs for wdqs-internal-main and wdqs-internal-scholarly to avoid mw-parsoid collision - ryankemper@cumin2002"
  • 06:37 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
  • 06:34 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2022 to es4 master T381259', diff saved to https://phabricator.wikimedia.org/P71485 and previous config saved to /var/cache/conftool/dbconfig/20241203-063408-marostegui.json
  • 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P71484 and previous config saved to /var/cache/conftool/dbconfig/20241203-063234-root.json
  • 06:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2021.codfw.wmnet with reason: cloning
  • 06:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2021.codfw.wmnet with reason: cloning
  • 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P71483 and previous config saved to /var/cache/conftool/dbconfig/20241203-061729-root.json
  • 06:10 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 06:10 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add VIPs for wdqs-internal-main and wdqs-internal-scholarly - ryankemper@cumin2002"
  • 06:10 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add VIPs for wdqs-internal-main and wdqs-internal-scholarly - ryankemper@cumin2002"
  • 06:08 marostegui@cumin1002: dbctl commit (dc=all): 'Add es2041 to es4 with just minimal weight T381259', diff saved to https://phabricator.wikimedia.org/P71482 and previous config saved to /var/cache/conftool/dbconfig/20241203-060847-marostegui.json
  • 06:06 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
  • 06:06 ryankemper: [Netbox] T379334 Aborted netbox sync cookbook due to wrong IPs for wdqs-internal-scholarly. Fixed in UI, re-running cookbook now
  • 06:06 marostegui@cumin1002: dbctl commit (dc=all): 'Add es2041 depooled T381259', diff saved to https://phabricator.wikimedia.org/P71481 and previous config saved to /var/cache/conftool/dbconfig/20241203-060614-marostegui.json
  • 06:06 ryankemper@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P71480 and previous config saved to /var/cache/conftool/dbconfig/20241203-060224-root.json
  • 06:00 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
  • 06:00 ryankemper: [Netbox] T379334 Added VIPs via UI for wdqs-internal-[main,scholarly].svc.[eqiad,codfw].wmnet
  • 05:47 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P71479 and previous config saved to /var/cache/conftool/dbconfig/20241203-054718-root.json
  • 05:44 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on wdqs[2018-2020,2026-2027].codfw.wmnet with reason: T376150 non-prod hosts
  • 05:44 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 12:00:00 on wdqs[2018-2020,2026-2027].codfw.wmnet with reason: T376150 non-prod hosts
  • 05:17 eileen: config revision changed from b3741848 to 694158ae
  • 05:17 eileen: civicrm upgraded from be7e5d33 to 6361a578
  • 05:01 mwpresync@deploy2002: Pruned MediaWiki: 1.44.0-wmf.3 (duration: 01m 27s)
  • 04:51 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.44.0-wmf.6 refs T375665 (duration: 48m 24s)
  • 04:02 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.44.0-wmf.6 refs T375665
  • 02:56 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be1091.eqiad.wmnet with OS bullseye
  • 02:37 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 02:36 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 02:20 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 02:20 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt ms-be1084 - vriley@cumin1002"
  • 02:20 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt ms-be1084 - vriley@cumin1002"
  • 02:16 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 01:53 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:47 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:38 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:38 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:36 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1091.eqiad.wmnet with OS bullseye
  • 01:35 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:35 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:26 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be1091.eqiad.wmnet with OS bullseye
  • 01:26 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:26 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:22 pt1979@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:22 pt1979@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:34 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:34 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1091.eqiad.wmnet with OS bullseye
  • 00:34 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:32 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:32 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:31 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:30 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:25 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:15 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART

2024-12-02

  • 23:58 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:50 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:50 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:50 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 23:50 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 23:46 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 22:27 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:27 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:27 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:26 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:25 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:24 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:21 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:21 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:20 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:20 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:18 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:18 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:16 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:16 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:05 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:05 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:45 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:40 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:36 urbanecm@deploy2002: Finished scap sync-world: Backport for testwiki: no growth experiment anymore (T380659), fix(surfacing): don't redirect to desktop (duration: 13m 22s)
  • 21:35 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:29 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:29 urbanecm@deploy2002: migr, urbanecm: Continuing with sync
  • 21:27 urbanecm@deploy2002: migr, urbanecm: Backport for testwiki: no growth experiment anymore (T380659), fix(surfacing): don't redirect to desktop synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:22 urbanecm@deploy2002: Started scap sync-world: Backport for testwiki: no growth experiment anymore (T380659), fix(surfacing): don't redirect to desktop
  • 21:21 urbanecm@deploy2002: Finished scap sync-world: Backport for Enable VisualEditor by default on Indonesian Wikiquote (T381214), votewiki, testwiki: add securepoll-edit-poll to electionadmin (T377531), cawiki: stop Flow being the default for some talk namespaces (T381295) (duration: 13m 40s)
  • 21:15 urbanecm@deploy2002: kemayo, urbanecm, nmw03, sd: Continuing with sync
  • 21:14 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:12 urbanecm@deploy2002: kemayo, urbanecm, nmw03, sd: Backport for Enable VisualEditor by default on Indonesian Wikiquote (T381214), votewiki, testwiki: add securepoll-edit-poll to electionadmin (T377531), cawiki: stop Flow being the default for some talk namespaces (T381295) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:12 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:09 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:09 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt ms-be1083 - vriley@cumin1002"
  • 21:09 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt ms-be1083 - vriley@cumin1002"
  • 21:08 urbanecm@deploy2002: Started scap sync-world: Backport for Enable VisualEditor by default on Indonesian Wikiquote (T381214), votewiki, testwiki: add securepoll-edit-poll to electionadmin (T377531), cawiki: stop Flow being the default for some talk namespaces (T381295)
  • 21:04 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 20:10 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3069.esams.wmnet [reason: done: checking icinga alerts]
  • 19:55 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3069.esams.wmnet [reason: checking icinga alerts]
  • 19:22 volans@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 19:22 volans@cumin1002: START - Cookbook sre.hosts.provision for host sretest1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 19:21 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs3010.esams.wmnet
  • 19:20 volans@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 19:15 dancy@deploy2002: Installation of scap version "4.131.0" completed for 207 hosts
  • 19:14 sukhe@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs3010.esams.wmnet
  • 19:13 sukhe: rebooting lvs3010 to test CR 1093958
  • 19:11 dancy@deploy2002: Installing scap version "4.131.0" for 207 hosts
  • 19:07 sukhe: disable puppet on A:lvs to finish rolling out CR 1093958: T358260
  • 19:01 volans@cumin1002: START - Cookbook sre.hosts.provision for host sretest1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 18:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2242']
  • 18:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2241']
  • 18:39 urbanecm@deploy2002: Finished scap sync-world: Backport for ApiQueryLinkRecommendations: Do not use relative protocol URIs (T381277) (duration: 10m 35s)
  • 18:38 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2242']
  • 18:38 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2241']
  • 18:37 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 18:36 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 18:35 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 18:35 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 18:35 jiji@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 18:35 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 18:34 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 18:34 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 18:34 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 18:34 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 18:34 jiji@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 18:33 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 18:33 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 18:33 jiji@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 18:33 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 18:32 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 18:32 urbanecm@deploy2002: urbanecm: Backport for ApiQueryLinkRecommendations: Do not use relative protocol URIs (T381277) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:32 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 18:32 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 18:31 jiji@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 18:31 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/benthos-cache-invalidator: apply
  • 18:31 jiji@deploy2002: helmfile [staging] START helmfile.d/services/benthos-cache-invalidator: apply
  • 18:28 urbanecm@deploy2002: Started scap sync-world: Backport for ApiQueryLinkRecommendations: Do not use relative protocol URIs (T381277)
  • 18:19 jiji@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 18:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2242.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:18 jiji@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 18:17 jiji@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 18:17 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2241.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:17 jiji@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 18:17 jiji@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 18:17 jiji@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 18:17 jiji@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 18:16 jiji@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 18:16 jiji@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 18:16 jiji@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 18:16 jiji@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 18:15 jiji@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 18:15 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 18:15 jiji@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 18:15 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 18:15 jiji@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 18:00 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 18:00 urbanecm@deploy2002: urbanecm: Backport for ApiQueryLinkRecommendations: Do not use relative protocol URIs (T381277) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:58 jiji@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
  • 17:57 urbanecm@deploy2002: Started scap sync-world: Backport for ApiQueryLinkRecommendations: Do not use relative protocol URIs (T381277)
  • 17:54 urbanecm@deploy2002: Sync cancelled.
  • 17:54 urbanecm@deploy2002: urbanecm: Backport for ApiQueryLinkRecommendations: Do not use relative protocol URIs (T381277) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:54 fabfur@cumin1002: END (PASS) - Cookbook sre.dns.roll-restart (exit_code=0) rolling restart_daemons on A:dnsbox
  • 17:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1006.eqiad.wmnet with OS bookworm
  • 17:50 urbanecm@deploy2002: Started scap sync-world: Backport for ApiQueryLinkRecommendations: Do not use relative protocol URIs (T381277)
  • 17:48 jiji@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
  • 17:48 dancy@deploy2002: Installation of scap version "4.129.0" completed for 207 hosts
  • 17:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2242.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2241.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:44 dancy@deploy2002: Installing scap version "4.129.0" for 207 hosts
  • 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2241-2 to codfw - jhancock@cumin2002"
  • 17:43 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2241-2 to codfw - jhancock@cumin2002"
  • 17:38 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 17:33 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1006.eqiad.wmnet with reason: host reimage
  • 17:31 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1006.eqiad.wmnet with reason: host reimage
  • 17:16 fabfur@cumin1002: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough and A:wikidough
  • 17:13 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1006.eqiad.wmnet with OS bookworm
  • 17:07 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "missing data for wikikube-worker1006 - jayme@cumin1002"
  • 17:07 topranks: resetting ulsfo->eqsin link to normal metric to put all codfw->eqsin traffic back on Aerlion cct
  • 17:07 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "missing data for wikikube-worker1006 - jayme@cumin1002"
  • 17:03 fabfur@cumin1002: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough and A:wikidough
  • 16:55 fabfur@cumin1002: START - Cookbook sre.dns.roll-restart rolling restart_daemons on A:dnsbox
  • 16:54 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 02m 28s)
  • 16:52 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 10m 36s)
  • 16:38 dancy@deploy2002: Installation of scap version "4.130.0" completed for 207 hosts
  • 16:34 dancy@deploy2002: Installing scap version "4.130.0" for 207 hosts
  • 16:32 jan_drewniak: starting portals deploy
  • 16:25 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1006.eqiad.wmnet with OS bookworm
  • 16:00 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wdqs1025.eqiad.wmnet']
  • 16:00 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wdqs1025.eqiad.wmnet']
  • 16:00 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wdqs1025.eqiad.wmnet']
  • 15:59 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wdqs1025.eqiad.wmnet']
  • 15:58 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 15:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 15:47 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 15:46 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 15:42 volans: uploaded spicerack_9.0.0 to apt.wikimedia.org bullseye-wikimedia
  • 15:42 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 15:42 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 15:32 taavi@deploy2002: Finished scap sync-world: Backport for wikitech: Drop contentadmin group (T375950) (duration: 09m 42s)
  • 15:29 sukhe: sudo cumin -b1 -s10 "A:cp" 'run-puppet-agent --enable "merging CR 1091748"'
  • 15:26 taavi@deploy2002: taavi: Continuing with sync
  • 15:26 taavi@deploy2002: taavi: Backport for wikitech: Drop contentadmin group (T375950) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:24 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet [reason: [done] testing CR 1091748]
  • 15:22 taavi@deploy2002: Started scap sync-world: Backport for wikitech: Drop contentadmin group (T375950)
  • 15:17 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet [reason: testing CR 1091748]
  • 15:14 sukhe: sudo cumin "A:cp" 'disable-puppet "merging CR 1091748"' [trafficserver: remove inbound TLS and related settings]
  • 15:08 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1006.eqiad.wmnet with OS bookworm
  • 15:03 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubernetes1018.eqiad.wmnet wikikube-worker1006.eqiad.wmnet on all recursors
  • 15:03 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache kubernetes1018.eqiad.wmnet wikikube-worker1006.eqiad.wmnet on all recursors
  • 15:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1018 to wikikube-worker1006
  • 15:01 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1006
  • 14:59 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1006
  • 14:59 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:58 marostegui: Deploy schema change on db1167 dbmaint eqiad - s8 sanitarium master, there will be days of lag in wikireplicas in s8 T367856
  • 14:57 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:55 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:53 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 14:50 sukhe: running authdns-update for CR 1099713
  • 14:44 jelto@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 14:43 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] testwiki: Enable Surfacing structured tasks (T379976), Prepare for surfacing structured tasks (squashed) (T379976) (duration: 19m 08s)
  • 14:36 urbanecm@deploy2002: migr, urbanecm: Continuing with sync
  • 14:34 moritzm: installing curl security updates
  • 14:29 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:29 jiji@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts mc-gp[1001-1003].eqiad.wmnet
  • 14:29 jiji@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 14:28 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1018 to wikikube-worker1006
  • 14:27 urbanecm@deploy2002: migr, urbanecm: Backport for [Growth] testwiki: Enable Surfacing structured tasks (T379976), Prepare for surfacing structured tasks (squashed) (T379976) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:27 jiji@cumin1002: START - Cookbook sre.dns.netbox
  • 14:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:23 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] testwiki: Enable Surfacing structured tasks (T379976), Prepare for surfacing structured tasks (squashed) (T379976)
  • 14:17 urbanecm@deploy2002: Finished scap sync-world: Backport for Drop $wgWikimediaCampaignEventsEnableCommunityList (T380075) (duration: 14m 37s)
  • 14:11 urbanecm@deploy2002: urbanecm, daimona: Continuing with sync
  • 14:08 jiji@cumin1002: START - Cookbook sre.hosts.decommission for hosts mc-gp[1001-1003].eqiad.wmnet
  • 14:07 urbanecm@deploy2002: urbanecm, daimona: Backport for Drop $wgWikimediaCampaignEventsEnableCommunityList (T380075) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:03 urbanecm@deploy2002: Started scap sync-world: Backport for Drop $wgWikimediaCampaignEventsEnableCommunityList (T380075)
  • 14:00 moritzm: removing ganeti1020 from active Ganeti nodes T378921
  • 13:57 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kafka-main1007.eqiad.wmnet
  • 13:57 jiji@cumin1002: START - Cookbook sre.hosts.remove-downtime for kafka-main1007.eqiad.wmnet
  • 13:51 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[1003,1008].eqiad.wmnet with reason: Hardware refresh
  • 13:51 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[1003,1008].eqiad.wmnet with reason: Hardware refresh
  • 13:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1198 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P71471 and previous config saved to /var/cache/conftool/dbconfig/20241202-134648-root.json
  • 13:46 isaranto@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:41 isaranto@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:37 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[1002,1007].eqiad.wmnet with reason: Hardware refresh
  • 13:37 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[1002,1007].eqiad.wmnet with reason: Hardware refresh
  • 13:31 marostegui@cumin1002: dbctl commit (dc=all): 'db1198 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P71470 and previous config saved to /var/cache/conftool/dbconfig/20241202-133143-root.json
  • 13:31 effie: repacing kafka-main1003 in production with kafka-main1008 - T363214
  • 13:30 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes1018.eqiad.wmnet
  • 13:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:29 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes1018.eqiad.wmnet
  • 13:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:24 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc-gp[2002-2003].codfw.wmnet
  • 13:24 jiji@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:24 jiji@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc-gp[2002-2003].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1002"
  • 13:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:21 jiji@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc-gp[2002-2003].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1002"
  • 13:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:18 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1005.eqiad.wmnet
  • 13:18 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1005.eqiad.wmnet
  • 13:17 jiji@cumin1002: START - Cookbook sre.dns.netbox
  • 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'db1198 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P71469 and previous config saved to /var/cache/conftool/dbconfig/20241202-131638-root.json
  • 13:06 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 13:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:01 jiji@cumin1002: START - Cookbook sre.hosts.decommission for hosts mc-gp[2002-2003].codfw.wmnet
  • 13:01 marostegui@cumin1002: dbctl commit (dc=all): 'db1198 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P71467 and previous config saved to /var/cache/conftool/dbconfig/20241202-130132-root.json
  • 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1020.eqiad.wmnet
  • 12:22 topranks: re-routing traffic from Drmrs towards TECHLIB-TCZ - AS2852 - National Library of Technology, Prague, to avoid path via GEANT
  • 12:18 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2005-2006].codfw.wmnet
  • 12:18 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2005-2006].codfw.wmnet
  • 12:13 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc-gp2001.codfw.wmnet
  • 12:13 jiji@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:13 jiji@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc-gp2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1002"
  • 12:06 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2005.codfw.wmnet with OS bookworm
  • 12:05 jiji@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc-gp2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1002"
  • 12:04 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 12:02 jiji@cumin1002: START - Cookbook sre.dns.netbox
  • 12:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1005.eqiad.wmnet with OS bookworm
  • 11:57 moritzm: upload mapnik 4.0.3+ds-2~wmf12u2 (adding a forward ported mapnik-config script to be consumed by node-mapnik even with the switch of mapnik 4 towards pkg-config) T327396
  • 11:56 jiji@cumin1002: START - Cookbook sre.hosts.decommission for hosts mc-gp2001.codfw.wmnet
  • 11:56 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2006.codfw.wmnet with OS bookworm
  • 11:55 marostegui: Stop mariadb on es2020 to clone es2041 T381259
  • 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be1070.eqiad.wmnet
  • 11:52 mvernon@cumin2002: START - Cookbook sre.hosts.remove-downtime for ms-be1070.eqiad.wmnet
  • 11:46 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2005.codfw.wmnet with reason: host reimage
  • 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1070.eqiad.wmnet with reason: vacuum two overlarge container dbs
  • 11:45 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on ms-be1070.eqiad.wmnet with reason: vacuum two overlarge container dbs
  • 11:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1005.eqiad.wmnet with reason: host reimage
  • 11:42 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2005.codfw.wmnet with reason: host reimage
  • 11:38 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1005.eqiad.wmnet with reason: host reimage
  • 11:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2020.codfw.wmnet with reason: cloning
  • 11:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2020.codfw.wmnet with reason: cloning
  • 11:36 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2006.codfw.wmnet with reason: host reimage
  • 11:33 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2006.codfw.wmnet with reason: host reimage
  • 11:26 ladsgroup@deploy2002: Finished scap sync-world: Backport for Translate: Disable message group subscription feature for legalteamwiki (T372386 T381250) (duration: 11m 21s)
  • 11:23 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2005
  • 11:23 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2005
  • 11:23 topranks: rollback OSPF metric change on cr4-ulsfo to place all codfw to eqsin traffic back on primary transport link
  • 11:22 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1005.eqiad.wmnet with OS bookworm
  • 11:21 marostegui@cumin2002: dbctl commit (dc=all): 'Depool es2020 T381259', diff saved to https://phabricator.wikimedia.org/P71463 and previous config saved to /var/cache/conftool/dbconfig/20241202-112105-marostegui.json
  • 11:19 ladsgroup@deploy2002: abi, ladsgroup: Continuing with sync
  • 11:19 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2005
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2005.codfw.wmnet 40.32.192.10.in-addr.arpa 0.4.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:19 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2005.codfw.wmnet 40.32.192.10.in-addr.arpa 0.4.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2005 - jayme@cumin2002"
  • 11:19 ladsgroup@deploy2002: abi, ladsgroup: Backport for Translate: Disable message group subscription feature for legalteamwiki (T372386 T381250) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:19 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2005 - jayme@cumin2002"
  • 11:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:15 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2088.codfw.wmnet with OS bullseye
  • 11:15 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubernetes1017.eqiad.wmnet wikikube-worker1005.eqiad.wmnet on all recursors
  • 11:15 ladsgroup@deploy2002: Started scap sync-world: Backport for Translate: Disable message group subscription feature for legalteamwiki (T372386 T381250)
  • 11:14 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache kubernetes1017.eqiad.wmnet wikikube-worker1005.eqiad.wmnet on all recursors
  • 11:14 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 11:14 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2005
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2006
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2006
  • 11:13 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2006
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2006.codfw.wmnet 141.32.192.10.in-addr.arpa 1.4.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:13 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2006.codfw.wmnet 141.32.192.10.in-addr.arpa 1.4.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2006 - jayme@cumin2002"
  • 11:13 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2006 - jayme@cumin2002"
  • 11:09 ladsgroup@deploy2002: Started scap sync-world: Backport for Translate: Disable message group subscription feature for legalteamwiki (T372386 T381250)
  • 11:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:07 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:07 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1017 to wikikube-worker1005
  • 11:06 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1005
  • 11:05 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 11:05 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1005
  • 11:05 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:05 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1017 to wikikube-worker1005 - jelto@cumin1002"
  • 11:04 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1017 to wikikube-worker1005 - jelto@cumin1002"
  • 11:02 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2005.codfw.wmnet with OS bookworm
  • 11:02 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2006
  • 11:02 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2006.codfw.wmnet with OS bookworm
  • 11:01 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2005.codfw.wmnet wikikube-worker2006.codfw.wmnet on all recursors
  • 11:01 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2005.codfw.wmnet wikikube-worker2006.codfw.wmnet on all recursors
  • 11:00 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:00 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1017 to wikikube-worker1005
  • 10:55 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2437 to wikikube-worker2006
  • 10:55 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2006
  • 10:54 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2006
  • 10:54 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:54 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2437 to wikikube-worker2006 - jayme@cumin2002"
  • 10:54 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2437 to wikikube-worker2006 - jayme@cumin2002"
  • 10:52 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2436 to wikikube-worker2005
  • 10:52 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2088.codfw.wmnet with reason: host reimage
  • 10:51 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2005
  • 10:51 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:51 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2005
  • 10:51 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:51 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2436 to wikikube-worker2005 - jayme@cumin2002"
  • 10:50 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2436 to wikikube-worker2005 - jayme@cumin2002"
  • 10:48 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2088.codfw.wmnet with reason: host reimage
  • 10:46 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2437 to wikikube-worker2006
  • 10:46 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:46 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2436 to wikikube-worker2005
  • 10:45 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes1017.eqiad.wmnet
  • 10:45 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes1017.eqiad.wmnet
  • 10:44 ladsgroup@deploy2002: Finished scap sync-world: Backport for Enable new ParserCache key schema on every page (T373037) (duration: 17m 25s)
  • 10:38 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1017.eqiad.wmnet with OS bookworm
  • 10:37 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 10:35 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be2088.codfw.wmnet with OS bullseye
  • 10:33 ladsgroup@deploy2002: ladsgroup: Backport for Enable new ParserCache key schema on every page (T373037) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:32 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 10:32 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 10:31 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2087.codfw.wmnet with OS bullseye
  • 10:26 ladsgroup@deploy2002: Started scap sync-world: Backport for Enable new ParserCache key schema on every page (T373037)
  • 10:16 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2437.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:16 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2436.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1017.eqiad.wmnet with reason: host reimage
  • 10:12 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1017.eqiad.wmnet with reason: host reimage
  • 10:12 marostegui: Deploy schema change on db1167 - s8 sanitarium master, there will be days of lag in wikireplicas in s8 T367856
  • 10:12 marostegui@cumin2002: dbctl commit (dc=all): 'Depool db1167 for an alter table', diff saved to https://phabricator.wikimedia.org/P71461 and previous config saved to /var/cache/conftool/dbconfig/20241202-101225-marostegui.json
  • 10:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: alter
  • 10:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: alter
  • 10:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 8:00:00 on db1167.eqiad.wmnet with reason: alter
  • 10:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 8:00:00 on db1167.eqiad.wmnet with reason: alter
  • 10:09 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2087.codfw.wmnet with reason: host reimage
  • 10:05 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2087.codfw.wmnet with reason: host reimage
  • 10:04 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2437.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:03 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2436.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:56 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host pc1017.eqiad.wmnet with OS bookworm
  • 09:52 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw[2436-2437].codfw.wmnet with reason: rename/reimage
  • 09:52 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw[2436-2437].codfw.wmnet with reason: rename/reimage
  • 09:52 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be2087.codfw.wmnet with OS bullseye
  • 09:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:48 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2436-2437].codfw.wmnet
  • 09:47 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2436-2437].codfw.wmnet
  • 09:45 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2086.codfw.wmnet with OS bullseye
  • 09:45 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 09:45 elukey@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: sync
  • 09:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: optimizing
  • 09:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: optimizing
  • 09:42 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 09:41 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
  • 09:39 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:39 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:35 marostegui: Installing mariadb 10.6.20 on db1198 T378940
  • 09:28 marostegui@cumin2002: dbctl commit (dc=all): 'Depoll db1198 to install 10.6.20', diff saved to https://phabricator.wikimedia.org/P71460 and previous config saved to /var/cache/conftool/dbconfig/20241202-092854-marostegui.json
  • 09:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1198.eqiad.wmnet with reason: testing
  • 09:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db1198.eqiad.wmnet with reason: testing
  • 09:24 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2086.codfw.wmnet with reason: host reimage
  • 09:20 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2086.codfw.wmnet with reason: host reimage
  • 09:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1020.eqiad.wmnet
  • 09:09 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be2086.codfw.wmnet with OS bullseye
  • 08:59 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2086.codfw.wmnet with OS bullseye
  • 08:52 dcausse: restarting blazegraph on wdqs1019 (BlazegraphFreeAllocatorsDecreasingRapidly)
  • 08:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:36 kartik@deploy2002: Finished scap sync-world: Backport for Translate: Enable message group subscription feature for some wikis (T372386) (duration: 23m 39s)
  • 08:35 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be2086.codfw.wmnet with OS bullseye
  • 08:29 kartik@deploy2002: abi, kartik: Continuing with sync
  • 08:25 kartik@deploy2002: abi, kartik: Backport for Translate: Enable message group subscription feature for some wikis (T372386) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 08:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 08:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1020.eqiad.wmnet
  • 08:12 kartik@deploy2002: Started scap sync-world: Backport for Translate: Enable message group subscription feature for some wikis (T372386)
  • 08:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1020.eqiad.wmnet
  • 08:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1009.eqiad.wmnet
  • 08:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1009.eqiad.wmnet
  • 08:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on pc1017.eqiad.wmnet with reason: T378068, host is not pooled
  • 08:00 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on pc1017.eqiad.wmnet with reason: T378068, host is not pooled
  • 08:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on pc1013.eqiad.wmnet with reason: T373037, host is not pooled
  • 08:00 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on pc1013.eqiad.wmnet with reason: T373037, host is not pooled
  • 05:20 TimStarling: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol=https
  • 05:14 TimStarling: on mwmaint2002: mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=idwikivoyage --force-protocol=https
  • 04:41 TimStarling: installed id.wikivoyage.org
  • 04:39 TimStarling: on db2123: grant alter ON `%wik%`.* TO `wikiadmin2023`@`10.%`
  • 04:26 tstarling@deploy2002: Finished scap sync-world: Backport for Create id.wikivoyage.org (T380726 T352113), Add messages for Indonesian Wikivoyage (idwikivoyage) (T380726) (duration: 31m 05s)
  • 04:13 tstarling@deploy2002: tstarling: Continuing with sync
  • 04:12 tstarling@deploy2002: tstarling: Backport for Create id.wikivoyage.org (T380726 T352113), Add messages for Indonesian Wikivoyage (idwikivoyage) (T380726) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 03:55 tstarling@deploy2002: Started scap sync-world: Backport for Create id.wikivoyage.org (T380726 T352113), Add messages for Indonesian Wikivoyage (idwikivoyage) (T380726)

2024-12-01

  • 23:53 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:17 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1156 gradually with 4 steps - Maint over (T381213)
  • 13:02 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1233 gradually with 4 steps - Maint over (T381213)
  • 12:31 ladsgroup@cumin1002: START - Cookbook sre.mysql.pool db1156 gradually with 4 steps - Maint over (T381213)
  • 12:16 ladsgroup@cumin1002: START - Cookbook sre.mysql.pool db1233 gradually with 4 steps - Maint over (T381213)
  • 12:10 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1156.eqiad.wmnet onto db1233.eqiad.wmnet
  • 10:45 ladsgroup@cumin1002: START - Cookbook sre.mysql.clone of db1156.eqiad.wmnet onto db1233.eqiad.wmnet
  • 10:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depool to reclone (T381213)', diff saved to https://phabricator.wikimedia.org/P71451 and previous config saved to /var/cache/conftool/dbconfig/20241201-104441-ladsgroup.json
  • 06:18 marostegui@cumin2002: dbctl commit (dc=all): 'Depoll db1233', diff saved to https://phabricator.wikimedia.org/P71450 and previous config saved to /var/cache/conftool/dbconfig/20241201-061841-marostegui.json


Archives

See Server Admin Log/Archives.

  NODES
admin 138
COMMUNITY 3
INTERN 94
Project 7
USERS 3