Page MenuHomePhabricator

1.36.0-wmf.18 deployment blockers
Closed, ResolvedPublicRelease

Details

Backup Train Conductor
hashar
Release Version
1.36.0-wmf.18
Release Date
Nov 16 2020, 12:00 AM

2020 week 47 1.36-wmf.18 Changes wmf/1.36.0-wmf.18

This MediaWiki Train Deployment is scheduled for the week of Monday, November 16th:

Monday November 16thTuesday, November 17thWednesday, November 18thThursday, November 19thFriday
Backports only.Branch wmf.18 and deploy to Group 0 Wikis.Deploy wmf.18 to Group 1 Wikis.Deploy wmf.18 to all Wikis.No deployments on fridays

How this works

  • Any serious bugs affecting wmf.18 should be added as subtasks beneath this one.
  • Any open subtask(s) block the train from moving forward. This means no further deployments until the blockers are resolved.
  • If something is serious enough to warrant a rollback then you should bring it to the attention of deployers on the #wikimedia-operations IRC channel.
  • If you have a risky change in this week's train add a comment to this task using the Risky patch template
  • For more info about deployment blockers, see Holding the train.

Related Links

Other Deployments

Previous: 1.36.0-wmf.17
Next: 1.36.0-wmf.19

Event Timeline

thcipriani triaged this task as Medium priority.
thcipriani updated Other Assignee, added: mmodell.

If at all possible, can I get the train window at 14:00 local time (CET) which would be 13:00 UTC. That will be easier to manage for me.

thcipriani updated Other Assignee, added: hashar; removed: mmodell.

If at all possible, can I get the train window at 14:00 local time (CET) which would be 13:00 UTC. That will be easier to manage for me.

Looks like I had this task assigned incorrectly vs our weekly meeting notes. You're backup for this week, but you'll be primary on T263186: 1.36.0-wmf.20 deployment blockers

Consider this train blocked until the cause behind T267668 is either fixed or reverted in master. I've prepared a revert at https://gerrit.wikimedia.org/r/c/mediawiki/core/+/640503, but giving my team until the branch cut next Monday/Tuesday to try to fix it in-master.

We have rolled out the fix for T267668: Some recent Commons uploads not available on other wikis (2020-11) yesterday (Nov. 12th). It hasn't been rolled back in master as Timo said.

Change 641319 had a related patch set uploaded (by DannyS712; owner: trainbranchbot):
[mediawiki/core@wmf/1.36.0-wmf.18] Branch commit for wmf/1.36.0-wmf.18

https://gerrit.wikimedia.org/r/641319

Now fixed in master and included in wmf.18:

[mediawiki/core@master] filerepo: remove repo name from getSharedCacheKey()
https://gerrit.wikimedia.org/r/640868

[mediawiki/core@wmf/1.36.0-wmf.18] filerepo: remove repo name from getSharedCacheKey()
https://gerrit.wikimedia.org/r/641290

T268012 and T268008 are two GrowthExperiments issues that are not fixed in wmf.18 as far as I’m aware. Both have fixes available, and for the former I just deployed the fix for wmf.16, but I’m not sure how to backport changes for not-yet-deployed trains (last time I tried, it didn’t quite work out), so I’ve left the wmf.18 branch alone for now. @kostajh can tell you more about whether those issues have any serious user impact or “just” logspam; depending on that, I expect you’ll decide whether to apply the backports to wmf.18 before the initial rollout, or afterwards.

hi @dancy, we've backported a fix to wmf.16 but need https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GrowthExperiments/+/641294 also backported to wmf.18. I've put it on the deployment calendar for the upcoming evening backport window but that will happen after this gets to group0. Which, now that I'm writing this, I think is probably OK, but just a heads up about it.

Change 641319 merged by jenkins-bot:
[mediawiki/core@wmf/1.36.0-wmf.18] Branch commit for wmf/1.36.0-wmf.18

https://gerrit.wikimedia.org/r/641319

@kostajh https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GrowthExperiments/+/641294 is deployed to group0.

I hope to see something to deal with the T268008 log messages soon. Thanks!

@kostajh https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GrowthExperiments/+/641294 is deployed to group0.

I hope to see something to deal with the T268008 log messages soon. Thanks!

@dancy the patch attached to that task would fix the problem, I think (cc @Tgr and @Catrope to review) but it seems like a bigger problem with the job queue infrastructure, not specific to that particular job.

@dancy the patch attached to that task would fix the problem, I think (cc @Tgr and @Catrope to review) but it seems like a bigger problem with the job queue infrastructure, not specific to that particular job.

Indeed, the larger issue is very troubling (our job doesn't do anything important, but if the parameter corruption affects some job which writes data, and it gets corrupted in such a way that it doesn't trigger an error, just results in the wrong data being written, that would be rather bad) so I don't think we should suppress it until the core issue is found.

We found the underlying issue (it was specific to the extension) so the logspam will be fixed soon.

We found the underlying issue (it was specific to the extension) so the logspam will be fixed soon.

Looks good now. Thanks!

Thanks for the note. Is this a request to hold the train, or just a hint that there may be a patch coming in soon?

I’m not the right person to decide that, sorry – I just saw a few reports of the few issue and thought it was worth mentioning here.

The train has been rolled back to group0 (i.e., group0 = 1.36.0-wmf.18, group1 and group 2 = 1.36.0-wmf.16).

T267668 is now blocking the train.

@Krinkle deployed the fix for T267668 to wmf.16 and testing looks good.

group0: wmf.18
group1: wmf.16
group2: wmf.16

@greg says we can finish the wmf.18 train on Monday and Tuesday.

wmf.18 has been rolled back out to group1.

Current status:
group0: wmf.18
group1: wmf.18
group2: wmf.16

It seems to work properly and T267668 has not been reopened. If all goes well over night, I will deploy it on the rest of the wiki during the European time slot at 13:00 UTC (14:00 CET). Checking with SRE before pushing it obviously!

Deployed to all wikis. Congratulations everyone.

  NODES
Experiments 7
HOME 1
Note 3
os 15
todo 2