This task is an initial dive into how we might implement T148589.
Description
Related Objects
Event Timeline
What about this: a client-side hash of loaded content gets sent to event log, and some bot reads the event log entries and makes api calls to load and render revisions, reporting when the hash is off.
Here's a proposal for doing this server-side (as per tech-talk discussion with @cwd yesterday).
- Use a server-side hook to run some code on every change to a Mediawiki namespace page. It can be run using deferred updates so it doesn't slow down the pageviews it runs on.
- Create a hash of the message contents as they should be sent via Special:BannerLoader.
- Store the hashes somewhere (memcached or DB).
- On every call to Special:BannerLoader, verify the hash of the banner content to be sent.
- If the hash is off, flag it for the client in the same way we're now flagging caught exceptions (client-side call to cn.handleBannerLoaderException(), which causes an error status flag to be sent with recordimpression).
- Don't cache such responses in Varnish (this should be worked on in a separate task).
Thoughts?
I think doing this server side is wise because browsers' wild variations in character set, encoding, and language seem likely to cause a lot of "false negatives" matching hashes.
The general plan seems sensible to me, the only part I know nothing about is Varnish.
Special:BannerLoader will serve many variations of each banner, which means the "change" hook (which probably doesn't exist) would have to figure out the matrix of all possible languages and countries for a banner, and calculate a hash for each.
There's also the issue of how code can know that it has the "correct" content. If we're really talking about an off-by-one problem, where we might get the previous revision's content, there's no way we can know whether the new revision has been changed to exactly match an older version, or if that's a mistake.
This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!
For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)