Page MenuHomePhabricator

Split PHPUnit test suite for wmf-quibble-vendor-mysql-php74-docker
Closed, ResolvedPublic

Description

According to the analysis in T361118: [EPIC][Infra] Reduce CI test runtime for Wikibase and related extensions, wmf-quibble-vendor-mysql-php74-docker takes the longest of the checks with 21.6 minutes, with the PHPUnit test suite taking 14.5 minutes. Any improvement to the runtime for this check would directly reduce the total round-trip time for CI checks.

Investigate splitting the test suite into smaller chunks. The chunks can then be run in parallel.

  • creating two chunks of 7.25 minutes would already improve the CI run time by 7.25 minutes
  • In the Python world, pytest-split is a popular plugin that implements test suite splitting for pytest suites.
  • phpunit has basic functionality for filtering test cases by test class name (e.g. phpunit --filter '/^[A-M]/', phpunit --filter '/^[N-Z]/')

Demonstrate the improvement in runtime by running the quibble docker image locally, with before / after timings, and propose patches to integration/config and integration/quibble .

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Prio Notes:

Impact AreaAffected
production / end users
monitoring
development efforts
onboarding efforts
additional stakeholders
ArthurTaylor renamed this task from [REPO][CLIENT][SW] Split PHPUnit test suite for wmf-quibble-vendor-mysql-php74-docker to [REPO][CLIENT] Split PHPUnit test suite for wmf-quibble-vendor-mysql-php74-docker.Apr 11 2024, 8:16 AM
ArthurTaylor claimed this task.

Some progress notes / a status update.

Running the test job on its own (11200 tests):

$ time composer run --timeout=0 phpunit:entrypoint -- --testsuite extensions --group Database --exclude-group Broken,ParserFuzz,Stub,Standalone
...
real	7m38.242s
user	6m58.434s
sys	0m30.815s

Running tests with classes grouped into 8 groups, but the groups running sequentially (11290 tests):

real	4m15.663s
user	3m44.455s
sys	0m14.330s

With 8 suites running in parallel, and test classes assigned round-robin to suites:

real	2m26.348s
user	8m8.877s
sys	0m29.452s

With 8 suites running parallel, and test classes assigned to balance suite duration:

real	2m5.897s
user	9m47.516s
sys	0m37.622s

In the related subtasks, I've been fixing up some tests that run okay in a normal sequential suite but fail if they are run with an unfamiliar set of classes (or in an unexpected order). I think what confuses me most so far is the difference between the linear run with all tests, and the linear run with the tests split into 8 groups. One obvious theory is that the split groups are running different tests - it's confusing that the number of tests is more in the split run than in the normal run. Another more out-there theory is that in the process of running 11200 tests in the same process, there's enough stored state, leaked memory and overhead to slow the rest of the suite down.

Next steps will be to continue to fix tests that fail on reordering, and to investigate exactly which tests are running in the one configuration that are not in the other.

I've uploaded the first (very ugly) versions of scripts that have the desired effect to gitlab:

https://gitlab.wikimedia.org/arthurtaylor/phpunit-quibble-splittool.git

If you create a file bootstrap-splittool.sh in your local quibble cache folder with the following contents:

#!/bin/bash
set -ex
if [ ! -d splittool ]; then
  git clone https://gitlab.wikimedia.org/arthurtaylor/phpunit-quibble-splittool.git splittool
else
  pushd splittool
  git pull
  popd
fi
exec splittool/splittool.sh

You should be able to run a parallel version of the test suite by passing the bootstrap script to quibble:

docker run -it -p9413:9413 --entrypoint=quibble-with-supervisord --tmpfs /workspace/db:size=640M --volume /home/arthur.taylor/work/quibble-run-folder/src:/workspace/src --volume /home/arthur.taylor/work/quibble-run-folder/cache:/cache --volume /home/arthur.taylor/work/quibble-run-folder/log:/workspace/log --volume /home/arthur.taylor/work/quibble-run-folder/ref:/srv/git:ro --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --env-file=env-Wikibase-Master-20240410 --init --rm docker-registry.wikimedia.org/releng/quibble-buster-php74:1.7.0-s1 --packages-source composer --db mysql --db-dir /workspace/db --git-parallel=8 -c /cache/bootstrap-splittool.sh

On my laptop this runs in a little under 4 minutes, with the test suite itself being about 2m30 of that.

Change #1031903 had a related patch set uploaded (by Arthur taylor; author: Arthur taylor):

[integration/quibble@master] [WIP] Add parallel execution for PHPUnit extensions suite

https://gerrit.wikimedia.org/r/1031903

The first sketch of an integration of the parallel testing with Quibble is now ready. On my laptop, the quibble run takes around 7m30s, though there are still test failures that need to be debugged.

The first sketch of an integration of the parallel testing with Quibble is now ready. On my laptop, the quibble run takes around 7m30s, though there are still test failures that need to be debugged.

Nice work :)

One thing I am uncertain of, when I looked at this in the past, was whether using --filter to split test suites up might end up in some tests not running. I wonder if we can use TestDox output or some other additional output to diff the test classes and test cases being run, to ensure we don't lose anything.

I also wanted to make sure you know about T50217: Speed up MediaWiki PHPUnit build by running integration tests in parallel, it sounds like there is some upcoming work planned there in the short term, so probably worth coordinating with @pmiazga on this: https://phabricator.wikimedia.org/T50217#9790465

In this approach I'm not using filter. I'm using phpunit's --list-tests-xmlto get a list of tests that would be run using the suite, then scanning the filesystem to find the test files associated with those test classes, then creating suites in phpunit.xml that contain those files. Basically expanding out the suite definitions that we have in phpunit.xml.dist.

I've been using the .phpunit.result.cache file to see if tests are being missed - I run the suite linearly, run the suite in parallel, and compare the list of tests in the two cache files. So far I haven't spotted any difference, but that's for sure something to keep an eye on.

Thanks for the tip about the other tickets - I'll have a look there and follow up with @pmiazga.

In this approach I'm not using filter. I'm using phpunit's --list-tests-xmlto get a list of tests that would be run using the suite, then scanning the filesystem to find the test files associated with those test classes, then creating suites in phpunit.xml that contain those files. Basically expanding out the suite definitions that we have in phpunit.xml.dist.

I've been using the .phpunit.result.cache file to see if tests are being missed - I run the suite linearly, run the suite in parallel, and compare the list of tests in the two cache files. So far I haven't spotted any difference, but that's for sure something to keep an eye on.

Thanks for the tip about the other tickets - I'll have a look there and follow up with @pmiazga.

Got it, thanks for the clarification.

@ArthurTaylor Based on the approach you describe, do you think it would be feasible to encapsulate that logic into a PHP script? That could potentially go into MediaWiki core (e.g. composer phpunit:parallel), and then it would be much easier for MediaWiki developers to make use of the parallel execution functionality in their local work. An advantage there is that it's easier for people to debug the parallel execution code in case of issues, whereas, most MediaWiki engineers are unfamiliar with Quibble and don't run it locally.

That said, the primary use case for parallel execution is CI, so I'm not opposed to making a Quibble-only implementation--just seems like, if a PHP script in MediaWiki core could do the job, then it would be nice to be able to support the use case of local development environments in addition to CI.

Hi @kostajh,

Thanks for the feedback. My original implementation of this was actually a PHP script, which I still have, so it's 100% possible to implement the logic in PHP. In terms of actually integrating this in composer, there are a couple of reasons why I went for the Python/Quibble approach:

  • Per other developers' experiences trying to get paratest working, if you start using anything other than composer phpunit:entrypoint, you need to understand how the current bootstrap.php is all working, which I honestly don't. I could do more research into that, but launching the jobs in parallel with an external (not composer) launcher was the path of least resistance.
  • At least for my setup (and I don't know how many other developers have the same experience), my development environment is similar to Quibble, but not 100% the same in terms of extensions and configuration. I can often run phpunit tests locally with PHPStorm or mw docker mediawiki exec composer, but I very often simply can't and there are often failures that I'm only able to reproduce in the Quibble docker container running on my machine. In that sense, it doesn't really matter whether the implementation is part of Quibble or part of composer.

I had a chat with @pmiazga yesterday, and he also raised the use-case of developers running the whole suite locally (i.e. from PHP/Composer), so I hear now two votes for that and I'm happy to take the feedback / hear from more people. Also if someone who knows their way around composer is interested to take the php version and run with it, it can be found in the phpunit-quibble-splittool repository.

I have filed a ticket for the composer/php implementation - T365567 - and added some info there - I'm very happy to address that as a follow-up. If you can imagine to go with the quibble/python implementation for now then I will continue to polish that and we can hopefully get some easy wins in the short term before I do the (I think harder) work of making it composer-native.

I will file a ticket for the composer/php implementation and add some info there - I'm very happy to address that as a follow-up. If you can imagine to go with the quibble/python implementation for now then I will continue to polish that and we can hopefully get some easy wins in the short term before I do the (I think harder) work of making it composer-native.

That sounds reasonable to me!

Change #1035364 had a related patch set uploaded (by Arthur taylor; author: Arthur taylor):

[integration/quibble@master] [DNM] Enable parallel runs by setting QUIBBLE_PHPUNIT_PARALLEL

https://gerrit.wikimedia.org/r/1035364

Progress update from my side. I have a patch that I'm happy with, where I've ironed out the kinks with logging and return codes. You can compare the slow run and the fast run by looking at the integration-quibble-fullrun-extensions-phpunit check results / log. In the fast case the suites (without database 5885 + 4333 + 2540 + 3596 + 2910 + 3068 + 3175 = 25507 tests, with database 934 + 562 + 1854 + 1043 + 934 + 5201 + 841 + 1564 = 12933 tests) need 7 minutes. In the slow case the suites (without database 25507 tests, with database 12933 tests) need 26 minutes.

All my changes to the tests themselves (cleaning up state mixing between tests) have been merged - the only failures / errors there should be actual problems on master. Otherwise, the patch requires no changes to mediawiki or anyone's dev setup, and is only active in CI when the environment variable is set.

I will file a ticket for the composer/php implementation and add some info there - I'm very happy to address that as a follow-up. If you can imagine to go with the quibble/python implementation for now then I will continue to polish that and we can hopefully get some easy wins in the short term before I do the (I think harder) work of making it composer-native.

Further building on the staged implementation of this, I think we could split up the PHP port into two parts: first implement the "build step" logic in a PHP script, so that Quibble invokes e.g. composer phpunit:prepare-parallel-test-list, and then after that is done, we could work on the parallel execution bit in PHP.

That makes sense. I've filed T365976 and T365978 to track these. Is there a board / tag / topic that I should be labeling these tickets with for greater visibility?

That makes sense. I've filed T365976 and T365978 to track these. Is there a board / tag / topic that I should be labeling these tickets with for greater visibility?

Maybe mark them as subtasks of T50217: Speed up MediaWiki PHPUnit build by running integration tests in parallel and tag with Developer Productivity and MediaWiki-Core-Tests ?

That makes sense. I've filed T365976 and T365978 to track these. Is there a board / tag / topic that I should be labeling these tickets with for greater visibility?

Maybe mark them as subtasks of T50217: Speed up MediaWiki PHPUnit build by running integration tests in parallel and tag with Developer Productivity and MediaWiki-Core-Tests ?

done!

ArthurTaylor renamed this task from [REPO][CLIENT] Split PHPUnit test suite for wmf-quibble-vendor-mysql-php74-docker to Split PHPUnit test suite for wmf-quibble-vendor-mysql-php74-docker.May 29 2024, 10:36 AM

Change #1043719 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[integration/quibble@master] release: Quibble 1.9.0

https://gerrit.wikimedia.org/r/1043719

Change #1031903 merged by jenkins-bot:

[integration/quibble@master] Add parallel execution for PHPUnit extensions suite

https://gerrit.wikimedia.org/r/1031903

Change #1043824 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[integration/quibble@master] build: Enable parallel runs by setting QUIBBLE_PHPUNIT_PARALLEL

https://gerrit.wikimedia.org/r/1043824

Change #1043824 merged by jenkins-bot:

[integration/quibble@master] build: Enable parallel runs by setting QUIBBLE_PHPUNIT_PARALLEL

https://gerrit.wikimedia.org/r/1043824

Change #1044365 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/config@master] dockerfiles: update Quibble to 1.9.0

https://gerrit.wikimedia.org/r/1044365

Change #1044365 merged by jenkins-bot:

[integration/config@master] dockerfiles: update Quibble to 1.9.0

https://gerrit.wikimedia.org/r/1044365

Change #1044421 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/config@master] jjb: switch jobs to Quibble 1.9.0

https://gerrit.wikimedia.org/r/1044421

Change #1035364 abandoned by Arthur taylor:

[integration/quibble@master] build: Enable parallel runs by setting QUIBBLE_PHPUNIT_PARALLEL

Reason:

Abandoning this for I9c2b0df089d18b6f0232ea1817b4855e5591e9b9

https://gerrit.wikimedia.org/r/1035364

Change #1044421 merged by jenkins-bot:

[integration/config@master] jjb: switch jobs to Quibble 1.9.0

https://gerrit.wikimedia.org/r/1044421

Change #1046625 had a related patch set uploaded (by Arthur taylor; author: Arthur taylor):

[integration/config@master] Enable parallel phpunit for WikibaseLexeme

https://gerrit.wikimedia.org/r/1046625

Change #1046625 merged by jenkins-bot:

[integration/config@master] Enable parallel phpunit for WikibaseLexeme

https://gerrit.wikimedia.org/r/1046625

Mentioned in SAL (#wikimedia-releng) [2024-06-17T12:14:10Z] <hashar> Reloaded Zuul for "Enable parallel phpunit for WikibaseLexeme" https://gerrit.wikimedia.org/r/1046625 # T361190

Change #1047919 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[integration/config@master] zuul: Enable PHPUnit parallel for all repos except core

https://gerrit.wikimedia.org/r/1047919

Change #1047919 merged by jenkins-bot:

[integration/config@master] zuul: Enable PHPUnit parallel for all WMDE-maintained repos

https://gerrit.wikimedia.org/r/1047919

Change #1049562 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[integration/config@master] zuul: Enable PHPUnit parallel for WikibaseQualityConstraints

https://gerrit.wikimedia.org/r/1049562

Change #1049919 had a related patch set uploaded (by Arthur taylor; author: Arthur taylor):

[integration/quibble@master] Add notice at the end of console log for parallel test runs

https://gerrit.wikimedia.org/r/1049919

Change #1049919 merged by jenkins-bot:

[integration/quibble@master] Add notice at the end of console log for parallel test runs

https://gerrit.wikimedia.org/r/1049919

Change #1049562 merged by jenkins-bot:

[integration/config@master] zuul: Enable PHPUnit parallel for WikibaseQualityConstraints

https://gerrit.wikimedia.org/r/1049562

kostajh claimed this task.

(restoring assignee after column trigger unintentionally cleared it)

  NODES
chat 1
composer 26
HOME 5
mac 1
Note 3
os 84
text 3
Users 1