Speedup regr_test.py by running test cases concurrently#10714
Speedup regr_test.py by running test cases concurrently#10714AlexWaygood merged 7 commits intopython:mainfrom
regr_test.py by running test cases concurrently#10714Conversation
|
Example of a successful CI run: https://github.com/python/typeshed/actions/runs/6199279959/job/16831483816 Example of a failing CI run: https://github.com/python/typeshed/actions/runs/6199348575/job/16831709368?pr=10714 |
|
Here's an alternative approach that uses It seems to work fine, without race conditions, and is a similar approach to what |
|
Any specific reason to use processes rather than threads here? Seems like the workers are mostly waiting on external processes, so the GIL shouldn't be much of an issue. I tend to avoid multiprocessing as it's more prone to failing in weird ways than threads are. |
|
I initially tried a ThreadPoolExecutor, but quickly ran into weird race conditions where some of the mypy subprocesses couldn't find various stdlib modules. But I'll take another look and see if I can make threads work here. I agree they make more sense for this kind of thing. |
|
I couldn't repro the race conditions anymore when I tried to, so I switched to a (Note that although we have the infrastructure setup to run test cases on stubs that have non-types dependencies, we don't yet have any test cases for any stubs with non-types dependencies. So there's some bits of this script that are currently "dead code, lying in wait". Those bits of the script might be more susceptible to #9537-type issues than the bits that are currently being used.) |
|
Thanks @JelleZijlstra! |
| test_case_dir: Path | ||
| tempdir: Path | ||
|
|
||
| def print_description(self, *, verbosity: Verbosity) -> None: |
There was a problem hiding this comment.
@AlexWaygood After this refactor, the verbosity argument of Result.print_description has been left unused. Was that intentional ?
As we add more and more test cases,
regr_test.pyis getting kinda slow. Other than mypy_primer, it's now the slowest CI job we have by a long way (and we don't run mypy_primer on all PRs -- for example, it's skipped on this PR!).The reason for the slowness is that we now have regression tests for 11 stubs packages. We run all regression tests on Python 3.8-12 inclusive, and we run them all on linux, darwin and win32. That adds up for a total of 165 subprocesses that are created for each run of the test when it's run with
--all(the flag we use in CI).At some point we may want to consider sharding this test between GitHub Actions workers, similar to the way we run
mypy_test.pyand pyright in CI. (We can also possibly reconsider whether we need to, e.g., run all tests on darwin, linux and Windows). For now, though, we can speed things up a lot just by running the subprocess concurrently using aProcessPoolExecutor. This cuts the execution time in CI roughly in half, from around 5-6 minutes to around 2-3 minutes.(N.B.: A
ProcessPoolExecutorfeels like a slightly blunt instrument here with a lot of overhead; I'm sure there are more efficient ways of spawning subprocesses concurrently. I got this to work reasonably quickly, however, and when I tried different approaches I quickly ran into race conditions. I think this is ~good enough for now.)