test: make timers-blocking-callback more reliable#14831
test: make timers-blocking-callback more reliable#14831Trott wants to merge 5 commits intonodejs:masterfrom
Conversation
test-timers-blocking-callback may fail erroneously on resource-constrained machines due to the timing nature of the test. There is likely no way around the timing issue. This change tries to decrease the probability of the test failing erroneously by having it retry a small number of times on failure. Tested on 0.10.38 (which has a bug that this test was written for) and (modifying the test slightly to remove ES6 stuff) the test still seems to fail 100% of the time there, which is what we want/expect. Fixes: nodejs#14792
|
/cc @aqrln @misterdjules |
|
For anyone else wanting to verify this still fails 100% in Node.js 0.10.38 and passes 100% in Node.js 0.10.39, here's the version of this test that I used with ES6-isms removed: Details |
| retries--; | ||
| return retry(callback); | ||
| } | ||
| assert.fail(`timeout delayed by more than 100ms (${latestDelay}ms)`); |
There was a problem hiding this comment.
Maybe
assert.fail(`timeout delayed by more than 100% (${latestDelay}ms)`);There was a problem hiding this comment.
I'm pretty sure the bug that this is a regression test for was one that delayed callbacks by 100ms, not 100%. But I can't find the info. Maybe @misterdjules knows if I'm mistaken about that or not.
(It does suggest that the TIMEOUT * 2 line might be better as TIMEOUT + 100. And maybe this needs a comment explaining that TIMEOUT needs to be short.)
There was a problem hiding this comment.
I thought TIMEOUT * 2 at first, but then read L57, also the test is (latestDelay > TIMEOUT * 2) implies 100%
There was a problem hiding this comment.
Nope, I'm wrong, it really should be 100%. Will change, thanks.
| TIMEOUT | ||
| ); | ||
| setTimeout( | ||
| blockingCallback.bind(null, testAddingTimerToNonEmptyTimersList), |
There was a problem hiding this comment.
CR: could you add a common.mustCall() to assert the async task actually runs
There was a problem hiding this comment.
That's not a callback. That's a "use this as a retry function if the test fails". So it almost never gets called.
EDIT: In the above comment, I assume you mean a common.mustCall() around testAddingTimerToNonEmptyTimersList. If you meant around blockingCallback.bind(...), I'd rather not. setTimeout() holds the event loop open (and thus the process running) until the callback is called. If it fails to run, that's a The Whole World Is Broken And On Fire kind of problem and I don't think we need to flag it in this test. I don't think we tend to wrap setTimeout() stuff in common.mustCall() generally and I wouldn't want to create an impression that moving in that direction is desirable.
There was a problem hiding this comment.
blockingCallback takes a 2nd arg callback
There was a problem hiding this comment.
Agree that it's a "Whole World Is Broken And On Fire" sitch, but adding it might help readability in that it will flag that this should be the end of the test.
|
probably cc @misterdjules tbh |
| let latestDelay; | ||
| let timeCallbackScheduled; | ||
|
|
||
| // These tests are somewhat probablistic so they may fail even when the bug is |
There was a problem hiding this comment.
nit: somehow the term "probabilistic" seems a bit vague, and doesn't seem to add to the rest of the sentence ("they may fail even when the bug is not present"), so I'd be inclined to remove that term. Not a big deal though.
There was a problem hiding this comment.
@misterdjules I updated the comment to remove the term "probablistic" and to hopefully add a bit more clarity about what the problem is.
| // https://github.com/nodejs/node-v0.x-archive/issues/9333.. | ||
| assert(latestDelay < TIMEOUT * 2); | ||
| // https://github.com/nodejs/node-v0.x-archive/issues/9333. | ||
| if (latestDelay > TIMEOUT * 2) { |
| // not present. However, they fail 100% of the time when the bug *is* present, | ||
| // so to increase reliability, allow for a small number of retries. (Keep it | ||
| // small because as currently written, one failure could result in multiple | ||
| // simultaneous retries of the test. Don't want to timer-bomb ourselves. |
There was a problem hiding this comment.
as currently written, one failure could result in multiple simultaneous retries of the test
Is there a way to rewrite that part of the test so that's not the case anymore?
There was a problem hiding this comment.
I think I've fixed that now. Will run another stress test, but in the meantime, PTAL. Thanks!
|
@refack Updated the message per your nit. LGTY? |
|
FreeBSD CI failure is unrelated. CI is good. @refack Is this good by you? |
|
Landed in c473d80 |
test-timers-blocking-callback may fail erroneously on resource-constrained machines due to the timing nature of the test. There is likely no way around the timing issue. This change tries to decrease the probability of the test failing erroneously by having it retry a small number of times on failure. Tested on 0.10.38 (which has a bug that this test was written for) and (modifying the test slightly to remove ES6 stuff) the test still seems to fail 100% of the time there, which is what we want/expect. PR-URL: #14831 Fixes: #14792 Reviewed-By: Refael Ackermann <refack@gmail.com> Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>
test-timers-blocking-callback may fail erroneously on resource-constrained machines due to the timing nature of the test. There is likely no way around the timing issue. This change tries to decrease the probability of the test failing erroneously by having it retry a small number of times on failure. Tested on 0.10.38 (which has a bug that this test was written for) and (modifying the test slightly to remove ES6 stuff) the test still seems to fail 100% of the time there, which is what we want/expect. PR-URL: #14831 Fixes: #14792 Reviewed-By: Refael Ackermann <refack@gmail.com> Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>
test-timers-blocking-callback may fail erroneously on resource-constrained machines due to the timing nature of the test. There is likely no way around the timing issue. This change tries to decrease the probability of the test failing erroneously by having it retry a small number of times on failure. Tested on 0.10.38 (which has a bug that this test was written for) and (modifying the test slightly to remove ES6 stuff) the test still seems to fail 100% of the time there, which is what we want/expect. PR-URL: #14831 Fixes: #14792 Reviewed-By: Refael Ackermann <refack@gmail.com> Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>
test-timers-blocking-callback may fail erroneously on resource-constrained machines due to the timing nature of the test. There is likely no way around the timing issue. This change tries to decrease the probability of the test failing erroneously by having it retry a small number of times on failure. Tested on 0.10.38 (which has a bug that this test was written for) and (modifying the test slightly to remove ES6 stuff) the test still seems to fail 100% of the time there, which is what we want/expect. PR-URL: #14831 Fixes: #14792 Reviewed-By: Refael Ackermann <refack@gmail.com> Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>
test-timers-blocking-callback may fail erroneously on
resource-constrained machines due to the timing nature of the test.
There is likely no way around the timing issue. This change tries to
decrease the probability of the test failing erroneously by having it
retry a small number of times on failure.
Tested on 0.10.38 (which has a bug that this test was written for) and
(modifying the test slightly to remove ES6 stuff) the test still seems
to fail 100% of the time there, which is what we want/expect.
Fixes: #14792
Checklist
make -j4 test(UNIX), orvcbuild test(Windows) passesAffected core subsystem(s)
test timers