This is implement in an extremely hacky way due to poor DBIx feature
support. Ideally, what we'd need is a way to tell DBIx to ignore the
errormsg column unless explicitly requested, and to automatically add a
computed 'errormsg IS NULL' column in others. Since it does not support
that, this commit instead hacks some support via method overrides while
taking care to not break anything obvious.
It seemed there was no self-contained end-to-end test actually doing
this?!
Among other things, this will help ensure that the switch-over to
`nix-eval-jobs` is correct.
Original commit message:
> There are some known regressions regarding local testing setups - since
> everything was kinda half written with the expectation that build dir =
> source dir (which should not be true anymore). But everything builds and
> the test suite runs fine, after several hours spent debugging random
> crashes in libpqxx with MALLOC_PERTURB_...
I have not experienced regressions with local testing.
(cherry picked from commit 4b886d9c45cd2d7fe9b0a8dbc05c7318d46f615d)
We've seen many fails on ofborg, at lot of them ultimately appear to come down to
a timeout being hit, resulting in something like this:
Failure executing slapadd -F /<path>/slap.d -b dc=example -l /<path>/load.ldif.
Hopefully this resolves it for most cases.
I've done some endurance testing and this helps a lot.
some other commands also regularly time-out with high load:
- hydra-init
- hydra-create-user
- nix-store --delete
This should address most issues with tests randomly failing.
Used the following script for endurance testing:
```
import os
import subprocess
run_counter = 0
fail_counter = 0
while True:
try:
run_counter += 1
print(f"Starting run {run_counter}")
env = os.environ
env["YATH_JOB_COUNT"] = "20"
result = subprocess.run(["perl", "t/test.pl"], env=env)
if (result.returncode != 0):
fail_counter += 1
print(f"Finish run {run_counter}, total fail count: {fail_counter}")
except KeyboardInterrupt:
print(f"Finished {run_counter} runs with {fail_counter} fails")
break
```
In case someone else wants to do it on their system :).
Note that YATH_JOB_COUNT may need to be changed loosely based on your
cores.
I only have 4 cores (8 threads), so for others higher numbers might
yield better results in hashing out unstable tests.
* Let tests themselves intentionally leak temp dir
By default Yath will clean up temporary files, so the result is the
same. But `--keep-dirs` can be passed to `yath test` telling Yath to
*not* clean them up instead. This is very useful for debugging.
* Update t/lib/HydraTestContext.pm
Co-authored-by: Cole Helbling <cole.e.helbling@outlook.com>
This is necessary because jobset and project names are not allowed to
begin with a digit, and yet the generated jobset and project names would
do just that.
Not the most elegant solution, but it works.
This might, hopefully, I don't know, possibly force the
database to live a little while longer and *reduce* but not
eliminate errors around stopping the database before we lose all
our DB::PG handles to it.
At the moment, the jobset object is unlikely to actually retrieve the
evaluation error output, because it isn't refreshed after
hydra-eval-jobsets is run.
Explicitly calling DBIx::Class::Row->discard_changes causes any updated
data to be refreshed, at the cost of losing any not-yet committed
changes to the row.
Set `dest_store` in the test hydra config, so that the testsuite ensures
that the distinction between the local store and the destination store
is properly taken into account.
Fix#938
By moving the tests subdirectory to t, we gain the ability to run `yath
test` with no arguments from inside `nix develop` in the root of the
the repo.
(`nix develop` is necessary in order to set the proper env vars for
`yath` to find our test libraries.)