When people reach out to the git repository they probably want to use
hydra from the same source.
This also removes the need for an overlay with simpler and more
performant direct use of the nixpkgs passed in. Before it was
re-importing nixpkgs.
test
There is an overlay for the `hydra` name, but `hydra_unstable` was used, which can refer to the nixpkgs package and lead to and outdated hydra version and requires configuring the correct package attribute downstream.
In my system logs I see this every time a new eval starts:
```
hydra-evaluator[PID]: hint: Using 'master' as the name for the initial branch. This default branch name
hydra-evaluator[PID]: hint: is subject to change. To configure the initial branch name to use in all
hydra-evaluator[PID]: hint: of your new repositories, which will suppress this warning, call:
hydra-evaluator[PID]: hint:
hydra-evaluator[PID]: hint: git config --global init.defaultBranch <name>
hydra-evaluator[PID]: hint:
hydra-evaluator[PID]: hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hydra-evaluator[PID]: hint: 'development'. The just-created branch can be renamed via this command:
hydra-evaluator[PID]: hint:
hydra-evaluator[PID]: hint: git branch -m <name>
```
This ensures this hint is not logged anymore and unclutters the syslog.
I presume it does not really matter what name is chosen for the branch.
See https://github.com/NixOS/hydra/pull/1414#issuecomment-2412350929
The variable is defined in src/lib/Hydra/Helper/Nix.pm
Error message without this patch:
```
hydra-evaluator[PID]: Couldn't require Hydra::Plugin::S3Backup : Global symbol "$MACHINE_LOCAL_STORE" requires explicit package name (did you forget to declare "my $MACHINE_LOCAL_STORE"?) at /nix/store/xxx-hydra-0-unstable-2024-09-24/libexec/hydra/lib/Hydra/Plugin/S3Backup.pm line 95.
hydra-evaluator[PID]: Compilation failed in require at /nix/store/xxx-hydra-perl-deps/lib/perl5/site_perl/5.38.2/Module/Runtime.pm line 314.
hydra-evaluator[PID]: at /nix/store/xxx-hydra-perl-deps/lib/perl5/site_perl/5.38.2/Module/Pluggable.pm line 32.
```
These are required for the `signString` and `readFile` subroutines used when signing NARs.
(cherry picked from commit b94a7b6d5c56362af9ea85d944f8454d861ec001)
We should look into how to resolve this, but I tried some things and nothing really worked.
Let's put it skipped for now until someone comes along to improve it.
Only log issues/failures when something's actually up.
It has irked me for a long time that so much output came
out of running the tests, this seems to silence it.
It does hide some warnings, but I think it makes the output
so much more readable that it's worth the tradeoff.
Helps for highly parallel running of jobs, sometimes they'd not give output for a while.
Setting this timeout higher appears to help.
Not completely sure if this is the right place to do it, but it works fine for me.
We've seen many fails on ofborg, at lot of them ultimately appear to come down to
a timeout being hit, resulting in something like this:
Failure executing slapadd -F /<path>/slap.d -b dc=example -l /<path>/load.ldif.
Hopefully this resolves it for most cases.
I've done some endurance testing and this helps a lot.
some other commands also regularly time-out with high load:
- hydra-init
- hydra-create-user
- nix-store --delete
This should address most issues with tests randomly failing.
Used the following script for endurance testing:
```
import os
import subprocess
run_counter = 0
fail_counter = 0
while True:
try:
run_counter += 1
print(f"Starting run {run_counter}")
env = os.environ
env["YATH_JOB_COUNT"] = "20"
result = subprocess.run(["perl", "t/test.pl"], env=env)
if (result.returncode != 0):
fail_counter += 1
print(f"Finish run {run_counter}, total fail count: {fail_counter}")
except KeyboardInterrupt:
print(f"Finished {run_counter} runs with {fail_counter} fails")
break
```
In case someone else wants to do it on their system :).
Note that YATH_JOB_COUNT may need to be changed loosely based on your
cores.
I only have 4 cores (8 threads), so for others higher numbers might
yield better results in hashing out unstable tests.