Compare commits

..

41 Commits

Author SHA1 Message Date
John Ericson
33a935e8ef Queue-runner: Always produce a machines JSON object
Some checks failed
Test / tests (pull_request) Has been cancelled
Even if there are no machines, there should at least be an empty object.
2025-04-09 11:31:47 -04:00
Pierre Bourdon
65618fd590 web: replace 'errormsg' with 'errormsg IS NULL' in most cases
This is implement in an extremely hacky way due to poor DBIx feature
support. Ideally, what we'd need is a way to tell DBIx to ignore the
errormsg column unless explicitly requested, and to automatically add a
computed 'errormsg IS NULL' column in others. Since it does not support
that, this commit instead hacks some support via method overrides while
taking care to not break anything obvious.
2025-04-09 11:31:47 -04:00
Pierre Bourdon
06ba54fca7 queue-runner: release machine reservation while copying outputs
This allows for better builder usage when the queue runner is busy. To
avoid running into uncontrollable imbalances between builder/queue
runner, we only release the machine reservation after the local
throttler has found a slot to start copying the outputs for that build.

As opposed to asserting uniqueness to understand resource utilization,
we just switch to using `std::unique_ptr`.
2025-04-09 11:31:47 -04:00
Jörg Thalheim
5b9c22dd18 bump nixpkgs 2025-04-09 11:31:47 -04:00
K900
e15070c6c2 Add metric for builds waiting for download slot
(cherry picked from commit f23ec71227911891807706b6b978836e4d80edde)
2025-04-09 11:31:47 -04:00
Jörg Thalheim
37744c7018 don't build hydra twice in a pull request + enable merge queue 2025-04-09 11:31:47 -04:00
Pierre Bourdon
1e3929e75f queue-runner: switch to pseudorandom ordering of builds processing
We don't rely on sequential / monotonic build IDs processing anymore, so
randomizing actually has the advantage of mixing builds for different
systems together, to avoid only one chunk of builds for a single system
getting processed while builders for other systems are starved.
2025-04-09 11:31:47 -04:00
Pierre Bourdon
28da0a705f queue runner: introduce some parallelism for remote paths lookup
Each output for a given step being ingested is looked up in parallel,
which should basically multiply the speed of builds ingestion by the
average number of outputs per derivation.
2025-04-09 11:31:47 -04:00
Pierre Bourdon
2050b2c324 queue-runner: reduce the time between queue monitor restarts
This will induce more DB queries (though these are fairly cheap), but at
the benefit of processing bumps within 1m instead of within 10m.
2025-04-09 11:31:47 -04:00
Pierre Bourdon
21d6d805ba queue-runner: remove id > X from new builds query
Running the query with/without it shows that it makes no difference to
postgres, since there's an index on finished=0 already. This allows a
few simplifications, but also paves the way towards running multiple
parallel monitor threads in the future.
2025-04-09 11:31:47 -04:00
Pierre Bourdon
478bb01f7f queue-runner: add prom metrics to allow detecting internal bottlenecks
By looking at the ratio of running vs. waiting for the dispatcher and
the queue monitor, we should get better visibility into what hydra is
currently bottlenecked on.

There are other side effects we can try to measure to get to the same
result, but having a simple way doesn't cost us much.
2025-04-09 11:31:47 -04:00
Pierre Bourdon
08bf31b71a queue-runner: limit parallelism of CPU intensive operations
My current theory is that running more parallel xz than available CPU
cores is reducing our overall throughput by requiring more scheduling
overhead and more cache thrashing.
2025-04-09 11:31:47 -04:00
Pierre Bourdon
641056bd0e web: Skip System on /machines
It is redundant
2025-04-09 11:31:47 -04:00
Jörg Thalheim
29a7ab8009 test/gitea: fix eval 2025-04-09 11:31:47 -04:00
John Ericson
eddc234915 Fix evaluation of NixOS tests, avoid with 2025-04-09 11:31:47 -04:00
Maximilian Bosch
80f917d8fa readIntoSocket: fix with store URIs containing an &
The third argument to `open()` in `-|` mode is passed to a shell if it's
a string. In my case the store URI contains
`?secret-key=${signingKey.directory}/secret&compression=zstd`

For the `nix store cat` case this means that

* until `&` the process will be started in the background. This fails
  immediately because no path to cat is specified.
* `compression=zstd` is a variable assignment
* the `$path` argument to `store cat` is attempted to be executed as
  another command

Passing just the list solves the problem.

(cherry picked from commit 3ee51dbe589458cc54ff753317bbc6db530bddc0)
2025-04-09 11:31:47 -04:00
git@71rd.net
5cb82812f2 Stream files from store instead of buffering them
When an artifact is requested from hydra the output is first copied
from the nix store into memory and then sent as a response, delaying
the download and taking up significant amounts of memory.

As reported in https://github.com/NixOS/hydra/issues/1357

Instead of calling a command and blocking while reading in the entire
output, this adds read_into_socket(). the function takes a
command, starting a subprocess with that command, returning a file
descriptor attached to stdout.
This file descriptor is then by responsebuilder of Catalyst to steam
the output directly

(cherry picked from commit 459aa0a5983a0bd546399c08231468d6e9282f54)
2025-04-09 11:31:47 -04:00
ajs124
17094c8371 lazy-load evaluation errors
Closes #1362
2025-04-09 11:31:47 -04:00
Maximilian Bosch
d5fb163618 Only show stepname if it doesn't equal the name of the drv
When building e.g. nixpkgs, the "Running builds" view will mostly look
like this

    hello.x86_64-linux (Build of hello-X.Y)
    exa.x86_64-linux (Build of exa-X.Y)
    ...

This doesn't provide any useful information. Showing the step name only
makes sense if it's not a child of the job's derivation. With this
patch, that information will only be shown if the drv name (i.e. w/o
`/nix/store/` prefix, .drv ext & hash) is not equal to the drv name of
the job itself (build.nixname).
2025-04-09 11:31:47 -04:00
Maximilian Bosch
baec2bbb4c Running builds view: show build step names
When using Hydra to build machine configurations, you'll often see
"nixosConfigurations.foo" five times, i.e. for each build step being
run. This isn't very helpful I think because in such a case, a single
build step can also be compiling the Linux kernel.

This change also fetches the `drvpath` and `type` from the `buildsteps`
relation. We're already joining it, so this doesn't make much difference
(confirmed via query logging that this doesn't cause extra SQL queries).

Unfortunately build steps don't have a human readable name, so I'm
deriving it from the drvpath by stripping away the hash (assuming that
it'll never contain a `-` and that `/nix/store/` is used as prefix). I
decided against using the Nix bindings for that to avoid too much
overhead due to store operations for each build step.
2025-04-09 11:31:47 -04:00
Maximilian Bosch
b55bd25581 Make "timed out" and "log limit exceeded" builds aborted
In 73694087a088ed4481b4ab268a03351b1bcaac3c I gave builds that failed
because of a timeout or exceeded log limit a stop sign and I stand by
that reasoning: with that it's possible to distinguish between actual
build failures and rather transient things such as timeouts.

Back then I considered it a feature that these are shown in a different
tab, but I don't think that's a good idea anymore. When using a jobset to
e.g. track the regressions from a mass rebuild (like a compiler or gcc
update), "Newly failed builds" should exclusively display regressions (and
flaky builds of course, not much I can do about that).

Also, when a bunch of builds fail in such a jobset because of e.g. a
broken connection to a builder that results in a timeout, I want to be
able to restart them all w/o rebuilding actual regressions.

To make it clear that we not only have "Aborted" builds in the tab, I
renamed the label to "Aborted / Timed out".
2025-04-09 11:31:47 -04:00
Pierre Bourdon
1ca17faed4 web: include current step status on /machines 2025-04-09 11:31:47 -04:00
John Ericson
9c022848cf Fix the build 2025-04-09 11:31:47 -04:00
John Ericson
f58a752419 Fix Nix code
Can now at least enter dev shell, but build is still broken.
2025-04-09 11:31:47 -04:00
John Ericson
0769853dec flake.lock: Update to nix and nix-eval-jobs 2.28
Flake lock file updates:

• Updated input 'nix':
    'github:NixOS/nix/d0f98c76f962147610489e84c10033ca92e9c532?narHash=sha256-u6RhBWQ1XohTZ4Ub5ml1PTcaxQgtqFNng6Sohy1rojw%3D' (2025-04-07)
  → 'github:NixOS/nix/a4962f73b5fc874d4b16baef47921daf349addfc?narHash=sha256-r%2BpsCOW77vTSTNbxTVrYHeh6OgB0QukbnyUVDwg8s4I%3D' (2025-04-07)
• Updated input 'nix-eval-jobs':
    'github:nix-community/nix-eval-jobs/62f9c9e8d00d2ff6ab27a6197ab459a8e0808e59?narHash=sha256-PypQspB7h7EENe4RQQUQj2Ay8J1%2BO49AKNO9JbAU4Ek%3D' (2025-04-07)
  → 'github:nix-community/nix-eval-jobs/cba718bafe5dc1607c2b6761ecf53c641a6f3b21?narHash=sha256-v5n6t49X7MOpqS9j0FtI6TWOXvxuZMmGsp2OfUK5QfA%3D' (2025-04-07)
2025-04-09 11:31:47 -04:00
John Ericson
21c6afa83b Fix build (due to C++ API changes) 2025-04-09 11:31:47 -04:00
John Ericson
1022514027 flake.lock: Update to nix and nix-eval-jobs 2.27
Flake lock file updates:

• Updated input 'nix':
    'github:NixOS/nix/e310c19a1aeb1ce1ed4d41d5ab2d02db596e0918?narHash=sha256-q/RgA4bB7zWai4oPySq9mch7qH14IEeom2P64SXdqHs%3D' (2025-02-18)
  → 'github:NixOS/nix/d0f98c76f962147610489e84c10033ca92e9c532?narHash=sha256-u6RhBWQ1XohTZ4Ub5ml1PTcaxQgtqFNng6Sohy1rojw%3D' (2025-04-07)
• Updated input 'nix-eval-jobs':
    'github:nix-community/nix-eval-jobs/f7418fc1fa45b96d37baa95ff3c016dd5be3876b?narHash=sha256-Lo4KFBNcY8tmBuCmEr2XV0IUZtxXHmbXPNLkov/QSU0%3D' (2025-03-26)
  → 'github:nix-community/nix-eval-jobs/62f9c9e8d00d2ff6ab27a6197ab459a8e0808e59?narHash=sha256-PypQspB7h7EENe4RQQUQj2Ay8J1%2BO49AKNO9JbAU4Ek%3D' (2025-04-07)
2025-04-09 11:31:47 -04:00
Jörg Thalheim
2d4232475c gitignore hydra-data as created by foreman 2025-04-09 11:31:47 -04:00
Jörg Thalheim
d799742057 fix development workflow after switching to meson-based build 2025-04-09 11:31:47 -04:00
Robin Stumm
485aa93f2d hydra-eval-jobset: do not wait on n-e-j inside transaction
fixes #1429
2025-04-09 11:31:47 -04:00
Josef Kemetmüller
590e8d8511 Fix rendering of metrics with special characters
My main motivation here is to get metrics with brackets to work in order
to support "pytest" test names:

- test_foo.py::test_bar[1]
- test_foo.py::test_bar[2]

I couldn't find an "HTML escape"-style function that would generate
valid html `id` attribute names from random strings, so I went with a
hash digest instead.
2025-04-09 11:31:47 -04:00
Maximilian Bosch
90a8a0d94a Reimplement (named) constituent jobs (+globbing) based on nix-eval-jobs
Depends on https://github.com/nix-community/nix-eval-jobs/pull/349 & #1421.

Almost equivalent to #1425, but with a small change: when having e.g. an
aggregate job with a glob that matches nothing, the jobset evaluation is
failed now. This was the intended behavior before (hydra-eval-jobset
fails hard if an aggregate is broken), the code-path was never reached
however since the aggregate was never marked as broken in this case
before.
2025-04-09 11:31:47 -04:00
zowoq
eb17619ee5 flake.lock: Update
Flake lock file updates:

• Updated input 'nix-eval-jobs':
    'github:nix-community/nix-eval-jobs/4b392b284877d203ae262e16af269f702df036bc?narHash=sha256-3wIReAqdTALv39gkWXLMZQvHyBOc3yPkWT2ZsItxedY%3D' (2025-02-14)
  → 'github:nix-community/nix-eval-jobs/f7418fc1fa45b96d37baa95ff3c016dd5be3876b?narHash=sha256-Lo4KFBNcY8tmBuCmEr2XV0IUZtxXHmbXPNLkov/QSU0%3D' (2025-03-26)
2025-04-09 11:31:47 -04:00
zowoq
ebefdb0a3d hydraTest: remove outdated postgresql version
error: postgresql_12 has been removed since it reached its EOL upstream
2025-04-09 11:31:47 -04:00
Martin Weinelt
55349930f1 Fix race condition in hydra-compress-logs 2025-04-09 11:31:47 -04:00
John Ericson
847a8ae6cd Revert "Use LegacySSHStore"
There were some hangs caused by this. Need to fix them, ideally
reproducing the issue in a test, before trying this again.

This reverts commit 4a4a0f901c70676ee47f830d2ff6a72789ba1baf.
2025-04-09 11:31:47 -04:00
86d0009448
add declaritive hydra spec 2025-04-01 15:02:44 -04:00
a20f37b97f
add gitea refs
Signed-off-by: ahuston-0 <aliceghuston@gmail.com>
Reviewed-on: https://<censored>/ahuston-0/hydra/pulls/1
2025-03-31 14:52:51 -04:00
a94f84118c
add Gitea pulls docs entry
Signed-off-by: ahuston-0 <aliceghuston@gmail.com>
2025-03-31 14:52:51 -04:00
Faye Chun
99e3ad325c
Merge branch 'NixOS:master' into add-gitea-pulls 2025-03-01 22:04:13 -05:00
Faye Chun
2f1fa2b069
Add a plugin to poll Gitea pull requests
Based off the existing GithubPulls.pm and GitlabPulls.pm plugins.

Also adds an integration test for the new 'giteapulls' input type to
the existing 'gitea' test.
2024-12-21 08:02:57 -05:00
2 changed files with 125 additions and 0 deletions

90
hydra/jobsets.nix Normal file
View File

@ -0,0 +1,90 @@
{ pulls, branches, ... }:
let
# create the json spec for the jobset
makeSpec =
contents:
builtins.derivation {
name = "spec.json";
system = "x86_64-linux";
preferLocalBuild = true;
allowSubstitutes = false;
builder = "/bin/sh";
args = [
(builtins.toFile "builder.sh" ''
echo "$contents" > $out
'')
];
contents = builtins.toJSON contents;
};
prs = readJSONFile pulls;
refs = readJSONFile branches;
# template for creating a job
makeJob =
{
schedulingshares ? 10,
keepnr ? 3,
description,
flake,
enabled ? 1,
}:
{
inherit
description
flake
schedulingshares
keepnr
enabled
;
type = 1;
hidden = false;
checkinterval = 300; # every 5 minutes
enableemail = false;
emailoverride = "";
};
giteaHost = "ssh://gitea@nayeonie.com:2222";
repo = "ahuston-0/hydra";
# # Create a hydra job for a branch
jobOfRef =
name:
{ ref, ... }:
if ((builtins.match "^refs/heads/(.*)$" ref) == null) then
null
else
{
name = builtins.replaceStrings [ "/" ] [ "-" ] "branch-${name}";
value = makeJob {
description = "Branch ${name}";
flake = "git+${giteaHost}/${repo}?ref=${ref}";
};
};
# Create a hydra job for a PR
jobOfPR = id: info: {
name = if info.draft then "draft-${id}" else "pr-${id}";
value = makeJob {
description = "PR ${id}: ${info.title}";
flake = "git+${giteaHost}/${repo}?ref=${info.head.ref}";
enabled = info.state == "open";
};
};
# some utility functions
# converts json to name/value dicts
attrsToList = l: builtins.attrValues (builtins.mapAttrs (name: value: { inherit name value; }) l);
# wrapper function for reading json from file
readJSONFile = f: builtins.fromJSON (builtins.readFile f);
# remove null values from a set, in-case of branches that don't exist
mapFilter = f: l: builtins.filter (x: (x != null)) (map f l);
# Create job set from PRs and branches
jobs = makeSpec (
builtins.listToAttrs (map ({ name, value }: jobOfPR name value) (attrsToList prs))
// builtins.listToAttrs (mapFilter ({ name, value }: jobOfRef name value) (attrsToList refs))
);
in
{
jobsets = jobs;
}

35
hydra/spec.json Normal file
View File

@ -0,0 +1,35 @@
{
"enabled": 1,
"hidden": false,
"description": "ahuston-0's fork of hydra",
"nixexprinput": "nixexpr",
"nixexprpath": "hydra/jobsets.nix",
"checkinterval": 60,
"schedulingshares": 100,
"enableemail": false,
"emailoverride": "",
"keepnr": 3,
"type": 0,
"inputs": {
"nixexpr": {
"value": "ssh://gitea@nayeonie.com:2222/ahuston-0/hydra.git add-gitea-pulls",
"type": "git",
"emailresponsible": false
},
"nixpkgs": {
"value": "https://github.com/NixOS/nixpkgs nixos-unstable",
"type": "git",
"emailresponsible": false
},
"pulls": {
"type": "giteapulls",
"value": "nayeonie.com ahuston-0 hydra https",
"emailresponsible": false
},
"branches": {
"type": "gitea_refs",
"value": "nayeonie.com ahuston-0 hydra heads https -",
"emailresponsible": false
}
}
}