Commit Graph

305 Commits

Author SHA1 Message Date
d571e44b86 Keep stats for the Hydra auto scaler
"hydra-queue-runner --status" now prints how many runnable and running
build steps exist for each machine type. This allows additional
machines to be provisioned based on the Hydra load.
2015-08-17 13:50:41 +02:00
d4759c1da2 hydra-queue-runner: Detect changes to the scheduling shares 2015-08-12 13:17:56 +02:00
576dc0c120 For completeness, re-implement meta.schedulingPriority 2015-08-12 12:05:43 +02:00
b7965df928 Load the queue in order of global priority 2015-08-11 02:14:34 +02:00
97f11baa8d Revive jobset scheduling
(I.e. taking the jobset scheduling share into account.)
2015-08-11 01:31:56 +02:00
eb13007fe6 Allow build to be bumped to the front of the queue via the web interface
Builds now have a "Bump up" action. This will cause the queue runner
to prioritise the steps of the build above all other steps.
2015-08-10 16:19:47 +02:00
27182c7c1d Start steps in order of ascending build ID 2015-08-10 16:19:47 +02:00
593850b956 Fix potential race in dispatcher wakeup 2015-08-10 12:54:55 +02:00
6a1c950e94 Unindent 2015-08-10 11:33:22 +02:00
f21b88e388 Remove superfluous check 2015-08-07 04:20:34 +02:00
f1fbf8c605 Fix race in finishing builds that have been cancelled 2015-08-07 04:18:48 +02:00
ff3f5eb4d8 Fix remote building on Nix 1.10 2015-07-31 03:41:55 +02:00
5b9a288123 Workaround for RemoteStore not supporting cmdBuildDerivation yet 2015-07-31 03:39:20 +02:00
4d26546d3c Add support for tracking custom metrics
Builds can now emit metrics that Hydra will store in its database and
render as time series via flot charts. Typical applications are to
keep track of performance indicators, coverage percentages, artifact
sizes, and so on.

For example, a coverage build can emit the coverage percentage as
follows:

  echo "lineCoverage $pct %" > $out/nix-support/hydra-metrics

Graphs of all metrics for a job can be seen at

  http://.../job/<project>/<jobset>/<job>#tabs-charts

Specific metrics are also visible at

  http://.../job/<project>/<jobset>/<job>/metric/<metric>

The latter URL also allows getting the data in JSON format (e.g. via
"curl -H 'Accept: application/json'").
2015-07-31 00:57:30 +02:00
c18fb0ad74 Temporarily disable machines after a connection failure 2015-07-21 15:58:47 +02:00
7e026d35f7 Split hydra-queue-runner.cc more 2015-07-21 15:14:17 +02:00
5370be9f52 hydra-queue-runner: Use cmdBuildDerivation
See 1511aa9f48 and eda2f36c2a.
2015-07-21 01:54:24 +02:00
3ded87329d Keep track of how many threads are waiting 2015-07-10 19:10:14 +02:00
89fb723ace Notify the queue runner when a build is deleted 2015-07-08 11:43:35 +02:00
35b7c4f82b Allow only 1 thread to send a closure to a given machine at the same time
This prevents a race where multiple threads see that machine X is
missing path P, and start sending it concurrently. Nix handles this
correctly, but it's still wasteful (especially for the case where P ==
GHC).

A more refined scheme would be to have per machine, per path locks.
2015-07-07 14:06:48 +02:00
16696a4aee Namespace cleanup 2015-07-07 10:29:43 +02:00
63745b8e25 Move buildRemote() into State 2015-07-07 10:25:33 +02:00
df29527531 Refactor 2015-07-07 10:17:21 +02:00
dffb629b8a Unify Hydra's NixOS module with the one used for hydra.nixos.org
In particular, the queue runner and web server now run under different
UIDs.
2015-07-02 01:01:44 +02:00
2ece42b2b9 Support preferLocalBuild
Derivations with "preferLocalBuild = true" can now be executed on
specific machines (typically localhost) by setting the mandary system
features field to include "local". For example:

  localhost x86_64-linux,i686-linux - 10 100 - local

says that "localhost" can *only* do builds with "preferLocalBuild =
true". The speed factor of 100 will make the machine almost always win
over other machines.
2015-06-30 00:20:19 +02:00
008d610467 getQueuedBuilds(): Don't catch errors while loading a build from the queue
Otherwise we never recover from reset daemon connections, e.g.

  hydra-queue-runner[16106]: while loading build 599369: cannot start daemon worker: reading from file: Connection reset by peer
  hydra-queue-runner[16106]: while loading build 599236: writing to file: Broken pipe
  ...

The error is now handled queueMonitor(), causing the next call to
queueMonitorLoop() to create a new connection.
2015-06-26 21:06:35 +02:00
2f4676bd97 JSONObject doesn't handle 64-bit integers 2015-06-25 16:59:48 +02:00
c6fcce3b3b Moar stats 2015-06-25 16:47:39 +02:00
18a3c3ff1c Update "make check" for the new queue runner
Also, if the machines file contains an entry for localhost, then run
"nix-store --serve" directly, without going through SSH.
2015-06-25 16:47:39 +02:00
32210905d8 Automatically reload $NIX_REMOTE_SYSTEMS when it changes
Otherwise, you'd have to restart the queue runner to add or remove
machines.
2015-06-25 16:47:25 +02:00
1a0e1eb5a0 More stats 2015-06-24 13:19:27 +02:00
3f8891b6ff Fix incorrect debug message 2015-06-23 17:53:15 +02:00
af5cbe97aa createStep(): Cache finished derivations
This gets rid of a lot of redundant calls to readDerivation().
2015-06-23 03:25:31 +02:00
681f63a382 Typo 2015-06-23 02:15:11 +02:00
524ee295e0 Fix sending notifications in the successful case 2015-06-23 02:13:06 +02:00
4db7c51b5c Rate-limit the number of threads copying closures at the same time
Having a hundred threads doing I/O at the same time is bad on magnetic
disks because of the excessive disk seeks. So allow only 4 threads to
copy closures in parallel.
2015-06-23 01:49:14 +02:00
a317d24b29 hydra-queue-runner: Send build notifications
Since our notification plugins are written in Perl, sending
notification from C++ requires a small Perl helper named
‘hydra-notify’.
2015-06-23 00:14:49 +02:00
5312e1209b Keep per-machine stats 2015-06-22 17:11:17 +02:00
d06366e7cf Remove obsolete comment 2015-06-22 16:59:50 +02:00
e069ee960e Doh 2015-06-22 16:58:40 +02:00
41ba7418e2 hydra-queue-runner: More stats 2015-06-22 15:34:33 +02:00
62b53a0a47 Guard against concurrent invocations of hydra-queue-runner 2015-06-22 14:24:03 +02:00
fbd7c02217 Periodically dump/log status 2015-06-22 14:15:43 +02:00
4f4141e1db Add command ‘hydra-queue-runner --status’ to show current status 2015-06-22 14:06:44 +02:00
44a2b74f5a Keep track of the number of build steps that are being built
(As opposed to being in the closure copying stage.)
2015-06-22 11:23:00 +02:00
fed71d3fe9 Move "created" field into Step::State 2015-06-22 11:07:52 +02:00
90a08db241 hydra-queue-runner: Fix assertion failure 2015-06-22 10:59:07 +02:00
d744362e4a hydra-queue-runner: Fix segfault sorting machines by load
While sorting machines by load, the load of a machine
(machine->currentJobs) can be changed by other threads. If that
happens, the comparator is no longer a proper ordering, in which case
std::sort() can segfault. So we now make a copy of currentJobs before
sorting.
2015-06-21 16:21:42 +02:00
a0eff6fc15 Fix machine selection 2015-06-19 17:45:26 +02:00
81abb6e166 Improve parsing of hydra-build-products 2015-06-19 17:20:20 +02:00