Commit Graph

293 Commits

Author SHA1 Message Date
66ae66024e Sync with latest Nix 2017-07-17 11:38:58 +02:00
1f94f03699 Fix build 2017-04-26 15:11:12 +02:00
cc85208fe4 Fix build 2017-04-18 20:50:18 +02:00
426aea1236 hydra-queue-runner: Allow multiple concurrent daemon connections 2017-04-06 18:50:53 +02:00
5810042a3b Periodically clear Store's path info cache
Otherwise the queue runner can consider paths as valid that have been
garbage-collected since the first time it queried them.
2017-04-06 17:20:23 +02:00
8364f4ec70 Upload log files to the right location
We were mixing up builds and steps. So for example

  https://cache.nixos.org/log/2w66a98iqbjdppc5s2b8qvhi3gprvy45-freecell-solver-4.8.0.drv

at the moment contains the log for
/nix/store/442r9d5ihbcpgq8q9dhijhvhlmplzp96-perl-namespace-autoclean-0.28.drv
because the latter is a step in http://hydra.nixos.org/build/51300420.
Oops.
2017-04-06 13:05:30 +02:00
4f11cf45dc Fix build cancellation
We nowadays ignore SIGINT, so the sshd child process inherited this
and ignored SIGINT as well.
2017-04-05 11:01:57 +02:00
147ba3ca31 Set proper charset on log files 2017-03-31 18:00:08 +02:00
8771f7f913 Merge pull request from shlevy/cached-build-notifications
Send BuildFinished notifications on cached build results.
2017-03-29 18:52:20 +02:00
57bc0eaead hydra-queue-runner: Limit concurrent database connections
Adding a 96-core aarch64 build machine to the build farm caused the
potential number of database connections to increase a lot, so we
started hitting the Postgres connection limit.
2017-03-21 11:53:46 +01:00
150228d7de Upload build logs to the binary cache 2017-03-15 16:59:57 +01:00
7e6486e694 Move log compression to a plugin 2017-03-15 16:59:57 +01:00
d1afb42f12 Supress debug message 2017-03-15 16:59:56 +01:00
73900e9f5f Fix std::stoi exception 2017-03-08 15:07:52 +01:00
edebdf33f0 hydra-queue-runner: Handle SIGINT 2017-03-03 12:41:00 +01:00
500c27e4d5 Add hydra.conf option "nar_buffer_size" to configure memoryTokens limit
It defaults to half the physical RAM.
2017-03-03 12:37:27 +01:00
53b1f7da64 Decrease memoryTokens 2017-02-03 14:44:52 +01:00
a366f362e1 Use latest nixUnstable 2017-02-03 14:39:18 +01:00
8a120006f0 Fix version test 2016-12-08 16:03:50 +01:00
9989e6c0f4 Get exact build start/stop times from the remote 2016-12-07 16:10:21 +01:00
f6081668dc Allow determinism checking for entire jobsets
Setting

  xxx-jobset-repeats = patchelf:master:2

will cause Hydra to perform every build step in the specified jobset 2
additional times (i.e. 3 times in total). Non-determinism is not fatal
unless the derivation has the attribute "isDeterministic = true"; we
just note the lack of determinism in the Hydra database. This will
allow us to get stats about the (lack of) reproducibility of all of
Nixpkgs.
2016-12-07 15:57:13 +01:00
8bb36e79bd Support testing build determinism
Builds can now specify the attribute "isDeterministic = true" to tell
Hydra to build with build-repeat > 0. If there is a mismatch between
rounds, the step / build fails with a suitable status.

Maybe this should be a meta attribute, but that makes it invisible to
hydra-queue-runner, and it seems reasonable to make a claim of
mandatory determinism part of the derivation (since e.g. enabling this
flag should trigger a rebuild).
2016-12-06 17:46:06 +01:00
afb8765ae4 hydra-queue-runner: Bump memory limit to reflect more accurate accounting 2016-11-16 17:51:18 +01:00
b4d32a3085 hydra-queue-runner: More accurate memory accounting
We now take into account the memory necessary for compressing the NAR
being exported to the binary cache, plus xz compression overhead.

Also, we now release the memory tokens for the NAR accessor *after*
releasing the NAR accessor. Previously the memory for the NAR accessor
might still be in use while another thread does an allocation, causing
the maximum to be exceeded temporarily.

Also, use notify_all instead of notify_one to wake up memory token
waiters. This is not very nice, but not every waiter is requesting the
same number of tokens, so some might be able to proceed.
2016-11-16 17:48:50 +01:00
cb5e438a08 Bump Nix
Fixes .
2016-11-09 19:15:13 +01:00
1ecc8a4f40 hydra-queue-runner: Fix a race keeping cancelled steps alive
If a step is cancelled just as its builder step is starting,
doBuildStep() will return sRetry. This causes builder() to make the
step runnable again, since the queue monitor may have added new builds
referencing it. The idea is that if the latter condition is not true,
the step's reference count will drop to zero and it will be
deleted. However, if the dispatcher thread sees and locks the step
before the reference count can drop to zero in the builder thread, the
dispatcher thread will start a new builder thread for the step. Thus
the step can be kept alive for an indefinite amount of time.

The fix is for State::builder() to use a weak pointer to the step, to
ensure that the step's reference count can drop to zero *before* it's
added to the runnable queue.
2016-11-08 11:47:49 +01:00
de9d7bcf25 hydra-queue-runner: Handle exceptions in the dispatcher thread
E.g. "resource unavailable" when creating new threads.
2016-11-08 11:25:43 +01:00
7863d2e1da Step cancellation: Don't use pthread_cancel()
This was a bad idea because pthread_cancel() is unsalvageable broken
in C++. Destructors are not allowed to throw exceptions (especially in
C++11), but pthread_cancel() can cause a __cxxabiv1::__forced_unwind
exception inside any destructor that invokes a cancellation
point. (This exception can be caught but *must* be rethrown.) So let's
just kill the builder process instead.
2016-11-07 19:38:24 +01:00
d7453bd8be hydra-queue-runner: Fix message 2016-11-02 12:44:18 +01:00
4f08c85c69 hydra-queue-runner: Fix assertion failure
It was hitting

    assert(reservation.unique());

Since we do want the machine reservation to be released before calling
wakeDispatcher(), let's use a different object for keeping track of
active steps.
2016-11-02 12:41:00 +01:00
b3169ce438 Kill active build steps when builds are cancelled
We now kill active build steps when there are no more referring
builds. This is useful e.g. for preventing cancelled multi-hour TPC-H
benchmark runs from hogging build machines.
2016-10-31 14:58:29 +01:00
a816ef873d Warn against empty machines file 2016-10-31 11:40:36 +01:00
3b84d4711b Bump Nix 2016-10-26 15:10:56 +02:00
0b00d51baf Prevent orphaned build steps
If two active steps of the same build failed, then the first would be
marked as "failed", but the second would end up as "orphaned", causing
it to be marked as "aborted" later on. Now it's correctly marked as
"failed".
2016-10-26 14:42:28 +02:00
8e1d791d0c Truncate the log just before starting the remote build
This gets rid of all those remote substitution messages that were
polluting the build logs.
2016-10-26 13:41:51 +02:00
3fcfa20d1a Fix regression caused by ee2e9f53
‘basicDrv.inputSrcs’ also contains the outputs of inputDrvs. These
don't necessarily exist in the local store, so copying them may cause
an exception. We should only copy the real inputSrcs.
2016-10-24 16:49:11 +02:00
a3efdcdfd9 Use std::regex 2016-10-21 18:06:26 +02:00
e0b2921ff2 Concurrent hydra-evaluator
This rewrites the top-level loop of hydra-evaluator in C++. The Perl
stuff is moved into hydra-eval-jobset. (Rewriting the entire evaluator
would be nice but is a bit too much work.) The new version has some
advantages:

* It can run multiple jobset evaluations in parallel.

* It uses PostgreSQL notifications so it doesn't have to poll the
  database. So if a jobset is triggered via the web interface or from
  a GitHub / Bitbucket webhook, evaluation of the jobset will start
  almost instantaneously (assuming the evaluator is not at its
  concurrency limit).

* It imposes a timeout on evaluations. So if e.g. hydra-eval-jobset
  hangs connecting to a Mercurial server, it will eventually be
  killed.
2016-10-14 14:22:12 +02:00
16feddd5d4 Drop obsolete -laws-cpp-sdk-s3 2016-10-14 14:22:12 +02:00
dd5af7637d Remove finally.hh 2016-10-14 14:22:12 +02:00
ee2e9f5335 Update to reflect BinaryCacheStore changes
BinaryCacheStore no longer implements buildPaths() and ensurePath(),
so we need to use copyPath() / copyClosure().
2016-10-07 20:23:05 +02:00
6a313c691b hydra-queue-runner: Fix build 2016-10-06 16:58:54 +02:00
7089142fdc Add error/warnings for deprecated store specification 2016-10-06 15:10:14 +02:00
a73f211bf2 Use store-api for binary cache instantiation 2016-10-06 15:09:44 +02:00
1c2f6281b9 Remove signing parameter (nix#f435f82) 2016-10-06 15:09:12 +02:00
232e6e8556 Replace buildVerbosity with verboseBuild (nix#5761827) 2016-10-06 15:08:02 +02:00
492d16074c Remove s3binarystore (moved to nix in d155d80) 2016-10-06 15:07:21 +02:00
b1512a152a Fix build failure on GCC 5.4 2016-09-30 17:05:07 +02:00
5962367ffc Send BuildFinished notifications on cached build results.
Fixes 
2016-08-17 06:40:12 -04:00
a55942603a Provide a plugin hook for when build steps finish
Fixes .
2016-05-27 14:35:32 +02:00