177 Commits

Author SHA1 Message Date
Eelco Dolstra
75e7b35477 Fix retry of transient failures 2016-03-10 16:44:26 +01:00
Eelco Dolstra
33da40f272 Doh 2016-03-09 17:31:57 +01:00
Eelco Dolstra
4b9c76e502 hydra-queue-runner: Ensure regular status dumps 2016-03-09 17:11:34 +01:00
Eelco Dolstra
4151be7e69 Make the output size limit configurable
The maximum output size per build step (as the sum of the NARs of each
output) can be set via hydra.conf, e.g.

  max-output-size = 1000000000

The default is 2 GiB.

Also refactored the build error / status handling a bit.
2016-03-09 17:00:09 +01:00
Eelco Dolstra
dc790c5f7e Fix bad format string 2016-03-09 16:59:35 +01:00
Eelco Dolstra
80ff78b1b6 Unify build and step status codes
Also remove the obsolete status code 5 from the database.
2016-03-09 15:30:43 +01:00
Eelco Dolstra
9127f5bbc3 hydra-queue-runner: Limit memory usage
When using a binary cache store, the queue runner receives NARs from
the build machines, compresses them, and uploads them to the
cache. However, keeping multiple large NARs in memory can cause the
queue runner to run out of memory. This can happen for instance when
it's processing multiple ISO images concurrently.

The fix is to use a TokenServer to prevent the builder threads to
store more than a certain total size of NARs concurrently (at the
moment, this is hard-coded at 4 GiB). Builder threads that cause the
limit to be exceeded will block until other threads have finished.

The 4 GiB limit does not include certain other allocations, such as
for xz compression or for FSAccessor::readFile(). But since these are
unlikely to be more than the size of the NARs and hydra.nixos.org has
32 GiB RAM, it should be fine.
2016-03-09 14:30:13 +01:00
Eelco Dolstra
b77a43b83d Get rid of "will retry" messages after "maybe cancelling..." 2016-03-08 13:09:39 +01:00
Eelco Dolstra
718fef29ef Keep track of time required to load builds 2016-03-08 13:09:29 +01:00
Eelco Dolstra
2feb17c681 Some more logging 2016-03-08 13:08:07 +01:00
Eelco Dolstra
45b237453a hydra-queue-runner: Recycle finishedDrvs
This should prevent the queue monitor thread from looking up the same
derivations over and over again.
2016-03-08 11:52:13 +01:00
Eelco Dolstra
2ab8e9a1e0 hydra-queue-runner: Fix handling of missing derivations
This barfed with 'queue monitor: ERROR: column "errormsg" of relation
"builds" does not exist' due to the removal of the errorMsg column.
2016-03-07 19:05:24 +01:00
Eelco Dolstra
e7ce225558 Fix build 2016-03-04 17:51:32 +01:00
Eelco Dolstra
86a2d6471c Fix a boost format string abort 2016-03-02 20:06:48 +01:00
Eelco Dolstra
232ca8fea2 Fix build 2016-03-02 17:05:07 +01:00
Eelco Dolstra
b98a061c24 Add some instrumentation to keep track of dispatcher cost 2016-03-02 14:18:39 +01:00
Eelco Dolstra
6beee0ab49 Fix segfault sorting runnable steps
Same problem as d744362e4a249a92ecfc3e48eae135e3a46db49e.

    at /nix/store/ksvsbr7pg4z69bv6fbbc8h7x7rm2104m-gcc-4.9.3/include/c++/4.9.3/bits/predefined_ops.h:166
    __last@entry=..., __comp=...) at /nix/store/ksvsbr7pg4z69bv6fbbc8h7x7rm2104m-gcc-4.9.3/include/c++/4.9.3/bits/stl_algo.h:1827
    __comp=...) at /nix/store/ksvsbr7pg4z69bv6fbbc8h7x7rm2104m-gcc-4.9.3/include/c++/4.9.3/bits/stl_algo.h:4717
2016-03-02 13:59:24 +01:00
Eelco Dolstra
7cd08c7c46 Warn if PostgreSQL appears stalled 2016-02-29 15:10:30 +01:00
Eelco Dolstra
922dc541c2 Add log message 2016-02-29 11:58:06 +01:00
Eelco Dolstra
610a8d67ae Better AWS error messages 2016-02-26 22:40:27 +01:00
Eelco Dolstra
1a055e7e9e Reduce severity level of some message 2016-02-26 21:31:08 +01:00
Eelco Dolstra
6bb860fd6e Add FIXME 2016-02-26 21:15:05 +01:00
Eelco Dolstra
53ca41ef9f Use US standard S3 region 2016-02-26 20:57:47 +01:00
Eelco Dolstra
c635f5d0ea Fix Makefile.am 2016-02-26 19:54:55 +01:00
Eelco Dolstra
b9afaadfb3 Keep better bytesReceived/bytesSent stats 2016-02-26 16:17:05 +01:00
Eelco Dolstra
6d741d2ffa Prevent download of NARs we just uploaded 2016-02-26 15:21:44 +01:00
Eelco Dolstra
02190b0fef Support hydra-build-products on binary cache stores 2016-02-26 14:45:03 +01:00
Eelco Dolstra
8e24ad6f0d Sync with Nix 2016-02-25 10:58:31 +01:00
Eelco Dolstra
8321a3eb27 Sync with Nix 2016-02-24 14:04:31 +01:00
Eelco Dolstra
7b509237cd Bleh Automake 2016-02-22 18:05:15 +01:00
Eelco Dolstra
6c3ae36648 hydra-queue-runner: Get store mode configuration from hydra.conf
To use the local Nix store (default):

  store_mode = direct

To use a local binary cache:

  store_mode = local-binary-cache
  binary_cache_dir = /var/lib/hydra/binary-cache

To use an S3 bucket:

  store_mode = s3-binary-cache
  binary_cache_s3_bucket = my-nix-bucket

Also, respect binary_cache_{secret,public}_key_file for signing the
binary cache.
2016-02-22 17:23:06 +01:00
Eelco Dolstra
94817d77d9 BinaryCacheStore: Respect build-use-substitutes 2016-02-22 17:21:39 +01:00
Eelco Dolstra
5668aa5f71 After uploading a .narinfo, add it to the LRU cache 2016-02-20 10:35:16 +01:00
Eelco Dolstra
88a05763cc Pool local store connections 2016-02-20 00:04:08 +01:00
Eelco Dolstra
1cefd6cac8 Fix log message 2016-02-20 00:02:37 +01:00
Eelco Dolstra
2b76094a23 S3BinaryCacheStore::isValidPath(): Do a GET instead of HEAD 2016-02-19 17:41:11 +01:00
Eelco Dolstra
bd76f9120a Cache .narinfo lookups 2016-02-19 16:19:40 +01:00
Eelco Dolstra
a0f74047da Keep some statistics for the binary cache stores 2016-02-19 14:24:23 +01:00
Eelco Dolstra
dc4a00347d Use a single BinaryCacheStore for all threads
This will make it easier to do caching / keep stats. Also, we won't
have S3Client's connection pooling if we create multiple S3Client
instances.
2016-02-18 17:31:19 +01:00
Eelco Dolstra
00a7be13a2 Make queue runner internal status available under /queue-runner-status 2016-02-18 17:11:46 +01:00
Eelco Dolstra
8c9fc677c1 Typo 2016-02-18 16:43:24 +01:00
Eelco Dolstra
db3fcc0f5e Enable substitution on the build machines
If properly configured, this allows them to get store paths directly
from S3, rather than having to receive them from the queue runner.
2016-02-18 16:42:05 +01:00
Eelco Dolstra
2d40888e2e Add an S3-backed binary cache store 2016-02-18 16:18:50 +01:00
Eelco Dolstra
0e254ca66d Refactor local binary cache code into a subclass 2016-02-18 14:06:17 +01:00
Eelco Dolstra
a992f688d1 Rename class 2016-02-18 13:02:20 +01:00
Eelco Dolstra
de77cc2910 Rename file 2016-02-18 13:02:20 +01:00
Eelco Dolstra
ce5790285a Merge remote-tracking branch 'origin/master' into binary-cache 2016-02-17 11:54:59 +01:00
Eelco Dolstra
d7a123fcd4 Keep track of the time we spend copying to/from build machines 2016-02-17 10:30:23 +01:00
Eelco Dolstra
25022bf5fd hydra-queue-runner: Support generating a signed binary cache 2016-02-16 16:41:42 +01:00
Eelco Dolstra
744cee134e hydra-queue-runner: Compress binary cache NARs using xz 2016-02-15 21:56:53 +01:00