hydra

Author	SHA1	Message	Date
John Ericson	3f932a6731	build-remote: Use `std::map<StorePath, UnkeyedValidPathInfo>` It is less denormalized	2023-12-09 11:59:09 -05:00
John Ericson	4515b5aa17	Merge pull request #1321 from NixOS/master Mere `master` into `nix-next`	2023-12-09 11:53:58 -05:00
John Ericson	831021808c	Merge pull request #1318 from obsidiansystems/use-build-result-serialiser Use factored-out `BuildResult` serializer	2023-12-08 11:25:05 -05:00
John Ericson	20c8263e3c	Update to Nix master The point of this branch is to always track Nix master, so we are proactively ready to upgrade to the next Nix release when it is ready. Flake lock file updates: • Updated input 'nix': 'github:NixOS/nix/50f8f1c8bc019a4c0fd098b9ac674b94cfc6af0d' (2023-11-27) → 'github:NixOS/nix/c3827ff6348a4d5199eaddf8dbc2ca2e2ef46ec5' (2023-12-07) • Added input 'nix/libgit2': 'github:libgit2/libgit2/45fd9ed7ae1a9b74b957ef4f337bc3c8b3df01b5' (2023-10-18)	2023-12-07 13:11:31 -05:00
John Ericson	6a54ab24e2	Use factored-out `BuildResult` serializer For the record, here is the Nix 2.19 version: https://github.com/NixOS/nix/blob/2.19-maintenance/src/libstore/serve-protocol.cc, which is what we would initially use. It is a more complete version of what Hydra has today except for one thing: it always unconditionally sets the start/stop times. I think that is correct at the other end seems to unconditionally measure them, but just to be extra careful, I reproduced the old behavior of falling back on Hydra's own measurements if `startTime` is 0. The only difference is that the fallback `stopTime` is now measured from after the entire `BuildResult` is transferred over the wire, but I think that should be negligible if it is measurable at all. (And remember, this is fallback case I already suspect is dead code.)	2023-12-07 02:00:22 -05:00
John Ericson	86cd5e9076	`copyClosureTo`: Use `SubstituteFlag` instead of `bool` This matches Nix (in the same serialization logic in `src/libstore/legacy-ssh-store.cc`) and adds clarity.	2023-12-07 00:18:50 -05:00
John Ericson	363604846a	Again, use `const` in for loop As requested by @teh. Was lost in merge with master, now added back.	2023-12-04 11:31:05 -05:00
John Ericson	162b538912	Remove unused `thisArrow` variable	2023-12-04 11:27:39 -05:00
John Ericson	104baef503	Document the connection initialization process	2023-12-04 09:42:04 -05:00
John Ericson	67eeabd518	Merge remote-tracking branch 'upstream/master' into split-buildRemote	2023-12-04 09:12:58 -05:00
John Ericson	622c25e3c4	Sedding prior to merge	2023-12-04 08:56:06 -05:00
John Ericson	c922e73c11	Update to Nix 2.19 Flake lock file updates: • Updated input 'nix': 'github:NixOS/nix/f5f4de6a550327b4b1a06123c2e450f1b92c73b6' (2023-10-02) → 'github:NixOS/nix/50f8f1c8bc019a4c0fd098b9ac674b94cfc6af0d' (2023-11-27)	2023-11-30 15:26:46 -05:00
John Ericson	e172461e55	Use `const` in for loop As requested by @teh	2023-11-30 12:19:20 -05:00
John Ericson	0917145622	Make new functions not in header `static`	2023-11-30 12:19:05 -05:00
John Ericson	2bda7ca642	Further use `Machine::Connection` to deduplicate	2023-11-30 11:31:58 -05:00
John Ericson	831a2d9bd5	Merge remote-tracking branch 'upstream/master' into split-buildRemote	2023-11-30 11:27:40 -05:00
chayleaf	e9da80fff6	support nix 2.18	2023-11-21 18:41:52 +07:00
Eelco Dolstra	35ccc9ebb2	Fix indentation Co-authored-by: John Ericson <git@JohnEricson.me>	2023-08-23 17:04:45 +02:00
Linus Heckemann	9f0427385f	Apply LTO fix suggested by Ericson2314	2023-08-20 14:55:56 +02:00
Linus Heckemann	b23431a657	Support Nix 2.17	2023-08-04 15:53:48 +02:00
Eelco Dolstra	9f69bb5c2c	Fix compilation against Nix 2.16	2023-06-23 15:06:55 +02:00
John Ericson	3526d61ff2	Merge remote-tracking branch 'upstream/master' into split-buildRemote	2022-10-25 11:24:54 -04:00
Théophane Hufschmitt	143c31734f	Move all the build remote utils to their namespace Just don't pollute the global one	2022-10-25 10:04:29 +02:00
Eelco Dolstra	44e1efff7f	Send the right nix-serve client version We were using protocol version 6 but requesting version 4. The only reason that this worked was because of a broken version check in 'nix-store --serve'. That was fixed in `c2d7456926`, which had the side-effect of breaking hydra-queue-runner.	2022-09-08 11:51:13 +02:00
Eelco Dolstra	bcaad1c934	openConnection(): Don't throw exceptions in forked child On hydra.nixos.org the queue runner had child processes that were stuck handling an exception: Thread 1 (Thread 0x7f501f7fe640 (LWP 1413473) "bld~v54h5zkhmb3"): #0 futex_wait (private=0, expected=2, futex_word=0x7f50c27969b0 <_rtld_local+2480>) at ../sysdeps/nptl/futex-internal.h:146 #1 __lll_lock_wait (futex=0x7f50c27969b0 <_rtld_local+2480>, private=0) at lowlevellock.c:52 #2 0x00007f50c21eaee4 in __GI___pthread_mutex_lock (mutex=0x7f50c27969b0 <_rtld_local+2480>) at ../nptl/pthread_mutex_lock.c:115 #3 0x00007f50c1854bef in __GI___dl_iterate_phdr (callback=0x7f50c190c020 <_Unwind_IteratePhdrCallback>, data=0x7f501f7fb040) at dl-iteratephdr.c:40 #4 0x00007f50c190d2d1 in _Unwind_Find_FDE () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1 #5 0x00007f50c19099b3 in uw_frame_state_for () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1 #6 0x00007f50c190ab90 in uw_init_context_1 () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1 #7 0x00007f50c190b08e in _Unwind_RaiseException () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1 #8 0x00007f50c1b02ab7 in __cxa_throw () from /nix/store/dd8swlwhpdhn6bv219562vyxhi8278hs-gcc-10.3.0-lib/lib/libstdc++.so.6 #9 0x00007f50c1d01abe in nix::parseURL (url="root@cb893012.packethost.net") at src/libutil/url.cc:53 #10 0x0000000000484f55 in extraStoreArgs (machine="root@cb893012.packethost.net") at build-remote.cc:35 #11 operator() (__closure=0x7f4fe9fe0420) at build-remote.cc:79 ... Maybe the fork happened while another thread was holding some global stack unwinding lock (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71744). Anyway, since the hanging child inherits all file descriptors to SSH clients, shutting down remote builds (via 'child.to = -1' in State::buildRemote()) doesn't work and 'child.pid.wait()' hangs forever. So let's not do any significant work between fork and exec.	2022-03-30 22:39:48 +02:00
ajs124	089da272c7	fix build against nix 2.7.0 fix build after such commits as df552ff53e68dff8ca360adbdbea214ece1d08ee and e862833ec662c1bffbe31b9a229147de391e801a	2022-03-29 15:38:24 -04:00
Graham Christensen	3b048ed136	Revert "Revert "Use `copyClosure` instead of `computeFSClosure` + `copyPaths`"" This reverts commit `8e3ada2afc`.	2022-03-29 15:28:47 -04:00
Théophane Hufschmitt	6e571e26ff	Build the resolved derivation and not the original one	2022-03-29 17:05:30 +02:00
Théophane Hufschmitt	92b627ac1b	Remove an accidental re-indenting of a comment Co-authored-by: Eelco Dolstra <edolstra@gmail.com>	2022-03-29 17:04:19 +02:00
Théophane Hufschmitt	b430d41afd	Use the `BuildOptions` more eagerly	2022-03-29 17:04:19 +02:00
Théophane Hufschmitt	fd0ae78eba	Factor out the copying from the build store	2022-03-29 17:04:19 +02:00
Théophane Hufschmitt	a778a89f04	Factor out the `queryPathInfos` part of the build	2022-03-29 17:04:19 +02:00
Théophane Hufschmitt	365776f5d7	Factor out the building part	2022-03-29 17:04:19 +02:00
Théophane Hufschmitt	9f1b911625	Factor more stuff out	2022-03-29 17:04:17 +02:00
Théophane Hufschmitt	2f494b7834	Factor out the creation of the log file	2022-03-29 16:52:59 +02:00
Cole Helbling	8e3ada2afc	Revert "Use `copyClosure` instead of `computeFSClosure` + `copyPaths`" This reverts commit `f14c583ce5`.	2022-03-28 09:54:02 -07:00
Eelco Dolstra	962bf36939	Merge pull request #1162 from obsidiansystems/less-ref Make `copyClosureTo` take a regular C++ ref to the store	2022-03-23 16:25:59 +01:00
John Ericson	445bba337b	Make `copyClosureTo` take a regular C++ ref to the store This is syntactically lighter wait, and demonstates there are no weird dynamic lifetimes involved, just regular passing reference to callee which it only borrows for the duration of the call.	2022-02-20 17:22:43 +00:00
John Ericson	f14c583ce5	Use `copyClosure` instead of `computeFSClosure` + `copyPaths` It is more terse, and in the future it is possible `copyClosure` will become more sophisticated.	2022-02-19 11:59:17 -05:00
Graham Christensen	f6e86efc9f	Merge pull request #1091 from Ma27/ssh-remote-store-location hydra-queue-runner: support store URIs declaring an alternate store location	2022-01-24 14:10:54 -05:00
Graham Christensen	3a4ea6e563	Merge pull request #1124 from obsidiansystems/simplify--closure-of-path-set simplify, `computeFSClosure` can take a set now	2022-01-24 14:09:35 -05:00
Graham Christensen	ba96a13407	Record metrics when getting the closure to localhost	2022-01-21 15:38:05 -05:00
Graham Christensen	7e9e82398d	build-remote: copy missing paths from the binary cache to localhost In a Hydra instance I saw: possibly transient failure building ‘/nix/store/X.drv’ on ‘localhost’: dependency '/nix/store/Y' of '/nix/store/Y.drv' does not exist, and substitution is disabled This is confusing because the Hydra in question does have substitution enabled. This instance uses: keep-outputs = true keep-derivations = true and an S3 binary cache which is not configured as a substituter in the nix.conf. It appears this instance encountered a situation where store path Y was built and present in the binary cache, and Y.drv was GC rooted on the instance, however Y was not on the host. When Hydra would try to build this path locally, it would look in the binary cache to see if it was cached: (nix) 439 bool valid = isValidPathUncached(storePath); 440 441 if (diskCache && !valid) 442 // FIXME: handle valid = true case. 443 diskCache->upsertNarInfo(getUri(), hashPart, 0); 444 445 return valid; Since it was cached, the store path was considered Valid. The queue monitor would then not put this input in for substitution, because the path is valid: (hydra) 470 if (!destStore->isValidPath(i.second.path(localStore, step->drv->name, i.first))) { 471 valid = false; 472 missing.insert_or_assign(i.first, i.second); 473 } Hydra appears to correctly handle the case of missing paths that need to be substituted from the binary cache already, but since most Hydra instances use `keep-outputs` and all paths in the binary cache originate from that machine, it is not common for a path to be cached and not GC rooted locally. I'll run Hydra with this patch for a while and see if we run in to the problem again. A big thanks to John Ericson who helped debug this particular issue.	2022-01-21 15:26:45 -05:00
John Ericson	e7a1ae87aa	simplify, `computeFSClosure` can take a set now	2022-01-20 14:53:01 -05:00
Maximilian Bosch	a18b487403	hydra-queue-runner: support store URIs declaring an alternate store location When having a builder like this in `/etc/nix/machines` ssh://mfbuild?remote-store=/home/bosch/store Hydra cannot build there since it tries to pass the entire value to `ssh(1)` which doesn't work. Also, an alternate store-location is e.g. used if the user isn't a trusted user on the remote system and thus cannot use `/nix/store`. If such a URI is given, Hydra will now add a `--store /home/bosch/store` to the `ssh`-command to select the appropriate location remotely.	2022-01-12 15:56:05 +01:00
Eelco Dolstra	5edb58b314	Fix build	2021-08-10 13:47:16 +02:00
Maximilian Bosch	2808227eb7	Fix `std::bad_alloc` errors for remote builds In Nix the protocol was slightly altered[1] to also contain more information about realisations. This however wasn't read from the pipe that was used to read from the store. After the `cmdBuildDerivation` command which caused this issue, Hydra will issue a `cmdQueryPathInfos` that tries to read from the remote store as well. However, there's still left over to read from the previous command and thus Nix fails to properly allocate the expected string. [1] See rev a2b69660a9b326b95d48bd222993c5225bbd5b5f Fixes #898	2021-04-15 15:16:52 +02:00
regnat	26ffd4a93e	Fix build with latest master	2021-04-08 17:11:15 +02:00
Shea Levy	930f05c38e	Bump Nix version	2021-03-10 12:53:03 -05:00
Graham Christensen	68ac64dbd9	Merge pull request #832 from wizeman/fix-hash-mismatch Fix persistent hash mismatch errors when importing	2021-03-02 16:04:23 -05:00

1 2 3

130 Commits