hydra

Author	SHA1	Message	Date
John Ericson	d45e14fd43	Merge pull request #1316 from NixOS/ca-derivations-prep Prepare for CA derivation support with lower impact changes	2024-01-24 18:12:42 -05:00
John Ericson	2e6ee28f9b	`Machine` -> `::Machine` so we don't conflict with Nix's	2024-01-23 11:03:19 -05:00
John Ericson	4e8fbaa3d6	Replace `Child` with `SSHMaster::Connection` Nix defines basically an identical struct for the same purpose, so let's just use that.	2024-01-23 01:11:46 -05:00
John Ericson	ebfefb9161	Sync up with some changes done to the main CA branch	2023-12-11 12:46:36 -05:00
John Ericson	8783dd53f6	Merge remote-tracking branch 'upstream/master' into ca-derivations-prep	2023-12-11 12:42:43 -05:00
John Ericson	831021808c	Merge pull request #1318 from obsidiansystems/use-build-result-serialiser Use factored-out `BuildResult` serializer	2023-12-08 11:25:05 -05:00
John Ericson	2ee0068fdc	Do not copy for both stores for now It has a performance cost, and as the comment says we should be doing the better solution. We want to land this preparatory change on prod while the rest is still on staging, so we should just skip it for now. Skipping it will not affect regular fixed-output and input-addressed derivations, which are the only ones prod would deal with upon getting this code. The main CA derivations support branch will revert this commit so it still works.	2023-12-07 15:05:03 -05:00
John Ericson	31ea6458ca	Merge remote-tracking branch 'upstream/master' into ca-derivations-prep	2023-12-07 15:01:35 -05:00
John Ericson	6a54ab24e2	Use factored-out `BuildResult` serializer For the record, here is the Nix 2.19 version: https://github.com/NixOS/nix/blob/2.19-maintenance/src/libstore/serve-protocol.cc, which is what we would initially use. It is a more complete version of what Hydra has today except for one thing: it always unconditionally sets the start/stop times. I think that is correct at the other end seems to unconditionally measure them, but just to be extra careful, I reproduced the old behavior of falling back on Hydra's own measurements if `startTime` is 0. The only difference is that the fallback `stopTime` is now measured from after the entire `BuildResult` is transferred over the wire, but I think that should be negligible if it is measurable at all. (And remember, this is fallback case I already suspect is dead code.)	2023-12-07 02:00:22 -05:00
John Ericson	86cd5e9076	`copyClosureTo`: Use `SubstituteFlag` instead of `bool` This matches Nix (in the same serialization logic in `src/libstore/legacy-ssh-store.cc`) and adds clarity.	2023-12-07 00:18:50 -05:00
John Ericson	11f8030b0f	Add comment from GitHub about adding to store as code comment	2023-12-06 17:59:25 -05:00
John Ericson	8046ec2668	Remove unused `outputHashes` variable This looks like a stray copy paste.	2023-12-04 16:21:56 -05:00
John Ericson	9ba4417940	Prepare for CA derivation support with lower impact changes This is just C++ changes without any Perl / Frontend / SQL Schema changes. The idea is that it should be possible to redeploy Hydra with these chnages with (a) no schema migration and also (b) no regressions. We should be able to much more safely deploy these to a staging server and then production `hydra.nixos.org`. Extracted from #875 Co-Authored-By: Théophane Hufschmitt <theophane.hufschmitt@tweag.io> Co-Authored-By: Alexander Sosedkin <monk@unboiled.info> Co-Authored-By: Andrea Ciceri <andrea.ciceri@autistici.org> Co-Authored-By: Charlotte 🦝 Delenk Mlotte@chir.rs> Co-Authored-By: Sandro Jäckel <sandro.jaeckel@gmail.com>	2023-12-04 16:14:47 -05:00
John Ericson	363604846a	Again, use `const` in for loop As requested by @teh. Was lost in merge with master, now added back.	2023-12-04 11:31:05 -05:00
John Ericson	162b538912	Remove unused `thisArrow` variable	2023-12-04 11:27:39 -05:00
John Ericson	104baef503	Document the connection initialization process	2023-12-04 09:42:04 -05:00
John Ericson	67eeabd518	Merge remote-tracking branch 'upstream/master' into split-buildRemote	2023-12-04 09:12:58 -05:00
John Ericson	622c25e3c4	Sedding prior to merge	2023-12-04 08:56:06 -05:00
John Ericson	c922e73c11	Update to Nix 2.19 Flake lock file updates: • Updated input 'nix': 'github:NixOS/nix/f5f4de6a550327b4b1a06123c2e450f1b92c73b6' (2023-10-02) → 'github:NixOS/nix/50f8f1c8bc019a4c0fd098b9ac674b94cfc6af0d' (2023-11-27)	2023-11-30 15:26:46 -05:00
John Ericson	e172461e55	Use `const` in for loop As requested by @teh	2023-11-30 12:19:20 -05:00
John Ericson	0917145622	Make new functions not in header `static`	2023-11-30 12:19:05 -05:00
John Ericson	2bda7ca642	Further use `Machine::Connection` to deduplicate	2023-11-30 11:31:58 -05:00
John Ericson	831a2d9bd5	Merge remote-tracking branch 'upstream/master' into split-buildRemote	2023-11-30 11:27:40 -05:00
chayleaf	e9da80fff6	support nix 2.18	2023-11-21 18:41:52 +07:00
Eelco Dolstra	35ccc9ebb2	Fix indentation Co-authored-by: John Ericson <git@JohnEricson.me>	2023-08-23 17:04:45 +02:00
Linus Heckemann	9f0427385f	Apply LTO fix suggested by Ericson2314	2023-08-20 14:55:56 +02:00
Linus Heckemann	b23431a657	Support Nix 2.17	2023-08-04 15:53:48 +02:00
Eelco Dolstra	9f69bb5c2c	Fix compilation against Nix 2.16	2023-06-23 15:06:55 +02:00
John Ericson	3526d61ff2	Merge remote-tracking branch 'upstream/master' into split-buildRemote	2022-10-25 11:24:54 -04:00
Théophane Hufschmitt	143c31734f	Move all the build remote utils to their namespace Just don't pollute the global one	2022-10-25 10:04:29 +02:00
Eelco Dolstra	44e1efff7f	Send the right nix-serve client version We were using protocol version 6 but requesting version 4. The only reason that this worked was because of a broken version check in 'nix-store --serve'. That was fixed in `c2d7456926`, which had the side-effect of breaking hydra-queue-runner.	2022-09-08 11:51:13 +02:00
Eelco Dolstra	bcaad1c934	openConnection(): Don't throw exceptions in forked child On hydra.nixos.org the queue runner had child processes that were stuck handling an exception: Thread 1 (Thread 0x7f501f7fe640 (LWP 1413473) "bld~v54h5zkhmb3"): #0 futex_wait (private=0, expected=2, futex_word=0x7f50c27969b0 <_rtld_local+2480>) at ../sysdeps/nptl/futex-internal.h:146 #1 __lll_lock_wait (futex=0x7f50c27969b0 <_rtld_local+2480>, private=0) at lowlevellock.c:52 #2 0x00007f50c21eaee4 in __GI___pthread_mutex_lock (mutex=0x7f50c27969b0 <_rtld_local+2480>) at ../nptl/pthread_mutex_lock.c:115 #3 0x00007f50c1854bef in __GI___dl_iterate_phdr (callback=0x7f50c190c020 <_Unwind_IteratePhdrCallback>, data=0x7f501f7fb040) at dl-iteratephdr.c:40 #4 0x00007f50c190d2d1 in _Unwind_Find_FDE () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1 #5 0x00007f50c19099b3 in uw_frame_state_for () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1 #6 0x00007f50c190ab90 in uw_init_context_1 () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1 #7 0x00007f50c190b08e in _Unwind_RaiseException () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1 #8 0x00007f50c1b02ab7 in __cxa_throw () from /nix/store/dd8swlwhpdhn6bv219562vyxhi8278hs-gcc-10.3.0-lib/lib/libstdc++.so.6 #9 0x00007f50c1d01abe in nix::parseURL (url="root@cb893012.packethost.net") at src/libutil/url.cc:53 #10 0x0000000000484f55 in extraStoreArgs (machine="root@cb893012.packethost.net") at build-remote.cc:35 #11 operator() (__closure=0x7f4fe9fe0420) at build-remote.cc:79 ... Maybe the fork happened while another thread was holding some global stack unwinding lock (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71744). Anyway, since the hanging child inherits all file descriptors to SSH clients, shutting down remote builds (via 'child.to = -1' in State::buildRemote()) doesn't work and 'child.pid.wait()' hangs forever. So let's not do any significant work between fork and exec.	2022-03-30 22:39:48 +02:00
ajs124	089da272c7	fix build against nix 2.7.0 fix build after such commits as df552ff53e68dff8ca360adbdbea214ece1d08ee and e862833ec662c1bffbe31b9a229147de391e801a	2022-03-29 15:38:24 -04:00
Graham Christensen	3b048ed136	Revert "Revert "Use `copyClosure` instead of `computeFSClosure` + `copyPaths`"" This reverts commit `8e3ada2afc`.	2022-03-29 15:28:47 -04:00
Théophane Hufschmitt	6e571e26ff	Build the resolved derivation and not the original one	2022-03-29 17:05:30 +02:00
Théophane Hufschmitt	92b627ac1b	Remove an accidental re-indenting of a comment Co-authored-by: Eelco Dolstra <edolstra@gmail.com>	2022-03-29 17:04:19 +02:00
Théophane Hufschmitt	b430d41afd	Use the `BuildOptions` more eagerly	2022-03-29 17:04:19 +02:00
Théophane Hufschmitt	fd0ae78eba	Factor out the copying from the build store	2022-03-29 17:04:19 +02:00
Théophane Hufschmitt	a778a89f04	Factor out the `queryPathInfos` part of the build	2022-03-29 17:04:19 +02:00
Théophane Hufschmitt	365776f5d7	Factor out the building part	2022-03-29 17:04:19 +02:00
Théophane Hufschmitt	9f1b911625	Factor more stuff out	2022-03-29 17:04:17 +02:00
Théophane Hufschmitt	2f494b7834	Factor out the creation of the log file	2022-03-29 16:52:59 +02:00
Cole Helbling	8e3ada2afc	Revert "Use `copyClosure` instead of `computeFSClosure` + `copyPaths`" This reverts commit `f14c583ce5`.	2022-03-28 09:54:02 -07:00
Eelco Dolstra	962bf36939	Merge pull request #1162 from obsidiansystems/less-ref Make `copyClosureTo` take a regular C++ ref to the store	2022-03-23 16:25:59 +01:00
John Ericson	445bba337b	Make `copyClosureTo` take a regular C++ ref to the store This is syntactically lighter wait, and demonstates there are no weird dynamic lifetimes involved, just regular passing reference to callee which it only borrows for the duration of the call.	2022-02-20 17:22:43 +00:00
John Ericson	f14c583ce5	Use `copyClosure` instead of `computeFSClosure` + `copyPaths` It is more terse, and in the future it is possible `copyClosure` will become more sophisticated.	2022-02-19 11:59:17 -05:00
Graham Christensen	f6e86efc9f	Merge pull request #1091 from Ma27/ssh-remote-store-location hydra-queue-runner: support store URIs declaring an alternate store location	2022-01-24 14:10:54 -05:00
Graham Christensen	3a4ea6e563	Merge pull request #1124 from obsidiansystems/simplify--closure-of-path-set simplify, `computeFSClosure` can take a set now	2022-01-24 14:09:35 -05:00
Graham Christensen	ba96a13407	Record metrics when getting the closure to localhost	2022-01-21 15:38:05 -05:00
Graham Christensen	7e9e82398d	build-remote: copy missing paths from the binary cache to localhost In a Hydra instance I saw: possibly transient failure building ‘/nix/store/X.drv’ on ‘localhost’: dependency '/nix/store/Y' of '/nix/store/Y.drv' does not exist, and substitution is disabled This is confusing because the Hydra in question does have substitution enabled. This instance uses: keep-outputs = true keep-derivations = true and an S3 binary cache which is not configured as a substituter in the nix.conf. It appears this instance encountered a situation where store path Y was built and present in the binary cache, and Y.drv was GC rooted on the instance, however Y was not on the host. When Hydra would try to build this path locally, it would look in the binary cache to see if it was cached: (nix) 439 bool valid = isValidPathUncached(storePath); 440 441 if (diskCache && !valid) 442 // FIXME: handle valid = true case. 443 diskCache->upsertNarInfo(getUri(), hashPart, 0); 444 445 return valid; Since it was cached, the store path was considered Valid. The queue monitor would then not put this input in for substitution, because the path is valid: (hydra) 470 if (!destStore->isValidPath(i.second.path(localStore, step->drv->name, i.first))) { 471 valid = false; 472 missing.insert_or_assign(i.first, i.second); 473 } Hydra appears to correctly handle the case of missing paths that need to be substituted from the binary cache already, but since most Hydra instances use `keep-outputs` and all paths in the binary cache originate from that machine, it is not common for a path to be cached and not GC rooted locally. I'll run Hydra with this patch for a while and see if we run in to the problem again. A big thanks to John Ericson who helped debug this particular issue.	2022-01-21 15:26:45 -05:00

1 2 3

137 Commits