hydra

Author	SHA1	Message	Date
John Ericson	ef7bf1e67b	Merge pull request #1375 from NixOS/nix-2.21 Nix 2.21	2024-04-12 17:28:37 -04:00
Maximilian Bosch	99afff03b0	hydra-queue-runner: drop broken connections from pool Closes #1336 When restarting postgresql, the connections are still reused in `hydra-queue-runner` causing errors like this main thread: Lost connection to the database server. queue monitor: Lost connection to the database server. and no more builds being processed. `hydra-evaluator` doesn't have that issue since it crashes right away. We could let it retry indefinitely as well (see below), but I don't want to change too much. If the DB is still unreachable 10s later, the process will stop with a non-zero exit code because of a missing DB connection. This however isn't such a big deal because it will be immediately restarted afterwards. With the current configuration, Hydra will never give up, but restart (and retry) infinitely. To me that seems reasonable, i.e. to retry DB connections on a long-running process. If this doesn't work out, the monitoring should fire anyways because the queue fills up, but I'm open to discuss that. Please note that this isn't reproducible with the DB and the queue runner on the same machine when using `services.hydra-dev`, because of the `Requires=` dependency `hydra-queue-runner.service` -> `hydra-init.service` -> `postgresql.service` that causes the queue runner to be restarted on `systemctl restart postgresql`. Internally, Hydra uses Nix's pool data structure: it basically has N slots (here DB connections) and whenever a new one is requested, an idle slot is provided or a new one is created (when N slots are active, it'll be waited until one slot is free). The issue in the code here is however that whenever an error is encountered, the slot is released, however the same broken connection will be reused the next time. By using `Pool::Handle::markBad`, Nix will drop a broken slot. This is now being done when `pqxx::broken_connection` was caught.	2024-03-15 14:09:31 +01:00
Maximilian Bosch	e499509595	Switch to new Nix bindings, update Nix for that Implements support for Nix's new Perl bindings[1]. The current state basically does `openStore()`, but always uses `auto` and doesn't support stores at other URIs. Even though the stores are cached inside the Perl implementation, I decided to instantiate those once in the Nix helper module. That way store openings aren't cluttered across the entire codebase. Also, there are two stores used later on - MACHINE_LOCAL_STORE for `auto`, BINARY_CACHE_STORE for the one from `store_uri` in `hydra.conf` - and using consistent names should make the intent clearer then. This doesn't contain any behavioral changes, i.e. the build product availability issue from #1352 isn't fixed. This patch only contains the migration to the new API. [1] https://github.com/NixOS/nix/pull/9863	2024-02-12 18:50:56 +01:00
John Ericson	7b826ec5ad	Merge branch 'nix-next' into nix-2.20	2024-01-30 13:26:45 -05:00
John Ericson	fcde5908d8	More CA derivations prep Again, with care not to change the schema in any way.	2024-01-25 21:32:22 -05:00
John Ericson	7a53b866f6	Merge branch 'master' into nix-next • Updated input 'nix' (merge): 'github:NixOS/nix/212ba69e6f995992f8b4e4c0656d19c0156c8714' 'github:NixOS/nix/2c4bb93ba5a97e7078896ebc36385ce172960e4e' (2024-01-25) → 'github:NixOS/nix/8df68a213fc52a57b02a57005b0e06cc8de40ce3' (2024-01-25)	2024-01-25 16:26:07 -05:00
John Ericson	c64eed7d07	Simplify `StoreConfig::getDefaultSystemFeatures` call That method is now static.	2024-01-25 15:58:07 -05:00
John Ericson	b1fa6b3aac	Use `StoreConfig::getDefaultSystemFeatures` for default machine config We have to oddly make a `StoreConfig` subclass to get it, but https://github.com/NixOS/nix/pull/9848 will fix that. The purpose of this is to ensure that, absent an explicit config, `localhost` includes `ca-derivations` and `recursive-nix` if those experimental features are enabled. Very much the complement of #1342, the previous PR.	2024-01-24 21:37:13 -05:00
John Ericson	07cb5d1b7c	Use `nix::ParsedDerivation::getRequiredSystemFeatures()` A slight dedup, and also ensures that floating CA derivations require a `ca-derivations` experimental feature. This fixes the scheduling issue that @SuperSandro2000 found.	2024-01-24 21:04:14 -05:00
John Ericson	449eb2d873	Use more `nix::Machine` fields The upstream fields were made to match Hydra, so we can get rid of the extra fields temporary added in `70e5469303`.	2024-01-24 20:14:31 -05:00
John Ericson	9e7ac58042	Merge branch 'master' into nix-next	2024-01-24 18:36:03 -05:00
John Ericson	d45e14fd43	Merge pull request #1316 from NixOS/ca-derivations-prep Prepare for CA derivation support with lower impact changes	2024-01-24 18:12:42 -05:00
John Ericson	9a86da0e7b	Merge branch 'master' into nix-next	2024-01-23 15:49:14 -05:00
John Ericson	70e5469303	Use Nix's `Machine` type in a mimimal way This is just using the fields from that type, and only where the types coincide. (There are two fields with different types, `speedFactor` most interestingly.) No code is reused, so we can be sure that no behavior is changed. Once the types are reconciled on the Nix side, then we can start carefully actually reusing code. Progress on #1164	2024-01-23 12:18:57 -05:00
John Ericson	2e6ee28f9b	`Machine` -> `::Machine` so we don't conflict with Nix's	2024-01-23 11:03:19 -05:00
John Ericson	7386caaecf	Use Nix's `SSHMaster`	2024-01-23 10:24:02 -05:00
John Ericson	84c46b6b68	Update to newer Nix Flake lock file updates: • Updated input 'nix': 'github:NixOS/nix/74534829f23b668fb9b2f2a14ff6afa4d5e71d4a' (2024-01-22) → 'github:NixOS/nix/b6aee9a93f6646bbffd919d362a5c75c37bb9caa' (2024-01-23)	2024-01-23 10:21:48 -05:00
John Ericson	f1d9230f25	Merge remote-tracking branch 'upstream/master' into nix-next	2024-01-23 01:18:13 -05:00
John Ericson	4e8fbaa3d6	Replace `Child` with `SSHMaster::Connection` Nix defines basically an identical struct for the same purpose, so let's just use that.	2024-01-23 01:11:46 -05:00
John Ericson	4ac31c89df	Use `nix::serv_proto::BasicConnection` in build_remote.cc - Use the type itself This lays the foundation for being able to dedup the protocol code. - Use `BasicConnection::handshake`, replacing ours. - Use `BasicConnection::queryValidPaths` - Use `BasicConnection::putBuildDerivationRequest`	2024-01-22 14:20:39 -05:00
John Ericson	89cfe26533	Merge remote-tracking branch 'upstream/master' into nix-next	2024-01-22 13:01:40 -05:00
John Ericson	588a0c5269	Merge remote-tracking branch 'upstream/master' into ca-derivations-prep	2023-12-23 19:19:54 -05:00
John Ericson	75f26f1fc4	Clean up `std::optional` dereferencing in the queue runner Instead of doing this partial operation a number of times, assert (with a comment, get a reference to the thing inside, and use that just once. (This refactor was done twice, "just once" for each time.)	2023-12-23 19:10:58 -05:00
John Ericson	6e67884ff1	One more `queryDerivationOutputMap` should use the eval store param	2023-12-11 14:05:18 -05:00
John Ericson	a6b6c5a539	Revert query -- those columns don't exist yet!	2023-12-11 12:58:54 -05:00
John Ericson	ebfefb9161	Sync up with some changes done to the main CA branch	2023-12-11 12:46:36 -05:00
John Ericson	8783dd53f6	Merge remote-tracking branch 'upstream/master' into ca-derivations-prep	2023-12-11 12:42:43 -05:00
John Ericson	f3a760ad9c	Merge pull request #1324 from obsidiansystems/serve-proto-build-options-serializer Use `ServeProto::Serialise<ServeProto::BuildOptions>`	2023-12-11 10:45:33 -05:00
John Ericson	8c10331ee8	Fix `totalNarSize` summation I accidentally removed it in `d0d3b0a298`.	2023-12-10 14:05:26 -05:00
John Ericson	20f5a2120c	Use `ServeProto::Serialise<ServeProto::BuildOptions>`	2023-12-10 13:24:17 -05:00
John Ericson	b56d2383c1	Do not attempt to speak a newer version of the protocol Both sides need to agree on a version (with `std::min`) for anything to work. Somehow... we've never done this. With this comment, the next commit succeeds. Without this commit, the next commit fails. This is because the next commit exposes serializers which do different things for proto version 2.7, and we're currently requesting 2.6. Opened https://github.com/NixOS/nix/issues/9584 to track this issue	2023-12-10 13:24:17 -05:00
John Ericson	69a5b00e60	Use `ServeProto::BuildOption` More deduplication with Nix.	2023-12-10 13:01:00 -05:00
John Ericson	f6f817926a	`std::move` the into the path info map	2023-12-09 12:12:00 -05:00
John Ericson	d0d3b0a298	Use `ServeProto::Serialise<UnkeyedValidPathInfo>` for `QueryValidPaths` Companion to already-merged https://github.com/NixOS/nix/pull/9560	2023-12-09 12:08:04 -05:00
John Ericson	3f932a6731	build-remote: Use `std::map<StorePath, UnkeyedValidPathInfo>` It is less denormalized	2023-12-09 11:59:09 -05:00
John Ericson	4515b5aa17	Merge pull request #1321 from NixOS/master Mere `master` into `nix-next`	2023-12-09 11:53:58 -05:00
John Ericson	831021808c	Merge pull request #1318 from obsidiansystems/use-build-result-serialiser Use factored-out `BuildResult` serializer	2023-12-08 11:25:05 -05:00
John Ericson	2ee0068fdc	Do not copy for both stores for now It has a performance cost, and as the comment says we should be doing the better solution. We want to land this preparatory change on prod while the rest is still on staging, so we should just skip it for now. Skipping it will not affect regular fixed-output and input-addressed derivations, which are the only ones prod would deal with upon getting this code. The main CA derivations support branch will revert this commit so it still works.	2023-12-07 15:05:03 -05:00
John Ericson	31ea6458ca	Merge remote-tracking branch 'upstream/master' into ca-derivations-prep	2023-12-07 15:01:35 -05:00
John Ericson	20c8263e3c	Update to Nix master The point of this branch is to always track Nix master, so we are proactively ready to upgrade to the next Nix release when it is ready. Flake lock file updates: • Updated input 'nix': 'github:NixOS/nix/50f8f1c8bc019a4c0fd098b9ac674b94cfc6af0d' (2023-11-27) → 'github:NixOS/nix/c3827ff6348a4d5199eaddf8dbc2ca2e2ef46ec5' (2023-12-07) • Added input 'nix/libgit2': 'github:libgit2/libgit2/45fd9ed7ae1a9b74b957ef4f337bc3c8b3df01b5' (2023-10-18)	2023-12-07 13:11:31 -05:00
John Ericson	6a54ab24e2	Use factored-out `BuildResult` serializer For the record, here is the Nix 2.19 version: https://github.com/NixOS/nix/blob/2.19-maintenance/src/libstore/serve-protocol.cc, which is what we would initially use. It is a more complete version of what Hydra has today except for one thing: it always unconditionally sets the start/stop times. I think that is correct at the other end seems to unconditionally measure them, but just to be extra careful, I reproduced the old behavior of falling back on Hydra's own measurements if `startTime` is 0. The only difference is that the fallback `stopTime` is now measured from after the entire `BuildResult` is transferred over the wire, but I think that should be negligible if it is measurable at all. (And remember, this is fallback case I already suspect is dead code.)	2023-12-07 02:00:22 -05:00
John Ericson	86cd5e9076	`copyClosureTo`: Use `SubstituteFlag` instead of `bool` This matches Nix (in the same serialization logic in `src/libstore/legacy-ssh-store.cc`) and adds clarity.	2023-12-07 00:18:50 -05:00
John Ericson	11f8030b0f	Add comment from GitHub about adding to store as code comment	2023-12-06 17:59:25 -05:00
John Ericson	e3443cd22a	Put back nicer `copyClosure` instead of manual closure + copy It looks like we accidentally got the old code back, probably after a merge conflict resolution.	2023-12-04 17:41:11 -05:00
John Ericson	8046ec2668	Remove unused `outputHashes` variable This looks like a stray copy paste.	2023-12-04 16:21:56 -05:00
John Ericson	9ba4417940	Prepare for CA derivation support with lower impact changes This is just C++ changes without any Perl / Frontend / SQL Schema changes. The idea is that it should be possible to redeploy Hydra with these chnages with (a) no schema migration and also (b) no regressions. We should be able to much more safely deploy these to a staging server and then production `hydra.nixos.org`. Extracted from #875 Co-Authored-By: Théophane Hufschmitt <theophane.hufschmitt@tweag.io> Co-Authored-By: Alexander Sosedkin <monk@unboiled.info> Co-Authored-By: Andrea Ciceri <andrea.ciceri@autistici.org> Co-Authored-By: Charlotte 🦝 Delenk Mlotte@chir.rs> Co-Authored-By: Sandro Jäckel <sandro.jaeckel@gmail.com>	2023-12-04 16:14:47 -05:00
John Ericson	a5d44b60ea	Merge pull request #1313 from obsidiansystems/split-buildRemote Split the `buildRemote` function, take 2	2023-12-04 11:37:36 -05:00
John Ericson	363604846a	Again, use `const` in for loop As requested by @teh. Was lost in merge with master, now added back.	2023-12-04 11:31:05 -05:00
John Ericson	162b538912	Remove unused `thisArrow` variable	2023-12-04 11:27:39 -05:00
John Ericson	104baef503	Document the connection initialization process	2023-12-04 09:42:04 -05:00

1 2 3 4 5 ...

440 Commits