hydra

Author	SHA1	Message	Date
Maximilian Bosch	90a8a0d94a	Reimplement (named) constituent jobs (+globbing) based on nix-eval-jobs Depends on https://github.com/nix-community/nix-eval-jobs/pull/349 & #1421. Almost equivalent to #1425, but with a small change: when having e.g. an aggregate job with a glob that matches nothing, the jobset evaluation is failed now. This was the intended behavior before (hydra-eval-jobset fails hard if an aggregate is broken), the code-path was never reached however since the aggregate was never marked as broken in this case before.	2025-04-09 11:31:47 -04:00
Pierre Bourdon	d84ff32ce6	hydra-eval-jobset: Use `nix-eval-jobs` instead of `hydra-eval-jobs` incrementally ingest eval results nix-eval-jobs streams output, unlike hydra-eval-jobs. Now that we've migrated, we can use this to: 1. Use less RAM by avoiding buffering a whole eval's worth of metadata into a Perl string and an array of JSON objects. 2. Make evals latency a bit lower by allowing the queue runner to start ingesting builds faster. Also use the newly-restored constituents support in `nix-eval-jobs` Note, we pass --workers and --max-memory-size to n-e-j Lost in the h-e-j -> n-e-j migration, causing evaluation to always be single threaded and limited to 4GiB RAM. Follow the config settings like h-e-j used to do (via C++ code). `nix-eval-jobs` should check `hydraJobs` and then `checks` with flakes (cherry picked from commit 6d4ccff43c41adaf6e4b2b9bced7243bc2f6e97b) (cherry picked from commit b0e9b4b2f99f9d8f5c4e780e89f955c394b5ced4) (cherry picked from commit cdfc5c81e8037d3e4818a3e459d0804b2c157ea9) (cherry picked from commit 4b107e6ff36bd89958fba36e0fe0340903e7cd13) Co-Authored-By: Maximilian Bosch <maximilian@mbosch.me>	2025-02-07 16:55:28 -05:00
Rick van Schijndel	578a3d2292	t: increase timeouts for slow commands with high load We've seen many fails on ofborg, at lot of them ultimately appear to come down to a timeout being hit, resulting in something like this: Failure executing slapadd -F /<path>/slap.d -b dc=example -l /<path>/load.ldif. Hopefully this resolves it for most cases. I've done some endurance testing and this helps a lot. some other commands also regularly time-out with high load: - hydra-init - hydra-create-user - nix-store --delete This should address most issues with tests randomly failing. Used the following script for endurance testing: ``` import os import subprocess run_counter = 0 fail_counter = 0 while True: try: run_counter += 1 print(f"Starting run {run_counter}") env = os.environ env["YATH_JOB_COUNT"] = "20" result = subprocess.run(["perl", "t/test.pl"], env=env) if (result.returncode != 0): fail_counter += 1 print(f"Finish run {run_counter}, total fail count: {fail_counter}") except KeyboardInterrupt: print(f"Finished {run_counter} runs with {fail_counter} fails") break ``` In case someone else wants to do it on their system :). Note that YATH_JOB_COUNT may need to be changed loosely based on your cores. I only have 4 cores (8 threads), so for others higher numbers might yield better results in hashing out unstable tests.	2024-07-31 17:13:28 +02:00
John Ericson	323b556dc8	Minimal CA support This verison has a worse UI, but also chnages the schema less: One non-null constraint is removed, but no new columns are added. Co-Authored-By: Andrea Ciceri <andrea.ciceri@autistici.org> Co-Authored-By: regnat <rg@regnat.ovh>	2024-01-26 00:34:58 -05:00
Eelco Dolstra	c72bed5cb4	Fix tests Use $NIX_REMOTE instead of the legacy environment variables.	2022-07-12 14:45:30 +02:00
Graham Christensen	5d169e3a2e	Add a test validating direct and indirect constituents	2022-02-20 12:28:40 -05:00
Graham Christensen	33f4c4c13d	build-locally-with-substitutable-path.t: give nix-store --delete a bit more time to run Under high load, like 64-128 tests at once, this can take more than a second.	2022-02-10 11:13:31 -05:00
Graham Christensen	845e6d4760	captureStdoutStderr*: move to Hydra::Helper::Exec which helps avoid some environment variable fixation problems	2022-02-09 14:28:50 -05:00
Graham Christensen	952f629b7c	Test the queue runner in the scenario where a dependency is available in the cache but GC'd locally, where we're building locally	2022-01-21 15:26:45 -05:00
Graham Christensen	fbce3b6ed1	default-machine-file: use makeAndEvaluateJobset	2021-12-14 20:52:40 -05:00
Graham Christensen	06f824ca48	notifications.t: use system() with lists	2021-12-14 20:48:19 -05:00
Graham Christensen	c2384a04d8	notifications.t: move to makeAndEvaluateJobset	2021-12-14 20:41:21 -05:00
Graham Christensen	7dcf6a01c6	JSON -> JSON::MaybeXS	2021-12-13 15:37:56 -05:00
Luke Granger-Brown	67ebce8493	Output evaluation errors without crashing if aggregate job is broken. At the moment, aggregate jobs can easily break and cause the entire evaluation to fail, which is not ideal. For Nixpkgs, we do have some important aggregate jobs (like `tested`), but for debugging and building purposes it's still useful to get a partial result even if the channel won't actually advance. This commit changes the behaviour of hydra-eval-jobs such that it aggregates any errors found during the construction of an aggregate, and will instead annotate the job with the evaluation failure such that it shows up in a "cleaner" way. There are really two types of failure that we care about: one is where the attribute just ends up missing altogether in the final output, and also where the attribute is in the output but fails to evaluate. Both are handled here. Note that this does mean that the same error message may be output multiple times, but this aids debuggability because it'll be much clearer what's blocking the job from being created.	2021-10-26 10:14:34 +01:00
Your Name	4677a7c894	perlcritic: use strict, use warnings	2021-09-06 22:13:33 -04:00
Graham Christensen	cb8929b7ed	Tighten up 'should exit with return code'	2021-06-16 11:48:49 -04:00
regnat	305c27d3fb	Move the default-machine-file test under `queue-runner` Just a bit of housekeeping to keep the testsuite clean	2021-04-29 06:55:23 +02:00
Graham Christensen	cf4434bc9f	queue runner: test notifications Especially, test the difference in behavior of substituted and unsubstituted builds.	2021-04-14 14:19:10 -04:00

18 Commits