hydra

Author	SHA1	Message	Date
ajs124	17094c8371	lazy-load evaluation errors Closes #1362	2025-04-09 11:31:47 -04:00
Maximilian Bosch	90a8a0d94a	Reimplement (named) constituent jobs (+globbing) based on nix-eval-jobs Depends on https://github.com/nix-community/nix-eval-jobs/pull/349 & #1421. Almost equivalent to #1425, but with a small change: when having e.g. an aggregate job with a glob that matches nothing, the jobset evaluation is failed now. This was the intended behavior before (hydra-eval-jobset fails hard if an aggregate is broken), the code-path was never reached however since the aggregate was never marked as broken in this case before.	2025-04-09 11:31:47 -04:00
John Ericson	341b2f1309	Update build system to depend on Nix 2.26	2025-02-13 21:54:35 -05:00
Pierre Bourdon	2f92846e5a	hydra-eval-jobs: remove, replaced by nix-eval-jobs (cherry picked from commit ed7c58708cd3affd62a598a22a500ed2adf318bf)	2025-02-07 16:55:28 -05:00
Pierre Bourdon	d84ff32ce6	hydra-eval-jobset: Use `nix-eval-jobs` instead of `hydra-eval-jobs` incrementally ingest eval results nix-eval-jobs streams output, unlike hydra-eval-jobs. Now that we've migrated, we can use this to: 1. Use less RAM by avoiding buffering a whole eval's worth of metadata into a Perl string and an array of JSON objects. 2. Make evals latency a bit lower by allowing the queue runner to start ingesting builds faster. Also use the newly-restored constituents support in `nix-eval-jobs` Note, we pass --workers and --max-memory-size to n-e-j Lost in the h-e-j -> n-e-j migration, causing evaluation to always be single threaded and limited to 4GiB RAM. Follow the config settings like h-e-j used to do (via C++ code). `nix-eval-jobs` should check `hydraJobs` and then `checks` with flakes (cherry picked from commit 6d4ccff43c41adaf6e4b2b9bced7243bc2f6e97b) (cherry picked from commit b0e9b4b2f99f9d8f5c4e780e89f955c394b5ced4) (cherry picked from commit cdfc5c81e8037d3e4818a3e459d0804b2c157ea9) (cherry picked from commit 4b107e6ff36bd89958fba36e0fe0340903e7cd13) Co-Authored-By: Maximilian Bosch <maximilian@mbosch.me>	2025-02-07 16:55:28 -05:00
John Ericson	141b5fd0b5	Improve tests around constituents - Test how shorter names are preferred when multiple jobs resolve to the same derivation. - Test the exact aggregate map we get, by looking in the DB.	2025-02-07 16:39:13 -05:00
John Ericson	8a8ac14877	Test using Hydra with flakes It seemed there was no self-contained end-to-end test actually doing this?! Among other things, this will help ensure that the switch-over to `nix-eval-jobs` is correct.	2025-02-06 21:30:49 -05:00
Pierre Bourdon	182a48c9fb	autotools -> meson Original commit message: > There are some known regressions regarding local testing setups - since > everything was kinda half written with the expectation that build dir = > source dir (which should not be true anymore). But everything builds and > the test suite runs fine, after several hours spent debugging random > crashes in libpqxx with MALLOC_PERTURB_... I have not experienced regressions with local testing. (cherry picked from commit 4b886d9c45cd2d7fe9b0a8dbc05c7318d46f615d)	2024-11-24 15:58:26 -05:00
Jörg Thalheim	b6f44b5cd0	Merge pull request #1402 from NixOS/like-sub tests: use `like` for testing regexes	2024-09-15 23:50:13 +02:00
Martin Weinelt	f730433789	Create eval-jobset role and guard /api/push route	2024-08-27 19:49:05 +02:00
Janne Heß	916531dc9c	api: Require POST for /api/push	2024-08-27 17:52:13 +02:00
Jörg Thalheim	250780aaf2	tests: use `like` for testing regexes This gives us better diagnostics when the test fails.	2024-08-21 08:34:25 +02:00
Rick van Schijndel	54002f0fcf	t/evaluator/evaluate-oom-job.t: always skip, the test always fails We should look into how to resolve this, but I tried some things and nothing really worked. Let's put it skipped for now until someone comes along to improve it.	2024-07-31 17:15:02 +02:00
Rick van Schijndel	a6b14369ee	t/test.pl: increase event-timeout, set qvf Only log issues/failures when something's actually up. It has irked me for a long time that so much output came out of running the tests, this seems to silence it. It does hide some warnings, but I think it makes the output so much more readable that it's worth the tradeoff. Helps for highly parallel running of jobs, sometimes they'd not give output for a while. Setting this timeout higher appears to help. Not completely sure if this is the right place to do it, but it works fine for me.	2024-07-31 17:15:02 +02:00
Rick van Schijndel	578a3d2292	t: increase timeouts for slow commands with high load We've seen many fails on ofborg, at lot of them ultimately appear to come down to a timeout being hit, resulting in something like this: Failure executing slapadd -F /<path>/slap.d -b dc=example -l /<path>/load.ldif. Hopefully this resolves it for most cases. I've done some endurance testing and this helps a lot. some other commands also regularly time-out with high load: - hydra-init - hydra-create-user - nix-store --delete This should address most issues with tests randomly failing. Used the following script for endurance testing: ``` import os import subprocess run_counter = 0 fail_counter = 0 while True: try: run_counter += 1 print(f"Starting run {run_counter}") env = os.environ env["YATH_JOB_COUNT"] = "20" result = subprocess.run(["perl", "t/test.pl"], env=env) if (result.returncode != 0): fail_counter += 1 print(f"Finish run {run_counter}, total fail count: {fail_counter}") except KeyboardInterrupt: print(f"Finished {run_counter} runs with {fail_counter} fails") break ``` In case someone else wants to do it on their system :). Note that YATH_JOB_COUNT may need to be changed loosely based on your cores. I only have 4 cores (8 threads), so for others higher numbers might yield better results in hashing out unstable tests.	2024-07-31 17:13:28 +02:00
John Ericson	8b48579593	Merge pull request #1374 from Mindavi/bugfix/rendering-issue-content-addressed ca-derivations: fix rendering issue	2024-04-18 13:08:30 -04:00
Rick van Schijndel	3f913a771d	t: content-addressed: add a comment about a misleading testcase	2024-04-03 22:55:42 +02:00
Rick van Schijndel	1665aed5e3	t: content-addressed: add test for caDependingOnFailingCA This uncovers an issue with the front-end.	2024-04-03 22:45:53 +02:00
Maximilian Bosch	e499509595	Switch to new Nix bindings, update Nix for that Implements support for Nix's new Perl bindings[1]. The current state basically does `openStore()`, but always uses `auto` and doesn't support stores at other URIs. Even though the stores are cached inside the Perl implementation, I decided to instantiate those once in the Nix helper module. That way store openings aren't cluttered across the entire codebase. Also, there are two stores used later on - MACHINE_LOCAL_STORE for `auto`, BINARY_CACHE_STORE for the one from `store_uri` in `hydra.conf` - and using consistent names should make the intent clearer then. This doesn't contain any behavioral changes, i.e. the build product availability issue from #1352 isn't fixed. This patch only contains the migration to the new API. [1] https://github.com/NixOS/nix/pull/9863	2024-02-12 18:50:56 +01:00
John Ericson	c62eaf248f	Remove now-unneeded workaround	2024-01-26 01:20:07 -05:00
John Ericson	13b5f007ef	Merge branch 'master' into ca-no-new-col	2024-01-26 01:19:45 -05:00
John Ericson	5ee0e443e4	Remove now-unneeded workaround	2024-01-26 01:08:11 -05:00
John Ericson	323b556dc8	Minimal CA support This verison has a worse UI, but also chnages the schema less: One non-null constraint is removed, but no new columns are added. Co-Authored-By: Andrea Ciceri <andrea.ciceri@autistici.org> Co-Authored-By: regnat <rg@regnat.ovh>	2024-01-26 00:34:58 -05:00
John Ericson	fcde5908d8	More CA derivations prep Again, with care not to change the schema in any way.	2024-01-25 21:32:22 -05:00
John Ericson	411e4d0c24	Let tests themselves intentionally leak temp dir (#1320 ) * Let tests themselves intentionally leak temp dir By default Yath will clean up temporary files, so the result is the same. But `--keep-dirs` can be passed to `yath test` telling Yath to not clean them up instead. This is very useful for debugging. * Update t/lib/HydraTestContext.pm Co-authored-by: Cole Helbling <cole.e.helbling@outlook.com>	2023-12-08 16:30:31 +00:00
Eelco Dolstra	ce001bb142	Relax time interval checks I saw one of these failing randomly.	2023-06-23 15:09:09 +02:00
Maximilian Bosch	fd765bc97a	Fix "My Jobs" tab in user dashboard Nowadays `Builds` doesn't reference `Project` directly anymore. This means that simply resolving both `jobset` and `project` with a single JOIN from `Builds` doesn't work anymore. Instead we need to resolve the relation to `jobset` first and then the relation to `project`. For similar fixes see e.g. `c7c4759600`.	2022-11-22 20:54:51 +01:00
Maximilian Bosch	d3fe4ffbf6	Job: expose `closuresize` and `size` (output size in the UI) as prometheus metrics	2022-09-22 10:47:22 +02:00
Eelco Dolstra	c72bed5cb4	Fix tests Use $NIX_REMOTE instead of the legacy environment variables.	2022-07-12 14:45:30 +02:00
ajs124	bb1f04ed86	AddBuilds: fix declarative jobsets with dynamic runcommand enabled $project->{enable_dynamic_run_command} is undefined	2022-06-30 01:49:30 +02:00
Kayla Firestack	065039beba	feat(t/evaluator/evaluate-oom): comment intentions	2022-05-02 15:26:26 -04:00
Kayla Firestack	87f610e7c1	fix(t/evaluator/evaluate-oom): use `test_context` to get path to ./t/jobs instead of relative paths	2022-05-02 15:14:46 -04:00
Kayla Firestack	013a1dcabc	fix(t/evaluator/evaluate-oom): check that the exit value of the `systemd-run` check is zero. Rework skip messages	2022-05-02 15:13:59 -04:00
Kayla Firestack	e917d9e546	fix(t/evaluator/evaluate-oom): convert systemd-run presence check to eval, fix indentaion, show relationships between flags and commands with indentation	2022-05-02 14:40:13 -04:00
Kayla Firestack	01ec004108	feat(t/evaluator/evaluate-oom-job): skip test if systemd-run is not present	2022-05-02 14:08:50 -04:00
Kayla Firestack	2c909c038f	feat(t/evaluator/hydra-eval-jobs): add basic evaluation test for hydra-eval-jobs	2022-05-02 13:50:57 -04:00
Kayla Firestack	90769ab5ad	feat(t/jobs): add test job to cause an OOM	2022-05-02 13:49:32 -04:00
Graham Christensen	5c90edd19f	Merge pull request #1103 from DeterminateSystems/runcommand/dynamic Dynamic RunCommand	2022-04-19 10:09:47 -04:00
Graham Christensen	e1965250b5	Merge pull request #1173 from DeterminateSystems/queue-runner-exporter hydra-queue-runner metrics	2022-04-07 12:27:33 -04:00
Cole Helbling	edf3c348f2	hydra-queue-runner: make entire address configurable	2022-04-06 10:59:45 -07:00
Cole Helbling	9c1f36c47c	t/lib/HydraTestContext: set queue runner port to 0 This makes the exposer choose a random, available port.	2022-03-29 11:41:23 -07:00
Graham Christensen	e5393c2cf8	fixup: make id non-ambiguous	2022-03-19 23:56:47 -04:00
Graham Christensen	a582e4c485	HydraTestContext: add \n's to various dies	2022-03-19 14:46:53 -04:00
Graham Christensen	0c51de6334	hydra-evaluate-jobset: assert it logs errored constituents properly	2022-03-19 14:35:30 -04:00
Graham Christensen	25f6bae847	HydraTestContext: make it easy to create a jobset without evaluating	2022-03-19 14:34:43 -04:00
Graham Christensen	e0921eba0a	Create a basic test which verifies we can't delete the derivation of aggregate jobs	2022-02-20 12:28:40 -05:00
Graham Christensen	be46f02164	tests: relocate evaluator tests	2022-02-20 12:28:40 -05:00
Graham Christensen	5d169e3a2e	Add a test validating direct and indirect constituents	2022-02-20 12:28:40 -05:00
Graham Christensen	dfb3eccfaa	Merge pull request #1140 from Ma27/nix-update Update Nix to 2.6	2022-02-19 08:38:34 -05:00
Cole Helbling	a22a8fa62d	AddBuilds: reject declarative jobsets with dynamic runcommand enabled if disabled elsewhere	2022-02-11 14:35:52 -05:00

1 2 3 4 5 ...

291 Commits