hydra

Author	SHA1	Message	Date
John Ericson	2e6ee28f9b	`Machine` -> `::Machine` so we don't conflict with Nix's	2024-01-23 11:03:19 -05:00
Pierre Bourdon	b7c864c515	queue-runner: only re-sort runnables by prio once per dispatch cycle The previous implementation was O(N²lg(N)) due to sorting the full runnables priority list once per runnable being scheduled. While not confirmed, this is suspected to cause performance issues and bottlenecking with the queue runner when the runnable list gets large enough. This commit changes the dispatcher to instead only sort runnables per priority once per dispatch cycle. This has the drawback of being less reactive to runnable priority changes: the previous code would react immediately, while this might end up using "old" priorities until the next dispatch cycle. However, dispatch cycles are not supposed to take very long (seconds, not minutes/hours), so this is not expected to have much or any practical impact. Ideally runnables would be maintained in a sorted data structure instead of the current approach of copying + sorting in the scheduler. This would however be a much more invasive change to implement, and might have to wait until we can confirm where the queue runner bottlenecks actually lie.	2023-09-08 23:38:30 +02:00
Eelco Dolstra	9f69bb5c2c	Fix compilation against Nix 2.16	2023-06-23 15:06:55 +02:00
Graham Christensen	4acaf9c8b0	hydra-queue-runner: don't dispatch until the machines parser has completed one run Periodically, I have seen tests fail because of out of order queue runner behavior: checking the queue for builds > 0... loading build 1 (tests:basic:empty_dir) aborting unsupported build step '...-empty-dir.drv' (type 'x86_64-linux') marking build 1 as failed adding new machine ‘localhost’ This patch should prevent the dispatcher from running before any machines are made available.	2022-02-10 10:54:30 -05:00
Graham Christensen	87d46ad5d6	hydra-queue-runner: --build-one: correctly handle a cached build Previously, the build ID would never flow through channels which exited. This patch tracks the buildOne state as part of State and exits avoids waiting forever for new work. The code around buildOnly is a bit rough, making this a bit weird to implement but since it is only used for testing the value of improving it on its own is a bit questionable.	2021-03-16 16:13:38 -04:00
Eelco Dolstra	ccd046ca3d	Keep track of the number of unsupported steps (cherry picked from commit `45ffe578b6`)	2020-03-31 22:19:03 +02:00
Eelco Dolstra	4417f9f260	Abort unsupported build steps If we don't see machine that supports a build step for 'max_unsupported_time' seconds, the step is aborted. The default is 0, which is appropriate for Hydra installations that don't provision missing machines dynamically. (cherry picked from commit `f5cdbfe21d`)	2020-03-31 22:19:01 +02:00
Eelco Dolstra	e4f5156c41	Build against nix-master (cherry picked from commit `e7f2139e25`)	2020-02-20 10:24:04 +01:00
Eelco Dolstra	423c0440ea	Typo	2018-12-20 12:07:02 +01:00
Eelco Dolstra	c0fac52872	Add some debug code	2018-03-07 10:23:43 +01:00
Eelco Dolstra	de9d7bcf25	hydra-queue-runner: Handle exceptions in the dispatcher thread E.g. "resource unavailable" when creating new threads.	2016-11-08 11:25:43 +01:00
Eelco Dolstra	b1512a152a	Fix build failure on GCC 5.4	2016-09-30 17:05:07 +02:00
Eelco Dolstra	8c7edb1005	Fix narrowing conversion	2016-04-13 16:30:52 +02:00
Eelco Dolstra	b98a061c24	Add some instrumentation to keep track of dispatcher cost	2016-03-02 14:18:39 +01:00
Eelco Dolstra	6beee0ab49	Fix segfault sorting runnable steps Same problem as `d744362e4a`. at /nix/store/ksvsbr7pg4z69bv6fbbc8h7x7rm2104m-gcc-4.9.3/include/c++/4.9.3/bits/predefined_ops.h:166 __last@entry=..., __comp=...) at /nix/store/ksvsbr7pg4z69bv6fbbc8h7x7rm2104m-gcc-4.9.3/include/c++/4.9.3/bits/stl_algo.h:1827 __comp=...) at /nix/store/ksvsbr7pg4z69bv6fbbc8h7x7rm2104m-gcc-4.9.3/include/c++/4.9.3/bits/stl_algo.h:4717	2016-03-02 13:59:24 +01:00
Eelco Dolstra	7e954aff03	Keep machine stats even when a machine is removed from the machines file This is important for the Hydra provisioner, since it needs to be able to see whether a disabled machine still has jobs running on it.	2015-09-02 13:31:47 +02:00
Eelco Dolstra	092d60735b	Keep track of wait time per system type I.e., how much time the currently runnable steps per system type have been waiting. This is useful for deciding whether to provision more machines.	2015-08-17 15:45:44 +02:00
Eelco Dolstra	ea1eb2e3fb	Keep track of requiredSystemFeatures in the machine stats For example, steps that require the "kvm" feature may require a different kind of machine to be provisioned. This can also be used to require performance-sensitive tests to run on a particular kind of machine, e.g., by setting requiredSystemFeatures to something like "ec2-i2.8xlarge".	2015-08-17 14:37:57 +02:00
Eelco Dolstra	d571e44b86	Keep stats for the Hydra auto scaler "hydra-queue-runner --status" now prints how many runnable and running build steps exist for each machine type. This allows additional machines to be provisioned based on the Hydra load.	2015-08-17 13:50:41 +02:00
Eelco Dolstra	576dc0c120	For completeness, re-implement meta.schedulingPriority	2015-08-12 12:05:43 +02:00
Eelco Dolstra	97f11baa8d	Revive jobset scheduling (I.e. taking the jobset scheduling share into account.)	2015-08-11 01:31:56 +02:00
Eelco Dolstra	eb13007fe6	Allow build to be bumped to the front of the queue via the web interface Builds now have a "Bump up" action. This will cause the queue runner to prioritise the steps of the build above all other steps.	2015-08-10 16:19:47 +02:00
Eelco Dolstra	27182c7c1d	Start steps in order of ascending build ID	2015-08-10 16:19:47 +02:00
Eelco Dolstra	593850b956	Fix potential race in dispatcher wakeup	2015-08-10 12:54:55 +02:00
Eelco Dolstra	6a1c950e94	Unindent	2015-08-10 11:33:22 +02:00
Eelco Dolstra	c18fb0ad74	Temporarily disable machines after a connection failure	2015-07-21 15:58:47 +02:00
Eelco Dolstra	7e026d35f7	Split hydra-queue-runner.cc more	2015-07-21 15:14:17 +02:00

27 Commits