Add a plugin for backing up builds in s3
In your hydra config, you can add an arbitrary number of <s3config>
sections, with the following options:
* name (required): Bucket name
* jobs (required): A regex to match job names (in project:jobset:job
format) that should be backed up to this bucket
* compression_type: bzip2 (default), xz, or none
* prefix: String to prepend to all hydra-created s3 keys (if this is
meant to represent a directory, you should include the trailing slash,
e.g. "cache/"). Default "".
After each build with an output (i.e. successful or failed-with-output
builds), the output path and its closure are uploaded to the bucket as
.nar files, with corresponding .narinfos to enable use as a binary
cache.
This plugin requires that s3 credentials be available. It uses
Net::Amazon::S3, which as of this commit the nixpkgs version can
retrieve s3 credentials from the AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY environment variables, or from ec2 instance
metadata when using an IAM role.
This commit also adds a hydra-s3-backup-collect-garbage program, which
uses hydra's gc roots directory to determine which paths are live, and
then deletes all files except nix-cache-info and any .nar or .narinfo
files corresponding to live paths. hydra-s3-backup-collect-garbage
respects the prefix configuration option, so it won't delete anything
outside of the hierarchy you give it, and it has the same credential
requirements as the plugin. Probably a timer unit running the garbage
collection periodically should be added to hydra-module.nix
Note that two of the added tests fail, due to a bug in the interaction
between Net::Amazon::S3 and fake-s3. Those behaviors work against real
s3 though, so I'm committing this even with the broken tests.
Signed-off-by: Shea Levy <shea@shealevy.com>
2013-09-03 10:53:56 -04:00
|
|
|
package Hydra::Plugin::S3Backup;
|
|
|
|
|
|
|
|
use strict;
|
2021-08-19 16:36:43 -04:00
|
|
|
use warnings;
|
Add a plugin for backing up builds in s3
In your hydra config, you can add an arbitrary number of <s3config>
sections, with the following options:
* name (required): Bucket name
* jobs (required): A regex to match job names (in project:jobset:job
format) that should be backed up to this bucket
* compression_type: bzip2 (default), xz, or none
* prefix: String to prepend to all hydra-created s3 keys (if this is
meant to represent a directory, you should include the trailing slash,
e.g. "cache/"). Default "".
After each build with an output (i.e. successful or failed-with-output
builds), the output path and its closure are uploaded to the bucket as
.nar files, with corresponding .narinfos to enable use as a binary
cache.
This plugin requires that s3 credentials be available. It uses
Net::Amazon::S3, which as of this commit the nixpkgs version can
retrieve s3 credentials from the AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY environment variables, or from ec2 instance
metadata when using an IAM role.
This commit also adds a hydra-s3-backup-collect-garbage program, which
uses hydra's gc roots directory to determine which paths are live, and
then deletes all files except nix-cache-info and any .nar or .narinfo
files corresponding to live paths. hydra-s3-backup-collect-garbage
respects the prefix configuration option, so it won't delete anything
outside of the hierarchy you give it, and it has the same credential
requirements as the plugin. Probably a timer unit running the garbage
collection periodically should be added to hydra-module.nix
Note that two of the added tests fail, due to a bug in the interaction
between Net::Amazon::S3 and fake-s3. Those behaviors work against real
s3 though, so I'm committing this even with the broken tests.
Signed-off-by: Shea Levy <shea@shealevy.com>
2013-09-03 10:53:56 -04:00
|
|
|
use parent 'Hydra::Plugin';
|
|
|
|
use File::Temp;
|
|
|
|
use File::Basename;
|
|
|
|
use Fcntl;
|
|
|
|
use IO::File;
|
|
|
|
use Net::Amazon::S3;
|
|
|
|
use Net::Amazon::S3::Client;
|
|
|
|
use Digest::SHA;
|
|
|
|
use Nix::Config;
|
|
|
|
use Nix::Store;
|
|
|
|
use Hydra::Model::DB;
|
|
|
|
use Hydra::Helper::CatalystUtils;
|
|
|
|
|
2019-08-13 17:20:16 +02:00
|
|
|
sub isEnabled {
|
|
|
|
my ($self) = @_;
|
|
|
|
return defined $self->{config}->{s3backup};
|
|
|
|
}
|
|
|
|
|
2013-09-22 18:48:37 -04:00
|
|
|
my $client;
|
2021-10-19 21:50:13 -04:00
|
|
|
my %compressors = ();
|
|
|
|
|
|
|
|
$compressors{"none"} = "";
|
|
|
|
|
|
|
|
if (defined($Nix::Config::bzip2)) {
|
|
|
|
$compressors{"bzip2"} = "| $Nix::Config::bzip2",
|
|
|
|
}
|
|
|
|
|
|
|
|
if (defined($Nix::Config::xz)) {
|
|
|
|
$compressors{"xz"} = "| $Nix::Config::xz",
|
|
|
|
}
|
|
|
|
|
Add a plugin for backing up builds in s3
In your hydra config, you can add an arbitrary number of <s3config>
sections, with the following options:
* name (required): Bucket name
* jobs (required): A regex to match job names (in project:jobset:job
format) that should be backed up to this bucket
* compression_type: bzip2 (default), xz, or none
* prefix: String to prepend to all hydra-created s3 keys (if this is
meant to represent a directory, you should include the trailing slash,
e.g. "cache/"). Default "".
After each build with an output (i.e. successful or failed-with-output
builds), the output path and its closure are uploaded to the bucket as
.nar files, with corresponding .narinfos to enable use as a binary
cache.
This plugin requires that s3 credentials be available. It uses
Net::Amazon::S3, which as of this commit the nixpkgs version can
retrieve s3 credentials from the AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY environment variables, or from ec2 instance
metadata when using an IAM role.
This commit also adds a hydra-s3-backup-collect-garbage program, which
uses hydra's gc roots directory to determine which paths are live, and
then deletes all files except nix-cache-info and any .nar or .narinfo
files corresponding to live paths. hydra-s3-backup-collect-garbage
respects the prefix configuration option, so it won't delete anything
outside of the hierarchy you give it, and it has the same credential
requirements as the plugin. Probably a timer unit running the garbage
collection periodically should be added to hydra-module.nix
Note that two of the added tests fail, due to a bug in the interaction
between Net::Amazon::S3 and fake-s3. Those behaviors work against real
s3 though, so I'm committing this even with the broken tests.
Signed-off-by: Shea Levy <shea@shealevy.com>
2013-09-03 10:53:56 -04:00
|
|
|
my $lockfile = Hydra::Model::DB::getHydraPath . "/.hydra-s3backup.lock";
|
|
|
|
|
|
|
|
sub buildFinished {
|
|
|
|
my ($self, $build, $dependents) = @_;
|
|
|
|
|
|
|
|
return unless $build->buildstatus == 0 or $build->buildstatus == 6;
|
|
|
|
|
|
|
|
my $jobName = showJobName $build;
|
|
|
|
my $job = $build->job;
|
|
|
|
|
|
|
|
my $cfg = $self->{config}->{s3backup};
|
|
|
|
my @config = defined $cfg ? ref $cfg eq "ARRAY" ? @$cfg : ($cfg) : ();
|
|
|
|
|
|
|
|
my @matching_configs = ();
|
|
|
|
foreach my $bucket_config (@config) {
|
|
|
|
push @matching_configs, $bucket_config if $jobName =~ /^$bucket_config->{jobs}$/;
|
|
|
|
}
|
|
|
|
|
|
|
|
return unless @matching_configs;
|
2013-09-22 18:48:37 -04:00
|
|
|
unless (defined $client) {
|
|
|
|
$client = Net::Amazon::S3::Client->new( s3 => Net::Amazon::S3->new( retry => 1 ) );
|
|
|
|
}
|
Add a plugin for backing up builds in s3
In your hydra config, you can add an arbitrary number of <s3config>
sections, with the following options:
* name (required): Bucket name
* jobs (required): A regex to match job names (in project:jobset:job
format) that should be backed up to this bucket
* compression_type: bzip2 (default), xz, or none
* prefix: String to prepend to all hydra-created s3 keys (if this is
meant to represent a directory, you should include the trailing slash,
e.g. "cache/"). Default "".
After each build with an output (i.e. successful or failed-with-output
builds), the output path and its closure are uploaded to the bucket as
.nar files, with corresponding .narinfos to enable use as a binary
cache.
This plugin requires that s3 credentials be available. It uses
Net::Amazon::S3, which as of this commit the nixpkgs version can
retrieve s3 credentials from the AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY environment variables, or from ec2 instance
metadata when using an IAM role.
This commit also adds a hydra-s3-backup-collect-garbage program, which
uses hydra's gc roots directory to determine which paths are live, and
then deletes all files except nix-cache-info and any .nar or .narinfo
files corresponding to live paths. hydra-s3-backup-collect-garbage
respects the prefix configuration option, so it won't delete anything
outside of the hierarchy you give it, and it has the same credential
requirements as the plugin. Probably a timer unit running the garbage
collection periodically should be added to hydra-module.nix
Note that two of the added tests fail, due to a bug in the interaction
between Net::Amazon::S3 and fake-s3. Those behaviors work against real
s3 though, so I'm committing this even with the broken tests.
Signed-off-by: Shea Levy <shea@shealevy.com>
2013-09-03 10:53:56 -04:00
|
|
|
|
|
|
|
# !!! Maybe should do per-bucket locking?
|
|
|
|
my $lockhandle = IO::File->new;
|
|
|
|
open($lockhandle, "+>", $lockfile) or die "Opening $lockfile: $!";
|
|
|
|
flock($lockhandle, Fcntl::LOCK_SH) or die "Read-locking $lockfile: $!";
|
|
|
|
|
|
|
|
my @needed_paths = ();
|
|
|
|
foreach my $output ($build->buildoutputs) {
|
|
|
|
push @needed_paths, $output->path;
|
|
|
|
}
|
|
|
|
|
|
|
|
my %narinfos = ();
|
|
|
|
my %compression_types = ();
|
|
|
|
foreach my $bucket_config (@matching_configs) {
|
|
|
|
my $compression_type =
|
|
|
|
exists $bucket_config->{compression_type} ? $bucket_config->{compression_type} : "bzip2";
|
|
|
|
die "Unsupported compression type $compression_type" unless exists $compressors{$compression_type};
|
|
|
|
if (exists $compression_types{$compression_type}) {
|
|
|
|
push @{$compression_types{$compression_type}}, $bucket_config;
|
|
|
|
} else {
|
|
|
|
$compression_types{$compression_type} = [ $bucket_config ];
|
|
|
|
$narinfos{$compression_type} = [];
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
my $build_id = $build->id;
|
2014-12-10 22:06:32 -05:00
|
|
|
my $tempdir = File::Temp->newdir("s3-backup-nars-$build_id" . "XXXXX", TMPDIR => 1);
|
Add a plugin for backing up builds in s3
In your hydra config, you can add an arbitrary number of <s3config>
sections, with the following options:
* name (required): Bucket name
* jobs (required): A regex to match job names (in project:jobset:job
format) that should be backed up to this bucket
* compression_type: bzip2 (default), xz, or none
* prefix: String to prepend to all hydra-created s3 keys (if this is
meant to represent a directory, you should include the trailing slash,
e.g. "cache/"). Default "".
After each build with an output (i.e. successful or failed-with-output
builds), the output path and its closure are uploaded to the bucket as
.nar files, with corresponding .narinfos to enable use as a binary
cache.
This plugin requires that s3 credentials be available. It uses
Net::Amazon::S3, which as of this commit the nixpkgs version can
retrieve s3 credentials from the AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY environment variables, or from ec2 instance
metadata when using an IAM role.
This commit also adds a hydra-s3-backup-collect-garbage program, which
uses hydra's gc roots directory to determine which paths are live, and
then deletes all files except nix-cache-info and any .nar or .narinfo
files corresponding to live paths. hydra-s3-backup-collect-garbage
respects the prefix configuration option, so it won't delete anything
outside of the hierarchy you give it, and it has the same credential
requirements as the plugin. Probably a timer unit running the garbage
collection periodically should be added to hydra-module.nix
Note that two of the added tests fail, due to a bug in the interaction
between Net::Amazon::S3 and fake-s3. Those behaviors work against real
s3 though, so I'm committing this even with the broken tests.
Signed-off-by: Shea Levy <shea@shealevy.com>
2013-09-03 10:53:56 -04:00
|
|
|
|
|
|
|
my %seen = ();
|
|
|
|
# Upload nars and build narinfos
|
|
|
|
while (@needed_paths) {
|
|
|
|
my $path = shift @needed_paths;
|
|
|
|
next if exists $seen{$path};
|
|
|
|
$seen{$path} = undef;
|
|
|
|
my $hash = substr basename($path), 0, 32;
|
|
|
|
my ($deriver, $narHash, $time, $narSize, $refs) = queryPathInfo($path, 0);
|
|
|
|
my $system;
|
|
|
|
if (defined $deriver and isValidPath($deriver)) {
|
|
|
|
$system = derivationFromPath($deriver)->{platform};
|
|
|
|
}
|
|
|
|
foreach my $reference (@{$refs}) {
|
|
|
|
push @needed_paths, $reference;
|
|
|
|
}
|
2021-10-20 11:56:16 -04:00
|
|
|
foreach my $compression_type (keys %compression_types) {
|
|
|
|
my $configs = $compression_types{$compression_type};
|
Add a plugin for backing up builds in s3
In your hydra config, you can add an arbitrary number of <s3config>
sections, with the following options:
* name (required): Bucket name
* jobs (required): A regex to match job names (in project:jobset:job
format) that should be backed up to this bucket
* compression_type: bzip2 (default), xz, or none
* prefix: String to prepend to all hydra-created s3 keys (if this is
meant to represent a directory, you should include the trailing slash,
e.g. "cache/"). Default "".
After each build with an output (i.e. successful or failed-with-output
builds), the output path and its closure are uploaded to the bucket as
.nar files, with corresponding .narinfos to enable use as a binary
cache.
This plugin requires that s3 credentials be available. It uses
Net::Amazon::S3, which as of this commit the nixpkgs version can
retrieve s3 credentials from the AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY environment variables, or from ec2 instance
metadata when using an IAM role.
This commit also adds a hydra-s3-backup-collect-garbage program, which
uses hydra's gc roots directory to determine which paths are live, and
then deletes all files except nix-cache-info and any .nar or .narinfo
files corresponding to live paths. hydra-s3-backup-collect-garbage
respects the prefix configuration option, so it won't delete anything
outside of the hierarchy you give it, and it has the same credential
requirements as the plugin. Probably a timer unit running the garbage
collection periodically should be added to hydra-module.nix
Note that two of the added tests fail, due to a bug in the interaction
between Net::Amazon::S3 and fake-s3. Those behaviors work against real
s3 though, so I'm committing this even with the broken tests.
Signed-off-by: Shea Levy <shea@shealevy.com>
2013-09-03 10:53:56 -04:00
|
|
|
my @incomplete_buckets = ();
|
|
|
|
# Don't do any work if all the buckets have this path
|
|
|
|
foreach my $bucket_config (@{$configs}) {
|
|
|
|
my $bucket = $client->bucket( name => $bucket_config->{name} );
|
|
|
|
my $prefix = exists $bucket_config->{prefix} ? $bucket_config->{prefix} : "";
|
|
|
|
push @incomplete_buckets, $bucket_config
|
|
|
|
unless $bucket->object( key => $prefix . "$hash.narinfo" )->exists;
|
|
|
|
}
|
|
|
|
next unless @incomplete_buckets;
|
|
|
|
my $compressor = $compressors{$compression_type};
|
2014-12-10 23:06:52 -05:00
|
|
|
system("$Nix::Config::binDir/nix-store --dump $path $compressor > $tempdir/nar") == 0 or die;
|
Add a plugin for backing up builds in s3
In your hydra config, you can add an arbitrary number of <s3config>
sections, with the following options:
* name (required): Bucket name
* jobs (required): A regex to match job names (in project:jobset:job
format) that should be backed up to this bucket
* compression_type: bzip2 (default), xz, or none
* prefix: String to prepend to all hydra-created s3 keys (if this is
meant to represent a directory, you should include the trailing slash,
e.g. "cache/"). Default "".
After each build with an output (i.e. successful or failed-with-output
builds), the output path and its closure are uploaded to the bucket as
.nar files, with corresponding .narinfos to enable use as a binary
cache.
This plugin requires that s3 credentials be available. It uses
Net::Amazon::S3, which as of this commit the nixpkgs version can
retrieve s3 credentials from the AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY environment variables, or from ec2 instance
metadata when using an IAM role.
This commit also adds a hydra-s3-backup-collect-garbage program, which
uses hydra's gc roots directory to determine which paths are live, and
then deletes all files except nix-cache-info and any .nar or .narinfo
files corresponding to live paths. hydra-s3-backup-collect-garbage
respects the prefix configuration option, so it won't delete anything
outside of the hierarchy you give it, and it has the same credential
requirements as the plugin. Probably a timer unit running the garbage
collection periodically should be added to hydra-module.nix
Note that two of the added tests fail, due to a bug in the interaction
between Net::Amazon::S3 and fake-s3. Those behaviors work against real
s3 though, so I'm committing this even with the broken tests.
Signed-off-by: Shea Levy <shea@shealevy.com>
2013-09-03 10:53:56 -04:00
|
|
|
my $digest = Digest::SHA->new(256);
|
|
|
|
$digest->addfile("$tempdir/nar");
|
|
|
|
my $file_hash = $digest->hexdigest;
|
|
|
|
my @stats = stat "$tempdir/nar" or die "Couldn't stat $tempdir/nar";
|
|
|
|
my $file_size = $stats[7];
|
|
|
|
my $narinfo = "";
|
|
|
|
$narinfo .= "StorePath: $path\n";
|
|
|
|
$narinfo .= "URL: $hash.nar\n";
|
|
|
|
$narinfo .= "Compression: $compression_type\n";
|
|
|
|
$narinfo .= "FileHash: sha256:$file_hash\n";
|
|
|
|
$narinfo .= "FileSize: $file_size\n";
|
|
|
|
$narinfo .= "NarHash: $narHash\n";
|
|
|
|
$narinfo .= "NarSize: $narSize\n";
|
|
|
|
$narinfo .= "References: " . join(" ", map { basename $_ } @{$refs}) . "\n";
|
|
|
|
if (defined $deriver) {
|
|
|
|
$narinfo .= "Deriver: " . basename $deriver . "\n";
|
|
|
|
if (defined $system) {
|
|
|
|
$narinfo .= "System: $system\n";
|
|
|
|
}
|
|
|
|
}
|
|
|
|
push @{$narinfos{$compression_type}}, { hash => $hash, info => $narinfo };
|
|
|
|
foreach my $bucket_config (@incomplete_buckets) {
|
|
|
|
my $bucket = $client->bucket( name => $bucket_config->{name} );
|
|
|
|
my $prefix = exists $bucket_config->{prefix} ? $bucket_config->{prefix} : "";
|
|
|
|
my $nar_object = $bucket->object(
|
|
|
|
key => $prefix . "$hash.nar",
|
|
|
|
content_type => "application/x-nix-archive"
|
|
|
|
);
|
|
|
|
$nar_object->put_filename("$tempdir/nar");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
# Upload narinfos
|
2021-10-20 11:56:16 -04:00
|
|
|
foreach my $compression_type (keys %narinfos) {
|
|
|
|
my $infos = $narinfos{$compression_type};
|
Add a plugin for backing up builds in s3
In your hydra config, you can add an arbitrary number of <s3config>
sections, with the following options:
* name (required): Bucket name
* jobs (required): A regex to match job names (in project:jobset:job
format) that should be backed up to this bucket
* compression_type: bzip2 (default), xz, or none
* prefix: String to prepend to all hydra-created s3 keys (if this is
meant to represent a directory, you should include the trailing slash,
e.g. "cache/"). Default "".
After each build with an output (i.e. successful or failed-with-output
builds), the output path and its closure are uploaded to the bucket as
.nar files, with corresponding .narinfos to enable use as a binary
cache.
This plugin requires that s3 credentials be available. It uses
Net::Amazon::S3, which as of this commit the nixpkgs version can
retrieve s3 credentials from the AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY environment variables, or from ec2 instance
metadata when using an IAM role.
This commit also adds a hydra-s3-backup-collect-garbage program, which
uses hydra's gc roots directory to determine which paths are live, and
then deletes all files except nix-cache-info and any .nar or .narinfo
files corresponding to live paths. hydra-s3-backup-collect-garbage
respects the prefix configuration option, so it won't delete anything
outside of the hierarchy you give it, and it has the same credential
requirements as the plugin. Probably a timer unit running the garbage
collection periodically should be added to hydra-module.nix
Note that two of the added tests fail, due to a bug in the interaction
between Net::Amazon::S3 and fake-s3. Those behaviors work against real
s3 though, so I'm committing this even with the broken tests.
Signed-off-by: Shea Levy <shea@shealevy.com>
2013-09-03 10:53:56 -04:00
|
|
|
foreach my $bucket_config (@{$compression_types{$compression_type}}) {
|
|
|
|
foreach my $info (@{$infos}) {
|
|
|
|
my $bucket = $client->bucket( name => $bucket_config->{name} );
|
|
|
|
my $prefix = exists $bucket_config->{prefix} ? $bucket_config->{prefix} : "";
|
|
|
|
my $narinfo_object = $bucket->object(
|
|
|
|
key => $prefix . $info->{hash} . ".narinfo",
|
|
|
|
content_type => "text/x-nix-narinfo"
|
|
|
|
);
|
|
|
|
$narinfo_object->put($info->{info}) unless $narinfo_object->exists;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
1;
|