Title: How the OpenBSD -stable packages are built | |
Author: Solène | |
Date: 29 October 2020 | |
Tags: openbsd | |
Description: | |
In this long blog post, I will write about the technical details | |
of the OpenBSD stable packages building infrastructure. I have setup | |
the infrastructure with the help of Theo De Raadt who provided me | |
the hardware in summer 2019, since then, OpenBSD users can upgrade | |
their packages using `pkg_add -u` for critical updates that has | |
been backported by the contributors. Many thanks to them, without | |
their work there would be no packages to build. Thanks to pea@ who | |
is my backup for operating this infrastructure in case something | |
happens to me. | |
**The total lines of code used is around 110 lines of shell.** | |
## Original design | |
In the original design, the process was the following. It was done | |
separately on each machine (amd64, arm64, i386, sparc64). | |
### Updating ports | |
First step is to update the ports tree using `cvs up` from a cron | |
job and capture its output. **If** there is a result, the process | |
continues into the next steps and we discard the result. | |
With CVS being per-directory and not using a database like git or | |
svn, it is not possible to "poll" for an update except by verifying | |
every directory if a new version of files is available. This check | |
is done three time a day. | |
### Make a list of ports to compile | |
This step is the most complicated of the process and weights for a | |
third of the total lines of code. | |
The script uses `cvs rdiff` between the cvs release and stable | |
branches to show what changed since release, and its output is | |
passed through a few grep and awk scripts to only retrieve the | |
"pkgpaths" (the pkgpath of curl is **net/curl**) of the packages | |
that were updated since the last release. | |
From this raw output of cvs rdiff: | |
File ports/net/dhcpcd/Makefile changed from revision 1.80 to | |
1.80.2.1 | |
File ports/net/dhcpcd/distinfo changed from revision 1.48 to | |
1.48.2.1 | |
File ports/net/dnsdist/Makefile changed from revision 1.19 to | |
1.19.2.1 | |
File ports/net/dnsdist/distinfo changed from revision 1.7 to | |
1.7.2.1 | |
File ports/net/icinga/core2/Makefile changed from revision 1.104 to | |
1.104.2.1 | |
File ports/net/icinga/core2/distinfo changed from revision 1.40 to | |
1.40.2.1 | |
File ports/net/synapse/Makefile changed from revision 1.13 to | |
1.13.2.1 | |
File ports/net/synapse/distinfo changed from revision 1.11 to | |
1.11.2.1 | |
File ports/net/synapse/pkg/PLIST changed from revision 1.10 to | |
1.10.2.1 | |
The script will produce: | |
net/dhcpcd | |
net/dnsdist | |
net/icinga/core2 | |
net/synapse | |
From here, for each pkgpath we have sorted out, the sqlports database | |
is queried to get the full list of pkgpaths of each packages, this | |
will include all packages like flavors, subpackages and multipackages. | |
This is important because an update in `editors/vim` pkgpath will | |
trigger this long list of packages: | |
editors/vim,-lang | |
editors/vim,-main | |
editors/vim,gtk2 | |
editors/vim,gtk2,-lang | |
[...40 results hidden for readability...] | |
editors/vim,no_x11,ruby | |
editors/vim,no_x11,ruby,-lang | |
editors/vim,no_x11,ruby,-main | |
Once we gathered all the pkgpaths to build and stored them in a | |
file, next step can start. | |
### Preparing the environment | |
As the compilation is done on the real system (using PORTS_PRIVSEP | |
though) and not in a chroot we need to clean all packages installed | |
except the minimum required for the build infrastructure, which are | |
rsync and sqlports. | |
`dpb(1)` can't be used because it didn't gave good results for | |
building the delta of the packages between release and stable. | |
The various temporary directories used by the ports infrastructure | |
are cleaned to be sure the build starts in a clean environment. | |
### Compiling and creating the packages | |
This step is really simple. The ports infrastructure is used | |
to build the packages list we produced at step 2. | |
env SUBDIRLIST=package_list BULK=yes make package | |
In the script there is some code to manage the logs of the previous | |
batch but there is nothing more. | |
Every new run of the process will pass over all the packages which | |
received a commit, but the ports infrastructure is smart enough to | |
avoid rebuilding ports which already have a package with the correct | |
version. | |
### Transfer the package to the signing team | |
Once the packages are built, we need to pass only the built | |
packages to the person who will manually sign the packages before | |
publishing them and have the mirrors to sync. | |
From the package list, the package file lists are generated and | |
reused by rsync to only copy the packages generated. | |
env SUBDIRLIST=package_list show=PKGNAMES make | grep -v "^=" | \ | |
grep ^. | tr ' ' '\n' | sed 's,$,\.tgz,' | sort -u | |
**The system has all the -release packages in | |
`${PACKAGE_REPOSITORY}/${MACHINE_ARCH}/all/` (like | |
`/usr/ports/packages/amd64/all`) to avoid rebuilding all dependencies | |
required for building a package update, thus we can't copy all the | |
packages from the directory where the packages are moved after | |
compilation.** | |
### Send a notification | |
Last step is to send an email with the output of rsync to send an | |
email telling which machine built which package to tell the people | |
signing the packages that some packages are available. | |
As this process is done on each machine and that they | |
don't necessarily build the same packages (no firefox on sparc64) | |
and they don't build at the same speed (arm64 is slower), mails | |
from the four machines could arrive at very different time, which | |
led to a small design change. | |
The whole process is automatic from building to delivering the | |
packages for signature. The signature step requires a human to be | |
done though, but this is the price for security and privilege | |
separation. | |
## Current design | |
In the original design, all the servers were running their separate | |
cron job, updating their own cvs ports tree and doing a very long | |
cvs diff. The result was working but not very practical for the | |
people signing who were receiving mails from each machine for each | |
batch. | |
The new design only changed one thing: One machine was chosen to | |
run the cron job, produce the package list and then will copy that | |
list to the other machines which update their ports tree and run | |
the build. Once all machines finished to build, the initiator machine | |
will gather outputs and send an unique mail with a summary of each | |
machine. This became easier to compare the output of each architecture | |
and once you receive the email this means every machine finished | |
their job and the signing can be done. | |
Having the summary of all the building machines resulted in another | |
improvement: In the logic of the script, it is possible to send an | |
email telling absolutely no package has been built while the process | |
was triggered, which means, something went wrong. From here, I | |
need to check the logs to understand why the last commit didn't | |
produce a package. This can be failures like a **distinfo** file | |
update forgotten in the commit. | |
Also, this permitted fixing one issue: As the distfiles are shared | |
through a common NFS mount point, if multiples machines try to fetch | |
a distfile at the same time, both will fail to build. Now, the | |
initiator machine will download all the required distfiles before | |
starting the build on every node. | |
All of the previous scripts were reused, except the one | |
sending the email which had to be rewritten. |