| Title: How the OpenBSD -stable packages are built | |
| Author: Solène | |
| Date: 29 October 2020 | |
| Tags: openbsd | |
| Description: | |
| In this long blog post, I will write about the technical details | |
| of the OpenBSD stable packages building infrastructure. I have setup | |
| the infrastructure with the help of Theo De Raadt who provided me | |
| the hardware in summer 2019, since then, OpenBSD users can upgrade | |
| their packages using `pkg_add -u` for critical updates that has | |
| been backported by the contributors. Many thanks to them, without | |
| their work there would be no packages to build. Thanks to pea@ who | |
| is my backup for operating this infrastructure in case something | |
| happens to me. | |
| **The total lines of code used is around 110 lines of shell.** | |
| ## Original design | |
| In the original design, the process was the following. It was done | |
| separately on each machine (amd64, arm64, i386, sparc64). | |
| ### Updating ports | |
| First step is to update the ports tree using `cvs up` from a cron | |
| job and capture its output. **If** there is a result, the process | |
| continues into the next steps and we discard the result. | |
| With CVS being per-directory and not using a database like git or | |
| svn, it is not possible to "poll" for an update except by verifying | |
| every directory if a new version of files is available. This check | |
| is done three time a day. | |
| ### Make a list of ports to compile | |
| This step is the most complicated of the process and weights for a | |
| third of the total lines of code. | |
| The script uses `cvs rdiff` between the cvs release and stable | |
| branches to show what changed since release, and its output is | |
| passed through a few grep and awk scripts to only retrieve the | |
| "pkgpaths" (the pkgpath of curl is **net/curl**) of the packages | |
| that were updated since the last release. | |
| From this raw output of cvs rdiff: | |
| File ports/net/dhcpcd/Makefile changed from revision 1.80 to | |
| 1.80.2.1 | |
| File ports/net/dhcpcd/distinfo changed from revision 1.48 to | |
| 1.48.2.1 | |
| File ports/net/dnsdist/Makefile changed from revision 1.19 to | |
| 1.19.2.1 | |
| File ports/net/dnsdist/distinfo changed from revision 1.7 to | |
| 1.7.2.1 | |
| File ports/net/icinga/core2/Makefile changed from revision 1.104 to | |
| 1.104.2.1 | |
| File ports/net/icinga/core2/distinfo changed from revision 1.40 to | |
| 1.40.2.1 | |
| File ports/net/synapse/Makefile changed from revision 1.13 to | |
| 1.13.2.1 | |
| File ports/net/synapse/distinfo changed from revision 1.11 to | |
| 1.11.2.1 | |
| File ports/net/synapse/pkg/PLIST changed from revision 1.10 to | |
| 1.10.2.1 | |
| The script will produce: | |
| net/dhcpcd | |
| net/dnsdist | |
| net/icinga/core2 | |
| net/synapse | |
| From here, for each pkgpath we have sorted out, the sqlports database | |
| is queried to get the full list of pkgpaths of each packages, this | |
| will include all packages like flavors, subpackages and multipackages. | |
| This is important because an update in `editors/vim` pkgpath will | |
| trigger this long list of packages: | |
| editors/vim,-lang | |
| editors/vim,-main | |
| editors/vim,gtk2 | |
| editors/vim,gtk2,-lang | |
| [...40 results hidden for readability...] | |
| editors/vim,no_x11,ruby | |
| editors/vim,no_x11,ruby,-lang | |
| editors/vim,no_x11,ruby,-main | |
| Once we gathered all the pkgpaths to build and stored them in a | |
| file, next step can start. | |
| ### Preparing the environment | |
| As the compilation is done on the real system (using PORTS_PRIVSEP | |
| though) and not in a chroot we need to clean all packages installed | |
| except the minimum required for the build infrastructure, which are | |
| rsync and sqlports. | |
| `dpb(1)` can't be used because it didn't gave good results for | |
| building the delta of the packages between release and stable. | |
| The various temporary directories used by the ports infrastructure | |
| are cleaned to be sure the build starts in a clean environment. | |
| ### Compiling and creating the packages | |
| This step is really simple. The ports infrastructure is used | |
| to build the packages list we produced at step 2. | |
| env SUBDIRLIST=package_list BULK=yes make package | |
| In the script there is some code to manage the logs of the previous | |
| batch but there is nothing more. | |
| Every new run of the process will pass over all the packages which | |
| received a commit, but the ports infrastructure is smart enough to | |
| avoid rebuilding ports which already have a package with the correct | |
| version. | |
| ### Transfer the package to the signing team | |
| Once the packages are built, we need to pass only the built | |
| packages to the person who will manually sign the packages before | |
| publishing them and have the mirrors to sync. | |
| From the package list, the package file lists are generated and | |
| reused by rsync to only copy the packages generated. | |
| env SUBDIRLIST=package_list show=PKGNAMES make | grep -v "^=" | \ | |
| grep ^. | tr ' ' '\n' | sed 's,$,\.tgz,' | sort -u | |
| **The system has all the -release packages in | |
| `${PACKAGE_REPOSITORY}/${MACHINE_ARCH}/all/` (like | |
| `/usr/ports/packages/amd64/all`) to avoid rebuilding all dependencies | |
| required for building a package update, thus we can't copy all the | |
| packages from the directory where the packages are moved after | |
| compilation.** | |
| ### Send a notification | |
| Last step is to send an email with the output of rsync to send an | |
| email telling which machine built which package to tell the people | |
| signing the packages that some packages are available. | |
| As this process is done on each machine and that they | |
| don't necessarily build the same packages (no firefox on sparc64) | |
| and they don't build at the same speed (arm64 is slower), mails | |
| from the four machines could arrive at very different time, which | |
| led to a small design change. | |
| The whole process is automatic from building to delivering the | |
| packages for signature. The signature step requires a human to be | |
| done though, but this is the price for security and privilege | |
| separation. | |
| ## Current design | |
| In the original design, all the servers were running their separate | |
| cron job, updating their own cvs ports tree and doing a very long | |
| cvs diff. The result was working but not very practical for the | |
| people signing who were receiving mails from each machine for each | |
| batch. | |
| The new design only changed one thing: One machine was chosen to | |
| run the cron job, produce the package list and then will copy that | |
| list to the other machines which update their ports tree and run | |
| the build. Once all machines finished to build, the initiator machine | |
| will gather outputs and send an unique mail with a summary of each | |
| machine. This became easier to compare the output of each architecture | |
| and once you receive the email this means every machine finished | |
| their job and the signing can be done. | |
| Having the summary of all the building machines resulted in another | |
| improvement: In the logic of the script, it is possible to send an | |
| email telling absolutely no package has been built while the process | |
| was triggered, which means, something went wrong. From here, I | |
| need to check the logs to understand why the last commit didn't | |
| produce a package. This can be failures like a **distinfo** file | |
| update forgotten in the commit. | |
| Also, this permitted fixing one issue: As the distfiles are shared | |
| through a common NFS mount point, if multiples machines try to fetch | |
| a distfile at the same time, both will fail to build. Now, the | |
| initiator machine will download all the required distfiles before | |
| starting the build on every node. | |
| All of the previous scripts were reused, except the one | |
| sending the email which had to be rewritten. |