This week on BSDNow, Allan and I are in Tokyo for AsiaBSDCon, but not to worry, we have a full episode lined up and ready to go. Hackathon reports
href="http://www.digitalocean.com/" title="DigitalOcean">
href="http://www.tarsnap.com/bsdnow" title="Tarsnap">
I have been working on and off for almost a year trying to get reproducible builds (the same source tree always builds an identical cdrom) on NetBSD. I did not think at the time it would take as long or be so difficult, so I did not keep a log of all the changes I needed to make. I was also not the only one working on this. Other NetBSD developers have been making improvements for the past 6 years. I would like to acknowledge the NetBSD build system (aka build.sh) which is a fully portable cross-build system. This build system has given us a head-start in the reproducible builds work.
I would also like to acknowledge the work done by the Debian folks who have provided a platform to run, test and analyze reproducible builds. Special mention to the diffoscope tool that gives an excellent overview of what's different between binary files, by finding out what they are (and if they are containers what they contain) and then running the appropriate formatter and diff program to show what's different for each file.
Finally other developers who have started, motivated and did a lot of work getting us here like Joerg Sonnenberger and Thomas Klausner for their work on reproducible builds, and Todd Vierling and Luke Mewburn for their work on build.sh.
Last week I gave a talk for the security class at Notre Dame based on features are faults but with some various commentary added. It was an exciting trip, with the opportunity to meet and talk with the computer vision group as well. Some other highlights include the Indiana skillet I had for breakfast, which came with pickles and was amazing, and explaining the many wonders of cvs to the Linux users group over lunch. After that came the talk, which went a little something like this.
I got started with OpenBSD back about the same time I started college, although I had a slightly different perspective then. I was using OpenBSD because it included so many security features, therefore it must be the most secure system, right? For example, at some point I acquired a second computer. What’s the first thing anybody does when they get a second computer? That’s right, set up a kerberos domain. The idea that more is better was everywhere. This was also around the time that ipsec was getting its final touches, and everybody knew ipsec was going to be the most secure protocol ever because it had more options than any other secure transport. We’ll revisit this in a bit.
There’s been a partial attitude adjustment since then, with more people recognizing that layering complexity doesn’t result in more security. It’s not an additive process. There’s a whole talk there, about the perfect security that people can’t or won’t use. OpenBSD has definitely switched directions, including less code, not more. All the kerberos code was deleted a few years ago.
Let’s assume about one bug per 100 lines of code. That’s probably on the low end. Now say your operating system has 100 million lines of code. If I’ve done the math correctly, that’s literally a million bugs. So that’s one reason to avoid adding features. But that’s a solveable problem. If we pick the right language and the right compiler and the right tooling and with enough eyeballs and effort, we can fix all the bugs. We know how to build mostly correct software, we just don’t care.
As we add features to software, increasing its complexity, new unexpected behaviors start to emerge. What are the bounds? How many features can you add before craziness is inevitable? We can make some guesses. Less than a thousand for sure. Probably less than a hundred? Ten maybe? I’ll argue the answer is quite possibly two. Interesting corollary is that it’s impossible to have a program with exactly two features. Any program with two features has at least a third, but you don’t know what it is
My first example is a bug in the NetBSD ftp client. We had one feature, we added a second feature, and just like that we got a third misfeature
Our story begins long ago. The origins of this bug are probably older than I am. In the dark times before the web, FTP sites used to be a pretty popular way of publishing files. You run an ftp client, connect to a remote site, and then you can browse the remote server somewhat like a local filesystem. List files, change directories, get files. Typically there would be a README file telling you what’s what, but you don’t need to download a copy to keep. Instead we can pipe the output to a program like more. Right there in the ftp client. No need to disconnect.
Fast forward a few decades, and http is the new protocol of choice. http is a much less interactive protocol, but the ftp client has some handy features for batch downloads like progress bars, etc. So let’s add http support to ftp. This works pretty well. Lots of code reused.
http has one quirk however that ftp doesn’t have. Redirects. The server can redirect the client to a different file. So now you’re thinking, what happens if I download http://somefile and the server sends back 302 http://|reboot. ftp reconnects to the server, gets the 200, starts downloading and saves it to a file called |reboot. Except it doesn’t. The function that saves files looks at the first character of the name and if it’s a pipe, runs that command instead. And now you just rebooted your computer. Or worse.
It’s pretty obvious this is not the desired behavior, but where exactly did things go wrong? Arguably, all the pieces were working according to spec. In order to see this bug coming, you needed to know how the save function worked, you needed to know about redirects, and you needed to put all the implications together.
What do we do about this? That’s a tough question. It’s much easier to poke fun at all the people who got things wrong. But we can try. My attitudes are shaped by experiences with the OpenBSD project, and I think we are doing a decent job of containing the complexity. Keep paring away at dependencies and reducing interactions. As a developer, saying “no” to all feature requests is actually very productive. It’s so much faster than implementing the feature. Sometimes users complain, but I’ve often received later feedback from users that they’d come to appreciate the simplicity.
There was a question about which of these vulnerabilities were found by researchers, as opposed to troublemakers. The answer was most, if not all of them, but it made me realize one additional point I hadn’t mentioned. Unlike the prototypical buffer overflow vulnerability, exploiting features is very reliable. Exploiting something like shellshock or imagetragick requires no customized assembly and is independent of CPU, OS, version, stack alignment, malloc implementation, etc. Within about 24 hours of the initial release of shellshock, I had logs of people trying to exploit it. So unless you’re on about a 12 hour patch cycle, you’re going to have a bad time.
The current code is written on top of GFS, a library with the generic support for writing filesystems, which was ported from Illumos. Because of significant differences between illumos VFS and FreeBSD VFS models, both the GFS and zfsctl code were heavily modified to work on FreeBSD. Nonetheless, they still contain quite a few ugly hacks and bugs.
This is a reimplementation of the zfsctl code where the VFS-specific bits are written from scratch and only the code that interacts with the rest of ZFS is reused.
Some ideas are picked from an independent work by Will (wca@)
The code that provides support for ZFS .zfs/ directory functionality has been reimplemented. It is no longer possible to create a snapshot by mkdir under .zfs/snapshot/. That should be the only user visible change.
TIL: On IllumOS, you can create, rename, and destroy snapshots, by manipulating the virtual directories in the .zfs/snapshots directory.
If enough people would find this feature useful, maybe it could be implemented (rm and rename have never existed on FreeBSD). At the same time, it seems like rather a lot of work, when the ZFS command line tools work so well. Although wca@ pointed out on IRC, it can be useful to be able to create a snapshot over NFS, or SMB.
With this change, all newly generated GELI keys will be approximately 2x as strong. Previously generated keys will take half as long to calculate, resulting in faster mounting of encrypted volumes. Users may choose to rekey, to generate a new key with the larger default number of iterations using the geli(8) setkey command. Security of existing data is not compromised, as ~1 second per brute force attempt is still a very high threshold.
I want OpenBSD and Debian to be able to obtain an IP via DHCP on their wired interfaces and I don't want external networking required for an NFS share to the VM. To accomplish this I need two interfaces since dhclient will erase any other IPv4 addresses already assigned. We can't assign an address directly to the bridge, but we can configure a virtual Ethernet device and add it.
MacObserver: Interview with Open Source Developer & Former Apple Manager Jordan Hubbard
2016 Google Summer of Code Mentor Summit and MeetBSD Trip Report: Gavin Atkinson