unixcat is like netcat, but for unix domain sockets.

But wait! Doesn’t netcat already support unix domain sockets? Yes, but only barely; read on.

Background

Recently, I was writing a program that (for various reasons) needed to connect to a unix domain socket and pass a file descriptor to another process.

“Passing a file descriptor” is a convenience feature for unix domain sockets. If you have a file open in one process and you’d like it to also be open in another process, you can send the file descriptor in a special message—the kernel will take care of deconflicting resources, and the receiving process gets a file descriptor that refers to the same file.

As I was writing this program, I was also obliged to write a listener that received and parsed this message in order to test that I was doing it right. I found this to be rather annoying—when implementing some protocol, I always prefer to have a reference peer to make testing easy. Normally, my tool of choice for testing networking programs is netcat; but netcat, despite its vast array of features (at least in the version maintained by nmap.org—other versions have fewer features), does not support passing a file descriptor over a unix domain socket.

Without a reference implementation, I have made the same mistake when implementing both sides of a protocol, resulting in a hard-to-debug situation where both the sender and receiver are wrong by spec, but can communicate with each other just fine.

So I decided to implement my own tool that allows for passing file descriptors, as well as other things, from the command line. If you want to try it out or read the documentation, head over to the repo. The rest of this post will be some self-indulgent notes on the experience of writing unixcat.

It’s not just file descriptors

Messages that contain extra information (like file descriptors) are called “ancillary messages”. Passing a file descriptor is the only portable ancillary message, but the goal here was supporting as many systems, and all of their available ancillary messages, as possible.

Linux allows you to request your peer’s SELinux security context over a unix domain socket, but the real trick to implement was passing process credentials.

Three systems, five implementations

The concept of passing process credentials is pretty simple: provide the right combination of socket flags and the kernel will tell you the pid, uid, and gid of your peer (or some variation thereof, depending on the OS). However, almost none of the Unixes do it exactly the same way.

MacOS, OpenBSD, and DragonflyBSD all disable passing credentials.

Disabling credential passing is a boring but safe decision. Providing these credentials seems secure, because the kernel can make sure that the peer process isn’t lying, but there are possible race conditions and unexpected situations where the received uid is the effective uid instead of the real uid.

NetBSD allows you to pass credentials, but only from the receiver side. If you want your peer’s credentials, you specify a socket option and the kernel fills in the credentials with the next recvmsg call. Easy enough.

Linux (expectedly) and FreeBSD (unexpectedly) are the two implementations that provided all of the complexity. Like NetBSD, Linux has a socket option that the receiver can specify to get peer credentials. (Unlike NetBSD, this option turns on receipt of credentials until it’s disabled—NetBSD’s version is one message and done.) But Linux also allows for sending credentials—a privileged process can fill in whatever values it likes in the credentials field. Unintuitively, sending credentials on Linux is only allowed if the receiver has specified the requisite socket option—no sending credentials without consent from the receiver.

As for FreeBSD… it has the same “receive credentials” option as NetBSD, but NetBSD’s option only enables receiving credentials, so FreeBSD also provides an option to receive credentials with every message, but that option provides slightly different credentials than the “receive only once” option, and therefore a different type of ancillary message; you can also send credentials on FreeBSD, though you can’t modify them like Linux, and sending credentials provides yet another slightly different set of credentials, but not a different type of ancillary message, so the only way to differentiate between sent and received credentials is to check the length of the data; also, setting the receive option will override the sent message in that you will get the received credential struct`and the sent credential struct will be discarded.

Trying to fit this all in one program was a chore.

Sometimes old tools are the best tools

unixcat is meant to be a fairly old-school debugging tool that supports as many Unix-derived operating systems as possible. It turns out the best build system for this is still GNU Autotools. Prior to this project, I had only interacted with Autotools as a user, so it was an interesting experience to write a configure.ac script. I wouldn’t say that Autotools is easy to use, but it is battle-tested. If it’s possible to support what you want to do, Autotools has a way. Writing my own feature tests was reasonably straightforward—I was able to stumble through writing some M4 macros without really understanding the language.

So how did you use LLMs?

unixcat was the first time I used an LLM as anything more than a toy. “Agentic programming” became a buzzword a few months before I committed any code, so I put aside my natural tendency towards curmudgeonliness, bought $20 in Anthropic credits, and downloaded Claude Code. It worked pretty well! Not being an LLM power-user, I had bounced off using LLM web interfaces for programming. As far as I could tell, the key part of getting them to work well was providing them with the correct context—since I was just working in the web interface, that involved a lot of copy-pasting that I found awkward. Being able to run claude in my terminal and telling it to read certain files removed those barriers.

Don’t take anything I say about LLMs too seriously at the moment. My particular combination of stinginess and intransigence make me a particularly poor example to follow. Some day, I will have an educated and well-considered opinion about using LLMs for programming, but for now these are just anecdotes.

My most successful use of claude was writing autoconf tests and bash scripts. Whenever I had the feeling of “ugh, I really don’t want to write this,” getting an LLM to generate a starting point helped break through any writer’s block. Telling an LLM to “write some scripts to test out every possible combination of these three options” is a lot nicer than actually writing the scripts yourself.

I had far less success getting claude to write any feature code. For example, dealing with a subtle bug in my polling loop, I wrote a test, verified it failed, and told claude to fix my poll loop until the test passed. This is a textbook use of agentic programming right? But the result was a hacky fix that addressed only the symptom of the problem; I hurriedly deleted it and went back to fixing it myself. It’s possible that further iteration would have created a better solution but I already felt like I was burning money. (I told you I was stingy.)

For the curious: I was mishandling reading an EOF value from a socket and closing the connection too early. claude’s solution was… to just not close the connection until recvmsg had returned an EOF value twice.

Cost: somewhere between $7–$8 over the course of two months. Benefits: less time spent looking up arcane syntax; probably more comprehensive tests on account of making them easier to write. Never let it be said I don’t change with the times.

Introducing unixcat

Background

It’s not just file descriptors

Three systems, five implementations

Sometimes old tools are the best tools

So how did you use LLMs?

Further reading