The eBPF Verifier: How it Works, How it Does Not Work, and How it Might Work

This was written in November 2024, as a final project for Dartmouth’s class COSC059 - Principles of Programming Languages, with small modifications for this blog. I want to thank Professor Sergey Bratus for his guidance, and a big part of the credit of this piece goes to him and his class. Introduction As the Linux kernel and ecosystem grew, user mode applications began to develop a growing need for adding and running safe kernel extensions. Consequently, the Berkeley Packet Filter (which, as its name indicates, was originally created for packet filtering in the kernel) was extended into eBPF, which allows safe execution of untrusted user-defined extensions in the kernel. These eBPF programs can be used for a wide variety of applications, including packet capturing and filtering, as well as tracing, instrumentation, and debugging, among others. They are typically written in high level programming languages, such as some subsets of C, Python, or Lua [7], and then compiled into a circa-x86 bytecode for the eBPF VM. This VM has eleven 64-bit registers (plus a special auxiliary register for ALU sanitation), a program counter, and a 512 byte fixed-size stack. This code is then loaded into the kernel with the bpf() syscall, and depending on the configuration, is either JIT-compiled into native instructions or ran in the eBPF interpreter [9]. In order to allow more flexibility and for optimization reasons, the eBPF programs also interact with the system using helper functions, which hook to other parts of the kernel. ...

November 30, 2024