The app for independent voices

How does the Linux kernel JIT compiler for BPF work?

A BPF program is compiled to an array of bytecode instructions for a register based virtual machine which executes inside the kernel. To speed things up, there are JIT compilers for different architectures. These JIT compilers unpack the BPF bytecode and directly emit the binary instructions for the CPU.

As a result of this, the JIT compiler is both simple and complicated. It is simple because it's all in one file and you know what is happening. For instance, JIT compilers in typical programming languages tend to be very complicated with intermediate representations and many passes. This also means there are no optimizations done (as far as I can see).

It is complicated because the JIT compiler is directly emitting the machine instructions without the help of any assembler. So you need to have knowledge of the instruction encoding format of the hardware.

The majority of the BPF instructions are simple, such as mov, add, sub, xor etc which have direct equivalent in x86 so usually the compiler needs to just lookup a table for translation. Similarly, there is one-to-one mapping of BPF registers to x86 registers via a table.

The following figure shows the translation loop of the JIT compiler for x86. You can see the compiler iterating through the BPF instructions. For each instruction it retrieves the opcode, the source and destination BPF registers, and other details. Then based on the BPF opcode type, it emits the right x86 instruction with proper encoding.

Nov 4, 2024
at
3:37 AM

Log in or sign up

Join the most interesting and insightful discussions.