eBPF (external Berkerly Packet Filter) is a technology originated in the Linux kernel that can run sandboxed programs in the OS kernel. It provides users with the ability to ingest user-defined program into the OS kernel without modifying the kernel code.
In this post, I will share some basics about eBPF and write a simple eBPF program.
eBPF is originated in the Linux kernel, which is used to provide the flexibility to dispatch network packets to their corresponding process. Initiallly, the Linux uses virtualization to run multiple user processes in the same time, howerver there is only on network interface to send and receive network packets. When receiving the network packet, the kernel needs to know which process should be sent to. But polling every process to check the information is inefficient. Also, adding user logic into the kernel may increase its complexity and makes it harder to maintain. Consequently, eBPF is introduced and used to provide such a flexibility and programmability to run user-defined program in the kernel, without harming the OS.
eBPF provides both high performance and extensibility:
- JIT compiler is able to provide high execution efficency as running code in the kernel
- It could add protocol and routing policy easily without modifying the kernel.
eBPF programs are event-driven and should be first loaded into the kernel before being ran. The events like some system calls will trigger the corresponding eBPF program to run. The following graph is the architecture for loading and verifying eBPF program in the kernel.
Write an eBPF program in C;
Compile the source code into bytecode;
Load the program into kernel using an API, which will initiate a system call to load the program.
In the system call, the kernal first verifyt the program is safe enough to run in the kernel. This part may including:
a. make sure the addresses reference in the program will not access the addresses out of the user process;
b. make sure the program doesn’t contain control loop, which may trap the kernel into infinite loop;
c. limit the number of instructions (e.g, 4096);
After verifying, the kernel will save the program in eBPF map;
When network packets come in, the kernel will check if there is any regitsered eBPF program to run.
Fist we need to write a eBPF program to define which event we want to listen and what we want to do when the event happens. Then we need to load this program into the kernel.
Then we compile this file into bytecode:
clang -O2 -target bpf -c ebpf_prog.c \
After that, we load our program into the kernel:
Then we compile this file and run:
clang -DHAVE_ATTR_TEST=0 -o monitor-exec -lelf \
Then you can see the result from the command line.
eBPF is not only used in network packet filtering right now. As developing and populating rapidly these years, eBPF is now a widely-used technology covering a wide range of applications.
eBPF can hook on every system calls. Also it can filter on all network packets and operations on sockets, we can apply security control by loading some inspection or filter program with eBPF into the kernel. Such a method will provide a better protection than using a user-level process to check.
Comparing to traditional metrics middleware, eBPF could provide dynamic computational ability to provide user-defined metrics efficiently.
From my perspective, eBPF is a Linux kernel technology in the beginning but now it’s more like a programing idea right now. It provides us an insight that by using the sandbox model, we could provide such a flexibility and extensibility to allow externel clients to ingest logic into the internal system, which could not only protect the system but also improve the efficiency. I will keep my eye on the development of eBPF and see how far it will go.
If you have any ideas, please feel free to share with me. Thanks!