What is eBPF?
eBPF (extended Berkeley Packet Filter) is a Linux kernel technology that lets you run tiny, sandboxed programs inside the kernel safely, without writing or loading custom kernel modules. These programs attach to well-defined hooks (network path, syscalls, tracepoints, security layers, etc.) and can observe or act on live system events with very low overhead.
Why it matters
• Safety: a verifier analyzes programs before load to ensure memory-safety, bounded loops, and deterministic termination.
• Performance: running close to the event source avoids expensive context switches and captures high-fidelity data.
• Flexibility: dynamic attach/detach, no kernel rebuilds, and rich data structures (“maps”) for passing data to user space or sharing across programs.
Core building blocks
• Programs: compiled bytecode executed at kernel hooks (e.g., kprobe/tracepoint, XDP, tc, LSM, cgroup, uprobes, perf).
• Maps: kernel-resident key/value stores (hash, array, LRU, per-CPU, ring buffer, bloom filter, etc.) for metrics, state, and event passing.
• Helpers: kernel-provided functions to read context, access maps, adjust packets, emit events, and more.
• Verifier: statically checks safety and resource bounds before admitting a program.
Where eBPF attaches
• Networking: XDP (very early ingress), tc (egress/ingress shaping), socket filters, cgroup hooks for per-service policy.
• Observability: kprobes/tracepoints, uprobes for user processes, perf events for profiling.
• Security: Linux Security Module (LSM) hooks to enforce fine-grained policies; audit-style telemetry; process/file/network monitoring.
Common use cases
• Networking: DDoS mitigation, load balancing, service mesh acceleration, flow metering.
• Observability: system and app tracing, syscall audit, latency histograms, flame graphs.
• Security: runtime policy, anomaly detection, prevention (e.g., block certain syscalls or IPs), forensics at source.
Development workflow (typical)
Identify a hook (e.g., tracepoint for sys_enter_execve
or XDP for early packet handling).
Write a small C program using eBPF helper APIs and maps.
Compile with Clang/LLVM to BPF bytecode (-target bpf
).
Load and attach via a user-space loader (libbpf-based), bpftool
, or higher-level tools.
Stream events to user space (perf buffer or ring buffer) and visualize/act.
Mini example: trace process execs (C, libbpf-style)
// prog.c (escape < > in HTML)
#include <linux/bpf.h>
#include "bpf_helpers.h"
SEC("tracepoint/syscalls/sys_enter_execve")
int on_execve(void *ctx) {
bpf_printk("execve called\n");
return 0;
}
char LICENSE[] SEC("license") = "GPL";
Build (example): clang -O2 -g -target bpf -c prog.c -o prog.o
Load/attach with a tiny libbpf-based loader or with bpftool
(for tracepoints you typically use a loader that calls bpf_link_create()
). View logs via sudo cat /sys/kernel/debug/tracing/trace_pipe
.
One-liners for fast exploration
• bpftrace (dynamic tracing):
bpftrace -e 'tracepoint:syscalls:sys_enter_execve { printf("%s\n", str(args->filename)); }'
• bpftool (introspection): list maps/programs, pin objects, dump info: sudo bpftool prog show
, sudo bpftool map show
.
Best practices
• Keep programs small and focused; push heavy work to user space.
• Prefer ring buffers for high-rate events; batch to reduce overhead.
• Use BTF-enabled kernels to simplify CO-RE (Compile Once, Run Everywhere) portability.
• Validate at scale: test under stress, capture edge cases, monitor verifier logs.
• Implement clear schemas for events (versioned structs) to avoid reader/writer drift.
Performance & safety notes
• eBPF runs in hot paths. Minimize helper calls, avoid unbounded loops, and use per-CPU maps to reduce contention.
• The verifier is strict by design—simplify control flow, initialize variables, and keep stack usage within limits.
• Use CO-RE to avoid brittle kernel header dependencies; prefer libbpf
skeletons for cleaner loaders.
eBPF across platforms
Linux is the primary home of eBPF. There are emerging ports and projects for other OSes (e.g., eBPF for Windows), but feature parity and hook coverage vary. For production, target modern Linux distributions with recent 5.x kernels and BTF enabled.
Tooling landscape
• bpftool: official Swiss-army knife for loading/introspection.
• libbpf & CO-RE: low-level, fast path for portable BPF apps.
• BCC: Python/Lua front-ends; great for prototyping and learning.
• bpftrace: DTrace-like one-liners and scripts for rapid insights.
Security and policy with eBPF
LSM and cgroup hooks allow you to enforce decisions (deny operations, rate-limit, quarantine flows) rather than only observing them. Combine telemetry with policy for closed-loop protection (detect → decide → act).
Gotchas to expect
• Kernel version drift: ensure your CI tests on representative kernels; CO-RE helps, but verify.
• Verifier rejections: simplify logic, break into helper functions, or split into multiple programs with tail calls.
• Packaging/permissions: eBPF requires capabilities (e.g., CAP_BPF
, CAP_SYS_ADMIN
on older kernels); document prerequisites for operators.
Learning path
Start with bpftrace and BCC to explore. Move to libbpf + CO-RE for production-grade agents with stable performance, smaller footprints, and fewer runtime dependencies. Add a user-space pipeline that standardizes events, batches IO, and exports metrics/logs to your observability stack (Prometheus/OpenTelemetry/ELK).
Further reading & reference links
• Kernel eBPF docs: docs.kernel.org/bpf
• libbpf CO-RE guide (reference implementers): github.com/libbpf/libbpf
• BCC tools & tutorials: github.com/iovisor/bcc
• bpftrace docs: bpftrace.org
• bpftool manual: man7.org/linux/man-pages/man8/bpftool.8.html
• Cilium (networking with eBPF): cilium.io
• Brendan Gregg’s eBPF articles: brendangregg.com/ebpf.html
Closing thought
eBPF turns the kernel into a programmable platform for precise, low-overhead visibility and enforcement. With careful design and the right tooling, you can build high-performance networking, observability, and security features that adapt quickly—without kernel patches or restarts.