Seccomp-bpf for Autonomous Agents

By Team Berialabs • may. 15, 2026 • mins read

The first time one of our agents deleted something it shouldn't have was a Tuesday at 21:47. I remember the time because the disk-full alert came in just as I was turning off the screen. The agent, in the middle of a reasoning chain about a privilege-escalation CTF, had decided that the best way to "clean up temporary artifacts" was to run find / -name "*.tmp" -delete. It didn't touch anything critical because the container was isolated, but it did render unusable the base image we used for our fuzzing runners. We had to rebuild it the following morning.

That night I wrote a sentence in my notebook that is still stuck to the monitor: "an LLM doesn't understand what irreversible means". From there grew the sandboxing model that Gwaihir CLI, our execution engine, uses today. And the piece that took us the most effort to tune was seccomp-bpf.

Why seccomp-bpf and not something else

The reasonable question is: if Docker, gVisor, Firecracker and Kata exist, why would we write BPF filters by hand? The short answer is that none of them fit the workflow of an agent that fires off offensive tools in bursts.

Docker in runc mode shares its kernel with the host, and its default seccomp profile lets through more than 300 syscalls. That's reasonable for a web service, not for something that decides at runtime to call ptrace or kexec_load. Firecracker (Agache et al., 2020) gave us a microVM per agent, but the cost of booting a VM for every invocation of nmap -sS broke our latency budget: 125 ms vs 4 ms for a fork+exec with seccomp applied.

gVisor was the strong candidate. Its Sentry intercepts syscalls in userspace and dramatically reduces the kernel surface. But Young et al.'s study at HotCloud 19 already measured what we later confirmed on our own benches: file opens 216 times slower, syscalls 2.2 times slower. When an agent fires up ffuf with 200 threads against an endpoint, that penalty shows.

Seccomp-bpf, on the other hand, is almost free. The filter compiles to BPF, lives in the process descriptor, and the per-syscall cost is on the order of tens of nanoseconds. And most importantly: we control it, rule by rule. As the kernel documentation rightly puts it, "it is not a sandbox in itself, it is a tool for sandbox developers to use"^[1]. That is exactly what we needed.

Minimal profiles per tool

The model is per tool, not per agent. When Gwaihir decides to run nmap, it doesn't load the sqlmap profile. Each binary the agent can invoke has its own .scprofile file describing the minimal set of syscalls observed in a strace -c trace, broadened by several passes of real usage.

The build process is boring, which is why it works. We run the tool against a lab target while perf trace is running, we collect the syscall list, cross-check it against the documentation, and discard anything clearly dangerous even if it shows up (for example, nmap should never need unshare or mount). What remains is the baseline.

For nmap, on a typical TCP SYN scan, the list fits on a single page. For sqlmap it's longer because the Python interpreter asks for more things. For ffuf, written in Go, there are surprises: the Go runtime calls rt_sigaction and mmap in industrial quantities, and forgetting one of them hangs the process with no clear diagnosis.

The negotiation protocol with Sentinel

This is where things get interesting for agents. A static profile is not enough. If the LLM decides mid-execution that it needs to resolve DNS to enrich a finding, it will call socket(AF_NETLINK, ...), which is not in the nmap profile. With no further mechanism, the process receives SIGSYS and dies.

What we did was put a seccomp notification channel (SECCOMP_RET_USER_NOTIF, available since kernel 5.0^[2]) between the child process and a supervisor we call Sentinel. When the filter encounters a syscall marked as "negotiable", it suspends the process and sends the request to Sentinel. Sentinel evaluates it against a declarative policy, optionally consults the model with a short prompt like "is this connect() to 10.10.0.0/8 justified by the current task?", and returns continue or abort.

It's a pattern similar to AgentBound (Securing AI Agent Execution, 2025), but at the kernel level rather than at MCP. The important difference is that the agent cannot bypass Sentinel: the BPF filter is loaded with NO_NEW_PRIVS before the execve, so not even a malicious binary can undo it.

A real filter

This is a simplified excerpt of the profile we use for nmap, written with libseccomp in Rust (the Gwaihir runtime weighs 13.2 MB and is built from a mix of Rust for the logic and Zig for the hot paths of tracing).

use libseccomp::*;

fn build_nmap_profile() -> Result {
    // Default action: SIGSYS and trace to the supervisor.
    let mut ctx = ScmpFilterContext::new(ScmpAction::Trap)?;

    // Basic I/O
    for sc in &["read", "write", "close", "fstat", "lseek",
                "openat", "pread64", "pwrite64"] {
        ctx.add_rule(ScmpAction::Allow, ScmpSyscall::from_name(sc)?)?;
    }

    // Memory
    for sc in &["mmap", "munmap", "mprotect", "brk", "madvise"] {
        ctx.add_rule(ScmpAction::Allow, ScmpSyscall::from_name(sc)?)?;
    }

    // Network: only what nmap needs for a SYN scan
    ctx.add_rule(ScmpAction::Allow, ScmpSyscall::from_name("socket")?)?;
    ctx.add_rule(ScmpAction::Allow, ScmpSyscall::from_name("setsockopt")?)?;
    ctx.add_rule(ScmpAction::Allow, ScmpSyscall::from_name("sendto")?)?;
    ctx.add_rule(ScmpAction::Allow, ScmpSyscall::from_name("recvfrom")?)?;

    // Negotiable: the agent can request these via Sentinel.
    ctx.add_rule(ScmpAction::Notify, ScmpSyscall::from_name("connect")?)?;
    ctx.add_rule(ScmpAction::Notify, ScmpSyscall::from_name("execve")?)?;

    // Explicitly forbidden.
    for sc in &["ptrace", "kexec_load", "init_module",
                "delete_module", "reboot", "mount", "unshare"] {
        ctx.add_rule(ScmpAction::KillProcess, ScmpSyscall::from_name(sc)?)?;
    }

    ctx.set_filter_attr(ScmpFilterAttr::CtlNnp, 1)?;
    Ok(ctx)
}

Three details that matter. First, the default action is SCMP_ACT_TRAP, not SCMP_ACT_KILL. The reason is diagnostic: we want the SIGSYS handler to capture the syscall and log it before the process dies. Second, the negotiable syscalls use SCMP_ACT_NOTIFY, not ALLOW; that's what opens the channel with Sentinel. Third, the forbidden ones get an explicit KILL_PROCESS even though they were already covered by the default, because we want that decision to be legible in a future audit.

Trade-offs we learned the hard way

The first is portability. Syscall numbers change between architectures (x86_64, aarch64, riscv64) and libseccomp abstracts that fairly well, but there are edge cases. socketcall multiplexed on 32 bits, arch_prctl which only exists on x86, statx which appeared in recent kernels. We keep a test matrix per architecture.

The second is debugging. When a profile kills a process for a syscall that wasn't on the list, the message you see is Bad system call (core dumped). Nothing more. We had to write a wrapper that parses the siginfo_t of SIGSYS and reports the syscall number, cross-references it against the table, and prints something legible. The first time we saw "statx blocked by profile nmap.scprofile, suggest adding to baseline" instead of a core dump, we nearly cried with relief.

The third is false positives. When a libc update starts using clone3 instead of clone, all your profiles break at once. We learned that when glibc 2.34 landed in one of the base images. We now have a weekly job that regenerates baselines and warns us if the diff is suspiciously large.

The fourth, and the one that surprised us the most, is the overhead of very large filters. BPF executes on every syscall, and a filter with hundreds of cascading rules makes itself felt. We measured a 3-4% penalty on syscall-heavy workloads with a 180-rule profile, versus 0.8% with one of 40. Reordering the rules so that frequent syscalls were evaluated first recovered most of that time.

What would have saved us time to know earlier

Three things. The first: start with SCMP_ACT_LOG instead of TRAP during profile development. You log everything, run the tool against real workloads for a whole day, and then build the list. We did it the other way around and lost weeks chasing crashes.

The second: user notification (USER_NOTIF) is what turns seccomp into something useful for agents. Without it, a static profile is a corset. With it, you get dynamic policy without sacrificing privilege separation.

The third: measure the real cost from day one. Published benchmarks are useful but your workloads are weird. An agent that calls a short tool fifty times per second has a very different cost profile from a long-running server.

The filter doesn't protect against everything. It protects against the stupid. And it turns out the stupid is 90% of what an LLM gets wrong when you let it loose.

Today Gwaihir runs exploits proposed by models without any of us watching the screen waiting for a catastrophe. Not because we trust the model, but because we trust the filter. And that, after Tuesday 21:47, is a life-changing thing.

References

Team Berialabs

Miembro de Berialabs, especializado en ofensiva asistida por IA.

Why seccomp-bpf and not something else

Minimal profiles per tool

The negotiation protocol with Sentinel

A real filter

Trade-offs we learned the hard way

What would have saved us time to know earlier

References

Team Berialabs

Lecturas relacionadas

面向自主代理的 Seccomp-bpf

自律エージェントのためのSeccomp-bpf

स्वायत्त एजेंट्स के लिए Seccomp-bpf