Skip to content

Logging Packets

In the previous chapter, our XDP application ran until Ctrl-C was hit and permitted all the traffic. Each time a packet was received, the BPF program created a log entry. Let's expand this program to log the traffic that is being permitted in the user-space application instead of the BPF program.

Source Code

Full code for the example in this chapter is availble here

Getting Data to User-Space

Sharing Data

To get data from kernel-space to user-space we use an eBPF map. There are numerous types of maps to chose from, but in this example we'll be using a PerfEventArray.

While we could go all out and extract data all the way up to L7, we'll constrain our firewall to L3, and to make things easier, IPv4 only. The data structure that we'll need to send information to user-space will need to hold an IPv4 address and an action for Permit/Deny, we'll encode both as a u32.

myapp-common/src/lib.rs
#![no_std]

#[repr(C)]
#[derive(Clone, Copy)]
pub struct PacketLog {
    pub ipv4_address: u32,
    pub action: u32,
}

#[cfg(feature = "user")]
unsafe impl aya::Pod for PacketLog {} // (1)
  1. We implement the aya::Pod trait for our struct since it is Plain Old Data as can be safely converted to a byte-slice and back.

Alignment, padding and verifier errors

At program load time, the eBPF verifier checks that all the memory used is properly initialized. This can be a problem if - to ensure alignment - the compiler inserts padding bytes between fields in your types.

Example:

#[repr(C)]
struct SourceInfo {
    source_port: u16,
    source_ip: u32,
}

let port = ...;
let ip = ...;
let si = SourceInfo { source_port: port, source_ip: ip };

In the example above, the compiler will insert two extra bytes between the struct fields source_port and source_ip to make sure that source_ip is correctly aligned to a 4 bytes address (assuming mem::align_of::<u32>() == 4). Since padding bytes are typically not initialized by the compiler, this will result in the infamous invalid indirect read from stack verifier error.

To avoid the error, you can either manually ensure that all the fields in your types are correctly aligned (eg by explicitly adding padding or by making field types larger to enforce alignment) or use #[repr(packed)]. Since the latter comes with its own footguns and can perform less efficiently, explicitly adding padding or tweaking alignment is recommended.

Solution ensuring alignment using larger types:

#[repr(C)]
struct SourceInfo {
    source_port: u32,
    source_ip: u32,
}

let port = ...;
let ip = ...;
let si = SourceInfo { source_port: port, source_ip: ip };

Solution with explicit padding:

#[repr(C)]
struct SourceInfo {
    source_port: u16,
    padding: u16,
    source_ip: u32,
}

let port = ...;
let ip = ...;
let si = SourceInfo { source_port: port, padding: 0, source_ip: ip };

Writing Data

Generating Bindings To vmlinux.h

To get useful data to add to our maps, we first need some useful data structures to populate with data from the XdpContext. We want to log the Source IP Address of incoming traffic, so we'll need to:

  1. Read the Ethernet Header to determine if this is an IPv4 Packet
  2. Read the Source IP Address from the IPv4 Header

The two structs in the kernel for this are ethhdr from uapi/linux/if_ether.h and iphdr from uapi/linux/ip.h. If I were to use bindgen to generate Rust bindings for those headers, I'd be tied to the kernel version of the system that I'm developing on. This is where aya-tool comes in to play. It can easily generate bindings for using the BTF information in /sys/kernel/btf/vmlinux.

First, we must make sure that bindgen is installed.

cargo install bindgen

Once the bindings are generated and checked in to our repository they shouldn't need to be regenerated again unless we need to add a new struct.

Lets use xtask to automate this so we can easily reproduce this file in future.

We'll add the following code

use aya_tool::generate::InputFile;
use std::{fs::File, io::Write, path::PathBuf};

pub fn generate() -> Result<(), anyhow::Error> {
    let dir = PathBuf::from("myapp-ebpf/src");
    let names: Vec<&str> = vec!["ethhdr", "iphdr"];
    let bindings = aya_tool::generate(
        InputFile::Btf(PathBuf::from("/sys/kernel/btf/vmlinux")),
        &names,
        &[],
    )?;
    // Write the bindings to the $OUT_DIR/bindings.rs file.
    let mut out = File::create(dir.join("bindings.rs"))?;
    write!(out, "{}", bindings)?;
    Ok(())
}
[package]
name = "xtask"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
anyhow = "1"
clap = { version = "3.1", features = ["derive"] }
aya-tool = { git = "https://github.com/aya-rs/aya", branch = "main" }
mod build_ebpf;
mod run;
mod codegen;

use std::process::exit;

use clap::Parser;

#[derive(Debug, Parser)]
pub struct Options {
    #[clap(subcommand)]
    command: Command,
}

#[derive(Debug, Parser)]
enum Command {
    BuildEbpf(build_ebpf::Options),
    Run(run::Options),
    Codegen,
}

fn main() {
    let opts = Options::parse();

    use Command::*;
    let ret = match opts.command {
        BuildEbpf(opts) => build_ebpf::build_ebpf(opts),
        Run(opts) => run::run(opts),
        Codegen => codegen::generate(),
    };

    if let Err(e) = ret {
        eprintln!("{:#}", e);
        exit(1);
    }
}

Once we've generated our file using cargo xtask codegen from the root of the project. We can access these by including mod bindings from our eBPF code.

Getting Packet Data From The Context And Into the Map

The XdpContext contains two fields, data and data_end. data is a pointer to the start of the data in kernel memory and data_end, a pointer to the end of the data in kernel memory. In order to access this data and ensure that the eBPF verifier is happy, we'll introduce a helper function called ptr_at. This function will ensure that before we access any data, we check that it's contained between data and data_end. It is marked as unsafe because when calling the function, you must ensure that there is a valid T at that location or there will be undefined behaviour.

With our helper function in place, we can:

  1. Read the Ethertype field to check if we have an IPv4 packet.
  2. Read the IPv4 Source Address from the IP header

To do this efficiently we'll add a dependency on memoffset = "0.6" in our myapp-ebpf/Cargo.toml

Reading Fields Using offset_of!

As there is limited stack space, it's more memory efficient to use the offset_of! macro to read a single field from a struct, rather than reading the whole struct and accessing the field by name.

Once we have our IPv4 source address, we can create a PacketLog struct and output this to our PerfEventArray

The resulting code looks like this:

myapp-ebpf/src/main.rs
#![no_std]
#![no_main]
#![allow(nonstandard_style, dead_code)]

use aya_bpf::{
    bindings::xdp_action,
    macros::{map, xdp},
    maps::PerfEventArray,
    programs::XdpContext,
};

use core::mem;
use memoffset::offset_of;
use myapp_common::PacketLog;

mod bindings;
use bindings::{ethhdr, iphdr};

#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
    unsafe { core::hint::unreachable_unchecked() }
}

#[map(name = "EVENTS")] // (1)
static mut EVENTS: PerfEventArray<PacketLog> =
    PerfEventArray::<PacketLog>::with_max_entries(1024, 0);

#[xdp]
pub fn xdp_firewall(ctx: XdpContext) -> u32 {
    match try_xdp_firewall(ctx) {
        Ok(ret) => ret,
        Err(_) => xdp_action::XDP_ABORTED,
    }
}

#[inline(always)] // (2)
unsafe fn ptr_at<T>(ctx: &XdpContext, offset: usize) -> Result<*const T, ()> {
    let start = ctx.data();
    let end = ctx.data_end();
    let len = mem::size_of::<T>();

    if start + offset + len > end {
        return Err(());
    }

    Ok((start + offset) as *const T)
}

fn try_xdp_firewall(ctx: XdpContext) -> Result<u32, ()> {
    let h_proto = u16::from_be(unsafe {
        *ptr_at(&ctx, offset_of!(ethhdr, h_proto))? // (3)
    });
    if h_proto != ETH_P_IP {
        return Ok(xdp_action::XDP_PASS);
    }
    let source = u32::from_be(unsafe {
        *ptr_at(&ctx, ETH_HDR_LEN + offset_of!(iphdr, saddr))?
    });

    let log_entry = PacketLog {
        ipv4_address: source,
        action: xdp_action::XDP_PASS,
    };
    unsafe {
        EVENTS.output(&ctx, &log_entry, 0); // (4)
    }
    Ok(xdp_action::XDP_PASS)
}

const ETH_P_IP: u16 = 0x0800;
const ETH_HDR_LEN: usize = mem::size_of::<ethhdr>();
  1. Create our map
  2. Here's ptr_at, which gives ensures packet access is bounds checked
  3. Using ptr_at to read our ethernet header
  4. Outputting the event to the PerfEventArray

Don't forget to rebuild your eBPF program!

Reading Data

In order to read from the AsyncPerfEventArray, we have to call AsyncPerfEventArray::open() for each online CPU, then we have to poll the file descriptor for events. While this is do-able using PerfEventArray and mio or epoll, the code is much less easy to follow. Instead, we'll use tokio, which was added to our template for us.

We'll need to add a dependency on bytes = "1" to myapp/Cargo.toml since this will make it easier to deal with the chunks of bytes yielded by the AsyncPerfEventArray.

Here's the code:

myapp/src/main.rs
use aya::{include_bytes_aligned, Bpf};
use anyhow::Context;
use aya::programs::{Xdp, XdpFlags};
use aya::maps::perf::AsyncPerfEventArray;
use aya::util::online_cpus;
use bytes::BytesMut;
use std::net;
use clap::Parser;
use log::info;
use tokio::{signal, task};

use myapp_common::PacketLog;

#[derive(Debug, Parser)]
struct Opt {
    #[clap(short, long, default_value = "eth0")]
    iface: String,
}

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let opt = Opt::parse();

    env_logger::init();

    // This will include your eBPF object file as raw bytes at compile-time and load it at
    // runtime. This approach is recommended for most real-world use cases. If you would
    // like to specify the eBPF program at runtime rather than at compile-time, you can
    // reach for `Bpf::load_file` instead.
    #[cfg(debug_assertions)]
    let mut bpf = Bpf::load(include_bytes_aligned!(
        "../../target/bpfel-unknown-none/debug/myapp"
    ))?;
    #[cfg(not(debug_assertions))]
    let mut bpf = Bpf::load(include_bytes_aligned!(
        "../../target/bpfel-unknown-none/release/myapp"
    ))?;
    // (1)
    let program: &mut Xdp = bpf.program_mut("xdp").unwrap().try_into()?;
    program.load()?;
    program.attach(&opt.iface, XdpFlags::default())
        .context("failed to attach the XDP program with default flags - try changing XdpFlags::default() to XdpFlags::SKB_MODE")?;

    // (2)
    let mut perf_array = AsyncPerfEventArray::try_from(bpf.map_mut("EVENTS")?)?;

    for cpu_id in online_cpus()? {
        // (3)
        let mut buf = perf_array.open(cpu_id, None)?;

        // (4)
        task::spawn(async move {
            // (5)
            let mut buffers = (0..10)
                .map(|_| BytesMut::with_capacity(1024))
                .collect::<Vec<_>>();

            loop {
                // (6)
                let events = buf.read_events(&mut buffers).await.unwrap();
                for i in 0..events.read {
                    let buf = &mut buffers[i];
                    let ptr = buf.as_ptr() as *const PacketLog;
                    // (7)
                    let data = unsafe { ptr.read_unaligned() };
                    let src_addr = net::Ipv4Addr::from(data.ipv4_address);
                    // (8)
                    info!("LOG: SRC {}, ACTION {}", src_addr, data.action);
                }
            }
        });
    }
    signal::ctrl_c().await.expect("failed to listen for event");
    Ok::<_, anyhow::Error>(())
}
  1. Name was not defined in myapp-ebpf/src/main.rs, so use xdp instead of myapp
  2. Define our map
  3. Call open() for each online CPU
  4. Spawn a tokio::task
  5. Create buffers
  6. Read events in to buffers
  7. Use read_unaligned to read our data into a PacketLog.
  8. Log the event to the console.

Running the program

As before, the interface can be overwritten by providing the interface name as a parameter, for example, RUST_LOG=info cargo xtask run -- iface wlp2s0.

$ RUST_LOG=info cargo xtask run
[2022-10-04T12:46:05Z INFO  myapp] LOG: SRC 192.168.1.205, ACTION 2
[2022-10-04T12:46:05Z INFO  myapp] LOG: SRC 192.168.1.21, ACTION 2
[2022-10-04T12:46:05Z INFO  myapp] LOG: SRC 192.168.1.21, ACTION 2
[2022-10-04T12:46:05Z INFO  myapp] LOG: SRC 18.168.253.132, ACTION 2
[2022-10-04T12:46:05Z INFO  myapp] LOG: SRC 18.168.253.132, ACTION 2
[2022-10-04T12:46:05Z INFO  myapp] LOG: SRC 18.168.253.132, ACTION 2
[2022-10-04T12:46:05Z INFO  myapp] LOG: SRC 140.82.121.6, ACTION 2