This repo contains a tool which is used to program ebpf maps using just an ebpf binary. It contains two scripts,
- bpfmap-info-extractor.py:- which extracts BTF info from a given ebpf binary and presents a flatten view of the map type information.
- ebpf-sdk-cli.py:- which uses the flatten type information generated by map info extractor and provides an interface to read/write/update map key and value pairs.
Below we show working of ebpf user sdk on an opensource ebpf module https://github.com/openshift/ingress-node-firewall.
https://github.com/openshift/ingress-node-firewall is an ebpf based firewall implemented for openshift and kubernetes environments. I have tried testing my ebpf-user-sdk with this firewall and show the execution below.
At first look the code of the firewall seems to use a map to store firewall rules with a key of the form,
struct lpm_ip_key_st {
__u32 prefixLen;
__u32 ingress_ifindex;
__u8 ip_data[16];
} __attribute__((packed));
and value of the form
struct rulesVal_st {
struct ruleType_st rules[MAX_RULES_PER_TARGET];
} __attribute__((packed));
which is an array of rules,
struct ruleType_st {
__u32 ruleId;
__u8 protocol;
__u16 dstPortStart;
__u16 dstPortEnd;
__u8 icmpType;
__u8 icmpCode;
__u8 action;
} __attribute__((packed));
The final map looks like,
/*
* ingress_node_firewall_table_map: is LPM trie map type
* key is the ingress interface index and the sourceCIDR.
* lookup returns an array of rules with actions for the XDP program
* to process.
* Note: this map is pinned to specific path in bpffs.
*/
struct {
__uint(type, BPF_MAP_TYPE_LPM_TRIE);
__type(key, struct lpm_ip_key_st);
__type(value, struct rulesVal_st);
__uint(max_entries, MAX_TARGETS);
__uint(map_flags, BPF_F_NO_PREALLOC);
__uint(pinning, LIBBPF_PIN_BY_NAME);
} ingress_node_firewall_table_map SEC(".maps");
There are two other maps,
/*
* ingress_node_firewall_events_map: is perf event array map type
* key is the rule id, packet header is captured and used to generate events.
*/
struct {
__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
__type(key, __u32);
__type(value, __u32);
__uint(max_entries, MAX_CPUS);
} ingress_node_firewall_events_map SEC(".maps");
/*
* ingress_node_firewall_statistics_map: is per cpu array map type
* key is the rule id
* user space collects statistics per CPU and aggregate them.
*/
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, __u32); // ruleId
__type(value, struct ruleStatistics_st);
__uint(max_entries, MAX_TARGETS);
} ingress_node_firewall_statistics_map SEC(".maps");
but they seem to be putting something as output in an event array or per cpu statistics which we can ignore for this example.
Currently, ebpf user sdk uses
bpftool underneath to program maps and it has two limitations with respect to the example above,
bpftool
does not seem to be fully compatible with MAP_TYPE_LPM_TRIE.- For an array we need to provide all elemets while programming via
bpftool
which is not possible for a huge array.
So we are going to change the map type to below (just for this version of ebpf user sdk, in future versions we can work with all map types)
/*
* ingress_node_firewall_table_map: is LPM trie map type
* key is the ingress interface index and the sourceCIDR.
* lookup returns an array of rules with actions for the XDP program
* to process.
* Note: this map is pinned to specific path in bpffs.
*/
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__type(key, struct lpm_ip_key_st);
__type(value, struct ruleType_st);
__uint(max_entries, MAX_TARGETS);
__uint(map_flags, BPF_F_NO_PREALLOC);
__uint(pinning, LIBBPF_PIN_BY_NAME);
} ingress_node_firewall_table_map SEC(".maps");
Our goal here would be to show that we can program the firewall table map using ebpf-client-sdk and hence we will focus on the firewall table map.
After compiling the firewall and getting an ebpf elf binary we can run the bpfmap-info-extractor.py to get the map information from the elf and a "flatten" version of the type which we can show to the user to program.
$ python3 bpfmap-info-extractor.py --elf ./ingress_node_firewall_kernel.o --parsed infw_parsed.btf.json
2023-02-13:14:26:18,696 INFO [bpfmap-info-extractor.py:106] Supplied bpf object, extracting raw btf
2023-02-13:14:26:18,696 INFO [bpfmap-info-extractor.py:96] Executing cmd: bpftool btf dump file ./ingress_node_firewall_kernel.o -p > /tmp/bpf-client-sdk-raw-btf-dump.json
2023-02-13:14:26:18,699 INFO [bpfmap-info-extractor.py:163] output file - infw_parsed.btf.json
2023-02-13:14:26:18,700 INFO [bpfmap-info-extractor.py:166] parsed btf is dumped to infw_parsed.btf.json which is to be used as input to cli
The ebpf-client-sdk currently runs bpftool underneath and has output a parsed btf which we can use in the next step to program the maps.
the contents of the file generated are,
$ cat infw_parsed.btf.json
{
"maps": [
{
"size": 32,
"name": "ingress_node_firewall_events_map",
"key": {
"name": "(anon)",
"kind": "PTR",
"type": {
"name": "__u32",
"kind": "TYPEDEF",
"type": {
"name": "unsigned int",
"kind": "INT",
"id": 7,
"size": 4,
"bits_offset": 0,
"nr_bits": 32,
"encoding": "(none)"
}
},
"flatten": {
"type_name": "unsigned int",
"size": 4,
"kind": "INT",
"input": null
}
},
"value": {
"name": "(anon)",
"kind": "PTR",
"type": {
"name": "__u32",
"kind": "TYPEDEF",
"type": {
"name": "unsigned int",
"kind": "INT",
"id": 7,
"size": 4,
"bits_offset": 0,
"nr_bits": 32,
"encoding": "(none)"
}
},
"flatten": {
"type_name": "unsigned int",
"size": 4,
"kind": "INT",
"input": null
}
},
"path": "n/a"
},
{
"size": 32,
"name": "ingress_node_firewall_statistics_map",
"key": {
"name": "(anon)",
"kind": "PTR",
"type": {
"name": "__u32",
"kind": "TYPEDEF",
"type": {
"name": "unsigned int",
"kind": "INT",
"id": 7,
"size": 4,
"bits_offset": 0,
"nr_bits": 32,
"encoding": "(none)"
}
},
"flatten": {
"type_name": "unsigned int",
"size": 4,
"kind": "INT",
"input": null
}
},
"value": {
"name": "(anon)",
"kind": "PTR",
"type": {
"name": "ruleStatistics_st",
"kind": "STRUCT",
"members": [
{
"name": "allow_stats",
"type": {
"name": "allow_stats_st",
"kind": "STRUCT",
"members": [
{
"name": "packets",
"type": {
"name": "__u64",
"kind": "TYPEDEF",
"type": {
"name": "unsigned long long",
"kind": "INT",
"id": 18,
"size": 8,
"bits_offset": 0,
"nr_bits": 64,
"encoding": "(none)"
}
}
},
{
"name": "bytes",
"type": {
"name": "__u64",
"kind": "TYPEDEF",
"type": {
"name": "unsigned long long",
"kind": "INT",
"id": 18,
"size": 8,
"bits_offset": 0,
"nr_bits": 64,
"encoding": "(none)"
}
}
}
]
}
},
{
"name": "deny_stats",
"type": {
"name": "deny_stats_st",
"kind": "STRUCT",
"members": [
{
"name": "packets",
"type": {
"name": "__u64",
"kind": "TYPEDEF",
"type": {
"name": "unsigned long long",
"kind": "INT",
"id": 18,
"size": 8,
"bits_offset": 0,
"nr_bits": 64,
"encoding": "(none)"
}
}
},
{
"name": "bytes",
"type": {
"name": "__u64",
"kind": "TYPEDEF",
"type": {
"name": "unsigned long long",
"kind": "INT",
"id": 18,
"size": 8,
"bits_offset": 0,
"nr_bits": 64,
"encoding": "(none)"
}
}
}
]
}
}
]
},
"flatten": {
"variable_name": "ruleStatistics_st",
"kind": "STRUCT",
"member": [
{
"variable_name": "allow_stats_st",
"kind": "STRUCT",
"member": [
{
"variable_name": "packets",
"type_name": "unsigned long long",
"size": 8,
"kind": "INT",
"input": null
},
{
"variable_name": "bytes",
"type_name": "unsigned long long",
"size": 8,
"kind": "INT",
"input": null
}
]
},
{
"variable_name": "deny_stats_st",
"kind": "STRUCT",
"member": [
{
"variable_name": "packets",
"type_name": "unsigned long long",
"size": 8,
"kind": "INT",
"input": null
},
{
"variable_name": "bytes",
"type_name": "unsigned long long",
"size": 8,
"kind": "INT",
"input": null
}
]
}
]
}
},
"path": "n/a"
},
{
"size": 48,
"name": "ingress_node_firewall_table_map",
"key": {
"name": "(anon)",
"kind": "PTR",
"type": {
"name": "lpm_ip_key_st",
"kind": "STRUCT",
"members": [
{
"name": "prefixLen",
"type": {
"name": "__u32",
"kind": "TYPEDEF",
"type": {
"name": "unsigned int",
"kind": "INT",
"id": 7,
"size": 4,
"bits_offset": 0,
"nr_bits": 32,
"encoding": "(none)"
}
}
},
{
"name": "ingress_ifindex",
"type": {
"name": "__u32",
"kind": "TYPEDEF",
"type": {
"name": "unsigned int",
"kind": "INT",
"id": 7,
"size": 4,
"bits_offset": 0,
"nr_bits": 32,
"encoding": "(none)"
}
}
},
{
"name": "ip_data",
"type": {
"name": "(anon)",
"kind": "ARRAY",
"type": {
"name": "__u8",
"kind": "TYPEDEF",
"type": {
"name": "unsigned char",
"kind": "INT",
"id": 30,
"size": 1,
"bits_offset": 0,
"nr_bits": 8,
"encoding": "(none)"
}
}
}
}
]
},
"flatten": {
"variable_name": "lpm_ip_key_st",
"kind": "STRUCT",
"member": [
{
"variable_name": "prefixLen",
"type_name": "unsigned int",
"size": 4,
"kind": "INT",
"input": null
},
{
"variable_name": "ingress_ifindex",
"type_name": "unsigned int",
"size": 4,
"kind": "INT",
"input": null
},
{
"variable_name": "ip_data",
"kind": "ARRAY",
"input": [
],
"member": {
"type_name": "unsigned char",
"size": 1,
"kind": "INT",
"input": null
},
"totalSize": null,
"fillWithZeros": true
}
]
}
},
"value": {
"name": "(anon)",
"kind": "PTR",
"type": {
"name": "ruleType_st",
"kind": "STRUCT",
"members": [
{
"name": "ruleId",
"type": {
"name": "__u32",
"kind": "TYPEDEF",
"type": {
"name": "unsigned int",
"kind": "INT",
"id": 7,
"size": 4,
"bits_offset": 0,
"nr_bits": 32,
"encoding": "(none)"
}
}
},
{
"name": "protocol",
"type": {
"name": "__u8",
"kind": "TYPEDEF",
"type": {
"name": "unsigned char",
"kind": "INT",
"id": 30,
"size": 1,
"bits_offset": 0,
"nr_bits": 8,
"encoding": "(none)"
}
}
},
{
"name": "dstPortStart",
"type": {
"name": "__u16",
"kind": "TYPEDEF",
"type": {
"name": "unsigned short",
"kind": "INT",
"id": 35,
"size": 2,
"bits_offset": 0,
"nr_bits": 16,
"encoding": "(none)"
}
}
},
{
"name": "dstPortEnd",
"type": {
"name": "__u16",
"kind": "TYPEDEF",
"type": {
"name": "unsigned short",
"kind": "INT",
"id": 35,
"size": 2,
"bits_offset": 0,
"nr_bits": 16,
"encoding": "(none)"
}
}
},
{
"name": "icmpType",
"type": {
"name": "__u8",
"kind": "TYPEDEF",
"type": {
"name": "unsigned char",
"kind": "INT",
"id": 30,
"size": 1,
"bits_offset": 0,
"nr_bits": 8,
"encoding": "(none)"
}
}
},
{
"name": "icmpCode",
"type": {
"name": "__u8",
"kind": "TYPEDEF",
"type": {
"name": "unsigned char",
"kind": "INT",
"id": 30,
"size": 1,
"bits_offset": 0,
"nr_bits": 8,
"encoding": "(none)"
}
}
},
{
"name": "action",
"type": {
"name": "__u8",
"kind": "TYPEDEF",
"type": {
"name": "unsigned char",
"kind": "INT",
"id": 30,
"size": 1,
"bits_offset": 0,
"nr_bits": 8,
"encoding": "(none)"
}
}
}
]
},
"flatten": {
"variable_name": "ruleType_st",
"kind": "STRUCT",
"member": [
{
"variable_name": "ruleId",
"type_name": "unsigned int",
"size": 4,
"kind": "INT",
"input": null
},
{
"variable_name": "protocol",
"type_name": "unsigned char",
"size": 1,
"kind": "INT",
"input": null
},
{
"variable_name": "dstPortStart",
"type_name": "unsigned short",
"size": 2,
"kind": "INT",
"input": null
},
{
"variable_name": "dstPortEnd",
"type_name": "unsigned short",
"size": 2,
"kind": "INT",
"input": null
},
{
"variable_name": "icmpType",
"type_name": "unsigned char",
"size": 1,
"kind": "INT",
"input": null
},
{
"variable_name": "icmpCode",
"type_name": "unsigned char",
"size": 1,
"kind": "INT",
"input": null
},
{
"variable_name": "action",
"type_name": "unsigned char",
"size": 1,
"kind": "INT",
"input": null
}
]
}
},
"path": "/sys/fs/bpf/ingress_node_firewall_table_map"
}
]
}
Currently ebpf-client-sdk needs to be explicitly told about location of the maps which you can do so by running.
$ python3 ebpf-sdk-cli.py --parsed_btf ./infw_parsed.btf.json --op enrich
2023-02-13:14:56:25,155 INFO [ebpf-sdk-cli.py:149] Welcome to ebpf-client-sdk-cli
2023-02-13:14:56:25,156 INFO [ebpf-sdk-cli.py:72] First we need to know where maps are
2023-02-13:14:56:25,156 INFO [ebpf-sdk-cli.py:74] Please tell the pinned location of the map - ingress_node_firewall_events_map
Please tell the pinned location of the map - ingress_node_firewall_events_map
/sys/fs/bpf/ingress_node_firewall_table_map
2023-02-13:14:58:15,791 INFO [ebpf-sdk-cli.py:74] Please tell the pinned location of the map - ingress_node_firewall_statistics_map
Please tell the pinned location of the map - ingress_node_firewall_statistics_map
n/a
2023-02-13:15:02:59,917 INFO [ebpf-sdk-cli.py:74] Please tell the pinned location of the map - ingress_node_firewall_table_map
Please tell the pinned location of the map - ingress_node_firewall_table_map
n/a
2023-02-13:15:03:01,893 INFO [ebpf-sdk-cli.py:76] Thanks
The two statistics maps are not focused right now so we just enter n/a
while for firewall table map we enter correct path /sys/fs/bpf/ingress_node_firewall_table_map
First we need to setup an environment to load the program and testing. Here because this program uses xdp connection, i'm just going to create a veth pair with one veth inside a namespace and attach the program to the veth in root namespace so we can send traffic from inside the namespace to outside emulating a pod.
$ ./scripts/veth-pair-setup.sh
Now we load and attach the infw firewall to the root veth
port.
$ bpftool prog loadall ingress_node_firewall_kernel.o /sys/fs/bpf/xdp/ingress_node_firewall type xdp -d
$ bpftool net attach xdp id <prog-id> dev veth
Testing the traffic run through infw firewall,
$ ip netns exec vns1 /bin/bash
$ ping 10.10.10.1
After we ping the root veth ip, we should see some logs from the firewall on the trace_pipe. Note:- While compiling DEBUG
needs to be enabled on the firewall for this to show.
$ cat /sys/kernel/debug/tracing/trace_pipe
ksoftirqd/1-22 [001] d.s.. 13931.297569: bpf_trace_printk: Ingress node firewall action UNDEF
<idle>-0 [001] d.s.. 17470.186992: bpf_trace_printk: Ingress node firewall action UNDEF
ksoftirqd/1-22 [001] d.s.. 21140.185892: bpf_trace_printk: Ingress node firewall action UNDEF
ping-403157 [001] d.s1. 24362.229686: bpf_trace_printk: Ingress node firewall action UNDEF
ping-403157 [001] d.s1. 24363.251356: bpf_trace_printk: Ingress node firewall action UNDEF
ping-403157 [001] d.s1. 24364.275233: bpf_trace_printk: Ingress node firewall action UNDEF
ping-403157 [001] d.s1. 24365.302933: bpf_trace_printk: Ingress node firewall action UNDEF
ping-403157 [001] d.s1. 24366.322915: bpf_trace_printk: Ingress node firewall action UNDEF
ping-403157 [001] d.s1. 24367.347195: bpf_trace_printk: Ingress node firewall action UNDEF
ping-403157 [001] d.s1. 24368.371545: bpf_trace_printk: Ingress node firewall action UNDEF
ping-403157 [001] d.s1. 24369.395988: bpf_trace_printk: Ingress node firewall action UNDEF
ping-403157 [001] d.s1. 24370.418847: bpf_trace_printk: Ingress node firewall action UNDEF
ping-403157 [001] d.s1. 24371.443260: bpf_trace_printk: Ingress node firewall action UNDEF
Note:- On my system the firewall debug function was not compiling, possibly due to kernel version mismatch and hence I plugged a differnt bpf_trace_printk based logging function in its place.
Now lets program the map so we see the action being recorded by the firewall.
First we need to know what to provide the map, since this test is going to just send ping
traffic so we are going to use the below values for ip ranges in our test setup.
key:
prefixLen: 64,
ingress_ifindex: 6,
ip_data: 10.10.10.2
value:
ruleId: 1,
protocol: 0,
dstPortStart: 0,
dstPortEnd: 0,
icmpType: 8,
icmpCode: 0,
action: 2
Now, we use ebpf-sdk-cli.py to program the map,
$ python3 ebpf-sdk-cli.py --parsed_btf infw_parsed.btf.json --op create --map ingress_node_firewall_table_map
This command because it expects user input will open an editor and ask user to provide the values in a YAML format which after editing looks like,
key:
variable_name: lpm_ip_key_st
kind: STRUCT
member:
- variable_name: prefixLen
type_name: unsigned int
size: 4
kind: INT
input: 64
- variable_name: ingress_ifindex
type_name: unsigned int
size: 4
kind: INT
input: 6
- variable_name: ip_data
kind: ARRAY
input: [10, 10, 10, 2]
member:
type_name: unsigned char
size: 1
kind: INT
input: null
totalSize: 16
fillWithZeros: true
value:
variable_name: ruleType_st
kind: STRUCT
member:
- variable_name: ruleId
type_name: unsigned int
size: 4
kind: INT
input: 1
- variable_name: protocol
type_name: unsigned char
size: 1
kind: INT
input: 0
- variable_name: dstPortStart
type_name: unsigned short
size: 2
kind: INT
input: 0
- variable_name: dstPortEnd
type_name: unsigned short
size: 2
kind: INT
input: 0
- variable_name: icmpType
type_name: unsigned char
size: 1
kind: INT
input: 8
- variable_name: icmpCode
type_name: unsigned char
size: 1
kind: INT
input: 0
- variable_name: action
type_name: unsigned char
size: 1
kind: INT
input: 2
byteOrder: reversed
Note:- in the user input the fields where user input is expected are shown as NULL
.
Some special fields to be noted are totalSize
of Array ip_data
which is of 16 bytes but we are providing only 4 so other fields are marked to be zeros with fillWithZeros
.
Finally the output we get is -
$ python3 ebpf-sdk-cli.py --parsed_btf infw_parsed.btf.json --op create --map ingress_node_firewall_table_map
2023-02-16:16:38:40,875 INFO [ebpf-sdk-cli.py:170] Welcome to ebpf-client-sdk-cli
2023-02-16:16:38:40,876 INFO [ebpf-sdk-cli.py:180] create map entry for map
2023-02-16:16:38:40,876 INFO [ebpf-sdk-cli.py:126] map name passed is ingress_node_firewall_table_map
2023-02-16:16:38:40,876 INFO [ebpf-sdk-cli.py:141] Selected map is ingress_node_firewall_table_map
2023-02-16:16:38:40,880 DEBUG [ebpf-sdk-cli.py:24] vim /tmp/ebpf-sdk-cli-input.yaml
2023-02-16:16:38:57,147 INFO [ebpf-sdk-cli.py:154] Loaded user input
2023-02-16:16:38:57,148 INFO [btf.py:315] Converted STRUCT lpm_ip_key_st to 40000000060000000a0a0a02000000000000000000000000
2023-02-16:16:38:57,148 INFO [btf.py:315] Converted STRUCT ruleType_st to 010000000000000000080002
2023-02-16:16:38:57,148 INFO [ebpf-sdk-cli.py:160] Going to perform op create
2023-02-16:16:38:57,148 DEBUG [ebpf-sdk-cli.py:24] bpftool map update pinned /sys/fs/bpf/ingress_node_firewall_table_map key 0x40 0x00 0x00 0x00 0x06 0x00 0x00 0x00 0x0a 0x0a 0x0a 0x02 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 value 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x08 0x00 0x02
2023-02-16:16:38:57,154 INFO [ebpf-sdk-cli.py:167] op returned 0x
Now the logs of ping traffic will look like below,
$ cat /sys/kernel/debug/tracing/trace_pipe
ping-1701245 [001] d.s1. 64656.190279: bpf_trace_printk: Ingress node firewall process IPv4 packet
ping-1701245 [001] d.s1. 64656.190293: bpf_trace_printk: saddr 34212362, ifId 6, prefixLen 64
ping-1701245 [001] d.s1. 64656.190294: bpf_trace_printk: Ingress node firewall action ALLOW -> XDP_PASS
ping-1701245 [001] d.s1. 64657.193547: bpf_trace_printk: Ingress node firewall process IPv4 packet
ping-1701245 [001] d.s1. 64657.193561: bpf_trace_printk: saddr 34212362, ifId 6, prefixLen 64
ping-1701245 [001] d.s1. 64657.193562: bpf_trace_printk: Ingress node firewall action ALLOW -> XDP_PASS
ping-1701245 [001] d.s1. 64658.218114: bpf_trace_printk: Ingress node firewall process IPv4 packet
ping-1701245 [001] d.s1. 64658.218139: bpf_trace_printk: saddr 34212362, ifId 6, prefixLen 64
ping-1701245 [001] d.s1. 64658.218141: bpf_trace_printk: Ingress node firewall action ALLOW -> XDP_PASS
ping-1701245 [001] d.s1. 64659.242114: bpf_trace_printk: Ingress node firewall process IPv4 packet
ping-1701245 [001] d.s1. 64659.242151: bpf_trace_printk: saddr 34212362, ifId 6, prefixLen 64
ping-1701245 [001] d.s1. 64659.242153: bpf_trace_printk: Ingress node firewall action ALLOW -> XDP_PASS
ping-1701245 [001] d.s1. 64660.266213: bpf_trace_printk: Ingress node firewall process IPv4 packet
ping-1701245 [001] d.s1. 64660.266254: bpf_trace_printk: saddr 34212362, ifId 6, prefixLen 64