Welcome to the Energy Charts

Welcome to the Energy Charts

Map of power plants
Publishing Notes
Data Protection

Welcome to the Energy Charts
The website for interactive graphs displaying electricity production and spot market prices in Germany

Electricity production in Germany
Electricity production

Installed power in Germany
Installed power

Electricity generation in Germany
Electricity generation

Import and export in Germany/Europe
Import and export

Electricity production and spot market prices in Germany
Spot market prices


The site allows you to interactively customize the graphs to your own needs: You can select one or more energy sources or switch between graphs with absolute or percent values. The numerical values displayed in the graphs can be viewed in a pop-up window. Furthermore you can choose the time period to be viewed.

A legend on top of each graph shows the available parameters, which are activated by clicking. Open circles in the legend indicate that a parameter (e.g. export) is not shown currently but can be inserted by clicking on it.

Other useful hints for operation can be found next to each chart under »usage tips«.

By making the data available on this website, it is our intent to promote transparent and objective discussions relating to all factors regarding the energy transformation in Germany.

The data is collected from various neutral sources by scientists at the Fraunhofer Institute for Solar Energy Systems ISE and cover the period from 2011 to present.

Follow us on Twitter: Follow @energy_charts

Twitter News-Feed

Terms and conditions of Twitter Inc.

News 2018
Friday, 4. January 2019

Net Public Electricity Generation in Germany in 2018
update: second version
This report presents the data on German net electricity generation for public electricity supply. The figures thus represent the electricity mix that actually comes out of the socket at home and that is consumed in the household or is used to charge electric vehicles publicly. On the German electricity exchange EEX, only net electricity generation is traded and only net figures are measured for cross-border electricity flows.

The Difference between Gross and Net Production
Renewable Energy Sources: Solar and Wind
Renewable Energy Sources: Hydropower and Biomass
Non-renewable Electricity Generation
Export Surplus
Load, Exchange Electricity Prices and Market Value
Version History
For additional slides of our assessment of 2018 visit the Renewable Energy Data page of the Fraunhofer Institute for Solar Energy Systems ISE.

1 TWh = 1 terawatt-hour = 1000 gigawatts-hours (GWh) = 1 million megawatt-hours (MWh) = 1 billion kilowatt-hours (kWh)

Thursday, 03. January 2019

New charts
In addition to the warming strips, there are also the temperatures from 1881 to 2018 as line diagrams for Germany and all federal states available. The fit curves show the long-term trends. Mouse-over shows the tooltips with the values.

The annual sunshine duration and precipitation in Germany is also available in our climate area for annual averages.
Saturday, 22. December 2018

New chart
The Energy Charts now include the warming stripes for Germany and all federal states since 1881. The graphics are interactive and also show the annual temperatures when moving the mouse over the stripes.
Many thanks to Ed Hawkins for this good idea!

Tuesday, 9. October 2018

New chart
The Energy Charts now also offer graphs for Day Ahead and Intraday electricity spot prices in connection with the trading volume in MWh. Weekly, monthly and yearly charts are available.

Friday, 3. August 2018

New chart
The Energy Charts now show the hourly values of the following climate data of DWD for Germany:

wind speed and wind direction of 300 stations
global and diffuse radiation of 26 stations
air temperature and humidity of 500 stations
at climate.

Thursday, 24. May 2018

New features and changes regarding GDPR
The welcome page and the map of power plants have been reworked.
Now you will find all news and and the twitter feed on the startpage giving you a fast overview to all news and interesting topics or events that have been tweeted.

In addition, we have updated our privacy policy in accordance with the requirements of the GDPR.

Saturday, 19. May 2018

New chart
The Energy Charts now include graphs showing the filling levels of pumped storage and seasonal storage power stations in Germany, Austria and Switzerland at Storage filling levels.

Wednesday, 9. May 2018

New charts
At the scatter charts on spot market prices we have added additional charts with

he Day Ahead price of electricity above the sum of wind and solar power.
If wind plus solar power increases by one GW, the price drops by 0.75 Euro/MWh.
with the Day Ahead price of electricity above the residual load (load minus solar minus wind).
If the residual load increases by one GW, the price increases by 1.10 Euro/MWh.

Tuesday, 8. May 2018

Power generation in Germany – assessment of 2017
update: fourth version
This report presents data on German net electricity generation for public power supply. The numbers thus represent the electricity mix that actually is consumed in the households or with which even electric vehicles are charged. Only the net electricity generation is traded on the German electricity exchange EEX and only net figures are measured for cross-border electricity flows.

Difference between gross and net production
Renewable energy: solar and wind
Renewable energy: hydropower and biomass
Non-renewable generation
Export surplus
Version History
For additional slides of our assessment of 2017 visit the Renewable Energy Data page of the Fraunhofer Institute for Solar Energy Systems ISE.

1 TWh = 1 terawatt-hour = 1000 gigawatts-hours (GWh) = 1 million megawatt-hours (MWh) = 1 billion kilowatt-hours (kWh)

Friday, 4. May 2018

New charts
The Energy Charts now offer scatter charts on spot market prices with

the price of electricity above the wind power.
Without wind power, the average price is 47.74 Euro/MWh.
If the wind power increases by one GW, the price drops by 0.91 Euro/MWh.
the price of electricity above the solar power.
Without solar power, the average price is 41.71 Euro/MWh.
If the solar power increases by one GW, the price drops by 0.58 Euro/MWh.
the price of electricity above the load.
If the load increases by one GW, the price increases by 0.82 Euro/MWh.

Friday, 13. April 2018

New chart features
The Energy Charts now also show the planned power production of conventional power plants and the forecast for wind and solar under “all sources” at electricity production in Germany.
This feature was requested in the user survey.

Wednesday, 21. March 2018

Thank-you very much! We are overwhelmed by your support!
Over the past four weeks, more than 1600 users of the Energy Charts responded to our questionnaire. Many thanks for your feedback and the many comments and ideas. In the following weeks, we will evaluate the responses and work on new functionalities and improvements in order to make the Energy Charts more understandable and informative and more fitting to your needs. Stay in touch!

Wednesday, 21. February 2018

For a Better Understanding of the Energy Transition: Fraunhofer ISE Interviews Users of the Energy Charts
The Energy Charts, compiled by the Fraunhofer Institute for Solar Energy Systems ISE, are the most detailed database for energy and market data on power generation in Germany and thus an important source for journalists and decision-makers. The interpretation of the data and graphics, however, is to some extent complex. A current project of Fraunhofer ISE is to improve the display of data and graphics in order to make them easier to use for journalists. Other users of the site will also profit from these improvements.

Monday, 5. February 2018

New chart features
The energy charts now also show hourly values over one whole year at electricity production in Germany. Just select the empty field in “month” and “week”.
Attention! Depending on the source you select, the browser must display more than 100.000 values. This can take up to a minute.

News archive

DNS Servers You Should Have Memorized

DNS Servers You Should Have Memorized


DNS Servers You Should Have Memorized
The latest DNS server IPs are easier to remember and offer privacy and filtering functionality


, DNS Servers You Should Have Memorized

If you’re a programmer, a systems administrator, or really any type of IT worker, you probably have your favorite go-to IP addresses for troubleshooting. And if you’re like me, you’ve probably been using the same ones for years.

Such IPs can be used for:

Testing ping connectivity
Checking DNS resolution using dig or nslookup
Updating a system’s permanent DNS settings
Most DNS servers allow you to ping them.

I like using DNS servers for this because you can use them for both connectivity and name resolution testing, and for the longest time I used the Google DNS servers:

…but they don’t have any filtering enabled, and in recent years I’ve become less thrilled about sending Google all my DNS queries.

Cisco bought OpenDNS, which is where Umbrella came from.

Alternatives to Google DNS
At some point I switched to using Cisco’s Umbrella servers because they do URL filtering for you. They maintain a list of dangerous URLs and block them automatically for you, which can help protect from malware.

The OpenDNS servers are great, but I always have to look them up. Then, a few years ago, a new set of DNS servers came out that focused not only on speed and functionality, but also memorability.

One of the first easy-to-remember options with filtering that came out was IBM’s Quad 9—which as you might expect has an IP address of four nines:

I figured they were being overwhelmed at launch time, or their filtering wasn’t tweaked yet.

I tried to use Quad9 one for a bit when it first came out, but found it a bit slow. I imagine they have probably fixed that by now, but more on performance below.

Enter CloudFlare
, DNS Servers You Should Have Memorized

So with Google, Cisco, and IBM providing interesting options with various functionality, we then saw CloudFlare enter the arena.

But rather than provide filtering, they instead focused on privacy.

Some other recursive DNS services may claim that their services are secure because they support DNSSEC. While this is a good security practice, users of these services are ironically not protected from the DNS companies themselves. Many of these companies collect data from their DNS customers to use for commercial purposes. Alternatively, does not mine any user data. Logs are kept for 24 hours for debugging purposes, then they are purged.

CloudFlare Website

And perhaps coolest of all for me was their memorability rating, which is basically flawless: abbreviates to 1.1, so you can literally test by typing ping 1.1.

How cool is that?

So with them they’re not filtering your URLs, but they are consciously avoiding logging or tracking you in any way, which is excellent.

Norton ConnectSafe DNS
Norton also has a public DNS service, which has an interesting feature of multiple levels of URL content filtering.

Block malicious and fraudulent sites

Block sexual content

Block mature content of many types

My recommendation
Performance also matters here, and that will vary based on where you are, but in recent testing I found all of these options to be fairly responsive.

To me it comes down to this:

If you care about privacy and speed and maximum memorability, I recommend CloudFlare:

I find the filtering claims by both companies to be too opaque for my tastes, with both of them feeling like borderline marketing to be honest.

If you want URL filtering I recommend Quad9 over Umbrella simply because it’s easier to remember and seems to focus on having multiple threat intelligence sources.

And if you want multiple levels of URL filtering, you can go with the Norton offering, but I think I personally prefer to just use Quad9 for that and be done with it. But I think Norton is still a cool option for like protecting an entire school or something by forcing their DNS through the strictest option.
Final answer—if pressed—here are the two I recommend you remember.

For speed and privacy:
For filtering:
Daniel Miessler
About The Author
Daniel Miessler is a cybersecurity expert and author of The Real Internet of Things, based in San Francisco, California. Specializing in IoT and Application Security, he has 20 years of experience helping companies from early-stage startups to the Global 100. Daniel currently works at a leading tech company in the Bay Area, leads the OWASP Internet of Things Security Project, and can be found writing about the intersection of security, technology, and humanity.

Every Sunday I put out a list of the most interesting stories in infosec, technology, and humans. Over 20K subscribers.


Related Content
a dns primer
my response to sam harris on the apple encryption debate
10 ways to test your website performance
60 information security interview questions [2019 update]
unsupervised learning: no. 67
a summary of new nmap features from blackhat/defcon 2008
Most Popular Content
content overview
most popular
my tutorial series
the unsupervised learning podcast
the concepts page
my idea collection
my book summaries

© Daniel Miessler 1999-2019

Load XDP programs using the ip (iproute2) command

Load XDP programs using the ip (iproute2) command

To make Medium work, we log user data and share it with processors. To use Medium, you must agree to our Privacy Policy, including cookie policy.
Load XDP programs using the ip (iproute2) command
Go to the profile of Lorenzo Fontana
Lorenzo Fontana
Oct 8, 2018
Nota bene: If you don’t even know what XDP is, continue reading I will explain it later.

My journey
Past week I was working on a piece of software that needed the ability to define some custom network paths to route HTTP traffic to multiple destinations based on rules defined on a data store.

During the first phase, I had the opportunity to go through different ideas, and since my end result needed to be very fast and able to handle high throughput, in my mind there was only one thing “XDP”.
I already used XDP in the past working at a similar project and my knowledge about eBPF was already good enough that my team agreed this was crazy and thus we decided we should try.

I’m aware of, and I usually use iovisor/bcc or libbpf directly, but this time, I wanted something I can use to validate some ideas first, debug some of that code and only later include my program in a more complete project.

Even if usually pay a lot of attention to LKML, lwn.net and in general, to relevant projects in this area like Cilium I didn’t know that the ip command was able to load XDP programs since last week when I was lucky enough to be in man 8 ip-link looking for something else and I found it! However, using it in that way wasn’t really straightforward (for me) and I had to spend quite some time to put all the pieces together so I decided to write this walkthrough.

Wait wait wait.. What is XDP?
From the iovisor.org website:

XDP or eXpress Data Path provides a high performance, programmable network data path in the Linux kernel as part of the IO Visor Project. XDP provides bare metal packet processing at the lowest point in the software stack which makes it ideal for speed without compromising programmability. Furthermore, new functions can be implemented dynamically with the integrated fast path without kernel modification.
Cool, so XDP is a way to hook our eBPF programs very close to the network stack in order to do packets processing, encapsulation, de-encapsulation, get metrics etc..

The important thing you need to know is that you can write a program, that gets loaded into the kernel as if it was a module but that can be loaded without modifying it.

This kind program is called an eBPF program and is compiled to run against a special VM residing in the kernel that verifies and then executes those programs in a way that they cannot harm the running system.

Note that eBPF programs are not Turing complete, you can’t write loops for example.

You can look at the diagram below to visualize how the loading of eBPF programs works.

eBPF load diagram
Said all of that, XDP programs are a specialized kind of eBPF programs with the additional capability to go lower level than kernel space by accessing driver space to act directly on packets.

So if we wanted to visualize the same diagram from an XDP point of view it will look like this.

In most cases, however your hardware may not support XDP so you will still be able to load XDP programs using the xdpgeneric driver so you will still get the improvements of doing this lower level but not as with a network card that supports offloading network processing to it instead of doing that stuff on your CPU(s).

If this is still very unclear you can read more about XDP here.

What can I do with XDP?
This depends on how much imagination you have, some examples can be:

Monitoring of the network packet flow by populating a map back to the userspace, look at this example if you need inspiration;
Writing your own custom ingress/egress firewall like in the examples here;
Rewrite packet destinations, packets re-routing;
Packet inspection, security tools based on packets flowing;
Let’s try to load a program.
You will need a Linux machine with Kernel > 4.8 with clang (llvm ≥ 3.7) , iproute2 and docker installed.
Docker is not needed to run XDP programs but it is used here because of three reasons:

Docker by default creates bridged network interfaces on the host and on the container when you create a new container. Sick! I can point my XDP program to one of the bridged interfaces on the host.
Since it creates network namespaces, I can access them again using the ip command, since we are talking about that command here, this is a bonus for this post for me.
I can run a web server featuring cats, the real force that powers the internet, and then I can then block traffic using XDP without asking you to compile, install or run anything else.
Step 0: Create a docker container that can accept some HTTP traffic
So, let’s run caturday, we are not exposing the 8080 port for a reason!

# docker run –name httptest -d fntlnz/caturday
Step 1: Discover the ip address and network interface for httptest
Obtain the network namespace file descriptor from docker

# sandkey=$(docker inspect httptest -f “{{.NetworkSettings.SandboxKey}}”)
Prepare the network namespace to be inspected with the ip command so that we can look at its network interfaces without using docker exec .This also allows us to use any program we have in our root mount namespace against the container’s network. This is needed because the image we used fntlnz/caturday does not contain iproute2.

# mkdir -p /var/run/netns
# ln -s $sandkey /var/run/netns/httpserver
Don’t worry too much about that symlink, it will go away at next reboot. Now, let’s show the interfaces inside the container:

# ip netns exec httpserver ip a

1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet scope host lo
valid_lft forever preferred_lft forever
71: eth0@if72: mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet brd scope global eth0
valid_lft forever preferred_lft forever
The IP address is

Our interface eth0 in the container has id 71 and is in pair with if72 so the ID in the host machine is just 72, let’s get the interface name in the host machine.

# ip a | grep 72:
Cool, the name is right there:

72: vethcaf7146@if71: mtu 1500 qdisc noqueue master docker0 state UP group default
So I have,

Interface name on the host: vethcaf7146
Step 2: Generate some traffic to go to the interface we just created with the container httptest
For this purpose, we can use hping3 or the browser or even curl. In this case I will be using hping3 because it continuously sends traffic to the interface and I can see how this changes after loading my XDP program.

# hping3 –fast
In another terminal open tcpdump to see the packet flow.

# tcpdump -i vethcaf7146
vethcaf7146 is the name of the interface we found at step 2.

Step 3: Compile an XDP program using clang
Let’s take this XDP program as an example (there’s a more complete example later!)

int main() {
return XDP_DROP;
Compile it with:

$ clang -O2 -target bpf -c dropper.c -o dropper.o
Once loaded, this program drops every packet sent to the interface. If you change XDP_DROP with XDP_TX the packets will be sent back to the direction where they came from and you then change it to XDP_PASS, the packets will just continue flowing.

XDP_TX can also be used to transmit the packet to another destination if used after modifying the packet’s destination address, mac address and TCP checksum for example.

Step 4: Load the dropper.o program
At the previous step, we compiled dropper.o from dropper.c. We can now use the ip command to load the program into the kernel.

# ip link set dev vethcaf7146 xdp obj dropper.o sec .text
The careful reader may have noticed that we are using the xdp flag in the previous command. That flag means that the kernel will do its best to load the XDP program on the network card as native. However not all the network cards support native XDP programs with hardware offloads, in that case, the kernel disables that.

Hardware offloads happen when the network driver is attached to a network card that can process the networking work defined in the XDP program so that the server’s CPU doesn’t have to do that.

In case you already know your card supports XDP you can use xdpdrv, if you know that It doesn’t you can use xdpgeneric .

Step 5: Test with traffic and unload
At this point, after loading the program you will notice that your tcpdump started at Step 3 will stop receiving traffic because of the drop.

You can now stop the XDP program by unloading it

# ip link set dev vethcaf7146 xdp off
Step 6: Drop only UDP packets
For the very simple use case we had (drop everything) we didn’t need to access the xdp_md struct to get the context of the current packet flow, but now, since we want to selectively drop specific packets we need to do that.

In order to do that, however, we need to declare a function that has xdp_md *ctx as the first argument, to do that we need to use a different section than .text when loading our program, let’s see the program below:

Wow, a lot changed here!
First, we have a macro that allows us to map a section name to specific symbols, then we use that macro in SEC(“dropper_main”) to point the section name dropper_main to the dropper function where we accept the xdp_md struct pointer. After that, there’s some boilerplate to extract the Ethernet frame from which we can then extract the information related to the packet like the protocol in this case that we use to check the protocol and drop all the UDP packets (line 26).

Let’s compile it!

$ clang -O2 -target bpf -c udp.c -o udp.o
And now we can verify the symbol table with objdump to see if our section is there.

objdump -t udp.o
udp.o: file format elf64-little
0000000000000050 l dropper_main 0000000000000000 LBB0_3
0000000000000000 g license 0000000000000000 _license
0000000000000000 g dropper_main 0000000000000000 dropper
Cool, let’s load the function using the section dropper_main always on the same interface we used before vethcaf7146.

# ip link set dev vethcaf7146 xdp obj udp.o sec dropper_main
Nice! So let’s now try to do a DNS query from the container to verify if UDP packets are being dropped:

# ip netns exec httpserver dig github.com
This should give nothing right now and dig will be stuck for a while before exiting because there’s our XDP program that is dropping packets.

Now we can unload that program and our DNS queries will be there again

sh-4.4# dig github.com
; <> DiG 9.13.3 <> github.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49708
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
; EDNS: version: 0, flags:; udp: 512
;github.com. IN A
github.com. 59 IN A
github.com. 59 IN A
;; Query time: 42 msec
;; WHEN: Sun Oct 07 23:10:53 CEST 2018
;; MSG SIZE rcvd: 71
When I first approached eBPF and XDP one year ago I was very scared and my brain was telling sentences like

I’m not going to learn this you fool!! ~ My brain, circa June 2017
However, after some initial pain and some months thinking about it I really enjoyed working on this stuff.

But! As always I’m still in the process of learning stuff and this topic is very exciting and still very very obscure for me that I will surely try to do more stuff with this low-level unicorn.

I hope you will enjoy working with XDP as much as I did and I hope my post here helped you going forward with your journey, as writing it did with mine.

I love this! ~ My brain, circa October 2018
Following, some references for you.



Click to access bertin_Netdev-XDP.pdf

Thanks for reading! You can find me on Twitter and on GitHub.

Ocean Beach some weeks ago, that week we had an hackathon at InfluxData HQ where I had some interesting conversations with my colleagues about this topic. Thanks Leonardo Di Donato, Chris Goller, Greg Linton and Jeff Welding.
Go to the profile of Lorenzo Fontana
Lorenzo Fontana
Interested in: distributed systems, software defined networking, performances, scalability and system programming. SRE at @InfluxDB — Leenux on the Desktop

Also tagged Programming
Goodbye, Object Oriented Programming
Go to the profile of Charles Scalfani
Charles Scalfani
Related reads
Kubernetes logs analysis with Elassandra, Fluent-Bit and Kibana
Go to the profile of Vincent Royer
Vincent Royer
Also tagged Programming
The React Handbook
Go to the profile of Flavio Copes
Flavio Copes
Applause from Lorenzo Fontana (author)
Go to the profile of Bhaskar Chowdhury
Bhaskar Chowdhury
Oct 9, 2018
Good one …very well written and explained Lorenzo!! thanks, man.

How to drop 10 million packets per second

How to drop 10 million packets per second

How to drop 10 million packets per second
Marek Majkowski 2018-07-06
Internally our DDoS mitigation team is sometimes called “the packet droppers”. When other teams build exciting products to do smart things with the traffic that passes through our network, we take joy in discovering novel ways of discarding it.


CC BY-SA 2.0 image by Brian Evans

Being able to quickly discard packets is very important to withstand DDoS attacks.

Dropping packets hitting our servers, as simple as it sounds, can be done on multiple layers. Each technique has its advantages and limitations. In this blog post we’ll review all the techniques we tried thus far.

Test bench
To illustrate the relative performance of the methods we’ll show some numbers. The benchmarks are synthetic, so take the numbers with a grain of salt. We’ll use one of our Intel servers, with a 10Gbps network card. The hardware details aren’t too important, since the tests are prepared to show the operating system, not hardware, limitations.

Our testing setup is prepared as follows:

We transmit a large number of tiny UDP packets, reaching 14Mpps (millions packets per second).

This traffic is directed towards a single CPU on a target server.

We measure the number of packets handled by the kernel on that one CPU.

We’re not trying to maximize userspace application speed, nor packet throughput – instead, we’re trying to specifically show kernel bottlenecks.

The synthetic traffic is prepared to put maximum stress on conntrack – it uses random source IP and port fields. Tcpdump will show it like this:

$ tcpdump -ni vlan100 -c 10 -t udp and dst port 1234
IP > UDP, length 16
IP > UDP, length 16
IP > UDP, length 16
IP > UDP, length 16
IP > UDP, length 16
IP > UDP, length 16
IP > UDP, length 16
IP > UDP, length 16
IP > UDP, length 16
IP > UDP, length 16
On the target side all of the packets are going to be forwarded to exactly one RX queue, therefore one CPU. We do this with hardware flow steering:

ethtool -N ext0 flow-type udp4 dst-ip dst-port 1234 action 2
Benchmarking is always hard. When preparing the tests we learned that having any active raw sockets destroys performance. It’s obvious in hindsight, but easy to miss. Before running any tests remember to make sure you don’t have any stale tcpdump process running. This is how to check it, showing a bad process active:

$ ss -A raw,packet_raw -l -p|cat
Netid State Recv-Q Send-Q Local Address:Port
p_raw UNCONN 525157 0 *:vlan100 users:((“tcpdump”,pid=23683,fd=3))
Finally, we are going to disable the Intel Turbo Boost feature on the machine:

echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo
While Turbo Boost is nice and increases throughput by at least 20%, it also drastically worsens the standard deviation in our tests. With turbo enabled we had ±1.5% deviation in our numbers. With Turbo off this falls down to manageable 0.25%.


Step 1. Dropping packets in application
Let’s start with the idea of delivering packets to an application and ignoring them in userspace code. For the test setup, let’s make sure our iptables don’t affect the performance:

iptables -I PREROUTING -t mangle -d -p udp –dport 1234 -j ACCEPT
iptables -I PREROUTING -t raw -d -p udp –dport 1234 -j ACCEPT
iptables -I INPUT -t filter -d -p udp –dport 1234 -j ACCEPT
The application code is a simple loop, receiving data and immediately discarding it in the userspace:

s = socket.socket(AF_INET, SOCK_DGRAM)
s.bind((“”, 1234))
while True:
We prepared the code, to run it:

$ ./dropping-packets/recvmmsg-loop
packets=171261 bytes=1940176
This setup allows the kernel to receive a meagre 175kpps from the hardware receive queue, as measured by ethtool and using our simple mmwatch tool:

$ mmwatch ‘ethtool -S ext0|grep rx_2’
rx2_packets: 174.0k/s
The hardware technically gets 14Mpps off the wire, but it’s impossible to pass it all to a single RX queue handled by only one CPU core doing kernel work. mpstat confirms this:

$ watch ‘mpstat -u -I SUM -P ALL 1 1|egrep -v Aver’
01:32:05 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
01:32:06 PM 0 0.00 0.00 0.00 2.94 0.00 3.92 0.00 0.00 0.00 93.14
01:32:06 PM 1 2.17 0.00 27.17 0.00 0.00 0.00 0.00 0.00 0.00 70.65
01:32:06 PM 2 0.00 0.00 0.00 0.00 0.00 100.00 0.00 0.00 0.00 0.00
01:32:06 PM 3 0.95 0.00 1.90 0.95 0.00 3.81 0.00 0.00 0.00 92.38
As you can see application code is not a bottleneck, using 27% sys + 2% userspace on CPU #1, while network SOFTIRQ on CPU #2 uses 100% resources.

By the way, using recvmmsg(2) is important. In these post-Spectre days, syscalls got more expensive and indeed, we run kernel 4.14 with KPTI and retpolines:

$ tail -n +1 /sys/devices/system/cpu/vulnerabilities/*
==> /sys/devices/system/cpu/vulnerabilities/meltdown /sys/devices/system/cpu/vulnerabilities/spectre_v1 /sys/devices/system/cpu/vulnerabilities/spectre_v2 protocol == IPPROTO_UDP
&& (htonl(iph->daddr) & 0xFFFFFF00) == 0xC6120000 //
&& udph->dest == htons(1234)) {
return XDP_DROP;
XDP program needs to be compiled with modern clang that can emit BPF bytecode. After this we can load and verify the running XDP program:

$ ip link show dev ext0
4: ext0: mtu 1500 xdp qdisc fq state UP mode DEFAULT group default qlen 1000
link/ether 24:8a:07:8a:59:8e brd ff:ff:ff:ff:ff:ff
prog/xdp id 5 tag aedc195cc0471f51 jited
And see the numbers in ethtool -S network card statistics:

$ mmwatch ‘ethtool -S ext0|egrep “rx”|egrep -v “: 0″|egrep -v “cache|csum”‘
rx_out_of_buffer: 4.4m/s
rx_xdp_drop: 10.1m/s
rx2_xdp_drop: 10.1m/s
Whooa! With XDP we can drop 10 million packets per second on a single CPU.

CC BY-SA 2.0 image by Andrew Filer

We repeated the these for both IPv4 and IPv6 and prepared this chart:

Generally speaking in our setup IPv6 had slightly lower performance. Remember that IPv6 packets are slightly larger, so some performance difference is unavoidable.

Linux has numerous hooks that can be used to filter packets, each with different performance and ease of use characteristics.

For DDoS purporses, it may totally be reasonable to just receive the packets in the application and process them in userspace. Properly tuned applications can get pretty decent numbers.

For DDoS attacks with random/spoofed source IP’s, it might be worthwhile disabling conntrack to gain some speed. Be careful though – there are attacks for which conntrack is very helpful.

In other circumstances it may make sense to integrate the Linux firewall into the DDoS mitigation pipeline. In such cases, remember to put the mitigations in a “-t raw PREROUTING” layer, since it’s significantly faster than “filter” table.

For even more demanding workloads, we always have XDP. And boy, it is powerful. Here is the same chart as above, but including XDP:

If you want to reproduce these numbers, see the README where we documented everything.

Here at Cloudflare we are using… almost all of these techniques. Some of the userspace tricks are integrated with our applications. The iptables layer is managed by our Gatebot DDoS pipeline. Finally, we are working on replacing our proprietary kernel offload solution with XDP.

Want to help us drop more packets? We’re hiring for many roles, including packet droppers, systems engineers and more!

Special thanks to Jesper Dangaard Brouer for helping with this work.

All content © 2018 Cloudflare

hpi.de: Is someone spying on you?

hpi.de: Is someone spying on you?

Toggle navigation

Leaked accounts per day
Is someone spying on you?
Everyday personal data is stolen in criminal cyber attacks. A large part of the stolen information is subsequently made public on Internet databases, where it serves as the starting point for other illegal activities.

With the HPI Identity Leak Checker, it is possible to check whether your email address, along with other personal data (e.g. telephone number, date of birth or address), has been made public on the Internet where it can be misused for malicious purposes.

Please enter your email address here.
The email address you have entered will only be used for searching in our database and, when applicable, to subsequently send an email notification. It will be saved in an obfuscated way to protect you from potential email spam and is never given to a third party.
Our other services and research on IT security
HPI-VDB – Our database for IT attack analysis and self-diagnosis,
tele-TASK – Lectures, not only on IT security,
openHPI – Our interactive online educational program.
… and more about our research in the field of IT security.

Privacy Statement Contact – Disclaimer © Hasso Plattner Institute 2014-2019

oneplus.com: Downloads und Aktualisierungen

oneplus.com: Downloads und Aktualisierungen

OnePlus 6T New Zubehör
Gehäuse & Schutz

Netzanschluss & Kabel



Marke Support Community

Downloads und Aktualisierungen
Laden Sie die aktuellste Oxygen OS Version für Ihr OnePlus Gerät herunter.

OnePlus 6T

OnePlus 6

OnePlus 5T

OnePlus 5

OnePlus 3T

OnePlus 3

OnePlus X

OnePlus 2

OnePlus 1
OnePlus 6T
OnePlus 6
OnePlus 5T
OnePlus 5
OnePlus 3T
Gehäuse & Schutz
Netzanschluss & Kabel
Studenten Programm
Trade-in Program
Aufgenommen auf dem OnePlus
FAQs zum Thema Kauf
Kontaktiere uns
Über OnePlus
Deutschland ( Deutsch / EUR ) +49 800 100 6293 (Toll Free) 8:00 am – 5:00 pm CET, Monday to Friday
© 2013 – 2018 OnePlus. All Rights Reserved.
Datenschutzbestimmungen. User Agreement Verkaufsbedingungen Security Feedback

Performance Implications of Packet Filtering with Linux eBPF

Performance Implications of Packet Filtering with Linux eBPF

Performance Implications of Packet Filtering
with Linux eBPF
Dominik Scholz, Daniel Raumer, Paul Emmerich, Alexander Kurtz, Krzysztof Lesiak and Georg Carle
Chair of Network Architectures and Services, Department of Informatics,
Technical University of Munich
Abstract—Firewall capabilities of operating systems are traditionally provided by inflexible filter routines or hooks in the
kernel. These require privileged access to be configured and are
not easily extensible for custom low-level actions. Since Linux 3.0,
the Berkeley Packet Filter (BPF) allows user-written extensions in
the kernel processing path. The successor, extended BPF (eBPF),
improves flexibility and is realized via a virtual machine featuring
both a just-in-time (JIT) compiler and an interpreter running in
the kernel. It executes custom eBPF programs supplied by the
user, effectively moving kernel functionality into user space.
We present two case studies on the usage of Linux eBPF.
First, we analyze the performance of the eXpress Data Path
(XDP). XDP uses eBPF to process ingress traffic before the
allocation of kernel data structures which comes along with
performance benefits. In the second case study, eBPF is used to
install application-specific packet filtering configurations acting
on the socket level. Our case studies focus on performance aspects
and discuss benefits and drawbacks.
Index Terms—Linux, eBPF, XDP, Performance Measurements
Controlling and monitoring network packets traveling from
and to a program on a specific host and network is an important
element in computer networks. In theory, the application itself
could determine whether to interpret and process a packet
received from the network and what information to send back,
but current implementations lack the ability for four reasons.
Our perspective is focused on Linux:
R1: Traffic should be filtered as early as possible to reduce
unnecessary overhead.
R2: Application developers know best about the packet filtering configuration concerning their application and therefore
should be able to ship policies together with their application
in an easy way.
R3: Considering the system administrator would know the
requirements of each application, she still would have to manage huge amounts of all slightly different and ever changing
configuration files, complicating central policy implementation
and verification.
R4: Orthogonal to these three policy related issues, modern applications have to cope with performance and latency
Over the past years, the state of the art was to have all of
the hosts’ packet filtering rules installed at a central location in
the kernel. Configuring and maintaining a centralized ruleset
using, e.g., iptables or nftables, requires root access, a scheme
not only used on UNIX-based systems. By the time a packet
is filtered in kernel space, rejected traffic already caused
overhead, as each single packet was copied to memory and
underwent basic processing. To circumvent these problems,
two trends for packet filtering can be observed: Either the
filtering is moved to a lower level leveraging hardware support
(e.g., offloading features or FPGA-based NICs), or breaking
up the ruleset and moving parts to user space.
In this paper we show how eBPF can be used to break up the
conventional packet filtering model in Linux, even in environments of 10 Gbit/s or more. Controlling the information flow
only at the network endpoints is appealing in its simplicity and
performance. Therefore, we use XDP (eXpress data path) for
coarse packet filtering before handing packets to the kernel.
This provides a first line of defense against traffic that is in
general unwanted by the host, e.g., spoofed addresses [1] or
Denial-of-Service flooding attacks. We contribute by comparing the achievable performance and latency for basic filtering
operations to using classic approaches like iptables or nftables.
Additionally, policy rules become simpler if only one application on a host at a time has to be considered and the risk of
leaving an application exposed to the network unintentionally
is reduced if both packet filtering and application form a
single unit. Default packet filtering rules could be shipped
by application developers alongside their product, bringing us
closer to secure-by-default systems while allowing techniques
like port-knocking without root access. For that vision, we
propose our solution for socket attached packet filtering.
Because traffic unwanted by the host in general is already
taken care of on a lower level, at socket level, per application
decisions regarding the traffic of interest can be made. We
show that both eBPF applications can be configured and used
from user space even in high-performance environments of
10 Gbit/s networks.
This paper is structured as follows. In Section II, we
present an overview of packet filtering on Linux machines
at different points in the processing chain. A short history
of eBPF in Linux is presented in Section III. In Section IV,
we present measurements of common kernel packet filters as
baseline for our case studies. Section V presents our XDP case
study. Section VI describes our proposal for socket attached
eBPF packet filtering and details the performance implications
thereof. Related work is discussed as part of each case study,
before we conclude with Section VII.
(Sec. V)
Socket eBPF
(Sec. VI)
Poll routines
Network stack
Transport prot.
Application FWs
Application FWs Application FWs Applications
Figure 1: Different levels of packet filtering in Linux
Packets can be filtered at different stages on their way from
the physical network interface until they reach the application.
Figure 1 presents a rough overview of this; every non-dashed
box presents a point where filters can be applied.
A. Hardware Level
The first, and from a performance perspective most attractive point for filtering as part of a firewall, is on the network
interface card (NIC) itself. Modern server NICs offer hardware offloading or filter capabilities, which can be configured
via driver parameters and tools like ethtool. Furthermore,
dedicated platforms based on FPGAs or SmartNICs for packet
filter offloading like HyPaFilter [2] have emerged. However,
the functionality is limited and depends on the vendor and the
specific NIC, resulting in a lack of general availability and no
common interface for configuration.
B. Network Level
We refer to network level firewalls as any kind of packet
filtering before the routing subsystem has started to process a
packet. From performance perspective this is more attractive
than a system level firewall as fewer CPU cycles are consumed
for dropped packets.
In Section V, we analyze the benefits of early filtering via
XDP, e.g., against DoS attacks.
C. System Level
Packet filters on system level like iptables or the newer
nftables are widely used in Linux systems, achieving performance levels acceptable for today’s applications. They
hook into the processing at different locations, e.g., at the
routing subsystem of the Linux kernel and therefore before
the application. However, these firewalls require root access
and system-specific knowledge; different rules may interfere
and it is not possible to ship meaningful application-specific
packet filters with the application.
In Section IV, we use both iptables and nftables as baseline
for the performance measurements of our case studies.
D. Application Level
Application level firewalls look into traffic addressed for
a specific application. In our second case study, we add
application-specific eBPF-based packet filters to sockets using
systemd’s socket activation (cf. Section VI). This allows
application developers to ship packet filtering rules that can
be deployed together with their application. Those rules are
specific to an application and simplify the central system-level
The increase of dynamic features is a general trend in the
Linux kernel: Virtual machines, byte code interpreters, and
JIT compilers help to abstract features and move complexity
into the user space. Both case studies in this paper (XDP and
our application level packet filters) are based on eBPF, which
is available in Linux 4.x. eBPF allows for high-performance
packet filtering controlled from user space. When filtering
sockets of specific applications, root access is not required.
A. Berkeley Packet Filter VM
BPF, developed in 1992 for UNIX [3], has the purpose
of packet filtering, improving the performance of network
monitoring applications such as tcpdump. BPF has long
been part of the Linux kernel. In the original paper [3], a
BPF program can be found, which loads a packet’s Ethernet
protocol type field and rejects everything except for type IP,
demonstrating the internal workings.
Since the release of Linux 3.0, the BPF VM has been
improved continuously [4]. A JIT compiler was added for
the filter, which enabled the Linux kernel to translate the
VM instructions to assembly code on the x86 64 architecture
in real time. In Linux 3.15 and 3.16, further performance
improvements and generalizations of BPF were implemented,
leading to the extended Berkeley Packet Filters (eBPF).
B. Extended Berkeley Packet Filters
eBPF is an extended form of the BPF VM with a network
specific architecture designed to be a general purpose filtering
system. While the features provided by or implemented with
eBPF continues to grow [5], it can already be used for other
applications than socket filtering, such as packet filtering,
traffic control/shaping, and tracing. The IO Visor Project [5]
lists eBPF features available by kernel version.
eBPF is a byte code with 64-bit instructions running on
a machine with ten 32-bit registers, either dynamically interpreted or compiled just-in-time [5]. The instructions of the
eBPF VM are usually mapped 1:1 to real assembly instructions
of the underlying hardware architecture [6][7]. When the
kernel runs an eBPF program, it sets up the memory this
program can access depending on the type of the program
(e.g., socket filtering or event tracing) and allows calling
a predefined and type-dependent set of functions provided
16 32 64 128 256 512 0
Number of Rules [#]
Packets [Mpps]
nftables iptables
Figure 2: Maximum throughput for iptables and nftables
by the kernel [8]. eBPF programs are like regular programs
running on hardware with two exceptions: First, when an eBPF
program is loaded into the kernel, it is statically verified. This
ensures that the program contains no backward jumps (loops)
and does not exceed a maximum size of 4096 instructions.
This verifier cannot be disabled without root access, impeding
users from overloading the system with complex filtering rules.
Thus, a malicious eBPF program cannot compromise or block
the kernel, which makes it possible to allow non-root users to
use eBPF [9]. All memory accesses are bound-checked.
The second difference are the key-value stores, coined maps.
While regular programs can retain state and communicate with
other programs using a variety of methods, eBPF programs are
restricted to reading from and writing to maps. Maps are areas
in memory which have to be set up by a user space helper
before the eBPF program is loaded into the kernel. Both, the
size of the key and value type, as well as the maximum number
of entries are determined at creation time. Data can then be
accessed in a secure manner by both user space and eBPF
kernel space using a file descriptor [5].
C. eXpress Data Path
Linux 4.8 integrated eXpress Data Path (XDP) [5] into the
kernel. XDP provides the possibility for packet processing at
the lowest level in the software stack. Functions implemented
in network drivers expose a common API for fast packet
processing across hardware from different vendors. Through
driver hooks, user-defined eBPF programs can access and
modify packets in the NIC driver’s DMA buffers. The result
is a stateless and fast programmable network data path in the
Linux kernel. It allows early discarding of packets, e.g., to
counteract DoS, and bypassing of the Linux network stack to
directly place packets in the egress buffers.
XDP is already actively used. Cloudflare integrated XDP
into their DoS mitigation pipeline [10]. Incentives are the low
cost for dropping packets and the ability to express rules in
a high level language. In addition to deploying their own
DoS protection solution based on XDP [11], Facebook has
published plans to use XDP as layer 4 load balancer [12].
In their proposed scheme, each end-host performs the load
balancing. This again is possible, because XDP hooks are
performed before any costly actions.
The iptables packet filter utility is part of Linux since
kernel version 2.4 in 2001. It was introduced together with
the netfilter framework that provides functionality to hook in
Relative Probability [%]
20 40 60 80 100 120 140 160 180 200
Latency [µs]
No Firewall
Figure 3: Latency distribution for iptables and nftables at
0.03 Mpps. Maximum outlier (blue cross) and median (dotted
red line) are marked for better visibility.
at different occasions in the Linux network stack. iptables
rules trigger different behavior implemented in different kernel
modules. This approach has shown to introduce drawbacks,
like the need to implement additional modules for IPv6, which
are, although copied in large parts from IPv4, a separate
module in the Linux kernel.
In 2014, with Linux 3.13 nftables [13] was introduced. It
also builds on the netfilter framework, but has only basic
primitives like compare operations, or functionality to load
information from the packets implemented in the kernel.
The actual packet filters are created in the user context via
translation of rules into a list of basic primitives that are
evaluated via pseudo-machines whenever a packet needs to be
checked. This for instance allows to extend packet filtering
functionality for new protocols without kernel updates, by
updating the userspace program for rule translation.
We consider both, iptables and nftables packet filters based
on the netfilter framework of the Linux kernel as the default
way of packet filtering in Linux. Therefore, we use them
as baseline comparison for our subsequently described case
studies of eBPF-based packet filtering approaches.
We perform basic tests to measure the performance of
iptables and nftables by installing rules that do not apply to the
traffic, but are sequentially checked before the packet passes.
Figure 2 shows the maximum throughput for increasing rule
set sizes. Our results coincide with related work, showing that
iptables yields better performance [14]. Both approaches are
able to process up to 1.5 Mpps for few number of rules, but
quickly decline when increasing the number of traversed rules.
The test traffic consists of a single flow, processing was hence
restricted to a single 3.2 GHz CPU core. Figure 3 shows the
latency distribution without firewall and with 3000 installed
nftables and iptables rules at 0.03 Mpps which equals 15 Mbit/s
at minimum packet size. iptables induces five times increased
latency compared to not using any packet filters, while the
median for nftables is further increased by roughly 20 µs. Both
applications show two peaks and a long tail which can stretch
up to 180 µs compared to 110 µs for no firewall.
0 2 4 6 8 10
O↵ered Rate [Mpps]
CPU cycles
per packet [log]
Standard JIT compiled
Figure 4: Lower bound of cycles per packet for an XDP
XDP yields performance gains as processing happens before
allocation of meta-data structures like the kernel’s sk_buff
or buffering in software queues [15]. The programmability
aspect is realized with eBPF. As result of the processing, a
packet can either be dropped, passed to the Linux TCP/IP
stack or transmitted back via the same NIC it was received
on. Beside functionality that is usually deployed in firewalls
on system level, this offers the possibility to use XDP for
packet processing, coarse filtering, e.g., to protect against
common DoS flooding attacks, stateful filtering, forwarding,
load balancing tasks, etc. In other words, XDP is useful to
protect the complete host from certain traffic, not to protect
single applications.
A. Measurement Setup
For all measurements we connected two hosts directly with
each other via a 10 Gigabit Ethernet (GbE) link. One host
is used as load generator and sink using the traffic generator
MoonGen [16]. The second host is the device under test (DuT)
and runs the packet filtering program. The DuT is equipped
with an Intel Xeon E3-1230 CPU (4 cores) @ 3.20 GHz and
16 GB DDR3 RAM and an Intel X540-T2 Network Adapter.
The DuT runs a live boot Debian Stretch image with a Linux
4.12 kernel supporting XDP for Intel NICs. We used the
Linux profiling utility perf for whitebox measurements of the
DuT. All incoming traffic was pinned to one core to reduce
the influence of jitter and side-effects on our measurements.
Furthermore, we statically set the CPU frequency to 100 % and
disabled Turbo Boost and Hyper-Threading. Each workload
was tested for 30 s.
1) XDP Sample Program: The sample program we used
for packet filtering with XDP is based on netoptimizer’s
prototype-kernel1. It consists of two parts: A command-line
application to add and remove filtering rules and an application
that attaches the XDP program to the network driver of the
specified interface. The processing of an incoming packet is
as follows. The XDP program first parses the Ethernet frame
and performs sanity checks. If the parsing fails, the packet is
passed on to the TCP/IP stack for further handling, otherwise
it attempts to parse the IPv4 header and extract the source
IP. If the extracted IP is contained in a blacklist, the packet
is dropped. If not, execution continues to check the layer 4
samples/bpf/xdp ddos01 blacklist kern.c
header, extract the ports and look up another eBPF map that
contains a list of blocked ports. If the port is blacklisted, the
program returns XDP_DROP, otherwise, the packet is passed
to the network stack.
We slightly modified the XDP program. Instead of passing
valid packets to the TCP/IP stack, we swap source and destination MAC addresses and return the packet to the sender with
XDP_TX. Including the network stack introduces complexity
and potential side effects. Analysis of the network stack can
be found in numerous related work [17], [18].
2) Filtering Using Kernel Bypass: In addition to comparing
the results with the baseline measurements presented in Section IV we compare XDP with current state of the art kernel
bypass technology. We use a packet filtering example based on
the libmoon/DPDK framework [19]. The filter sample program
was configured to align the number of threads and TX/RX
queues to the same amount used by the XDP sample.
3) MoonGen Load Generator: We use the MoonGen [16]
high-performance, software packet generator based on libmoon/DPDK for traffic generation. MoonGen is able to saturate a 10 Gbit/s link with minimum sized, user-customized
packets (14.88 Mpps) using a single core and offers precise
nanosecond-level hardware assisted latency measurements. All
measurements were performed using an example script2 sending minimum sized 64 B packets.
B. Performance Results
1) eBPF Compiler Mode: At the time of writing this paper
(2018-03), the JIT compiler is still disabled by default as it is
deemed experimental and not mature enough. To analyze the
effect of the JIT compiler, we used the most basic XDP example, dropping every received packet. Our measurements show
that enabling the JIT can improve the performance by up to
45 %, resulting in 10 Mpps total processed packets. As shown
in Figure 4, the JIT increases the CPU cycles spent per packet
for low packet rates. The optimizations through the JIT reverse
this effect for higher rates. Processing packets with only 50 %
of the cycles yields the observed increase in performance. With
increased maturity and continuous optimization of the JIT this
performance gain compared to static compilation is likely to
increase further.
An XDP program uses the routines of the Linux network
stack for packet reception and transmission. The costs for the
operations of the NIC driver are known to be roughly 700
CPU cycles per packet [18]. Thus, we can estimate the lower
bound for executing the XDP code to roughly 300 CPU cycles
per packet when JIT compiled.
2) Maximum Processing Rate: Figure 5 shows the maximum performance when filtering packets with XDP with JIT
enabled. XDP can process up to 7.2 Mpps in the case of 90 %
packets dropped, a 28 % performance loss compared to the
basic drop example. When reaching the point of full CPU
utilization, the amount of processed packets remains constant,
i.e., excess packets cannot be processed and are dropped
0 2 4 6 8 10 12 14 0
O↵ered Rate [Mpps]
Processed Packets
libmoon 90% dropped
50% dropped 10% dropped
Figure 5: Packet filtering performance for different amount of
filtered packets (JIT enabled)
(independent of eBPF rules and XDP actions). The 3 Mpps
reduction in peak performance compared to the simple drop
example is because the filter program performs additional tasks
including the parsing and forwarding of packets. Depending on
the percentage distribution of dropped and forwarded packets,
the performance differs. Dropping packets costs less cycles
than forwarding a packet. However, towards the worst case,
i.e., all packets are forwarded, less than a 10 % reduction in
performance is visible.
The results show a ⇠10-fold performance increase
compared to iptables and nftables. As expected, the
libmoon/DPDK-based kernel bypass approach was able to
process at 10 Gbit/s line-rate for all scenarios.
In the following, our analysis is limited to the case of 90 %
flows being passed. We chose this scenario to obtain more
samples for our latency analysis, as only packets that pass the
device can be timestamped.
3) Profiling: We use Linux’ perf record profiling utility to analyze the costs of different internal processing steps.
Linux’ perf record allows us to count CPU cycles spent
per function. As this results in hundreds of kernel functions,
we group the functions by category. bpf_prog contains
the eBPF program code itself (packet processing), while
bpf_helper are functions that can be called from eBPF
programs such as map lookups. Driver related functions of
the NIC are grouped in ixgbe, while kernel denotes all
other kernel functions. Finally, idle contains idle functions
and unused CPU cycles. Note that in order to get meaningful output when profiling a JIT-compiled eBPF program,
the net/core/bpf_jit_kallsyms kernel parameter has
to be set, exporting the program as a kernel symbol to
Disregarding idle times for lower rates in Figure 6, the eBPF
program’s relative cost account for the highest percentage with
approximately 60 % of CPU usage. The second highest is the
ixgbe driver code, requiring close to 20 % of the resources,
primarily for handling DMA descriptors (13 %). The BPF
helper functions consume approximately 10 %. This is almost
exclusively utilized for the execution of lookup functions. The
processing performed by the kernel amounts to less than 5 %
accounted to various utility functions.
4) Latency: Figure 7 shows latency percentiles up to the
99.99th percentile for the XDP packet filtering example with
90 % packets passed. We compare the cases with and without
2 4 6 8 10 0
O↵ered Rate [Mpps]
CPU load [%]
idle kernel ixgbe
bpf helper bpf prog
Figure 6: Profiling XDP
0 1 2 3 4 5 6 7 100
103 JIT enabled
O↵ered Rate [Mpps]
Latency [log(µs)]
50th 90th 99th 99.9th 99.99th
Figure 7: Latency percentiles for XDP packet filtering (90 %
passing traffic)
JIT to analyze the effect of the JIT compiler on latency
As with the drop example, the JIT compiled code yields
3 Mpps more performance for the filter example. Two different
areas of latency can be observed. Overloading the device leads
to a high latency (800 µs to 1500 µs) caused by buffers. During
normal operation, the program shows a median latency of
roughly 50 µs. With increasing load the 99.99th percentile
extends up to the worst case latency. At both edges, latency
optima with median latency of 10 µs to 20 µs are observable.
For very low data rates this extends throughout the 99.99th
Figure 8 shows the latency histograms for both optima and
the average case during normal operation. These latency figures are in the expected and normal range for in-kernel packet
processing with interrupt throttling (ixgbe ITR) [20], [21]. The
histograms for the optima show similarities, independent of the
JIT compiler. The difference is that the optimum for high data
rates shows a long tail with outliers between 100 to 300 µs.
During the average steady-state case (1.4 Mpps/3.9 Mpps) the
median latency raises to roughly 50 µs. Both cases have a long
tail, however, enabling JIT causes more and higher outliers up
to 900 µs.
In comparison to the iptables and nftables baseline measurements, XDP achieves slightly better median latencies for
the steady-state. However, the downside is the long tail,
which can appear for all data rates beyond 0.5 Mpps and
10 0.2 Mpps
Relative Probability [%]
1.4 Mpps
0 50 100 150 200 250 300 350 400 450 500
Latency [µs]
2.95 Mpps
(a) No JIT
10 0.4 Mpps
Relative Probability [%]
3.9 Mpps
0 50 100 150 200 250 300 350 400 450 500
Latency [µs]
6.25 Mpps
852 853 854
(b) JIT enabled
Figure 8: Histograms for XDP packet filtering with 90 % passing traffic. Median (red dotted line) and outliers (blue cross) are
marked for better visibility.
can induce 20 times increased latencies. The JIT compiler
increases this effect. This is a significant problem for modern
high-performance applications with high demand to low and
stable latency [22]. Such outliers do not appear when running
an application on DPDK [21], the trade-off is that the CPU is
always fully utilized by the poll-mode driver.
C. Discussion
Our measurements have shown that XDP provides a significant performance increase in comparison to iptables or
nftables. Enabling the eBPF JIT compiler further increases
the performance up to 10 Mpps. However, the line rate performance of a kernel bypass application like DPDK cannot be
The mean latency of XDP is comparable to in-kernel filtering applications. Latency is dominated by interrupt throttling
in the driver (ixgbe ITR) and dynamic interrupt disabling and
polling in NAPI [20]. DPDK-based applications with a pure
poll-mode driver provide a consistent low latency at the cost
of 100% CPU load regardless of the offered load.
XDP offers a tradeoff. While performance is not as good
as dedicated high-performance frameworks that bypass the
kernel, it offers flexibility. It is fast enough to be deployed as
DoS protection, but also offers kernel integration, i.e., packets
can be passed through the network stack with all its benefits.
While being able to process 7.2 Mpps might seem insufficient
as it is only 50 % of 10 GbE line rate, this represents the
performance of a single core, i.e., it scales with the number
of CPU cores.
Our second case study aims at fine-grained filtering before
packets reach the application. The purpose is not DoS protection. This should be handled in general for the complete host
on a lower level, e.g. with XDP. Allowing to filter packets on
a per-socket basis has clear advantages. Application developers can bind to the wildcard address, circumventing startup
problems in case an address or interface is not yet available.
The developers can ship their program with packet filters to
Forward packet
to application
Drop packet
as filter
Bytecode Kernel VM
Figure 9: Steps from C based packet filter definition to
effective socket attached packet filter
restrict the network exposure according to their requirements.
Also, the user can define application specific firewall rules
without privileges. This does not introduce security problems,
i.e. one application can not mess with the domain of another, as
rules can only be attached to a specific socket, only impacting
the application listening on the socket. Lastly, network tool
developers can implement complex firewall tools on top of this
approach, such as custom port-knocking solutions or traffic
analysis programs.
As a result, the complex firewall is decentralized, consisting
of smaller, easy to maintain rulesets per application. This approach is less prone to errors and requires less administration.
A. Technical Overview
We combined the eBPF features in the Linux kernel and
the systemd init daemon. systemd creates the socket, adds an
eBPF machine and then passes it to the application via systemd socket activation. We implemented a proof-of-concept,
available as open source [23], for demonstration and use it to
analyze what performance penalty is inherited when attaching
an eBPF filter to an application socket. Our implementation
offers a command line interface, enabling to quickly write
applications requiring even complex support from the firewall.
We demonstrate this with an implementation of port-knocking
using our tool [23].
Figure 9 shows the steps to instantiate application specific
packet filtering rules via eBPF. In the first step, C code
is compiled with clang into eBPF programs, before being
attached as filter to an application socket. The kernel VM now
runs the program to determine which packets are to be dropped
and which to be forwarded to the application.
0 10 20 30 40 50 60 0
MTU [kB]
Standard JIT compiled TB/HT
Figure 10: Baseline transmission speeds at different MTUs
1) BPF Compiler Collection: As programming with the
low-level eBPF instruction set (cf. [24]) can be cumbersome,
the BPF Compiler Collection (BCC) [25] provides the C compiler clang to compile restricted C code into eBPF programs.
The restrictions with byte code verifier in the kernel are:
no loops, limited number of instructions, no function calls,
etc. BCC wraps LLVM/clang and enhances it with support
for special syntax for defining eBPF maps inside the actual
C program code. The eBPF maps are used to store state
across multiple invocations of the eBPF program and for
communication with the user space.
2) Socket Activation: Socket activation allows to spawn
services on arrival of incoming connections. This allows the
application to only start if and when a client actually connects
to it, and to only stay active for the duration of the connection.
While the first popular implementation, inetd, spawned a process for new requests, the systemd socket activation protocol
only starts the main application once and passes the listening
socket to it via a file descriptor with a specified number.
We take advantage of this capability to pass sockets in our
implementation and use this to implement application-level
packet filtering. The server socket is created before the filter
program is attached to the socket using the BCC library. The
socket is then passed to the application via systemd socket
activation. As both systemd and socket activation have become
popular, many applications support preconfigured socket file
descriptors. For those, our application firewall can be added
without changes.
3) Data Availability: When transferring a stream of data
using TCP sockets, the user space only sees one packet
containing the TCP/UDP payload. The lack of Ethernet and
IP headers makes packet filtering impossible.
Fortunately, eBPF allows negative memory addresses to
implement certain Linux-only extensions such as getting the
current time or determining the CPU core the filter program
is currently running on [26]. Our implementation uses this to
read from memory addresses containing the layer 2 and layer 3
headers, even if the filter is attached to a UDP or TCP socket.
B. Measurement Setup
The measurement setup differs in comparison to Section V-A. As benchmarking an application-level packet filter
requires all packets to pass the kernel, it is expected that the
performance is clearly below 10 GbE line rate on the same
hardware. Initial measurements using a simple echo daemon
without any packet filters confirmed that the peak throughput
is below 4 Gbit/s. To maximize the impact of our socketactivated packet filters, we decided to measure the performance
using the loopback interface of the DuT. Using a virtual link
accomplishes this, as the results will be largely dominated by
how efficient the eBPF filters (and the TCP/IP stack) work
on Linux, assuming that processor and memory speeds are
constant. Furthermore, we decided to use the Linux utility
iperf instead of MoonGen as traffic generator and sink. This
approach requires no dedicated networking hardware, allowing
for simpler to reproduce experiments at the cost of not being
able to analyze the latency behavior. However, we expect that
latency will be primarily influenced by the same mechanisms
discussed previously.
All measurements are performed using IPv6/TCP traffic
generated by iperf. Instead of the packet rate, we measure
the throughput of the application, as it is the important metric
for application developers. We tested three different configurations. As in Section V-B1, we evaluated the impact of enabling
the JIT compiler for eBPF. Furthermore, we analyzed the
performance when also enabling dynamic frequency scaling
of the CPU, Turbo Boost, and Hyper Threading (referred to as
TB/HT). Note that these are usually disabled for performance
measurements as they introduce jitter and for instance raise
the CPU frequency above 100 %. However, we argue that it
represents a realistic scenario for an application using a socket
attached filter as they are enabled per default on Linux-based
C. Performance Measurements
The following discusses the performance results of our
socket-activated packet filtering tool.
1) Baseline: To determine maximum achievable transmission speed of our setup, we ran a performance test with iperf
serving as both the client and server at the default MTU of the
loopback interface (65536 bytes). The measurement showed a
maximum transmission speed of roughly 3.8 GB/s as the limit.
Figure 10 shows the peak throughput for increasing MTU
when using our client and server programs without attaching a
socket filter. Instead, they use classic systemd socket activation
to create and pass the socket.
The results show that our program without any filtering
rules is able to reach the maximum throughput starting with
approximately 42 kB. As we are not using socket filters, this
is independent of the usage of the JIT compiler. For smaller
MTU the correlation between MTU size and transmission
speed appears sub-linear. This suggests a slightly increasing
overhead with growing packet size. We attribute the visible
steps of performance to side effects of the Linux memory
2) Subnet Filtering: The first scenario configures a socket
filter to compare all ingress traffic against a configurable set
of IPv6 subnets and only allow the packet to pass if a match
was found, i.e., a whitelist filter. The matching rule is put
increasingly further in the back of the list of subnets given
to the socket filter, starting from index 0 up to index 22,
which is the biggest index our filter supports, because of
the maximum size of an eBPF program. We ran the test for
0 5 10 15 20 0
x-th Rule Matching Trac [#]
Rel. Througput [%]
65536 MTU 9000 MTU
1500 MTU 1280 MTU
(a) Standard
0 5 10 15 20 0
x-th Rule Matching Trac [#]
Rel. Throughput [%]
65536 MTU 9000 MTU
1500 MTU 1280 MTU
(b) JIT compiled
0 5 10 15 20 0
x-th Rule Matching Trac [#]
Rel. Throughput [%]
65536 MTU 9000 MTU
1500 MTU 1280 MTU
(c) TB/HT optimizations enabled
Figure 11: Subnet filtering performance at various MTUs and matching rule indices
0 20 40 60 80 100 0
x-th Rule Matching Trac [#]
Rel. Throughput [%]
65536 MTU 9000 MTU
1500 MTU 1280 MTU
(a) Standard
0 20 40 60 80 100 0
x-th Rule Matching Trac [#]
Rel. Throughput [%]
65536 MTU 9000 MTU
1500 MTU 1280 MTU
(b) JIT compiled
0 20 40 60 80 100 0
x-th Rule Matching Trac [#]
Rel. Throughput [%]
65536 MTU 9000 MTU
1500 MTU 1280 MTU
(c) TB/HT optimizations enabled
Figure 12: Interface filtering performance at various MTUs and matching rule indices
four different MTUs typically encountered. 1280 bytes is the
minimum MTU allowed with IPv6 [27], 1500 bytes the default
MTU for Ethernet [28], 9000 bytes a commonly supported size
for Jumbo frames [29], and 65536 bytes is the default MTU
of the loopback interface on Linux [30]. The results relative
to the baseline speed at the corresponding MTU are displayed
in Figure 11.
For all configurations, the performance degrades the further
back the matching rule is in the set of whitelisted subnets.
This is expected as rules are matched sequentially, i.e. nonmatching rules before a match do cost performance. Higher
MTUs generally lead to better performance, even though the
performance is relative to the baseline performance of that
MTU. This is because a higher MTU not only means fewer
packets for the TCP/IP stack to handle, but also implies fewer
invocations of the socket filter.
Enabling JIT (cf. Figure 11b) yields significant performance
gains up to 100 % for the worst case of lowest possible MTU
and highest possible index for the matching rule. Enabling
further optimizations (cf. Figure 11c) further increases the
performance, such that the minimum performance is 80 % of
the baseline. Due to the baseline being slightly lower for the
1280 and 1500 bytes MTUs, the relative throughput for the
9000 bytes MTU is worse than the other measured MTUs.
3) Interface Filtering: The second scenario filters ingress
packets depending on their incoming interface. The difference
to the first scenario is that the (byte-) code for checking
whether a packet arrived via a particular interface (32 bit
integer) is shorter than the code required for doing a full IPv6
subnet matching (128 bit integer + logic for subnet matching).
Consequently, the eBPF program can contain four times as
many rules, allowing to further investigate the performance of
the match and action processing.
The results of the worst case without JIT compilation in
Figure 12a are almost identical to Figure 11a, despite the
increase in the number of rules. Only when enabling JIT
(cf. Figure 12b) or all optimizations (cf. Figure 12c), the
performance degrades to 50 % and 70 % of the baseline,
respectively, when having to process more than 100 rules. In
fact, in this scenario, when having to process the same number
of rules as in the previous scenario (up to 22), the performance
is equal or slightly better. This is due to the simpler packet
filter, i.e., the interface ID in comparison to an IPv6 subnet.
D. Discussion
Making the packet filtering or general firewall decisions, at
least partially, at the application level instead of the network
or system level provides both better isolation between applications and also more freedom for application developers and
users. While not all applications may require that freedom,
there are examples (such as local file sharing solutions or
remote management interfaces) whose overall security could
be increased, if their network exposure could be limited in
a flexible and configurable way. Additionally, an application
level firewall does not have to care about the system-level
configuration and its limitations, thus reducing the risks of
creating an error-prone ruleset affecting other applications on
the host.
The idea of filtering network traffic at the application
level has in fact been implemented before with for instance
TCP Wrappers [31]. However, these implementations typically
require explicit application support (linking against a shared
library [32]), and, more importantly, do not allow applications
to ship their own, arbitrary, filtering rules. Instead, they are
restricted to what the system administrator has set up in the
global TCP Wrappers configuration file.
Our approach circumvents these issues. We have shown that
this can be done at high data rates in both scenarios, once with
few complex rules, and in the second case with one hundred
simple rules. When using optimizations available and enabled
per default in modern systems, the performance loss is below
30 % on a single core.
We argue that the bottleneck is the limit of the eBPF
program size, i.e., the number and complexity of rules that
can be attached to the socket. Considering machines with a
realistic number of interfaces and therefore required socket
filter rules, this is likely to be acceptable for packet filtering
at application level. Besides packet filtering, our tool can also
be used for more complex firewalling operations, which we
demonstrated with a port-knocking application [23].
As shown by our two case studies, eBPF can be used for
versatile and high performance packet filtering applications.
XDP, hooking at the lowest level before the network stack, is
well suited for coarse packet filtering such as DoS prevention.
Our measurements have shown that XDP can yield four times
the performance in comparison to performing a similar task in
the kernel using common packet filtering tools. While latency
outliers exist, which will likely be improved with increasing
maturity of the XDP code base, the median latency also shows
improvements. JIT compiled code yields up to 45 % improved
performance at the cost of more and higher latency outliers.
Furthermore, eBPF and XDP are constantly being improved
or extended, e.g., eBPF hardware offloading or redirecting
packets to another NIC for XDP, which will likely improve
the performance and support more use cases.
Our approach for socket-activated packet filtering shows that
eBPF provides flexibility. Application specific firewall rules
can be set by each application individually without requiring
root access. Our tool grants the freedom to the application
developer to restrict the network exposure of the application to
its needs. This can improve the classic, error prone, centralized
configuration schemes used with iptables and nftables. The
complexity of the global, system level firewall is reduced
while security is improved through better isolation between
applications. Performance losses are below 20 % for the worst
case of maximum number of rules with minimum MTU when
enabling JIT compilation and modern performance features
like Turbo Boost.
The code of our eBPF demo application is available as free
and open source [23].
[1] P. Ferguson and D. Senie, “Network Ingress Filtering: Defeating Denial
of Service Attacks which employ IP Source Address Spoofing,” Internet
Requests for Comments, RFC Editor, RFC 2827, May 2000.
[2] A. Fiessler, S. Hager, B. Scheuermann, and A. W. Moore, “Hypafilter—a
versatile hybrid fpga packet filter,” in Architectures for Networking
and Communications Systems (ANCS), 2016 ACM/IEEE Symposium on.
IEEE, 2016.
[3] S. McCanne and V. Jacobson, “The BSD Packet Filter: A New Architecture for User-level Packet Capture,” in Proceedings of the USENIX
Winter 1993 Conference Proceedings on USENIX Winter 1993 Conference Proceedings, ser. USENIX’93. Berkeley, CA, USA: USENIX
Association, 1993.
[4] Q. Monnet. (2017, 10) Dive into BPF: a list of reading material.
[Online]. Available: https://qmonnet.github.io/whirl-offload/2016/09/01/
[5] IO Visor Project. (2017, 09) BPF Features by Linux Kernel
Version. [Online]. Available: https://github.com/iovisor/bcc/blob/master/
[6] J. Corbet, “BPF: the universal in-kernel virtual machine,” May 2014.
[Online]. Available: https://lwn.net/Articles/599755
[7] J. Corbet, “A JIT for packet filters,” Apr. 2011. [Online]. Available:
[8] IO Visor Project. (2017, 09) bcc Reference Guide. [Online]. Available:
https://github.com/iovisor/bcc/blob/master/docs/reference guide.md
[9] J. Corbet. (2015, 10) Unprivileged bpf(). [Online]. Available:
[10] G. Bertin, “XDP in practice: integrating XDP into our DDoS mitigation
pipeline,” in Technical Conference on Linux Networking, Netdev, vol. 2,
[11] H. Zhou, D. Porter, R. Tierney, and N. Shirokov, “Droplet: DDoS
countermeasures powered by BPF + XDP,” in Technical Conference
on Linux Networking, Netdev, vol. 1, 2017.
[12] H. Zhou, Nikita, and M. Lau. (2017) XDP Production Usage: DDoS
Protection and L4LB. [Online]. Available: https://www.netdevconf.org/
[13] netfilter.org. The ”nftables” project. [Online]. Available: http://netfilter.
[14] P. Sutter. (2017) Benchmarking nftables. [Online]. Available: https:
[15] Linux Foundation. (2017, Jul.) eXpress Data Path. [Online]. Available:
[16] P. Emmerich, S. Gallenmuller, D. Raumer, F. Wohlfart, and G. Carle, ¨
“MoonGen: A Scriptable High-Speed Packet Generator,” in Internet
Measurement Conference (IMC) 2015, Tokyo, Japan, Oct. 2015.
[17] R. Bolla and R. Bruschi, “Linux software router: Data plane optimization
and performance evaluation,” Journal of Networks, vol. 2, no. 3, pp. 6–
17, 2007.
[18] D. Raumer, F. Wohlfart, D. Scholz, P. Emmerich, and G. Carle,
“Performance exploration of software-based packet processing systems,”
Leistungs-, Zuverlassigkeits-und Verl ¨ asslichkeitsbewertung von Kommu- ¨
nikationsnetzen und verteilten Systemen, 6. GI/ITG-Workshop, MMBnet,
[19] P. Emmerich. (2017) libmoon git repository. [Online]. Available:
[20] P. Emmerich, D. Raumer, A. Beifuß, L. Erlacher, F. Wohlfart, T. M.
Runge, S. Gallenmuller, and G. Carle, “Optimizing Latency and CPU ¨
Load in Packet Processing Systems,” in International Symposium on
Performance Evaluation of Computer and Telecommunication Systems
(SPECTS 2015), Chicago, IL, USA, Jul. 2015.
[21] P. Emmerich, D. Raumer, S. Gallenmuller, F. Wohlfart, and G. Carle, ¨
“Throughput and Latency of Virtual Switching with Open vSwitch: A
Quantitative Analysis,” Jul. 2017.
[22] G. Tene, “How not to measure latency,” 2013.
[23] A. Kurtz. alfwrapper – Application-level firewalling using systemd
socket action and eBPF filters. Github project page. [Online]. Available:
[24] python sweetness. (2015, 07) Fun with BPF, or, shutting
down a TCP listening socket the hard way. [Online]. Available: http://pythonsweetness.tumblr.com/post/125005930662/
[25] BCC Project. (2017) Main repository. [Online]. Available: https:
[26] A. Starovoitov. (2017, 03) [iovisor-dev] Reading src/dst addresses
in reuseport programs. [Online]. Available: https://lists.iovisor.org/
[27] S. Deering and R. Hinden, “Internet Protocol, Version 6 (IPv6)
Specification,” Internet Requests for Comments, RFC Editor, RFC 8200,
Jul. 2017. [Online]. Available: http://www.rfc-editor.org/rfc/rfc8200.txt
[28] IEEE, “Standard for Ethernet,” IEEE Std 802.3-2015 (Revision of IEEE
Std 802.3-2012), March 2016.
[29] I. Cisco Systems. (2017) Jumbo/Giant Frame Support on Catalyst Switches Configuration Example. [Online]. Available: https://www.cisco.com/c/en/us/support/docs/switches/
[30] Linux 4.13 Source Code. (2017) drivers/net/loopback.c. [Online].
Available: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.
[31] W. Venema, “TCP Wrapper – Network monitoring, access control, and
booby traps.” 1992. [Online]. Available: http://http://static.usenix.org/
publications/library/proceedings/sec92/full papers/venema.pdf
[32] D. J. Barrett, R. E. Silverman, and R. G. Byrnes. (2017) SSH
Frequently Asked Questions: I’m trying to use TCP wrappers (libwrap),
but it doesn’t work. [Online]. Available: http://www.snailbook.com/faq/