No description
  • C 81.4%
  • Shell 11.9%
  • Roff 4.9%
  • Makefile 1.5%
  • Python 0.3%
Find a file
David Gibson e3f70c05ba tcp_splice: Force TCP RST on abnormal close conditions
When we need to prematurely close a spliced connection, we use:
    conn_flag(conn, CLOSING);
This does destroy the flow, but does so in the same way as a clean close
from both ends.  That's not what we want in error conditions, or when one
side of the flow has signalled an abnormal exit with an EPOLLHUP event.

Replace all places where we close the connection - except for the happy
path close - with calls to a new tcp_splice_rst() function, which forces
the sockets to emit a TCP RST on each side.

Link: https://bugs.passt.top/show_bug.cgi?id=193
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2026-01-27 12:40:29 +01:00
contrib selinux: Enable open permissions on netns directory, operations on container_var_run_t 2026-01-16 17:22:44 +01:00
doc doc: Add test program verifying socket RST behaviour 2026-01-27 12:40:21 +01:00
hooks hooks/pre-push: Use mandoc(1) to get HTML anchors to command-line options 2026-01-17 15:36:02 +01:00
LICENSES passt: Relicense to GPL 2.0, or any later version 2023-04-06 18:00:33 +02:00
test test: Include sshd-auth in mbuto guest image 2026-01-10 19:27:47 +01:00
.clang-format clang: Add .clang-format file 2024-11-07 12:45:16 +01:00
.clang-tidy clang: Move clang-tidy configuration from Makefile to .clang-tidy 2024-11-07 12:46:19 +01:00
.clangd clang: Add rudimentary clangd configuration 2024-11-07 12:47:07 +01:00
.gitignore migrate: Fix several errors with passt-repair 2025-02-04 08:52:27 +01:00
.gitpublish Add git-publish configuration file 2022-10-22 03:45:50 +02:00
arch.c arch: Avoid explicit access to 'environ' 2024-11-07 12:46:29 +01:00
arch.h passt: Relicense to GPL 2.0, or any later version 2023-04-06 18:00:33 +02:00
arp.c arp/ndp: don't send messages on uninitialized tap interface 2025-11-27 22:29:25 +01:00
arp.h arp/ndp: send ARP announcement / unsolicited NA when neigbour entry added 2025-10-30 12:01:01 +01:00
checksum.c style: Add parentheses to function names in comments 2025-07-18 19:19:37 +02:00
checksum.h checksum: Don't export various functions 2025-03-07 02:21:24 +01:00
conf.c conf, pasta: Add --splice-only option 2026-01-19 09:12:27 +01:00
conf.h conf: Move mode detection into helper function 2025-03-12 23:08:33 +01:00
CONTRIBUTING.md Add reverse Christmas tree to CONTRIBUTING.md 2025-10-27 23:32:37 +01:00
dhcp.c treewide: Don't rely on terminator records in ip[46].dns arrays 2026-01-10 19:27:45 +01:00
dhcp.h dhcp: use iov_tail rather than pool 2025-09-03 20:43:39 +02:00
dhcpv6.c treewide: Fix more pointers which can be const 2026-01-14 01:07:51 +01:00
dhcpv6.h dhcpv6: use iov_tail rather than pool 2025-09-03 20:43:41 +02:00
epoll_ctl.c epoll_ctl: Extract epoll operations 2025-10-30 15:32:12 +01:00
epoll_ctl.h epoll_ctl: Move u64 variant first for safer initialisation 2026-01-14 01:07:51 +01:00
epoll_type.h netlink: add subscription on changes in NDP/ARP table 2025-10-30 12:00:49 +01:00
flow.c flow: Remove EPOLLFD_ID_INVALID 2026-01-20 19:37:53 +01:00
flow.h flow: Remove EPOLLFD_ID_INVALID 2026-01-20 19:37:53 +01:00
flow_table.h flow, fwd: Optimise forwarding rule lookup using epoll ref when possible 2026-01-18 12:48:09 +01:00
fwd.c conf, pasta: Add --splice-only option 2026-01-19 09:12:27 +01:00
fwd.h flow, fwd: Optimise forwarding rule lookup using epoll ref when possible 2026-01-18 12:48:09 +01:00
icmp.c flow: Remove EPOLLFD_ID_INVALID 2026-01-20 19:37:53 +01:00
icmp.h icmp: Remove vestiges of ICMP timer 2025-11-01 00:22:52 +01:00
icmp_flow.h icmp: Remove redundant id field from flow table entry 2024-07-19 18:33:06 +02:00
igmp.c igmp: Remove apparently unneeded suppression 2026-01-14 01:07:51 +01:00
inany.c conf, fwd: Check forwarding table for conflicting rules 2026-01-18 12:47:53 +01:00
inany.h conf, fwd: Check forwarding table for conflicting rules 2026-01-18 12:47:53 +01:00
iov.c iov: Fix coding style of basic (non-IOV_TAIL) parts 2025-12-08 04:47:57 +01:00
iov.h treewide: Fix places where we incorrectly indented with spaces 2026-01-11 01:31:50 +01:00
ip.c fwd, tcp, udp: Set up listening sockets based on forward table 2026-01-18 12:47:47 +01:00
ip.h ip: Add ipproto_name() function 2026-01-18 12:47:44 +01:00
isolation.c isolation: keep CAP_DAC_OVERRIDE initially 2025-10-09 10:11:27 +02:00
isolation.h passt, util: Close any open file that the parent might have leaked 2024-08-08 21:31:25 +02:00
lineread.c style: Fix 'Return' comment style 2025-07-18 19:19:24 +02:00
lineread.h lineread: Use ssize_t for line lengths 2024-06-07 20:44:44 +02:00
linux_dep.h cppcheck: Suppress the suppression of a suppression 2025-10-07 15:33:30 +02:00
log.c treewide: Flush pcap and log files, if used, before exiting 2025-08-19 16:29:52 +02:00
log.h treewide: Introduce passt_exit() helper 2025-12-12 22:20:02 +01:00
Makefile epoll_ctl: Extract epoll operations 2025-10-30 15:32:12 +01:00
migrate.c migrate: Don't use terminator element for versions[] array 2026-01-10 19:27:43 +01:00
migrate.h migrate: Skeleton of live migration logic 2025-02-12 19:47:07 +01:00
mld.c passt: Relicense to GPL 2.0, or any later version 2023-04-06 18:00:33 +02:00
ndp.c treewide: Don't rely on terminator records in ip[46].dns arrays 2026-01-10 19:27:45 +01:00
ndp.h arp/ndp: send ARP announcement / unsolicited NA when neigbour entry added 2025-10-30 12:01:01 +01:00
netlink.c epoll_ctl: Extract epoll operations 2025-10-30 15:32:12 +01:00
netlink.h netlink: add subscription on changes in NDP/ARP table 2025-10-30 12:00:49 +01:00
packet.c packet: Add support for multi-vector packets 2025-09-03 20:43:51 +02:00
packet.h packet: Add support for multi-vector packets 2025-09-03 20:43:51 +02:00
passt-repair.1 passt-repair: Add directory watch 2025-03-12 21:34:36 +01:00
passt-repair.c build: Fix errors of TCP_REPAIR_* undeclared 2025-09-02 09:31:27 +02:00
passt.1 conf, pasta: Add --splice-only option 2026-01-19 09:12:27 +01:00
passt.c conf, fwd: Move initialisation of auto port scanning out of conf() 2025-12-12 22:38:56 +01:00
passt.h conf, pasta: Add --splice-only option 2026-01-19 09:12:27 +01:00
pasta.c conf, pasta: Add --splice-only option 2026-01-19 09:12:27 +01:00
pasta.h pasta: Add fallback timer mechanism to check if namespace is gone 2024-02-16 08:47:14 +01:00
pcap.c tap: Use iov_tail with tap_add_packet() 2025-09-03 20:42:20 +02:00
pcap.h treewide: Flush pcap and log files, if used, before exiting 2025-08-19 16:29:52 +02:00
pif.c pif: Correctly set scope_id for guest-side link local addresses 2025-12-10 08:37:29 +01:00
pif.h inany: Let length of sockaddr_inany be implicit from the family 2025-12-02 23:07:14 +01:00
qrap.1 passt.1, qrap.1: align license description with SPDX identifier 2024-06-19 15:00:55 +02:00
qrap.c util: move IP stuff from util.[ch] to ip.[ch] 2024-03-06 08:03:38 +01:00
README.md tcp: Limit advertised window to available, not total sending buffer size 2025-12-08 08:03:23 +01:00
repair.c epoll_ctl: Extract epoll operations 2025-10-30 15:32:12 +01:00
repair.h flow, repair: Proper error handling for missing passt-repair helper on target 2025-06-06 10:46:40 +02:00
seccomp.sh seccomp: Fix build and operation on 32-bit musl targets 2025-12-07 23:17:25 +01:00
siphash.h style: Add parentheses to function names in comments 2025-07-18 19:19:37 +02:00
slirp4netns.sh passt: Relicense to GPL 2.0, or any later version 2023-04-06 18:00:33 +02:00
tap.c conf, pasta: Add --splice-only option 2026-01-19 09:12:27 +01:00
tap.h arp/ndp: don't send messages on uninitialized tap interface 2025-11-27 22:29:25 +01:00
tcp.c tcp: Properly propagate tap-side RST to socket side 2026-01-27 12:40:25 +01:00
tcp.h fwd, tcp, udp: Add forwarding rule to listening socket epoll references 2026-01-18 12:48:06 +01:00
tcp_buf.c tcp, udp: Pad batched frames to 60 bytes (802.3 minimum) in non-vhost-user modes 2025-12-08 04:47:22 +01:00
tcp_buf.h tcp: unify payload and flags l2 frames array 2024-11-07 12:47:41 +01:00
tcp_conn.h tcp: Adaptive interval based on RTT for socket-side acknowledgement checks 2025-12-08 09:15:36 +01:00
tcp_internal.h tcp, udp: Pad batched frames to 60 bytes (802.3 minimum) in non-vhost-user modes 2025-12-08 04:47:22 +01:00
tcp_splice.c tcp_splice: Force TCP RST on abnormal close conditions 2026-01-27 12:40:29 +01:00
tcp_splice.h flow, tcp: Flow based NAT and port forwarding for TCP 2024-07-19 18:33:29 +02:00
tcp_vu.c tcp, udp: Pad batched frames for vhost-user modes to 60 bytes (802.3 minimum) 2025-12-08 04:47:46 +01:00
tcp_vu.h vhost-user: add vhost-user 2024-11-27 16:47:32 +01:00
udp.c flow, fwd: Optimise forwarding rule lookup using epoll ref when possible 2026-01-18 12:48:09 +01:00
udp.h fwd, tcp, udp: Add forwarding rule to listening socket epoll references 2026-01-18 12:48:06 +01:00
udp_flow.c flow: Remove EPOLLFD_ID_INVALID 2026-01-20 19:37:53 +01:00
udp_flow.h flow, fwd: Optimise forwarding rule lookup using epoll ref when possible 2026-01-18 12:48:09 +01:00
udp_internal.h flow, fwd: Optimise forwarding rule lookup using epoll ref when possible 2026-01-18 12:48:09 +01:00
udp_vu.c udp_vu: Discard datagrams when RX virtqueue is not usable 2026-01-10 20:54:13 +01:00
udp_vu.h udp: Split spliced forwarding path from udp_buf_reply_sock_data() 2025-04-07 21:41:32 +02:00
util.c util: Be more defensive about buffer overruns in read_file() 2026-01-10 19:27:40 +01:00
util.h treewide: Introduce passt_exit() helper 2025-12-12 22:20:02 +01:00
vhost_user.c treewide: Fix more pointers which can be const 2026-01-14 01:07:51 +01:00
vhost_user.h style: Add parentheses to function names in comments 2025-07-18 19:19:37 +02:00
virtio.c treewide: Fix places where we incorrectly indented with spaces 2026-01-11 01:31:50 +01:00
virtio.h vhost-user: Fix VHOST_USER_GET_QUEUE_NUM to return number of queues 2025-09-09 21:13:59 +02:00
vu_common.c tcp, udp: Pad batched frames for vhost-user modes to 60 bytes (802.3 minimum) 2025-12-08 04:47:46 +01:00
vu_common.h treewide: Fix places where we incorrectly indented with spaces 2026-01-11 01:31:50 +01:00

passt: Plug A Simple Socket Transport

passt implements a translation layer between a Layer-2 network interface and native Layer-4 sockets (TCP, UDP, ICMP/ICMPv6 echo) on a host. It doesn't require any capabilities or privileges, and it can be used as a simple replacement for Slirp.

Overview diagram of passt

pasta: Pack A Subtle Tap Abstraction

pasta (same binary as passt, different command) offers equivalent functionality, for network namespaces: traffic is forwarded using a tap interface inside the namespace, without the need to create further interfaces on the host, hence not requiring any capabilities or privileges.

It also implements a tap bypass path for local connections: packets with a local destination address are moved directly between Layer-4 sockets, avoiding Layer-2 translations, using the splice(2) and recvmmsg(2)/sendmmsg(2) system calls for TCP and UDP, respectively.

Overview diagram of pasta

See also the man page.

Motivation

passt

When container workloads are moved to virtual machines, the network traffic is typically forwarded by interfaces operating at data link level. Some components in the containers ecosystem (such as service meshes), however, expect applications to run locally, with visible sockets and processes, for the purposes of socket redirection, monitoring, port mapping.

To solve this issue, user mode networking, as provided e.g. by libslirp, can be used. Existing solutions implement a full TCP/IP stack, replaying traffic on sockets that are local to the pod of the service mesh. This creates the illusion of application processes running on the same host, eventually separated by user namespaces.

While being almost transparent to the service mesh infrastructure, that kind of solution comes with a number of downsides:

  • three different TCP/IP stacks (guest, adaptation and host) need to be traversed for every service request
  • addressing needs to be coordinated to create the pretense of consistent addresses and routes between guest and host environments. This typically needs a NAT with masquerading, or some form of packet bridging
  • the traffic seen by the service mesh and observable externally is a distant replica of the packets forwarded to and from the guest environment:
    • TCP congestion windows and network buffering mechanisms in general operate differently from what would be naturally expected by the application
    • protocols carrying addressing information might pose additional challenges, as the applications don't see the same set of addresses and routes as they would if deployed with regular containers

passt implements a thinner layer between guest and host, that only implements what's strictly needed to pretend processes are running locally. The TCP adaptation doesn't keep per-connection packet buffers, and reflects observed sending windows and acknowledgements between the two sides. This TCP adaptation is needed as passt runs without the CAP_NET_RAW capability: it can't create raw IP sockets on the pod, and therefore needs to map packets at Layer-2 to Layer-4 sockets offered by the host kernel.

See also a detailed illustration of the problem and what lead to this approach.

pasta

On Linux, regular users can create network namespaces and run application services inside them. However, connecting namespaces to other namespaces and to external hosts requires the creation of network interfaces, such as veth pairs, which needs in turn elevated privileges or the CAP_NET_ADMIN capability. pasta, similarly to slirp4netns, solves this problem by creating a tap interface available to processes in the namespace, and mapping network traffic outside the namespace using native Layer-4 sockets.

Existing approaches typically implement a full, generic TCP/IP stack for this translation between data and transport layers, without the possibility of speeding up local connections, and usually requiring NAT. pasta:

  • avoids the need for a generic, full-fledged TCP/IP stack by coordinating TCP connection dynamics between sender and receiver
  • offers a fast bypass path for local connections: if a process connects to another process on the same host across namespaces, data is directly forwarded using pairs of Layer-4 sockets
  • with default options, maps routing and addressing information to the namespace, avoiding any need for NAT

Features

: done/supported, : out of scope, 🛠: in progress/being considered : nice-to-have, eventually

Protocols

  • IPv4
    • all features, except for
    • fragmentation
  • IPv6
    • all features, except for
    • fragmentation
    • jumbograms
  • TCP
  • UDP
  • ICMP/ICMPv6 Echo
  • IGMP/MLD proxy
  • SCTP

Portability

Security

  • no dynamic memory allocation (sbrk(2), brk(2), mmap(2) blocked via seccomp)
  • root operation not allowed outside user namespaces
  • all capabilities dropped, other than CAP_NET_BIND_SERVICE (if granted)
  • with default options, user, mount, IPC, UTS, PID namespaces are detached
  • no external dependencies (other than a standard C library)
  • restrictive seccomp profiles (34 syscalls allowed for passt, 43 for pasta on x86_64)
  • examples of AppArmor and SELinux profiles available
  • static checkers in continuous integration (clang-tidy, cppcheck)
  • clearly defined boundary-checked packet abstraction
  • 🛠️ ~5 000 LoC target
  • fuzzing, packetdrill tests
  • stricter synflood protection
  • 💡 add your ideas

Configurability

  • all addresses, ports, port ranges
  • optional NAT, not required
  • all protocols
  • pasta: auto-detection of bound ports
  • run-time configuration of port ranges without autodetection
  • configuration of port ranges for autodetection
  • 💡 add your ideas

Performance

  • maximum two (cache hot) copies on every data path
  • pasta: zero-copy for local connections by design (no configuration needed)
  • generalised coalescing and batching on every path for every supported protocol
  • 4 to 50 times IPv4 TCP throughput of existing, conceptually similar solutions depending on MTU (UDP and IPv6 hard to compare)
  • vhost-user support for maximum one copy on every data path and lower request-response latency
  • multithreading
  • raw IP socket support if CAP_NET_RAW is granted
  • eBPF support (might not improve performance over vhost-user)

Interfaces

Availability

Services

  • built-in ARP proxy
  • minimalistic DHCP server
  • minimalistic NDP proxy with router advertisements and SLAAC support
  • minimalistic DHCPv6 server
  • fine-grained configurability of DHCP, NDP, DHCPv6 options

Interfaces and Environment

passt exchanges packets with qemu via UNIX domain socket, using the socket back-end in qemu. This is supported since qemu 7.2.

For older versions, the qrap wrapper can be used to connect to a UNIX domain socket and to start qemu, which can now use the file descriptor that's already opened.

This approach, compared to using a tap device, doesn't require any security capabilities, as we don't need to create any interface.

pasta runs out of the box with any recent (post-3.8) Linux kernel.

Services

passt and pasta provide some minimalistic implementations of networking services:

  • ARP proxy, that resolves the address of the host (which is used as gateway) to the original MAC address of the host
  • DHCP server, a simple implementation handing out one single IPv4 address to the guest or namespace, namely, the same address as the first one configured for the upstream host interface, and passing the nameservers configured on the host
  • NDP proxy, which can also assign prefix and nameserver using SLAAC
  • DHCPv6 server: a simple implementation handing out one single IPv6 address to the guest or namespace, namely, the same address as the first one configured for the upstream host interface, and passing the nameservers configured on the host

Addresses

For IPv4, the guest or namespace is assigned, via DHCP, the same address as the upstream interface of the host, and the same default gateway as the default gateway of the host. Addresses are translated in case the guest is seen using a different address from the assigned one.

For IPv6, the guest or namespace is assigned, via SLAAC, a prefix derived from the address of the upstream interface of the host, the same default route as the default route of the host, and, if a DHCPv6 client is running in the guest or namespace, also the same address as the upstream address of the host. This means that, with a DHCPv6 client in the guest or namespace, addresses don't need to be translated. Should the client use a different address, the destination address is translated for packets going to the guest or to the namespace.

Local connections with passt

For UDP and TCP, for both IPv4 and IPv6, packets from the host addressed to a loopback address are forwarded to the guest with their source address changed to the address of the gateway or first hop of the default route. This mapping is reversed on the other way.

Local connections with pasta

Packets addressed to a loopback address in either namespace are directly forwarded to the corresponding (or configured) port in the other namespace. Similarly as passt, packets from the non-init namespace addressed to the default gateway, which are therefore sent via the tap device, will have their destination address translated to the loopback address.

Protocols

passt and pasta support TCP, UDP and ICMP/ICMPv6 echo (requests and replies). More details about the TCP implementation are described in the theory of operation, and similarly for UDP.

An IGMP/MLD proxy is currently work in progress.

Ports

passt

To avoid the need for explicit port mapping configuration, passt can bind to all unbound non-ephemeral (0-49152) TCP and UDP ports. Binding to low ports (0-1023) will fail without additional capabilities, and ports already bound (service proxies, etc.) will also not be used. Smaller subsets of ports, with port translations, are also configurable.

UDP ephemeral ports are bound dynamically, as the guest uses them.

If all ports are forwarded, service proxies and other services running in the container need to be started before passt starts.

pasta

With default options, pasta scans for bound ports on init and non-init namespaces, and automatically forwards them from the other side. Port forwarding is fully configurable with command line options.

Demo

pasta

passt

Continuous Integration

See also the test logs.

Performance

Try it

passt

  • build from source:

      git clone https://passt.top/passt
      cd passt
      make
    
    • alternatively, install one of the available packages

      Static binaries and packages are simply built with:

        make pkgs
      
  • have a look at the man page for synopsis and options:

      man ./passt.1
    
  • run the demo script, that detaches user and network namespaces, configures the new network namespace using pasta, starts passt and, optionally, qemu:

      doc/demo.sh
    
  • alternatively, you can use libvirt to start QEMU

  • and that's it, you should now have TCP connections, UDP, and ICMP/ICMPv6 echo working from/to the guest for IPv4 and IPv6

  • to connect to a service on the VM, just connect to the same port directly with the address of the current network namespace

pasta

  • build from source:

      git clone https://passt.top/passt
      cd passt
      make
    
    • alternatively, install one of the available packages

      Static binaries and packages are simply built with:

        make pkgs
      
  • have a look at the man page for synopsis and options:

      man ./pasta.1
    
  • start pasta with:

      ./pasta
    
    • alternatively, use it directly with Podman (since Podman 4.3.2, or with commit aa47e05ae4a0):

        podman run --net=pasta ...
      
  • you're now inside a new user and network namespace. For IPv6, SLAAC happens right away as pasta sets up the interface, but DHCPv6 support is available as well. For IPv4, configure the interface with a DHCP client:

      dhclient
    

    and, optionally:

      dhclient -6
    
    • alternatively, start pasta as:

        ./pasta --config-net
      

      to let pasta configure networking in the namespace by itself, using netlink

    • ...or run the demo script:

        doc/demo.sh
      
  • and that's it, you should now have TCP connections, UDP, and ICMP/ICMPv6 echo working from/to the namespace for IPv4 and IPv6

  • to connect to a service inside the namespace, just connect to the same port using the loopback address.

Contribute

Mailing Lists

  • Submit, review patches, and discuss development ideas on passt-dev. Please refer to the CONTRIBUTING.md file for details.

  • Ask your questions and discuss usage needs on passt-user

Bug Reports and Feature Requests

Chat

Weekly development meeting

  • Open to everybody! Feel free to join and propose a different time directly on the agenda.

Security and Vulnerability Reports

  • Please send an email to passt-sec, private list, no subscription required