passt

mirror of https://passt.top/passt synced 2025-06-21 14:25:34 +02:00

Author	SHA1	Message	Date
David Gibson	fc6ee68ad3	udp: Merge vhost-user and "buf" listening socket paths udp_buf_listen_sock_data() and udp_vu_listen_sock_data() now have effectively identical structure. The forwarding functions used for flow specific sockets (udp_buf_sock_to_tap(), udp_vu_sock_to_tap() and udp_sock_to_sock()) also now take a number of datagrams. This means we can re-use them for the listening socket path, just passing '1' so they handle a single datagram at a time. This allows us to merge both the vhost-user and flow specific paths into a single, simpler udp_listen_sock_data(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2025-04-07 21:43:52 +02:00
David Gibson	0304dd9c34	udp: Split spliced forwarding path from udp_buf_reply_sock_data() udp_buf_reply_sock_data() can handle forwarding data either from socket to socket ("splicing") or from socket to tap. It has a test on each datagram for which case we're in, but that will be the same for everything in the batch. Split out the spliced path into a separate udp_sock_to_sock() function. This leaves udp_{buf,vu}_reply_sock_data() handling only forwards from socket to tap, so rename and simplify them accordingly. This makes the code slightly longer for now, but will allow future cleanups to shrink it back down again. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [sbrivio: Fix typos in comments to udp_sock_recv() and udp_vu_listen_sock_data()] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2025-04-07 21:41:32 +02:00
David Gibson	5221e177e1	udp: Parameterize number of datagrams handled by udp_*_reply_sock_data() Both udp_buf_reply_sock_data() and udp_vu_reply_sock_data() internally decide what the maximum number of datagrams they will forward is. We have some upcoming reasons to allow the caller to decide that instead, so make the maximum number of datagrams a parameter for both of them. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2025-04-07 21:31:54 +02:00
David Gibson	84ab1305fa	udp: Polish udp_vu_sock_info() and remove from vu specific code udp_vu_sock_info() uses MSG_PEEK to look ahead at the next datagram to be received and gets its source address. Currently we only use it in the vhost-user path, but there's nothing inherently vhost-user specific about it. We have upcoming uses for it elsewhere so rename and move to udp.c. While we're there, polish its error reporting a litle. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [sbrivio: Drop excess newline before udp_sock_recv()] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2025-04-07 21:29:23 +02:00
David Gibson	76e554d9ec	udp: Simplify updates to UDP flow timestamp Since UDP has no built in knowledge of connections, the only way we know when we're done with a UDP flow is a timeout with no activity. To keep track of this struct udp_flow includes a timestamp to record the last time we saw traffic on the flow. For data from listening sockets and from tap, this is done implicitly via udp_flow_from_{sock,tap}() but for reply sockets it's done explicitly. However, that logic is duplicated between the vhost-user and "buf" paths. Make it common in udp_reply_sock_handler() instead. Technically this is a behavioural change: previously if we got an EPOLLIN event, but there wasn't actually any data we wouldn't update the timestamp, now we will. This should be harmless: if there's an EPOLLIN we expect there to be data, and even if there isn't the worst we can do is mildly delay the cleanup of a stale flow. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2025-04-02 11:30:26 +02:00
David Gibson	269cf6a12a	udp: Share more logic between vu and non-vu reply socket paths Share some additional miscellaneous logic between the vhost-user and "buf" paths for data on udp reply sockets. The biggest piece is error handling of cases where we can't forward between the two pifs of the flow. We also make common some more simple logic locating the correct flow and its parameters. This adds some lines of code due to extra comment lines, but nonetheless reduces logic duplication. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2025-03-26 21:34:28 +01:00
David Gibson	d924b7dfc4	udp_vu: Factor things out of udp_vu_reply_sock_data() loop At the start of every cycle of the loop in udp_vu_reply_sock_data() we: - ASSERT that uflow is not NULL - Check if the target pif is PIF_TAP - Initialize the v6 boolean However, all of these depend only on the flow, which doesn't change across the loop. This is probably a duplication from udp_vu_listen_sock_data(), where the flow can be different for each packet. For the reply socket case, however, factor that logic out of the loop. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2025-03-26 21:34:26 +01:00
David Gibson	5a977c2f4e	udp: Simplify checking of epoll event bits udp_{listen,reply}_sock_handler() can accept both EPOLLERR and EPOLLIN events. However, unlike most epoll event handlers we don't check the event bits right there. EPOLLERR is checked within udp_sock_errs() which we call unconditionally. Checking EPOLLIN is still more buried: it is checked within both udp_sock_recv() and udp_vu_sock_recv(). We can simplify the logic and pass less extraneous parameters around by moving the checking of the event bits to the top level event handlers. This makes udp_{buf,vu}_{listen,reply}_sock_handler() no longer general event handlers, but specific to EPOLLIN events, meaning new data. So, rename those functions to udp_{buf,vu}_{listen,reply}_sock_data() to better reflect their function. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2025-03-26 21:34:23 +01:00
David Gibson	89b203b851	udp: Common invocation of udp_sock_errs() for vhost-user and "buf" paths The vhost-user and non-vhost-user paths for both udp_listen_sock_handler() and udp_reply_sock_handler() are more or less completely separate. Both, however, start with essentially the same invocation of udp_sock_errs(), so that can be made common. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2025-03-26 21:34:11 +01:00
Jon Maloy	55431f0077	udp: create and send ICMPv4 to local peer when applicable When a local peer sends a UDP message to a non-existing port on an existing remote host, that host will return an ICMP message containing the error code ICMP_PORT_UNREACH, plus the header and the first eight bytes of the original message. If the sender socket has been connected, it uses this message to issue a "Connection Refused" event to the user. Until now, we have only read such events from the externally facing socket, but we don't forward them back to the local sender because we cannot read the ICMP message directly to user space. Because of this, the local peer will hang and wait for a response that never arrives. We now fix this for IPv4 by recreating and forwarding a correct ICMP message back to the internal sender. We synthesize the message based on the information in the extended error structure, plus the returned part of the original message body. Note that for the sake of completeness, we even produce ICMP messages for other error codes. We have noticed that at least ICMP_PROT_UNREACH is propagated as an error event back to the user. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Jon Maloy <jmaloy@redhat.com> [sbrivio: fix cppcheck warning: udp_send_conn_fail_icmp4() doesn't modify 'in', it can be declared as const] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2025-03-07 02:21:19 +01:00
Laurent Vivier	8996d183c5	udp_vu: update segment size In udp_vu_sock_recv(), collect a segment with a size defined to IP_MAX_MTU + ETH_HLEN + sizeof(struct virtio_net_hdr_mrg_rxbuf) The original version double counted the IP header: IP_MAX_MTU includes the IP header, and so did hdrlen. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-12-05 21:08:58 +01:00
David Gibson	67151090bc	iov, checksum: Replace csum_iov() with csum_iov_tail() We usually want to checksum only the tail part of a frame, excluding at least some headers. csum_iov() does that for a frame represented as an IO vector, not actually summing the entire IO vector. We now have struct iov_tail to explicitly represent this construct, so replace csum_iov() with csum_iov_tail() taking that representation rather than 3 parameters. We propagate the same change to csum_udp4() and csum_udp6() which take similar parameters. This slightly simplifies the code, and will allow some further simplifications as struct iov_tail is more widely used. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-11-28 14:03:16 +01:00
Laurent Vivier	28997fcb29	vhost-user: add vhost-user add virtio and vhost-user functions to connect with QEMU. $ ./passt --vhost-user and # qemu-system-x86_64 ... -m 4G \ -object memory-backend-memfd,id=memfd0,share=on,size=4G \ -numa node,memdev=memfd0 \ -chardev socket,id=chr0,path=/tmp/passt_1.socket \ -netdev vhost-user,id=netdev0,chardev=chr0 \ -device virtio-net,mac=9a:2b:2c:2d:2e:2f,netdev=netdev0 \ ... Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [sbrivio: as suggested by lvivier, include <netinet/if_ether.h> before including <linux/if_ether.h> as C libraries such as musl __UAPI_DEF_ETHHDR in <netinet/if_ether.h> if they already have a definition of struct ethhdr] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-11-27 16:47:32 +01:00

13 commits