passt

Author	SHA1	Message	Date
David Gibson	5566386f5f	treewide: Standardise variable names for various packet lengths At various points we need to track the lengths of a packet including or excluding various different sets of headers. We don't always use the same variable names for doing so. Worse in some places we use the same name for different things: e.g. tcp_fill_headers[46]() use ip_len for the length including the IP headers, but then tcp_send_flag() which calls it uses it to mean the IP payload length only. To improve clarity, standardise on these names: dlen: L4 protocol payload length ("data length") l4len: plen + length of L4 protocol header l3len: l4len + length of IPv4/IPv6 header l2len: l3len + length of L2 (ethernet) header Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-05-02 16:13:23 +02:00
David Gibson	9e22c53aa9	checksum: Make csum_ip4_header() take a host endian length csum_ip4_header() takes the packet length as a network endian value. In general it's very error-prone to pass non-native-endian values as a raw integer. It's particularly bad here because this differs from other checksum functions (e.g. proto_ipv4_header_psum()) which take host native lengths. It turns out all the callers have easy access to the native endian value, so switch it to use host order like everything else. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-05-02 16:13:21 +02:00
David Gibson	1095a7b0c9	treewide: Remove misleading and redundant endianness notes In general, it's much less error-prone to have the endianness of values implied by the type, rather than just noting it in comments. We can't always easily avoid it, because C, but we can do so when possible. struct in_addr and in6_addr are always encoded network endian, so noting it explicitly isn't useful. Remove them. In some cases we also have endianness notes on uint8_t parameters, which doesn't make sense: for a single byte endianness is irrelevant. Remove those too. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-05-02 16:13:16 +02:00
David Gibson	c27ca91564	checksum: Use proto_ipv6_header_psum() for ICMPv6 as well `7df624e79` ("checksum: introduce functions to compute the header part checksum for TCP/UDP") introduced a helper to compute the partial checksum for the IPv6 pseudo-header used in L4 protocol checksums. It used it in csum_udp6() for UDP packets, but not in csum_icmp6() for the identical calculation in csum_icmp6() for ICMPv6 packets. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-05-02 16:13:03 +02:00
Laurent Vivier	7df624e79a	checksum: introduce functions to compute the header part checksum for TCP/UDP The TCP and UDP checksums are computed using the data in the TCP/UDP payload but also some informations in the IP header (protocol, length, source and destination addresses). We add two functions, proto_ipv4_header_psum() and proto_ipv6_header_psum(), to compute the checksum of the IP header part. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-ID: <20240303135114.1023026-8-lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-03-06 08:03:47 +01:00
Laurent Vivier	feb4900c25	checksum: use csum_ip4_header() in udp.c and tcp.c We can find the same function to compute the IPv4 header checksum in tcp.c, udp.c and tap.c Use the function defined for tap.c, csum_ip4_header(), but with the code used in tcp.c and udp.c as it doesn't need a fully initialiazed IPv4 header, only protocol, tot_len, saddr and daddr. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-ID: <20240303135114.1023026-7-lvivier@redhat.com> [dwg: Fix weird cppcheck regression; it appears to be a problem in pre-existing code, but somehow this patch is exposing it] Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-03-06 08:03:44 +01:00
Laurent Vivier	e289d287c6	checksum: add csum_iov() Introduce the function csum_unfolded() that computes the unfolded 32-bit checksum of a data buffer, and call it from csum() that returns the folded value. Introduce csum_iov() that computes the checksum using csum_folded() on all vectors of the iovec array and returns the folded result. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-ID: <20240303135114.1023026-4-lvivier@redhat.com> [dwg: Fixed trivial cppcheck & clang-tidy regressions] Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-03-06 08:03:36 +01:00
Laurent Vivier	907621eaae	checksum: align buffers If buffer is not aligned use sum_16b() only on the not aligned part, and then use csum_avx2() on the remaining part Remove unneeded now function csum_unaligned(). Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-ID: <20240303135114.1023026-3-lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-03-06 08:03:32 +01:00
David Gibson	4681ea09bc	checksum: Don't use linux/icmp.h when netinet/ip_icmp.h will do In most places where we need to get ICMP definitions, we get them from <netinet/ip_icmp.h>. However in checksum.c we instead include <linux/icmp.h>. Change it to use <netinet/ip_icmp.h> for consistency. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2023-12-27 19:29:45 +01:00
David Gibson	04b10a8d90	siphash, checksum: Move TBAA explanation to checksum.c A number of checksum and hash functions require workarounds for the odd behaviour of Type-Baased Alias Analysis. We have a detailed comment about this on siphash_8b() and other functions reference that. Move the main comment to csume_16b() instead, because we're going to reorganise things in siphash.c. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2023-09-30 12:40:51 +02:00
Stefano Brivio	ca2749e1bd	passt: Relicense to GPL 2.0, or any later version In practical terms, passt doesn't benefit from the additional protection offered by the AGPL over the GPL, because it's not suitable to be executed over a computer network. Further, restricting the distribution under the version 3 of the GPL wouldn't provide any practical advantage either, as long as the passt codebase is concerned, and might cause unnecessary compatibility dilemmas. Change licensing terms to the GNU General Public License Version 2, or any later version, with written permission from all current and past contributors, namely: myself, David Gibson, Laine Stump, Andrea Bolognani, Paul Holzinger, Richard W.M. Jones, Chris Kuhn, Florian Weimer, Giuseppe Scrivano, Stefan Hajnoczi, and Vasiliy Ulyanov. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2023-04-06 18:00:33 +02:00
Stefano Brivio	a48c5c2abf	treewide: Disable gcc strict aliasing rules as needed, drop workarounds Recently, commit `4ddbcb9c0c` ("tcp: Disable optimisations for tcp_hash()") worked around yet another issue we hit with gcc 12 and '-flto -O2': some stores affecting the input data to siphash_20b() were omitted altogether, and tcp_hash() wouldn't get the correct hash for incoming connections. Digging further into this revealed that, at least according to gcc's interpretation of C99 aliasing rules, passing pointers to functions with different types compared to the effective type of the object (for example, a uint8_t pointer to an anonymous struct, as it happens in tcp_hash()), doesn't guarantee that stores are not reordered across the function call. This means that, in general, our checksum and hash functions might not see parts of input data that was intended to be provided by callers. Not even switching from uint8_t to character types, which should be appropriate here, according to C99 (ISO/IEC 9899, TC3, draft N1256), section 6.5, "Expressions", paragraph 7: An object shall have its stored value accessed only by an lvalue expression that has one of the following types: [...] — a character type. does the trick. I guess this is also subject to interpretation: casting passed pointers to character types, and then using those as different types, might still violate (dubious) aliasing rules. Disable gcc strict aliasing rules for potentially affected functions, which, in turn, disables gcc's Type-Based Alias Analysis (TBAA) optimisations based on those function arguments. Drop the existing workarounds. Also the (seemingly?) bogus 'maybe-uninitialized' warning on the tcp_tap_handler() > tcp_hash() > siphash_20b() path goes away with -fno-strict-aliasing on siphash_20b(). Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>	2023-02-27 18:54:13 +01:00
David Gibson	7c7b68dbe0	Use typing to reduce chances of IPv4 endianness errors We recently corrected some errors handling the endianness of IPv4 addresses. These are very easy errors to make since although we mostly store them in network endianness, we sometimes need to manipulate them in host endianness. To reduce the chances of making such mistakes again, change to always using a (struct in_addr) instead of a bare in_addr_t or uint32_t to store network endian addresses. This makes it harder to accidentally do arithmetic or comparisons on such addresses as if they were host endian. We introduce a number of IN4_IS_ADDR_*() helpers to make it easier to directly work with struct in_addr values. This has the additional benefit of making the IPv4 and IPv6 paths more visually similar. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-11-04 12:04:24 +01:00
Stefano Brivio	e67039f712	checksum: Fix calculation for ICMP checksum on IPv4 We need to zero out the checksum field before calculating the checksum, of course. I have no idea how this passed the "icmp" test set, looking into it. Reported-by: Paul Holzinger <pholzing@redhat.com> Fixes: `67ab617172` ("Add csum_icmp4() helper for calculating ICMP checksums") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>	2022-10-26 06:28:06 +02:00
David Gibson	f72b63e92f	Remove support for TCP packets from tap_ip_send() tap_ip_send() is never used for TCP packets, we're unlikely to use it for that in future, and the handling of TCP packets makes other cleanups unnecessarily awkward. Remove it. This is the only user of csum_tcp4(), so we can remove that as well. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-10-19 03:34:40 +02:00
David Gibson	3d8ccb44a6	Add csum_ip4_header() helper to calculate IPv4 header checksums We calculate IPv4 header checksums in at least two places, in dhcp() and in tap_ip_send. Add a helper to handle this calculation in both places. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-10-19 03:34:34 +02:00
David Gibson	bd4be308fc	Add csum_udp4() helper for calculating UDP over IPv4 checksums At least two places in passt fill in UDP over IPv4 checksums, although since UDP checksums are optional with IPv4 that just amounts to storing a 0 (in tap_ip_send()) or leaving a 0 from an earlier initialization (in dhcp()). For consistency, add a helper for this "calculation". Just for the heck of it, add the option (compile time disabled for now) to calculate real UDP checksums. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-10-19 03:34:32 +02:00
David Gibson	6905ac75ec	Add csum_udp6() helper for calculating UDP over IPv6 checksums Add a helper for calculating UDP checksums when used over IPv6 For future flexibility, the new helper takes parameters for the fields in the IPv6 pseudo-header, so an IPv6 header or pseudo-header doesn't need to be explicitly constructed. It also allows the UDP header and payload to be in separate buffers, although we don't use this yet. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-10-19 03:34:29 +02:00
David Gibson	67ab617172	Add csum_icmp4() helper for calculating ICMP checksums Although tap_ip_send() is currently the only place calculating ICMP checksums, create a helper function for symmetry with ICMPv6. For future flexibility it allows the ICMPv6 header and payload to be in separate buffers. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-10-19 03:34:26 +02:00
David Gibson	7abd2b0d72	Add csum_icmp6() helper for calculating ICMPv6 checksums At least two places in passt calculate ICMPv6 checksums, ndp() and tap_ip_send(). Add a helper to handle this calculation in both places. For future flexibility, the new helper takes parameters for the fields in the IPv6 pseudo-header, so an IPv6 header or pseudo-header doesn't need to be explicitly constructed. It also allows the ICMPv6 header and payload to be in separate buffers, although we don't use this yet. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-10-19 03:34:21 +02:00
Stefano Brivio	06aa26fcf3	Makefile: Hack for optimised-away store in ndp() before checksum calculation With gcc 11 and 12, passing -flto, or -flto=auto, and -O2, intra-procedural optimisation gets rid of a fundamental bit in ndp(): the store of hop_limit in the IPv6 header, before the checksum is calculated, which on x86_64 looks like this: ip6hr->hop_limit = IPPROTO_ICMPV6; b8c0: c6 44 24 35 3a movb $0x3a,0x35(%rsp) Here, hop_limit is temporarily set to the protocol number, to conveniently get the IPv6 pseudo-header for ICMPv6 checksum calculation in memory. With LTO, the assignment just disappears from the binary. This is rather visible as NDP messages get a wrong checksum, namely the expected checksum plus 58, and they're ignored by the guest or in the namespace, meaning we can't get any IPv6 routes, as reported by Wenli Quan. The issue affects a significant number of distribution builds, including the ones for CentOS Stream 9, EPEL 9, Fedora >= 35, Mageia Cauldron, and openSUSE Tumbleweed. As a quick workaround, declare csum_unaligned() as "noipa" for gcc 11 and 12, with -flto and -O2. This disables inlining and cloning, which causes the assignment to be compiled again. Leave a TODO item: we should figure out if a gcc issue has already been reported, and report one otherwise. There's no apparent justification as to why the store could go away. Reported-by: Wenli Quan <wquan@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2129713 Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-09-29 12:23:11 +02:00
Stefano Brivio	292c185553	passt: Address new clang-tidy warnings from LLVM 13.0.1 clang-tidy from LLVM 13.0.1 reports some new warnings from these checkers: - altera-unroll-loops, altera-id-dependent-backward-branch: ignore for the moment being, add a TODO item - bugprone-easily-swappable-parameters: ignore, nothing to do about those - readability-function-cognitive-complexity: ignore for the moment being, add a TODO item - altera-struct-pack-align: ignore, alignment is forced in protocol headers - concurrency-mt-unsafe: ignore for the moment being, add a TODO item Fix bugprone-implicit-widening-of-multiplication-result warnings, though, that's doable and they seem to make sense. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-01-30 02:59:12 +01:00
Stefano Brivio	627e18fa8a	passt: Add cppcheck target, test, and address resulting warnings ...mostly false positives, but a number of very relevant ones too, in tcp_get_sndbuf(), tcp_conn_from_tap(), and siphash PREAMBLE(). Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-10-21 09:41:13 +02:00
Stefano Brivio	dd942eaa48	passt: Fix build with gcc 7, use std=c99, enable some more Clang checkers Unions and structs, you all have names now. Take the chance to enable bugprone-reserved-identifier, cert-dcl37-c, and cert-dcl51-cpp checkers in clang-tidy. Provide a ffsl() weak declaration using gcc built-in. Start reordering includes, but that's not enough for the llvm-include-order checker yet. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-10-21 04:26:08 +02:00
Stefano Brivio	087b5f4dbb	LICENSES: Add license text files, add missing notices, fix SPDX tags SPDX tags don't replace license files. Some notices were missing and some tags were not according to the SPDX specification, too. Now reuse --lint from the REUSE tool (https://reuse.software/) passes. Reported-by: Martin Hauke <mardnh@gmx.de> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-10-20 08:29:30 +02:00
Stefano Brivio	4b12cf94f0	checksum: Stream load into four registers at a time with > 128 bytes ...and further interleave register usage. This brings the csum() overhead reported by perf(1) for 30 seconds of 64KiB TCP IPv4 frames, host to guest, from 7.2% to 5.8%. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-10-15 17:04:46 +02:00
Stefano Brivio	74f29d3148	checksum: Interleave lo/hi sums while folding into 128-bit sums, drop TODO I left a TODO and never checked -- this actually seems to slightly improve CPIs on AMD Naples (two 128-bit FMA units glued together). Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-10-15 16:59:12 +02:00
Stefano Brivio	17765f8de0	checksum: Introduce AVX2 implementation, unify helpers Provide an AVX2-based function using compiler intrinsics for TCP/IP-style checksums. The load/unpack/add idea and implementation is largely based on code from BESS (the Berkeley Extensible Software Switch) licensed as 3-Clause BSD, with a number of modifications to further decrease pipeline stalls and to minimise cache pollution. This speeds up considerably data paths from sockets to tap interfaces, decreasing overhead for checksum computation, with 16-64KiB packet buffers, from approximately 11% to 7%. The rest is just syscalls at this point. While at it, provide convenience targets in the Makefile for avx2, avx2_debug, and debug targets -- these simply add target-specific CFLAGS to the build. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-07-26 07:18:50 +02:00

28 commits