Introduce ip.[ch] file to encapsulate IP protocol handling functions and
structures. Modify various files to include the new header ip.h when
it's needed.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-ID: <20240303135114.1023026-5-lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently port_fwd.[ch] contains helpers related to port forwarding,
particular automatic port forwarding. We're planning to allow much more
flexible sorts of forwarding, including both port translation and NAT based
on the flow table. This will subsume the existing port forwarding logic,
so rename port_fwd.[ch] to fwd.[ch] with matching updates to all the names
within.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Add this helper to format an inany into either IPv4 or IPv6 text
format as appropriate.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Introduce functions to copy to/from a buffer from/to an iovec array,
to compute data length in in bytes of an iovec and to copy memory from
an iovec to another.
iov_from_buf(), iov_to_buf(), iov_size(), iov_copy().
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-ID: <20240217150725.661467-2-lvivier@redhat.com>
[dwg: Small changes to suppress cppcheck warnings]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Don't run cppcheck to find out if the --check-level=exhaustive option
is available, unless we're actually going to run cppcheck later.
To avoid this, move this check under the cppcheck target, and
implement it in shell script instead of using Makefile directives,
because we can't easily implement conditionals in recipes.
Reported-by: Rahil Bhimjiani <me@rahil.website>
Link: https://bugs.gentoo.org/920795
Fixes: 8640d62af7 ("cppcheck: Use "exhaustive" level checking when available")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Future debugging will want to identify a specific passt interface. We make
a distinction in these helpers between the name of the *type* of pif, and
name of the pif itself. For the time being these are always the same
thing, since we have at most instance of each type of pif. However, that
might change in future.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We want to generalise "connection" tracking to things other than true TCP
connections. Continue implenenting this by renaming the TCP connection
table to the "flow table" and moving it to flow.c. The definitions are
split between flow.h and flow_table.h - we need this separation to avoid
circular dependencies: the definitions in flow.h will be needed by many
headers using the flow mechanism, but flow_table.h needs all those protocol
specific headers in order to define the full flow table entry.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently TCP connections use a 1-bit selector, 'spliced', to determine the
rest of the contents of the structure. We want to generalise the TCP
connection table to other types of flows in other protocols. Make a start
on this by replacing the tcp_conn_common structure with a new flow_common
structure with an enum rather than a simple boolean indicating the type of
flow.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
When we plan to use valgrind, we need to build passt a bit differently:
* We need debug symbols so that valgrind can match problems it finds to
meaningful locations
* We need to allow additional syscalls in the seccomp filter, because
valgrind's wrappers need them
Currently we also disable optimization (-O0). This is unfortunate, because
it will make valgrind tests even slower than they'd otherwise be. Worse,
it's possible that the asm behaviour without optimization might be
different enough that valgrind could miss a use of uninitialized variable
or other fault it would detect.
I suspect this was originally done because without it inlining could mean
that suppressions we use don't reliably match the places we want them to.
Alas, it turns out this is true even with -O0. We've now implemented a
more robust workaround for this (explicit ((noinline)) attributes when
compiled with -DVALGRIND). So, we can re-enable optimization for valgrind
builds, making them closer to regular builds.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
valgrind complains if we pass a NULL buffer to recv(), even if we use
MSG_TRUNC, in which case it's actually safe. For a long time we've had
a valgrind suppression for this. It singles out the recv() in
tcp_sock_consume(), the only place we use MSG_TRUNC.
However, tcp_sock_consume() only has a single caller, which makes it a
prime candidate for inlining. If inlined, it won't appear on the stack and
valgrind won't match the correct suppression.
It appears that certain compiler versions (for example gcc-13.2.1 in
Fedora 39) will inline this function even with the -O0 we use for valgrind
builds. This breaks the suppression leading to a spurious failure in the
tests.
There's not really any way to adjust the suppression itself without making
it overly broad (we don't want to match other recv() calls). So, as a hack
explicitly prevent inlining of this function when we're making a valgrind
build. To accomplish this add an explicit -DVALGRIND when making such a
build.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
clang-tidy from LLVM 17.0.3 (which is in Fedora 39) includes a new
"misc-include-cleaner" warning that tries to make sure that headers
*directly* provide the things that are used in the .c file. That sounds
great in theory but is in practice unusable:
Quite a few common things in the standard library are ultimately provided
by OS-specific system headers, but for portability should be accessed via
closer-to-standardised library headers. This will warn constantly about
such cases: e.g. it will want you to include <linux/limits.h> instead of
<limits.h> to get PATH_MAX.
So, suppress this warning globally in the Makefile.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
logmsg() takes printf like arguments, but because it's not a built in, the
compiler won't generate warnings if the format string and parameters don't
match. Enable those by using the format attribute.
Strictly speaking this is a gcc extension, but I believe it is also
supported by some other common compilers. We already use some other
attributes in various places. For now, just use it and we can worry about
compilers that don't support it if it comes up.
This exposes some warnings from existing callers, both in gcc and in
clang-tidy:
- Some are straight out bugs, which we correct
- It's occasionally useful to invoke the logging functions with an empty
string, which gcc objects to, so disable that specific warning in the
Makefile
- Strictly speaking the C standard requires that the parameter for a %p
be a (void *), not some other pointer type. That's only likely to cause
problems in practice on weird architectures with different sized
representations for pointers to different types. Nonetheless add the
casts to make it happy.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We have several possible ways of communicating with other entities. We use
sockets to communicate with the host and other network sites, but also in
a different context to communicate "spliced" channels to a namespace. We
also use a tuntap device or a qemu socket to communicate with the namespace
or guest.
For the time being these are just defined implicitly by how we structure
things. However, there are other communication channels we want to use in
future (e.g. virtio-user), and we want to allow more flexible forwarding
between those. To accomplish that we're going to want a specific way of
referring to those channels.
Introduce the concept of a "passt/pasta interface" or "pif" representing a
specific channel to communicate network data. Each pif is assumed to be
associated with a specific network namespace in the broad sense (that is
as a place where IP addresses have a consistent meaning - not the Linux
specific sense). But there could be multiple pifs communicating with the
same namespace (e.g. the spliced and tap interfaces in pasta).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The implementation of scanning /proc files to do automatic port forwarding
is a bit awkwardly split between procfs_scan_listen() in util.c,
get_bound_ports() and related functions in conf.c and the initial setup
code in conf().
Consolidate all of this into port_fwd.h, which already has some related
definitions, and a new port_fwd.c.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Recent enough cppcheck versions (at least as of cppcheck 2.12) give this
error processing conf.c:
conf.c:1179:1: information: ValueFlow analysis is limited in conf. Use --check-level=exhaustive if full analysis is wanted. [checkLevelNormal]
Adding --check-level=exhaustive doesn't seem to significantly increase the
cppcheck run time for us, so enable it when possible, suppressing that
warning.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We have a bunch of variants of the siphash functions for different data
sizes. The callers, in tcp.c, need to pack the various values they want to
hash into a temporary structure, then call the appropriate version. We can
avoid the copy into the temporary by directly using the incremental
siphash functions.
The length specific hash functions also have an undocumented constraint
that the data pointer they take must, in fact, be aligned to avoid
unaligned accesses, which may cause crashes on some architectures.
So, prefer the incremental approach and remove the length-specific
functions.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We have several workarounds for a clang-tidy bug where the checker doesn't
recognize that a number of system calls write to - and therefore initialise
- a socket address. We can't neatly use a suppression, because the bogus
warning shows up some time after the actual system call, when we access
a field of the socket address which clang-tidy erroneously thinks is
uninitialised.
Consolidate these workarounds into one place by using macros to implement
wrappers around affected system calls which add a memset() of the sockaddr
to silence clang-tidy. This removes the need for the individual memset()
workarounds at the callers - and the somewhat longwinded explanatory
comments.
We can then use a #define to not include the hack in "real" builds, but
only consider it for clang-tidy.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
C11 has some features that will allow us to make some things a bit cleaner.
Alter the Makefile to tell the compiler to allow us to use C11 code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This reverts commit cc2a6bec3c: I
applied that patch by mistake.
Fixes: cc2a6bec3c ("MAKE: Fix parallel builds; .o files; .gitignore; new makedocs")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
In practical terms, passt doesn't benefit from the additional
protection offered by the AGPL over the GPL, because it's not
suitable to be executed over a computer network.
Further, restricting the distribution under the version 3 of the GPL
wouldn't provide any practical advantage either, as long as the passt
codebase is concerned, and might cause unnecessary compatibility
dilemmas.
Change licensing terms to the GNU General Public License Version 2,
or any later version, with written permission from all current and
past contributors, namely: myself, David Gibson, Laine Stump, Andrea
Bolognani, Paul Holzinger, Richard W.M. Jones, Chris Kuhn, Florian
Weimer, Giuseppe Scrivano, Stefan Hajnoczi, and Vasiliy Ulyanov.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
A cross-architecture build might pass a target-specific CC on 'make',
and not on 'make install', and this is what happens in Debian
cross-qa tests.
Given that we select binaries to be installed depending on the target
architecture, this means we would build AVX2 binaries in any case on
a x86_64 build machine.
By overriding TARGET in package build rules, we can tell the Makefile
about the target architecture, also for the 'install' (Makefile)
target.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Debian cross-building automatic checks:
http://crossqa.debian.net/src/passt
currently fail because we don't use the right target architecture and
compiler while building the system call lists and resolving their
numbers in seccomp.sh. Pass ARCH and CC to seccomp.sh and use them.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Recently, commit 4ddbcb9c0c ("tcp: Disable optimisations
for tcp_hash()") worked around yet another issue we hit with gcc 12
and '-flto -O2': some stores affecting the input data to siphash_20b()
were omitted altogether, and tcp_hash() wouldn't get the correct hash
for incoming connections.
Digging further into this revealed that, at least according to gcc's
interpretation of C99 aliasing rules, passing pointers to functions
with different types compared to the effective type of the object
(for example, a uint8_t pointer to an anonymous struct, as it happens
in tcp_hash()), doesn't guarantee that stores are not reordered
across the function call.
This means that, in general, our checksum and hash functions might
not see parts of input data that was intended to be provided by
callers.
Not even switching from uint8_t to character types, which should be
appropriate here, according to C99 (ISO/IEC 9899, TC3, draft N1256),
section 6.5, "Expressions", paragraph 7:
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:
[...]
— a character type.
does the trick. I guess this is also subject to interpretation:
casting passed pointers to character types, and then using those as
different types, might still violate (dubious) aliasing rules.
Disable gcc strict aliasing rules for potentially affected functions,
which, in turn, disables gcc's Type-Based Alias Analysis (TBAA)
optimisations based on those function arguments.
Drop the existing workarounds. Also the (seemingly?) bogus
'maybe-uninitialized' warning on the tcp_tap_handler() > tcp_hash() >
siphash_20b() path goes away with -fno-strict-aliasing on
siphash_20b().
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Future compilers will not support implicit ints by default, causing
the probe to always fail.
Link: https://bugs.passt.top/show_bug.cgi?id=42
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
These files are left around by emacs amongst other editors.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
If you run the build several times it will fail unnecessarily with:
ln -s passt pasta
ln: failed to create symbolic link 'pasta': File exists
make: *** [Makefile:134: pasta] Error 1
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently, when instructed to open an IPv6 socket, sock_l4() explicitly
sets the IPV6_V6ONLY socket option so that the socket will only respond to
IPv6 connections. Linux (and probably other platforms) allow "dual stack"
sockets: IPv6 sockets which can also accept IPv4 connections.
Extend sock_l4() to be able to make such sockets, by passing AF_UNSPEC as
the address family and no bind address (binding to a specific address would
defeat the purpose). We add a Makefile define 'DUAL_STACK_SOCKETS' to
indicate availability of this feature on the target platform.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
struct tcp_conn stores an address which could be IPv6 or IPv4 using a
union. We can do this without an additional tag by encoding IPv4 addresses
as IPv4-mapped IPv6 addresses.
This approach is useful wider than the specific place in tcp_conn, so
expose a new 'union inany_addr' like this from a new inany.h. Along with
that create a number of helper functions to make working with these "inany"
addresses easier.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently spliced and non-spliced connections use completely independent
tracking structures. We want to unify these, so as a preliminary step move
the definitions for both variants into a new tcp_conn.h header, shared by
tcp.c and tcp_splice.c.
This requires renaming some #defines with the same name but different
meanings between the two cases. In the process we correct some places that
are slightly out of sync between the comments and the code for various
event bit names.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
clang-tools 15.0.0 appears to have added a new warning that will always
complain about assignments in if statements, which we use in a number of
places in passt/pasta. Encountered on Fedora 37 with
clang-tools-extra-15.0.0-3.fc37.x86_64.
Suppress the new warning so that we can compile and test.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The AUDIT_ARCH defines in seccomp.h corresponding to HPPA are
AUDIT_ARCH_PARISC and AUDIT_ARCH_PARISC64.
Build error spotted in Debian's buildd log on
phantom.physik.fu-berlin.de.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
On mips64el, gcc -dumpmachine correctly reports mips64el as
architecture prefix, but for some reason seccomp.h defines
AUDIT_ARCH_MIPSEL64 and not AUDIT_ARCH_MIPS64EL. Mangle AUDIT_ARCH
accordingly.
Build error spotted in Debian's buildd logs from Loongson build.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Drop it from the internal FLAGS variable, but honour -O2 if passed in
CFLAGS. In Debian packages, dpkg-buildflags uses it as hardening
flag, and we get a QA warning if we drop it:
https://qa.debian.org/bls/bytag/W-dpkg-buildflags-missing.html
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
CPPFLAGS allow the user to pass pre-processor flags. This is unlikely
to be needed at the moment, but the Debian Hardening Walkthrough
reasonably requests it to be handled in order to fully support
hardened build flags:
https://wiki.debian.org/HardeningWalkthrough#Handling_dpkg-buildflags_in_your_upstream_build_system
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
In some environments, such as KubeVirt pods, we might not have a
system logger available. We could choose to run in foreground, but
this takes away the convenient synchronisation mechanism derived from
forking to background when interfaces are ready.
Add optional logging to file with -l/--log-file and --log-size.
Unfortunately, this means we need to duplicate features that are more
appropriately implemented by a system logger, such as rotation. Keep
that reasonably simple, by using fallocate() with range collapsing
where supported (Linux kernel >= 3.15, extent-based ext4 and XFS) and
falling back to an unsophisticated block-by-block moving of entries
toward the beginning of the file once we reach the (mandatory) size
limit.
While at it, clarify the role of LOG_EMERG in passt.c.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Logging to file is going to add some further complexity that we don't
want to squeeze into util.c.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
With gcc 11 and 12, passing -flto, or -flto=auto, and -O2,
intra-procedural optimisation gets rid of a fundamental bit in ndp():
the store of hop_limit in the IPv6 header, before the checksum is
calculated, which on x86_64 looks like this:
ip6hr->hop_limit = IPPROTO_ICMPV6;
b8c0: c6 44 24 35 3a movb $0x3a,0x35(%rsp)
Here, hop_limit is temporarily set to the protocol number, to
conveniently get the IPv6 pseudo-header for ICMPv6 checksum
calculation in memory.
With LTO, the assignment just disappears from the binary.
This is rather visible as NDP messages get a wrong checksum, namely
the expected checksum plus 58, and they're ignored by the guest or
in the namespace, meaning we can't get any IPv6 routes, as reported
by Wenli Quan.
The issue affects a significant number of distribution builds,
including the ones for CentOS Stream 9, EPEL 9, Fedora >= 35,
Mageia Cauldron, and openSUSE Tumbleweed.
As a quick workaround, declare csum_unaligned() as "noipa" for gcc
11 and 12, with -flto and -O2. This disables inlining and cloning,
which causes the assignment to be compiled again.
Leave a TODO item: we should figure out if a gcc issue has already
been reported, and report one otherwise. There's no apparent
justification as to why the store could go away.
Reported-by: Wenli Quan <wquan@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2129713
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Commit 1a563a0cbd ("passt: Address gcc 11 warnings") works around an
issue where the remote address passed to hash functions is seen as
uninitialised by gcc, with -flto and -O2.
It turns out we get the same exact behaviour on gcc 12.1 and 12.2, so
extend the applicability of the same workaround to gcc 12.
Don't go further than that, though: should the issue reported at:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78993
happen to be fixed in a later version of gcc, we won't need the
noinline attributes anymore. Otherwise, we'll notice.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
It's unclear what original suppressions these unmatchedSuppression
suppressions were supposed to go with. They don't trigger any warnings on
the current code that I can tell, so remove them. If we find a problem
with some cppcheck versions in future, replace them with inline
suppressions so it's clearer exactly where the issue is originating.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We have a couple of functions that are unused (for now) by design.
Although at least one has a flag so that gcc doesn't warn, cppcheck has its
own warnings about this. Add specific inline suppressions for these rather
than a blanket suppression in the Makefile.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
I can't get this warning to trigger, even without the suppression, so
remove it. If it shows up again on some cppcheck version, we can replace
it with inline suppressions so it's clear where the issue lay.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
I can't get these warnings to trigger on the cppcheck versions I have, so
remove them. If we find in future we need to replace these, they should be
replaced with inline suppressions so its clear what's the section of code
at issue.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
I can't get this warning to trigger, so I think this suppression must be
out of date. Whether that's because we've changed our code to no longer
have the problem, or because cppcheck itself has been updated to remove a
false positive I don't know.
If we find that we do need a suppression like this for some cppcheck
version, we should replace it with an inline suppression so it's clear
what exactly is triggering the warning.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
seccomp.sh generates seccomp.h, so if we change it, we should re-build
seccomp.h as well. Add this to the make dependencies so it happens.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Since bf95322f "conf: Make the argument to --pcap option mandatory" we
no longer call localtime() in pcap.c, so we no longer need the matching
cppcheck suppression.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>