Commit graph

517 commits

Author SHA1 Message Date
Stefano Brivio
33b1bdd079 seccomp: Add a number of alternate and per-arch syscalls
Depending on the C library, but not necessarily in all the
functions we use, statx() might be used instead of stat(),
getdents() instead of getdents64(), readlinkat() instead of
readlink(), openat() instead of open().

On aarch64, it's clone() and not fork(), and dup3() instead of
dup2() -- just allow the existing alternative instead of dealing
with per-arch selections.

Since glibc commit 9a7565403758 ("posix: Consolidate fork
implementation"), we need to allow set_robust_list() for
fork()/clone(), even in a single-threaded context.

On some architectures, epoll_pwait() is provided instead of
epoll_wait(), but never both. Same with newfstat() and
fstat(), sigreturn() and rt_sigreturn(), getdents64() and
getdents(), readlink() and readlinkat(), unlink() and
unlinkat(), whereas pipe() might not be available, but
pipe2() always is, exclusively or not.

Seen on Fedora 34: newfstatat() is used on top of fstat().

syslog() is an actual system call on some glibc/arch combinations,
instead of a connect()/send() implementation.

On ppc64 and ppc64le, _llseek(), recv(), send() and getuid()
are used. For ppc64 only: ugetrlimit() for the getrlimit()
implementation, plus sigreturn() and fcntl64().

On s390x, additionally, we need to allow socketcall() (on top
of socket()), and sigreturn() also for passt (not just for
pasta).

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 16:30:59 +01:00
Stefano Brivio
be265eef06 tcp: Don't round down MSS to >= 64KiB page size, but clamp it in any case
On some architectures, the page size is bigger than the maximum size
of an Ethernet frame.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 16:30:59 +01:00
Stefano Brivio
34872fadec pasta: Check for zero d_reclen returned by getdents64() syscall
Seen on PPC with some older kernel versions: we seemingly have bytes
left to read from the returned array of dirent structs, but d_reclen
is zero: this, and all the subsequent entries, are not valid.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 16:30:59 +01:00
Stefano Brivio
64f7d81d9a netlink: Fix swapped v4/v6-only flags in external interface detection
The effect of this typo became visible in an IPv6-only environment,
where passt wouldn't work at all.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 16:30:59 +01:00
Stefano Brivio
caa22aa644 tcp, udp, util: Fixes for bitmap handling on big-endian, casts
Bitmap manipulating functions would otherwise refer to inconsistent
sets of bits on big-endian architectures. While at it, fix up a
couple of casts.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 16:30:59 +01:00
Stefano Brivio
4c7304db85 conf, pasta: Explicitly pass CLONE_{NEWUSER,NEWNET} to setns()
Only allow the intended types of namespaces to be joined via setns()
as a defensive measure.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 16:30:59 +01:00
Stefano Brivio
1776de0140 tcp, netlink, HAS{BYTES_ACKED,MIN_RTT,GETRANDOM} and NETLINK_GET_STRICT_CHK
tcpi_bytes_acked and tcpi_min_rtt are only available on recent
kernel versions: provide fall-back paths (incurring some grade of
performance penalty).

Support for getrandom() was introduced in Linux 3.17 and glibc 2.25:
provide an alternate mechanism for that as well, reading from
/dev/random.

Also check if NETLINK_GET_STRICT_CHK is defined before using it:
it's not strictly needed, we'll filter out irrelevant results from
netlink anyway.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 16:30:59 +01:00
Stefano Brivio
daf8d057ce seccomp: Introduce mechanism to allow per-arch syscalls
Some C library functions are commonly implemented by different syscalls
on different architectures. Add a mechanism to allow selected syscalls
for a single architecture, syntax in #syscalls comment is:

	#syscalls <arch>:<name>

e.g. s390x:socketcall, given that socketcall() is commonly used there
instead of socket().

This is now implemented by a compiler probe for syscall numbers,
auditd tools (ausyscall) are not required anymore as a result.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 16:29:34 +01:00
Stefano Brivio
1eb14d7305 util: Fall-back definitions for SECCOMP_RET_KILL_PROCESS, ETH_{MAX,MIN}_MTU
They're not available on some older toolchains.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 07:57:09 +01:00
Stefano Brivio
fa7e2e7016 Makefile, seccomp: Fix build for i386, ppc64, ppc64le
On some distributions, on ppc64, ulimit -s returns 'unlimited': add a
reasonable default, and also make sure ulimit is invoked using the
default shell, which should ensure ulimit is actually implemented.

Also note that AUDIT_ARCH doesn't follow closely the naming reported
by 'uname -m': convert for i386 and ppc as needed.

While at it, move inclusion of seccomp.h after util.h, the former is
less generic (cosmetic/clang-tidy only).

Older kernel headers might lack a definition for AUDIT_ARCH_PPC64LE:
define that explicitly if it's not available.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 07:57:09 +01:00
Stefano Brivio
b93c2c1713 passt: Drop <linux/ipv6.h> include, carry own ipv6hdr and opt_hdr definitions
This is the only remaining Linux-specific include -- drop it to avoid
clang-tidy warnings and to make code more portable.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 07:57:09 +01:00
Stefano Brivio
f6d9787d30 tap, tcp: Fix two comparisons with different signedness reported by gcc 7
For some reason, those are not reported by recent versions of gcc.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 07:57:09 +01:00
Stefano Brivio
6040f16239 tcp: Cover all usages of tcpi_snd_wnd with HAS_SND_WND
...I forgot two of them.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 07:57:09 +01:00
Stefano Brivio
2c7431ffcf README: Feature list, links to lists, bugs, chat
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-23 12:28:50 +02:00
Stefano Brivio
a77c5ef93a README, perf_report: Markdown and CSS fixes
Updating md2html on the server needs a few adjustments.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-22 14:52:47 +02:00
Stefano Brivio
94c7c1dbcf slirp4netns.sh: Fix up usage, exit 0 on --help
Based on an original patch by Giuseppe Scrivano: there's no need
to pass $0 to usage, drop that everywhere, and make it consistent.

Don't exit with error on -h, --help.

Suggested-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 20:37:51 +02:00
Stefano Brivio
1fc6416cf9 seccomp: Add newfstatat to list of allowed syscalls
...it looks like, on a recent Fedora installation, daemon() uses it.

Reported-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 20:18:17 +02:00
Stefano Brivio
d36e429bc6 netlink: Fix length of address attribute
...I broke this while playing with clang-tidy, and didn't add
tests for pasta's --config-net yet.

Reported-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 20:14:52 +02:00
Stefano Brivio
875550b973 passt: Fork into background also if not running from a terminal
This is actually annoying: there's no way to make it fork into
background when running from a script. However, it's always
possible to keep it in foreground with -f. Make it simpler, and
always fork into background if -f is not given.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 20:13:18 +02:00
Stefano Brivio
2d75a3d71c test/two_guests: Fix sleep command for DAD
An inline comment prefixed by a space doesn't mean the space
is dropped, and sleep(1) will get a blank in its argument.

Move the comment on its own line.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 20:10:23 +02:00
Stefano Brivio
e934da3c81 test/two_guests: Let the guests end DAD before starting the DHCPv6 client
They'll start DAD as we bring up the interface, and the DHCPv6
client might be unreasonably delayed if we start it too early.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 17:54:03 +02:00
Stefano Brivio
a2b86c5f90 tcp: Restore source address to network endianness before using it for hash table
This was actually fine "on the wire", but it's inconsistent with the
way we hash other addresses/protocols and also ends up with a wrong
endianness in captures in case we replace the address with our
default gateway.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 17:54:03 +02:00
Stefano Brivio
000ae818d4 pcap: Fix failure check on write() in pcapm()
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 17:36:05 +02:00
Stefano Brivio
73a4a6b7cd ndp: Don't send a DNS search list if we don't have a list of DNS servers
This is not explicitly forbidden, but it confuses the ISC's DHCP client,
and doesn't make sense anyway.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 17:34:42 +02:00
Stefano Brivio
af55c4e98f ndp: Don't sabotage DAD by replying to probing neighbour solicitation
If the solicitation comes from ::, it's the guest performing
duplicate address detection -- don't answer that.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 12:16:16 +02:00
Stefano Brivio
bf68270898 ndp: Set (ICMP) hop limit to 255 in router advertisement
Found while re-reading this part, zero works as well, but a
host might legitimately refuse a value that's below a given
threshold.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 12:16:16 +02:00
Stefano Brivio
a620de294d qrap: Silence bogus clang-tidy bugprone-suspicious-missing-comma warning
This is actually a concatenation -- mark it with an extra pair
of parentheses.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 12:16:16 +02:00
Stefano Brivio
685b50c3ce Makefile: cppcheck target: Suppress unmatchedSuppression, pass CFLAGS
Some of those warnings don't trigger even on systems with very
similar toolchains, suppress unmatchedSuppression warnings, they're
basically useless.

While at it, pass CFLAGS to cppcheck.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 12:16:16 +02:00
Stefano Brivio
627e18fa8a passt: Add cppcheck target, test, and address resulting warnings
...mostly false positives, but a number of very relevant ones too,
in tcp_get_sndbuf(), tcp_conn_from_tap(), and siphash PREAMBLE().

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 09:41:13 +02:00
Stefano Brivio
c3f8e4d2cd test/perf: Actually load passt enough to test UDP performance
With recent improvements, we're not CPU-bound at all while testing
UDP performance. Give the VM more memory and CPUs, forward two
additional ports, start up to four threads in parallel, and give
single iperf3 threads higher bandwidth targets.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 04:54:05 +02:00
Stefano Brivio
1f3d6f96b5 test/lib/test: Wait a bit longer before terminating iperf3 processes
Sometimes tests run a few seconds longer than expected, wait a few
more seconds.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 04:51:39 +02:00
Stefano Brivio
27f5999677 udp: Avoid static initialiser for udp{4,6}_l2_buf
With the new UDP_TAP_FRAMES value, the binary size grows considerably.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 04:49:25 +02:00
Stefano Brivio
85a80f8f63 udp: Fix maximum payload size calculation for IPv4 buffers, bump UDP_TAP_FRAMES
The issue with a higher UDP_TAP_FRAMES was actually coming from a
payload size the guest couldn't digest. Fix that, and bump
UDP_TAP_FRAMES back to 128.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 04:42:09 +02:00
Stefano Brivio
dd942eaa48 passt: Fix build with gcc 7, use std=c99, enable some more Clang checkers
Unions and structs, you all have names now.

Take the chance to enable bugprone-reserved-identifier,
cert-dcl37-c, and cert-dcl51-cpp checkers in clang-tidy.

Provide a ffsl() weak declaration using gcc built-in.

Start reordering includes, but that's not enough for the
llvm-include-order checker yet.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 04:26:08 +02:00
Stefano Brivio
6257a2752e test/perf: Try sourcing maximum scaling frequency from cpufreq
On most recent CPUs, that's a better indication of all-core turbo
frequency, or non-turbo frequency, than /proc/cpuinfo.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 01:24:22 +02:00
Stefano Brivio
819d13bb92 seccomp.sh: Handle missing ausyscall(8) or unknown syscall number
...try sourcing it with the compiler from <sys/syscalls.h> before
giving up.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 01:21:26 +02:00
Stefano Brivio
849308d207 Makefile, tcp: Don't try to use tcpi_snd_wnd from tcp_info on pre-5.3 kernels
Detect missing tcpi_snd_wnd in struct tcp_info at build time,
otherwise build fails with a pre-5.3 linux/tcp.h header.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 01:19:27 +02:00
Stefano Brivio
a20626fb35 util: Go to next non-empty line, skip newlines in line_read()
Otherwise, we'll stop returning lines at the first empty line
in a file -- this is not expected in case of e.g. /etc/resolv.conf.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-20 11:39:08 +02:00
Stefano Brivio
9618d24700 ndp, dhcpv6, tcp, udp: Always use link-local as source if gateway isn't
This shouldn't happen on any sane configuration, but I just met an
example of that: the default IPv6 gateway on the host is configured
with a global unicast address, we use that as source for RA, DHCPv6
replies, and the guest ignores it. Same later on if we talk TCP or
UDP and the guest has no idea where that address comes from.

Use our link-local address in case the gateway address is global.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-20 11:10:23 +02:00
Stefano Brivio
12cfa6444c passt: Add clang-tidy Makefile target and test, take care of warnings
Most are just about style and form, but a few were actually
serious mistakes (NDP-related).

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-20 08:34:22 +02:00
Stefano Brivio
7f1e7019cb test/demo: Don't wait for # after pasta is started by perf report
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-20 08:34:22 +02:00
Stefano Brivio
4f69efcfba README: .. doesn't actually work for comments in Markdown
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-20 08:34:02 +02:00
Stefano Brivio
7d24803fb3 conf: Always pass an empty buffer to line_read() in get_dns()
Given that get_dns() touches the buffer read by line_read(), we
can't optimise that by passing the existing buffer.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-20 08:29:30 +02:00
Stefano Brivio
b0b77118fe passt: Address warnings from Clang's scan-build
All false positives so far.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-20 08:29:30 +02:00
Stefano Brivio
1a563a0cbd passt: Address gcc 11 warnings
A mix of unchecked return values, a missing permission mask for
open(2) with O_CREAT, and some false positives from
-Wstringop-overflow and -Wmaybe-uninitialized.

Reported-by: Martin Hauke <mardnh@gmx.de>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-20 08:29:30 +02:00
Stefano Brivio
087b5f4dbb LICENSES: Add license text files, add missing notices, fix SPDX tags
SPDX tags don't replace license files. Some notices were missing and
some tags were not according to the SPDX specification, too.

Now reuse --lint from the REUSE tool (https://reuse.software/) passes.

Reported-by: Martin Hauke <mardnh@gmx.de>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-20 08:29:30 +02:00
Stefano Brivio
f154a0489a Makefile: Install man pages to /usr/share/man instead of /usr/man
Reported-by: Martin Hauke <mardnh@gmx.de>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-20 08:29:30 +02:00
Stefano Brivio
2725003d45 Makefile: Prefix installation paths with $(DESTDIR)
Martin reports that DESTDIR is ignored in install/uninstall targets,
see also:
	https://www.gnu.org/prep/standards/html_node/DESTDIR.html

Reported-by: Martin Hauke <mardnh@gmx.de>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-19 09:42:08 +02:00
Stefano Brivio
9df5027129 perf/passt_udp: Don't overshoot UDP bandwidth excessively on larger MTUs
...performance with 64KiB MTUs might look worse than with 9000bytes
on some configurations.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-19 09:30:42 +02:00
Stefano Brivio
7aaff3387a perf/passt_tcp: Don't exceed typical L3 cache sizes with buffers
...we might see misleading rate drops with larger MTUs otherwise.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-19 09:28:44 +02:00