Commit graph

1696 commits

Author SHA1 Message Date
Stefano Brivio
a20d269630 test/distro: Set unprivileged_userns_clone on Debian Buster and earlier
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
081f7c8f4c test/lib: Consistent cols, rows, poster attributes for asciinema player
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
0bf6adc886 arch: Pointer to local outside scope, CWE-562
Reported by Coverity: if we fail to run the AVX2 version, once
execve() fails, we had already replaced argv[0] with the new
stack-allocated path string, and that's then passed back to
main(). Use a static variable instead.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
2b1fbf4631 udp: Out-of-bounds read, CWE-125 in udp_timer()
Not an actual issue due to how it's typically stored, but udp_act
can also be used for ports 65528-65535. Reported by Coverity.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
5ab2e12f98 tcp: False "Out-of-bounds read" positive, CWE-125
Reported by Coverity: it doesn't see that tcp{4,6}_l2_buf_used are
set to zero by tcp_l2_data_buf_flush(), repeat that explicitly here.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
2a3b8dad33 tcp, tcp_splice: False "Negative array index read" positives, CWE-129
A flag or event bit is always set by callers. Reported by Coverity.

Signed-by-off: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
264d68edcf tcp_splice: Logically dead code, CWE-561
Reported by Coverity.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
71a00f1449 tcp: Dereference null return value, CWE-476
Not an issue with a sane kernel behaviour. Reported by Coverity.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
ceddcac74a conf, tap: False "Buffer not null terminated" positives, CWE-170
Those strings are actually guaranteed to be NULL-terminated. Reported
by Coverity.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
e46f67f152 conf: False "Assign instead of compare" positive, CWE-481
This really just needs to be an assignment before line_read() --
turn it into a for loop. Reported by Coverity.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
eb3d3f367e treewide: Argument cannot be negative, CWE-687
Actually harmless. Reported by Coverity.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
bb76470090 passt: Improper use of negative value (CWE-394)
Reported by Coverity.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
0786b2e60a conf, packet: Operands don't affect result, CWE-569
Reported by Coverity.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
48bc843d6e tap: Resource leak, CWE-404
Reported by Coverity.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
22ed4467a4 treewide: Unchecked return value from library, CWE-252
All instances were harmless, but it might be useful to have some
debug messages here and there. Reported by Coverity.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
6a3f6df865 tcp: False "Untrusted loop bound" positive, CWE-606
Field doff in struct tcp_hdr is 4 bits wide, so optlen in
tcp_tap_handler() is already bound, but make that explicit.
Reported by Coverity.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-05 18:47:07 +02:00
Stefano Brivio
975ee8eb2b passt: Ignoring number of bytes read, CWE-252
Harmless, assuming sane kernel behaviour. Reported by Coverity.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-05 18:47:07 +02:00
Stefano Brivio
dbd0a7035c treewide: Invalid type in argument to printf format specifier, CWE-686
Harmless except for two bad debugging prints.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-05 18:47:04 +02:00
Stefano Brivio
54f8bf8246 passt.1, qrap.1: Update links to qemu out-of-tree patch
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-01 10:10:46 +02:00
Stefano Brivio
8cc6c9b490 README: Fix link to contrib/debian
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-30 14:34:42 +02:00
Stefano Brivio
ba72c83d79 hooks: Copy .webp diagram versions too
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-30 14:34:16 +02:00
Stefano Brivio
baf79c033e README: Drop red notice about early development phase
Last famous words: it should be tested enough by now.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-30 05:52:39 +02:00
Stefano Brivio
bc925b1da4 contrib: Add example of Debian package files
...using dh_apparmor to ship and apply AppArmor profiles. Tried on
current Debian testing (Bookworm, 12).

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-30 05:52:39 +02:00
Stefano Brivio
81c2461408 contrib: Add example spec file for Fedora
...with SELinux package, too. Tested on Fedora 35, but it should
work on pretty much any version.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-30 05:52:39 +02:00
Stefano Brivio
8fd20ad99d tap: Re-read from tap in tap_handler_pasta() on buffer full
read() will return zero if we pass a zero length, which makes no
sense: instead, track explicitly that we exhausted the buffer, flush
packets to handlers and redo.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-30 05:50:17 +02:00
Stefano Brivio
8d85b6a99e tap: Allow ioctl() and openat() for tap_ns_tun() re-initialisation
If the tun interface disappears, we'll call tap_ns_tun() after the
seccomp profile is applied: add ioctl() and openat() to it.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-30 05:49:46 +02:00
Stefano Brivio
37c228ada8 tap, tcp, udp, icmp: Cut down on some oversized buffers
The existing sizes provide no measurable differences in throughput
and packet rates at this point. They were probably needed as batched
implementations were not complete, but they can be decreased quite a
bit now.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
1f4b7fa0d7 passt, pasta: Add examples of SELinux policy modules
These should cover any reasonably common use case in distributions.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
e9d573b14f passt, pasta: Add examples of AppArmor policies
These should cover any reasonably common use case in distributions.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
f643c69806 tcp: Fix warning by gcc 5.4 on ppc64le about comparison in CONN_OR_NULL()
...we don't really need two extra bits, but it's easier to organise
things differently than to silence this.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
052424d7f5 passt: Accurate error reporting for sandbox()
It's actually quite easy to make it fail depending on the
environment, accurately report errors here.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
33fc2dece2 Makefile: Allow implicit test for bugprone-suspicious-string-compare checker
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
62c3edd957 treewide: Fix android-cloexec-* clang-tidy warnings, re-enable checks
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
ad7f57a5b7 udp: Move flags before ts in struct udp_tap_port, avoid end padding
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
48582bf47f treewide: Mark constant references as const
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
965f603238 treewide: Add include guards
...at the moment, just for consistency with packet.h, icmp.h,
tcp.h and udp.h.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
bb70811183 treewide: Packet abstraction with mandatory boundary checks
Implement a packet abstraction providing boundary and size checks
based on packet descriptors: packets stored in a buffer can be queued
into a pool (without storage of its own), and data can be retrieved
referring to an index in the pool, specifying offset and length.

Checks ensure data is not read outside the boundaries of buffer and
descriptors, and that packets added to a pool are within the buffer
range with valid offset and indices.

This implies a wider rework: usage of the "queueing" part of the
abstraction mostly affects tap_handler_{passt,pasta}() functions and
their callees, while the "fetching" part affects all the guest or tap
facing implementations: TCP, UDP, ICMP, ARP, NDP, DHCP and DHCPv6
handlers.

Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
3e4c2d1098 util: Fix function declaration style of write_pidfile()
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
415ccf6116 tcp, tcp_splice: Use less awkward syntax to swap in/out sockets from pools
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
f41f0416b8 dhcp: Minimum option length implied by RFC 951 is 60 bytes, not 62
In section 3 ("Packet Format"), "vend" is 64 bytes long, minus the
magic that's 60 bytes, not 62.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
54d9df3903 tcp: Fit struct tcp_conn into a single 64-byte cacheline
...by:

- storing the chained-hash next connection pointer as numeric
  reference rather than as pointer

- storing the MSS as 14-bit value, and rounding it

- using only the effective amount of bits needed to store the hash
  bucket number

- explicitly limiting window scaling factors to 4-bit values
  (maximum factor is 14, from RFC 7323)

- scaling SO_SNDBUF values, and using a 8-bit representation for
  the duplicate ACK sequence

- keeping window values unscaled, as received and sent

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
bc4ec1a8e9 README: Update Interfaces and Availability sections
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
e80f608710 README: Avoid "here" links
They look a bit lame: rephrase sentences to avoid them.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
0ebaac9747 test/perf: Work-around for virtio_net hang before long streams from guest
I didn't have time to investigate the root cause for the virtio_net
TX hang yet. Add a quick work-around for the moment being.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
92074c16a8 tcp_splice: Close sockets right away on high number of open files
We can't take for granted that the hard limit for open files is
big enough as to allow to delay closing sockets to a timer.

Store the value of RTLIMIT_NOFILE we set at start, and use it to
understand if we're approaching the limit with pending, spliced
TCP connections. If that's the case, close sockets right away as
soon as they're not needed, instead of deferring this task to a
timer.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
be5bbb9b06 tcp: Rework timers to use timerfd instead of periodic bitmap scan
With a lot of concurrent connections, the bitmap scan approach is
not really sustainable.

Switch to per-connection timerfd timers, set based on events and on
two new flags, ACK_FROM_TAP_DUE and ACK_TO_TAP_DUE. Timers are added
to the common epoll list, and implement the existing timeouts.

While at it, drop the CONN_ prefix from flag names, otherwise they
get quite long, and fix the logic to decide if a connection has a
local, possibly unreachable endpoint: we shouldn't go through the
rest of tcp_conn_from_tap() if we reset the connection due to a
successful bind(2), and we'll get EACCES if the port number is low.

Suggested by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
3eb19cfd8a tcp, udp, util: Enforce 24-bit limit on socket numbers
This should never happen, but there are no formal guarantees: ensure
socket numbers are below SOCKET_MAX.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
66a95e331e test, seccomp, Makefile: Switch to valgrind runs for passt functional tests
Pass to seccomp.sh a list of additional syscalls valgrind needs as
EXTRA_SYSCALLS in a new 'valgrind' make target, and add corresponding
support in seccomp.sh itself.

In test setup functions, start passt with valgrind, but not for
performance tests.

Add tests checking that valgrind exits without errors after all the
other tests in the group are done.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
a3bcca864e test: Add asciinema(1) as requirement for CI in README
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-28 17:11:40 +02:00
Stefano Brivio
10f1787edf Makefile: Enable a few hardening flags
They don't have a measurable performance impact and make things a
bit safer.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-28 17:11:40 +02:00