seccomp: Add a number of alternate and per-arch syscalls

Depending on the C library, but not necessarily in all the
functions we use, statx() might be used instead of stat(),
getdents() instead of getdents64(), readlinkat() instead of
readlink(), openat() instead of open().

On aarch64, it's clone() and not fork(), and dup3() instead of
dup2() -- just allow the existing alternative instead of dealing
with per-arch selections.

Since glibc commit 9a7565403758 ("posix: Consolidate fork
implementation"), we need to allow set_robust_list() for
fork()/clone(), even in a single-threaded context.

On some architectures, epoll_pwait() is provided instead of
epoll_wait(), but never both. Same with newfstat() and
fstat(), sigreturn() and rt_sigreturn(), getdents64() and
getdents(), readlink() and readlinkat(), unlink() and
unlinkat(), whereas pipe() might not be available, but
pipe2() always is, exclusively or not.

Seen on Fedora 34: newfstatat() is used on top of fstat().

syslog() is an actual system call on some glibc/arch combinations,
instead of a connect()/send() implementation.

On ppc64 and ppc64le, _llseek(), recv(), send() and getuid()
are used. For ppc64 only: ugetrlimit() for the getrlimit()
implementation, plus sigreturn() and fcntl64().

On s390x, additionally, we need to allow socketcall() (on top
of socket()), and sigreturn() also for passt (not just for
pasta).

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This commit is contained in:
Stefano Brivio 2022-01-26 06:55:28 +01:00
parent be265eef06
commit 33b1bdd079
6 changed files with 15 additions and 10 deletions

View file

@ -233,7 +233,7 @@ speeding up local connections, and usually requiring NAT. _pasta_:
* ✅ root operation not allowed outside user namespaces * ✅ root operation not allowed outside user namespaces
* ✅ all capabilities dropped, other than `CAP_NET_BIND_SERVICE` (if granted) * ✅ all capabilities dropped, other than `CAP_NET_BIND_SERVICE` (if granted)
* ✅ no external dependencies (other than a standard C library) * ✅ no external dependencies (other than a standard C library)
* ✅ restrictive seccomp profiles (46 syscalls allowed for _passt_, 58 for * ✅ restrictive seccomp profiles (50 syscalls allowed for _passt_, 62 for
_pasta_) _pasta_)
* ✅ static checkers in continuous integration (clang-tidy, cppcheck) * ✅ static checkers in continuous integration (clang-tidy, cppcheck)
* 🛠️ rework of TCP state machine (flags instead of states), TCP timers, and code * 🛠️ rework of TCP state machine (flags instead of states), TCP timers, and code

2
conf.c
View file

@ -11,7 +11,7 @@
* Copyright (c) 2020-2021 Red Hat GmbH * Copyright (c) 2020-2021 Red Hat GmbH
* Author: Stefano Brivio <sbrivio@redhat.com> * Author: Stefano Brivio <sbrivio@redhat.com>
* *
* #syscalls stat * #syscalls stat|statx
*/ */
#include <arpa/inet.h> #include <arpa/inet.h>

14
passt.c
View file

@ -273,12 +273,16 @@ static void pid_file(struct ctx *c) {
* *
* Return: non-zero on failure * Return: non-zero on failure
* *
* #syscalls read write open close fork dup2 exit chdir ioctl writev syslog * #syscalls read write open|openat close fork|clone dup2|dup3 ioctl writev
* #syscalls prlimit64 epoll_ctl epoll_create1 epoll_wait accept4 accept listen
* #syscalls socket bind connect getsockopt setsockopt recvfrom sendto shutdown * #syscalls socket bind connect getsockopt setsockopt recvfrom sendto shutdown
* #syscalls openat fstat fcntl lseek clone setsid exit_group getpid * #syscalls accept4 accept listen set_robust_list getrlimit setrlimit
* #syscalls clock_gettime newfstatat * #syscalls openat fcntl lseek clone setsid exit exit_group getpid chdir
* #syscalls:pasta rt_sigreturn * #syscalls epoll_ctl epoll_create1 epoll_wait|epoll_pwait epoll_pwait
* #syscalls prlimit64 clock_gettime fstat|newfstat newfstatat syslog
* #syscalls ppc64le:_llseek ppc64le:recv ppc64le:send ppc64le:getuid
* #syscalls ppc64:_llseek ppc64:recv ppc64:send ppc64:getuid ppc64:ugetrlimit
* #syscalls s390x:socketcall s390x:sigreturn
* #syscalls:pasta rt_sigreturn|sigreturn ppc64:sigreturn ppc64:fcntl64
*/ */
int main(int argc, char **argv) int main(int argc, char **argv)
{ {

View file

@ -12,7 +12,8 @@
* Author: Stefano Brivio <sbrivio@redhat.com> * Author: Stefano Brivio <sbrivio@redhat.com>
* *
* #syscalls:pasta clone unshare waitid kill execve exit_group rt_sigprocmask * #syscalls:pasta clone unshare waitid kill execve exit_group rt_sigprocmask
* #syscalls:pasta geteuid getdents64 readlink setsid nanosleep clock_nanosleep * #syscalls:pasta geteuid getdents64|getdents readlink|readlinkat setsid
* #syscalls:pasta nanosleep clock_nanosleep
*/ */
#include <sched.h> #include <sched.h>

2
tap.c
View file

@ -772,7 +772,7 @@ restart:
* tap_sock_init_unix() - Create and bind AF_UNIX socket, wait for connection * tap_sock_init_unix() - Create and bind AF_UNIX socket, wait for connection
* @c: Execution context * @c: Execution context
* *
* #syscalls:passt unlink * #syscalls:passt unlink|unlinkat
*/ */
static void tap_sock_init_unix(struct ctx *c) static void tap_sock_init_unix(struct ctx *c)
{ {

2
tcp.c
View file

@ -304,7 +304,7 @@
* - SPLICE_FIN_TO: FIN (EPOLLRDHUP) seen from connected socket * - SPLICE_FIN_TO: FIN (EPOLLRDHUP) seen from connected socket
* - SPLICE_FIN_BOTH: FIN (EPOLLRDHUP) seen from both sides * - SPLICE_FIN_BOTH: FIN (EPOLLRDHUP) seen from both sides
* *
* #syscalls pipe pipe2 * #syscalls pipe|pipe2 pipe2
*/ */
#include <sched.h> #include <sched.h>