Both the -D (--dns) and -S (--search) options take an optional argument.
If the argument is omitted the option is disabled entirely. However,
handling the optional argument requires some ugly special case handling if
it's the last option on the command line, and has potential ambiguity with
non-option arguments used with pasta. It can also make it more confusing
to read command lines.
Simplify the logic here by replacing the non-argument versions with an
explicit "-D none" or "-S none".
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[sbrivio: Reworked logic to exclude redundant/conflicting options]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The --pcap or -p option can be used with or without an argument. If given,
the argument gives the name of the file to save a packet trace to. If
omitted, we generate a default name in /tmp.
Generating the default name isn't particularly useful though, since making
a suitable name can easily be done by the caller. Remove this feature.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reflect the changes from commit 4b2e018d70 ("Allow different
external interfaces for IPv4 and IPv6 connectivity") into the manual.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This is useful in environments where we want to forward a large
number of ports, or all non-ephemeral ones, and some other service
running on the host needs a few selected ports.
I'm using ~ as prefix for the specification of excluded ranges and
ports to avoid the need for explicit command line quoting.
Ranges and ports can be excluded from given ranges by adding them
in the comma-separated list, prefixed by ~. Some quick examples:
-t 5000-6000,~5555: forward ports 5000 to 6000, but not 5555
-t ~20000-20010: forward all non-ephemeral, allowed ports, except
for ports 20000 to 20010
...more details in usage message and man page.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
A lot of tests and examples invoke qemu with the command "kvm". However,
as far as I can tell, "kvm" being aliased to the appropriate qemu system
binary is Debian specific. The binary names from qemu upstream -
qemu-system-$ARCH - also aren't universal, but they are more common (they
should be good for both Debian and Fedora at least).
In order to still get KVM acceleration when available, we use the option
"-M accel=kvm:tcg" to tell qemu to try using either KVM or TCG in that
order
A number of the places we invoked "kvm" are expecting specifically an x86
guest, and so it's also safer to explicitly invoke qemu-system-x86_64.
Some others appear to be independent of the target arch (just wanting the
same arch as the host to allow KVM acceleration). Although I suspect there
may be more subtle x86 specific options in the qemu command lines, attempt
to preserve arch independence by using $(uname -m).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
For some reason, the passt/pasta tests and examples use dhclient for
DHCPv6, but in most cases use udhcpc for DHCPv4. Change it to use dhclient
for both DHCPv4 and DHCPv6. This means one less tool we need for testing,
plus dhclient is easily available on Fedora whereas udhcpc is not.
Note that the passt tests still rely on udhcpc indirectly because mbuto
wants to put it into the guest images it generates.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A number of tests and examples use dhclient in both IPv4 and IPv6 modes.
We use "dhclient -6" for IPv6, but usually just "dhclient" for IPv4. Add
an explicit "-4" argument to make it more clear and explicit.
In addition, when dhclient is run from within pasta it usually won't be
"real" root, and so will not have access to write the default global pid
file. This results in a mostly harmless but irritating error:
Can't create /var/run/dhclient.pid: Permission denied
We can avoid that by using the --no-pid flag to dhclient.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
On some systems, user and group "nobody" might not be available. The
new --runas option allows to override the default "nobody" choice if
started as root.
Now that we allow this, drop the initgroups() call that was used to
add any additional groups for the given user, as that might now
grant unnecessarily broad permissions. For instance, several
distributions have a "kvm" group to allow regular user access to
/dev/kvm, and we don't need that in passt or pasta.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This feature is available in slirp4netns but was missing in passt and
pasta.
Given that we don't do dynamic memory allocation, we need to bind
sockets while parsing port configuration. This means we need to
process all other options first, as they might affect addressing and
IP version support. It also implies a minor rework of how TCP and UDP
implementations bind sockets.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
--debug can be a bit too noisy, especially as single packets or
socket messages are logged: implement a new option, --trace,
implying --debug, that enables all debug messages.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
If we statically configure a default route, and also advertise it for
SLAAC, the kernel will try moments later to add the same route:
ICMPv6: RA: ndisc_router_discovery failed to add default route
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This should be convenient for users managing filesystem-bound network
namespaces: monitor the base directory of the namespace and exit if
the namespace given as PATH or NAME target is deleted. We can't add
an inotify watch directly on the namespace directory, that won't work
with nsfs.
Add an option to disable this behaviour, --no-netns-quit.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
For compatibility with libslirp/slirp4netns users: introduce a
mechanism to map, in the UDP routines, an address facing guest or
namespace to the first IPv4 or IPv6 address resulting from
configuration as resolver. This can be enabled with the new
--dns-forward option.
This implies that sourcing and using DNS addresses and search lists,
passed via command line or read from /etc/resolv.conf, is not bound
anymore to DHCP/DHCPv6/NDP usage: for example, pasta users might just
want to use addresses from /etc/resolv.conf as mapping target, while
not passing DNS options via DHCP.
Reflect this in all the involved code paths by differentiating
DHCP/DHCPv6/NDP usage from DNS configuration per se, and in the new
options --dhcp-dns, --dhcp-search for pasta, and --no-dhcp-dns,
--no-dhcp-search for passt.
This should be the last bit to enable substantial compatibility
between slirp4netns.sh and slirp4netns(1): pass the --dns-forward
option from the script too.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
To reach (at least) a conceptually equivalent security level as
implemented by --enable-sandbox in slirp4netns, we need to create a
new mount namespace and pivot_root() into a new (empty) mountpoint, so
that passt and pasta can't access any filesystem resource after
initialisation.
While at it, also detach IPC, PID (only for passt, to prevent
vulnerabilities based on the knowledge of a target PID), and UTS
namespaces.
With this approach, if we apply the seccomp filters right after the
configuration step, the number of allowed syscalls grows further. To
prevent this, defer the application of seccomp policies after the
initialisation phase, before the main loop, that's where we expect bad
things to happen, potentially. This way, we get back to 22 allowed
syscalls for passt and 34 for pasta, on x86_64.
While at it, move #syscalls notes to specific code paths wherever it
conceptually makes sense.
We have to open all the file handles we'll ever need before
sandboxing:
- the packet capture file can only be opened once, drop instance
numbers from the default path and use the (pre-sandbox) PID instead
- /proc/net/tcp{,v6} and /proc/net/udp{,v6}, for automatic detection
of bound ports in pasta mode, are now opened only once, before
sandboxing, and their handles are stored in the execution context
- the UNIX domain socket for passt is also bound only once, before
sandboxing: to reject clients after the first one, instead of
closing the listening socket, keep it open, accept and immediately
discard new connection if we already have a valid one
Clarify the (unchanged) behaviour for --netns-only in the man page.
To actually make passt and pasta processes run in a separate PID
namespace, we need to unshare(CLONE_NEWPID) before forking to
background (if configured to do so). Introduce a small daemon()
implementation, __daemon(), that additionally saves the PID file
before forking. While running in foreground, the process itself can't
move to a new PID namespace (a process can't change the notion of its
own PID): mention that in the man page.
For some reason, fork() in a detached PID namespace causes SIGTERM
and SIGQUIT to be ignored, even if the handler is still reported as
SIG_DFL: add a signal handler that just exits.
We can now drop most of the pasta_child_handler() implementation,
that took care of terminating all processes running in the same
namespace, if pasta started a shell: the shell itself is now the
init process in that namespace, and all children will terminate
once the init process exits.
Issuing 'echo $$' in a detached PID namespace won't return the
actual namespace PID as seen from the init namespace: adapt
demo and test setup scripts to reflect that.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This is actually annoying: there's no way to make it fork into
background when running from a script. However, it's always
possible to keep it in foreground with -f. Make it simpler, and
always fork into background if -f is not given.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
SPDX tags don't replace license files. Some notices were missing and
some tags were not according to the SPDX specification, too.
Now reuse --lint from the REUSE tool (https://reuse.software/) passes.
Reported-by: Martin Hauke <mardnh@gmx.de>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The behaviour without tcpi_snd_wnd changed: the only difference now
is the advertised window, which corresponds to the queried sending
buffer size.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Move netlink routines to their own file, and use netlink to configure
or fetch all the information we need, except for the TUNSETIFF ioctl.
Move pasta-specific functions to their own file as well, add
parameters and calls to configure the tap interface in the namespace.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Based on a patch from Giuseppe Scrivano, this adds the ability to:
- specify paths and names of target namespaces to join, instead of
a PID, also for user namespaces, with --userns
- request to join or create a network namespace only, without
entering or creating a user namespace, with --netns-only
- specify the base directory for netns mountpoints, with --nsrun-dir
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
[sbrivio: reworked logic to actually join the given namespaces when
they're not created, implemented --netns-only and --nsrun-dir,
updated pasta demo script and man page]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Detecting bound ports at start-up time isn't terribly useful: do this
periodically instead, if configured.
This is only implemented for TCP at the moment, UDP is somewhat more
complicated: leave a TODO there.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>