After we start the iperf3 server in the background, we have a sleep to
make sure it's ready to receive connections. We can simplify this slightly
by using the -D option to have iperf3 background itself rather than
backgrounding it manually. That won't return until the server is ready to
use.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The -m option controls the MTU, that is the maximum transmissible L3
datagram, not including L2 headers. We currently limit it to ETH_MAX_MTU
which sounds like it makes sense. But ETH_MAX_MTU is confusing: it's not
consistently used as to whether it means the maximum L3 datagram size or
the maximum L2 frame size. Even within conf() we explicitly account for
the L2 header size when computing the default --mtu value, but not when
we compute the maximum --mtu value.
Clean this up by reworking the maximum MTU computation to be the minimum of
IP_MAX_MTU (65535) and the maximum sized IP datagram which can fit into
our L2 frames when we account for the L2 header. The latter can vary
depending on our tap backend, although it doesn't right now.
Link: https://bugs.passt.top/show_bug.cgi?id=66
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The pcap header includes a value indicating how much of each frame is
captured. We always capture the entire frame, so we want to set this to
the maximum possible frame size. Currently we do that by setting it to
ETH_MAX_MTU, but that's a confusingly named constant which might not always
be correct depending on the details of our tap backend.
Instead add a tap_l2_max_len() function that explicitly returns the maximum
frame size for the current mode and use that to set snaplen. While we're
there, there's no particular need for the pcap header to be defined in a
global; make it local to pcap_init() instead.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We define the size of pkt_buf as large enough to hold 128 maximum size
packets. Well, approximately, since we round down to the page size. We
don't have any specific reliance on how many packets can fit in the buffer,
we just want it to be big enough to allow reasonable batching. The
current definition relies on the confusingly named ETH_MAX_MTU and adds
in sizeof(uint32_t) rather non-obviously for the pseudo-physical header
used by the qemu socket (passt mode) protocol.
Instead, just define it to be 8MiB, which is what that complex calculation
works out to.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently in tap.c we (mostly) use ETH_MAX_MTU as the maximum length of
an L2 frame. This define comes from the kernel, but it's badly named and
used confusingly.
First, it doesn't really have anything to do with Ethernet, which has no
structural limit on frame lengths. It comes more from either a) IP which
imposes a 64k datagram limit or b) from internal buffers used in various
places in the kernel (and in passt).
Worse, MTU generally means the maximum size of the IP (L3) datagram which
may be transferred, _not_ counting the L2 headers. In the kernel
ETH_MAX_MTU is sometimes used that way, but sometimes seems to be used as
a maximum frame length, _including_ L2 headers. In tap.c we're mostly
using it in the second way.
Finally, each of our tap backends could have different limits on the frame
size imposed by the mechanisms they're using.
Start clearing up this confusion by replacing it in tap.c with new
L2_MAX_LEN_* defines which specifically refer to the maximum L2 frame
length for each backend.
Signed-off-by: David Gibson <dgibson@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently we define both TAP_BUF_BYTES and PKT_BUF_BYTES as essentially
the same thing. They'll be different only if TAP_BUF_BYTES is negative,
which makes no sense. So, remove TAP_BUF_BYTES and just use PKT_BUF_BYTES.
In addition, most places we use this to just mean the size of the main
packet buffer (pkt_buf) for which we can just directly use sizeof.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We verify that every packet we store in a pool (and every partial packet
we retreive from it) has a length no longer than UINT16_MAX. This
originated in the older packet pool implementation which stored packet
lengths in a uint16_t. Now, that packets are represented by a struct
iovec with its size_t length, this check serves only as a sanity / security
check that we don't have some wildly out of range length due to a bug
elsewhere.
We have may reasons to (slightly) increase this limit in future, so in
preparation, give this quantity an explicit name - PACKET_MAX_LEN.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We detect our operating mode in conf_mode(), unless we're using vhost-user
mode, in which case we change it later when we parse the --vhost-user
option. That means we need to delay parsing the --repair-path option (for
vhost-user only) until still later.
However, there are many other places in the main option parsing loop which
also rely on mode. We get away with those, because they happen to be able
to treat passt and vhost-user modes identically. This is potentially
confusing, though. So, move setting of MODE_VU into conf_mode() so
c->mode always has its final value from that point onwards.
To match, we move the parsing of --repair-path back into the main option
parsing loop.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
One of the first things we need to do is determine if we're in passt mode
or pasta mode. Currently this is open-coded in main(), by examining
argv[0]. We want to complexify this a bit in future to cover vhost-user
mode as well. Prepare for this, by moving the mode detection into a new
conf_mode() function.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently we rely on detecting our mode first and use different sets of
(single character) options for each. This means that if you use an option
valid in only one mode in another you'll get the generic usage() message.
We can give more helpful errors with little extra effort by combining all
the options into a single value of the option string and giving bespoke
messages if an option for the wrong mode is used; in fact we already did
this for some single mode options like '-1'.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
...and time out after that. This will be needed because of an upcoming
change to passt-repair enabling it to start before passt is started,
on both source and target, by means of an inotify watch.
Once the inotify watch triggers, passt-repair will connect right away,
but we have no guarantees that the connection completes before we
start the migration process, so wait for it (for a reasonable amount
of time).
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
It might not be feasible for users to start passt-repair after passt
is started, on a migration target, but before the migration process
starts.
For instance, with libvirt, the guest domain (and, hence, passt) is
started on the target as part of the migration process. At least for
the moment being, there's no hook a libvirt user (including KubeVirt)
can use to start passt-repair before the migration starts.
Add a directory watch using inotify: if PATH is a directory, instead
of connecting to it, we'll watch for a .repair socket file to appear
in it, and then attempt to connect to that socket.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
We have some functions in our headers which are definitely there on
purpose. However, they're not yet used outside the files in which they're
defined. That causes sufficiently recent cppcheck versions (2.17) to
complain they should be static.
Suppress the errors for these "logically" exported functions.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
vhost-user added several functions which are exposed in headers, but not
used outside the file where they're defined. I can't tell if these are
really internal functions, or of they're logically supposed to be exported,
but we don't happen to have anything using them yet.
For the time being, just remove the exports. We can add them back if we
need to.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
tcp_update_csum() is exposed in tcp_internal.h, but is only used in tcp.c.
Remove the unneded prototype and make it static.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Several of the exposed functions in checksum.h are no longer directly used.
Remove them from the header, and make static. In particular sum_16b()
should not be used outside: generally csum_unfolded() should be used which
will automatically use either the AVX2 optimized version or sum_16b() as
necessary.
csum_fold() and csum() could have external uses, but they're not used right
now. We can expose them again if we need to.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
passt_vsyslog() is an exposed function in log.h. However it shouldn't
be called from outside log.c: it writes specifically to the system log,
and most code should call passt's logging helpers which might go to the
syslog or to a log file.
Make passt_vsyslog() local to log.c. This requires a code motion to avoid
a forward declaration.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This marks static a number of functions which are only used in their .c
file, have no prototypes in a .h and were never intended to be globally
exposed.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
When a local peer sends a UDP message to a non-existing port on an
existing remote host, that host will return an ICMPv6 message containing
the error code ICMP6_DST_UNREACH_NOPORT, plus the IPv6 header, UDP header
and the first 1232 bytes of the original message, if any. If the sender
socket has been connected, it uses this message to issue a
"Connection Refused" event to the user.
Until now, we have only read such events from the externally facing
socket, but we don't forward them back to the local sender because
we cannot read the ICMP message directly to user space. Because of
this, the local peer will hang and wait for a response that never
arrives.
We now fix this for IPv6 by recreating and forwarding a correct ICMP
message back to the internal sender. We synthesize the message based
on the information in the extended error structure, plus the returned
part of the original message body.
Note that for the sake of completeness, we even produce ICMP messages
for other error types and codes. We have noticed that at least
ICMP_PROT_UNREACH is propagated as an error event back to the user.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
[sbrivio: fix cppcheck warning, udp_send_conn_fail_icmp6() doesn't
modify saddr which can be declared as const]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We will need to build the UDP header at other locations than in function
tap_udp6_send(), so we break that part out to a separate function.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
When a local peer sends a UDP message to a non-existing port on an
existing remote host, that host will return an ICMP message containing
the error code ICMP_PORT_UNREACH, plus the header and the first eight
bytes of the original message. If the sender socket has been connected,
it uses this message to issue a "Connection Refused" event to the user.
Until now, we have only read such events from the externally facing
socket, but we don't forward them back to the local sender because
we cannot read the ICMP message directly to user space. Because of
this, the local peer will hang and wait for a response that never
arrives.
We now fix this for IPv4 by recreating and forwarding a correct ICMP
message back to the internal sender. We synthesize the message based
on the information in the extended error structure, plus the returned
part of the original message body.
Note that for the sake of completeness, we even produce ICMP messages
for other error codes. We have noticed that at least ICMP_PROT_UNREACH
is propagated as an error event back to the user.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
[sbrivio: fix cppcheck warning: udp_send_conn_fail_icmp4() doesn't
modify 'in', it can be declared as const]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We will need to build the UDP header at other locations than in function
tap_udp4_send(), so we break that part out to a separate function.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently we reject the -m option if given a value less than ETH_MIN_MTU
(68). That define is derived from the kernel, but its name is misleading:
it doesn't really have anything to do with Ethernet per se, but is rather
the minimum payload any L2 link must be able to handle in order to carry
IPv4. For IPv6, it's not sufficient: that requires an MTU of at least
1280.
Newer kernels have better named constants IPV4_MIN_MTU and IPv6_MIN_MTU.
Copy and use those constants instead, along with some more specific error
messages.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently, if a non-SYN TCP packet arrives which doesn't match any existing
connection, we simply ignore it. However RFC 9293, section 3.10.7.1 says
we should respond with an RST to a non-SYN, non-RST packet that's for a
CLOSED (i.e. non-existent) connection.
This can arise in practice with migration, in cases where some error means
we have to discard a connection. We destroy the connection with tcp_rst()
in that case, but because the guest is stopped, we may not be able to
deliver the RST packet on the tap interface immediately. This change
ensures an RST will be sent if the guest tries to use the connection again.
A similar situation can arise if a passt/pasta instance is killed or
crashes, but is then replaced with another attached to the same guest.
This can leave the guest with stale connections that the new passt instance
isn't aware of. It's better to send an RST so the guest knows quickly
these are broken, rather than letting them linger until they time out.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
To allow more batching, we group together related packets into "seqs" in
the tap layer, before passing them to the L4 protocol layers. Currently
we consider the IP protocol, both IP addresses and also the L4 ports when
grouping things into seqs. We ignore the IPv6 flow label.
We have some future cases where we want to consider the the flow label in
the L4 code, which is awkward if we could be given a single batch with
multiple labels. Add the flow label to tap6_l4_t and group by it as well
as the other criteria. In future we could possibly use the flow label
_instead_ of peeking into the L4 header for the ports, but we don't do so
for now.
The guest should use the same flow label for all packets in a low, but if
it doesn't this change won't break anything, it just means we'll batch
things a bit sub-optimally.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The flow label is a 20-bit field in the IPv6 header. The length and
alignment make it awkward to pass around as is. Obviously, it can be
packed into a 32-bit integer though, and we do this in two places. We
have some further upcoming places where we want to manipulate the flow
label, so make some helpers for marshalling and unmarshalling it to an
integer.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
In tcp_flow_migrate_target(), if we're unable to create and bind the new
socket, we print an error, cancel the flow and carry on. This seems to
make sense based on our policy of generally letting the migration complete
even if some or all flows are lost in the process. But it doesn't quite
work: the flow_alloc_cancel() means that the flows in the target's flow
table are no longer one to one match to the flows which the source is
sending data for. This means that data for later flows will be mismatched
to a different flow. Most likely that will cause some nasty error later,
but even worse it might appear to succeed but lead to data corruption due
to incorrectly restoring one of the flows.
Instead, we should leave the flow in the table until we've read all the
data for it, *then* discard it. Technically removing the
flow_alloc_cancel() would be enough for this: if tcp_flow_repair_socket()
fails it leaves conn->sock == -1, which will cause the restore functions
in tcp_flow_migrate_target_ext() to fail, discarding the flow. To make
what's going on clearer (and with less extraneous error messages), put
several explicit tests for a missing socket later in the migration path to
read the data associated with the flow but explicitly discard it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
tcp_rst() attempts to send an RST packet to the guest, and if that succeeds
moves the flow to CLOSED state. However, even if the tcp_send_flag() fails
the flow is still dead: we've usually closed the socket already, and
something has already gone irretrievably wrong. So we should still mark
the flow as CLOSED. That will cause it to be cleaned up, meaning any
future packets from the guest for it won't match a flow, so should generate
new RSTs (they don't at the moment, but that's a separate bug).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
There are two small bugs in error returns from tcp_low_repair_socket(),
which is supposed to return a negative errno code:
1) On bind() failures, wedirectly pass on the return code from bind(),
which is just 0 or -1, instead of an error code.
2) In the caller, tcp_flow_migrate_target() we call strerror_() directly
on the negative error code, but strerror() requires a positive error
code.
Correct both of these.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Migrating TCP flows requires passt-repair in order to use TCP_REPAIR. If
passt-repair is not started, our failure mode is pretty ugly though: we'll
attempt the migration, hitting various problems when we can't enter repair
mode. In some cases we may not roll back these changes properly, meaning
we break network connections on the source.
Our general approach is not to completely block migration if there are
problems, but simply to break any flows we can't migrate. So, if we have
no connection from passt-repair carry on with the migration, but don't
attempt to migrate any TCP connections.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We could get a migration request when we have no active flows; or at least
none that we need or are able to migrate. In this case after sending or
receiving the number of flows we continue to step through various lists.
In the target case, this could include communication with passt-repair. If
passt-repair wasn't started that could cause further errors, but of course
they shouldn't matter if we have nothing to repair.
Make it more obvious that there's nothing to do and avoid such errors by
short-circuiting flow_migrate_{source,target}() if there are no migratable
flows.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Here are a bunch of workarounds and a couple of fixes for libvirt
usage which are rather hard to split into single logical patches
as there appear to be some obscure dependencies between some of them:
- passt-repair needs to have an exec_type typeattribute (otherwise
the policy for lsmd(1) causes a violation on getattr on its
executable) file, and that typeattribute just happened to be there
for passt as a result of init_daemon_domain(), but passt-repair
isn't a daemon, so we need an explicit corecmd_executable_file()
- passt-repair needs a workaround, which I'll revisit once
https://github.com/fedora-selinux/selinux-policy/issues/2579 is
solved, for usage with libvirt: allow it to use qemu_var_run_t
and virt_var_run_t sockets
- add 'bpf' and 'dac_read_search' capabilities for passt-repair:
they are needed (for whatever reason I didn't investigate) to
actually receive socket files via SCM_RIGHTS
- passt needs further workarounds in the sense of
https://github.com/fedora-selinux/selinux-policy/issues/2579:
allow it to use map and use svirt_tmpfs_t (not just svirt_image_t):
it depends on where the libvirt guest image is
- ...it also needs to map /dev/null if <access mode='shared'/> is
enabled in libvirt's XML for the memoryBacking object, for
vhost-user operation
- and 'ioctl' on the TCP socket appears to be actually needed, on top
of 'getattr', to dump some socket parameters
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
When printing list of allowed syscalls the width of terminal is
obtained for nicer output (see commit below). The width is
obtained by running 'stty'. While this works when building from a
console, it doesn't work during rpmbuild/emerge/.. as stdout is
usually not a console but a logfile and stdin is usually
/dev/null or something. This results in stty reporting errors
like this:
stty: 'standard input': Inappropriate ioctl for device
Redirect stty's stderr to /dev/null to silence it.
Fixes: 712ca32353 ("seccomp.sh: Try to account for terminal width while formatting list of system calls")
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
When studying the Linux source code and Wireshark dumps it seems like
the no_frag flag in the IPv4 header is always set. Following discussions
in the Internet on this subject indicates that modern routers never
fragment packets, and that it isn't even supported in many cases.
Adding to this that incoming messages forwarded on the tap interface
never even pass through a router it seems safe to always set this flag.
This makes the IPv4 headers of forwarded messages identical to those
sent by the external sockets, something we must consider desirable.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Otherwise we build it, but we don't install it. Not an issue that
warrants a a release right away as it's anyway usable.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
...instead of unconditionally trying to enable both: mmap2() is the
32-bit ARM variant for mmap() (and perhaps for other architectures),
bot if mmap() is available, valgrind will use that one.
This avoids seccomp.sh warning us about missing mmap2() if mmap() is
present, and is consistent with what we do in vhost-user code.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
On the command line -m 0 means "don't assign an MTU" (letting the guest use
its default. However, internally we use (c->mtu == -1) to represent that
state. We use (c->mtu == 0) to represent "the user didn't specify on the
command line, so use the default" - but this is only used during conf(),
never afterwards.
This is unnecessarily confusing. We can instead just initialise c->mtu to
its default (65520) before parsing options and use 0 on both the command
line and internally to represent the "don't assign" special case. This
ensures that c->mtu is always 0..65535, so we can store it in a uint16_t
which is more natural.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We're a bit sloppy with parsing MTU which can lead to some surprising,
though fairly harmless, results:
* Passing a non-number like '-m xyz' will not give an error and act like
-m 0
* Junk after a number (e.g. '-m 1500pqr') will be ignored rather than
giving an error
* We parse the MTU as a long, then immediately assign to an int, so on
some platforms certain ludicrously out of bounds values will be
silently truncated, rather than giving an error
Be a bit more thorough with the error checking to avoid that.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The migration code introduced a number of 'foreach' macros to traverse the
flow table. These aren't inherently tied to migration, so polish up their
naming, move them to flow_table.h and also use in flow_defer_handler()
which is the other place we need to traverse the whole table.
For now we keep foreach_established_tcp_flow() as is.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The foreach macros used to step through flows each take a 'bound' parameter
to only scan part of the flow table. Only one place actually passes a
bound different from FLOW_MAX. So we can simplify every other invocation
by having that one case manually handle the bound.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The foreach macros are odd in that they take two loop counters: an integer
index, and a pointer to the flow. We nearly always want the latter, not
the former, and we can get the index from the pointer trivially when we
need it. So, rearrange the macros not to need the integer index.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Our general logging helpers include a number of _perror() variants which,
like perror(3) include the description of the current errno. We didn't
have those for our flow specific logging helpers, though. Fill this gap
with flow_perror() and flow_dbg_perror(), and use them where it's useful.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
tcp_flow_migrate_source_ext() is passed both the index of the flow it
operates on and the pointer to the connection structure. However, the
former is trivially derived from the latter. Simplify the interface.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This function existed in drafts of the migration code, but not the final
version. Get rid of the prototype.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
tcp_flow_migrate_target_ext() takes a raw union flow *, although it is TCP
specific, and requires a FLOW_TYPE_TCP entry. Our usual convention is that
such functions should take a struct tcp_tap_conn * instead. Convert it to
do so.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
head_cnt is a global variable which tracks how many entries in head[] are
currently used. The fact that it's global obscures the fact that the
lifetime over which it has a meaningful value is quite short: a single
call to of tcp_vu_data_from_sock().
Make it a local to tcp_vu_data_from_sock() to make that lifetime clearer.
We keep the head[] array global for now - although technically it has the
same valid lifetime - because it's large enough we might not want to put
it on the stack.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The uses of this macro were removed in d4598e1d18 ("udp: Use the same
buffer for the L2 header for all frames").
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Fundamentally what packet_check_range() does is to check whether a given
memory range is within the allowed / expected memory set aside for packets
from a particular pool. That range could represent a whole packet (from
packet_add_do()) or part of a packet (from packet_get_do()), but it doesn't
really matter which.
However, we pass the start of the range as two parameters: @start which is
the start of the packet, and @offset which is the offset within the packet
of the range we're interested in. We never use these separately, only as
(start + offset). Simplify the interface of packet_check_range() and
vu_packet_check_range() to directly take the start of the relevant range.
This will allow some additional future improvements.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently we have a dummy pkt[1] array, which we alias with an array of
a different size via various macros. However, we already require C11 which
includes flexible array members, so we can do better.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>