Commit graph

4 commits

Author SHA1 Message Date
Stefano Brivio
087b5f4dbb LICENSES: Add license text files, add missing notices, fix SPDX tags
SPDX tags don't replace license files. Some notices were missing and
some tags were not according to the SPDX specification, too.

Now reuse --lint from the REUSE tool (https://reuse.software/) passes.

Reported-by: Martin Hauke <mardnh@gmx.de>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-20 08:29:30 +02:00
Stefano Brivio
4b12cf94f0 checksum: Stream load into four registers at a time with > 128 bytes
...and further interleave register usage. This brings the csum()
overhead reported by perf(1) for 30 seconds of 64KiB TCP IPv4
frames, host to guest, from 7.2% to 5.8%.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-15 17:04:46 +02:00
Stefano Brivio
74f29d3148 checksum: Interleave lo/hi sums while folding into 128-bit sums, drop TODO
I left a TODO and never checked -- this actually seems to slightly
improve CPIs on AMD Naples (two 128-bit FMA units glued together).

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-15 16:59:12 +02:00
Stefano Brivio
17765f8de0 checksum: Introduce AVX2 implementation, unify helpers
Provide an AVX2-based function using compiler intrinsics for
TCP/IP-style checksums. The load/unpack/add idea and implementation
is largely based on code from BESS (the Berkeley Extensible Software
Switch) licensed as 3-Clause BSD, with a number of modifications to
further decrease pipeline stalls and to minimise cache pollution.

This speeds up considerably data paths from sockets to tap
interfaces, decreasing overhead for checksum computation, with
16-64KiB packet buffers, from approximately 11% to 7%. The rest is
just syscalls at this point.

While at it, provide convenience targets in the Makefile for avx2,
avx2_debug, and debug targets -- these simply add target-specific
CFLAGS to the build.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-07-26 07:18:50 +02:00