linux/Documentation/networking
Shahab Vahedi f122668ddc ARC: Add eBPF JIT support
This will add eBPF JIT support to the 32-bit ARCv2 processors. The
implementation is qualified by running the BPF tests on a Synopsys HSDK
board with "ARC HS38 v2.1c at 500 MHz" as the 4-core CPU.

The test_bpf.ko reports 2-10 fold improvements in execution time of its
tests. For instance:

test_bpf: #33 tcpdump port 22 jited:0 704 1766 2104 PASS
test_bpf: #33 tcpdump port 22 jited:1 120  224  260 PASS

test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:0 238 PASS
test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:1  23 PASS

test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:0 2034681 PASS
test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:1 1020022 PASS

Deployment and structure
------------------------
The related codes are added to "arch/arc/net":

- bpf_jit.h       -- The interface that a back-end translator must provide
- bpf_jit_core.c  -- Knows how to handle the input eBPF byte stream
- bpf_jit_arcv2.c -- The back-end code that knows the translation logic

The bpf_int_jit_compile() at the end of bpf_jit_core.c is the entrance
to the whole process. Normally, the translation is done in one pass,
namely the "normal pass". In case some relocations are not known during
this pass, some data (arc_jit_data) is allocated for the next pass to
come. This possible next (and last) pass is called the "extra pass".

1. Normal pass       # The necessary pass
     1a. Dry run       # Get the whole JIT length, epilogue offset, etc.
     1b. Emit phase    # Allocate memory and start emitting instructions
2. Extra pass        # Only needed if there are relocations to be fixed
     2a. Patch relocations

Support status
--------------
The JIT compiler supports BPF instructions up to "cpu=v4". However, it
does not yet provide support for:

- Tail calls
- Atomic operations
- 64-bit division/remainder
- BPF_PROBE_MEM* (exception table)

The result of "test_bpf" test suite on an HSDK board is:

hsdk-lnx# insmod test_bpf.ko test_suite=test_bpf

  test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed]

All the failing test cases are due to the ones that were not JIT'ed.
Categorically, they can be represented as:

  .-----------.------------.-------------.
  | test type |   opcodes  | # of cases  |
  |-----------+------------+-------------|
  | atomic    | 0xC3, 0xDB |         149 |
  | div64     | 0x37, 0x3F |          22 |
  | mod64     | 0x97, 0x9F |          15 |
  `-----------^------------+-------------|
                           | (total) 186 |
                           `-------------'

Setup: build config
-------------------
The following configs must be set to have a working JIT test:

  CONFIG_BPF_JIT=y
  CONFIG_BPF_JIT_ALWAYS_ON=y
  CONFIG_TEST_BPF=m

The following options are not necessary for the tests module,
but are good to have:

  CONFIG_DEBUG_INFO=y             # prerequisite for below
  CONFIG_DEBUG_INFO_BTF=y         # so bpftool can generate vmlinux.h

  CONFIG_FTRACE=y                 #
  CONFIG_BPF_SYSCALL=y            # all these options lead to
  CONFIG_KPROBE_EVENTS=y          # having CONFIG_BPF_EVENTS=y
  CONFIG_PERF_EVENTS=y            #

Some BPF programs provide data through /sys/kernel/debug:
  CONFIG_DEBUG_FS=y
arc# mount -t debugfs debugfs /sys/kernel/debug

Setup: elfutils
---------------
The libdw.{so,a} library that is used by pahole for processing
the final binary must come from elfutils 0.189 or newer. The
support for ARCv2 [1] has been added since that version.

[1]
https://sourceware.org/git/?p=elfutils.git;a=commit;h=de3d46b3e7

Setup: pahole
-------------
The line below in linux/scripts/Makefile.btf must be commented out:

pahole-flags-$(call test-ge, $(pahole-ver), 121) += --btf_gen_floats

Or else, the build will fail:

$ make V=1
  ...
  BTF     .btf.vmlinux.bin.o
pahole -J --btf_gen_floats                    \
       -j --lang_exclude=rust                 \
       --skip_encoding_btf_inconsistent_proto \
       --btf_gen_optimized .tmp_vmlinux.btf
Complex, interval and imaginary float types are not supported
Encountered error while encoding BTF.
  ...
  BTFIDS  vmlinux
./tools/bpf/resolve_btfids/resolve_btfids vmlinux
libbpf: failed to find '.BTF' ELF section in vmlinux
FAILED: load BTF from vmlinux: No data available

This is due to the fact that the ARC toolchains generate
"complex float" DIE entries in libgcc and at the moment, pahole
can't handle such entries.

Running the tests
-----------------
host$ scp /bld/linux/lib/test_bpf.ko arc:
arc # sysctl net.core.bpf_jit_enable=1
arc # insmod test_bpf.ko test_suite=test_bpf
      ...
      test_bpf: #1048 Staggered jumps: JMP32_JSLE_X jited:1 697811 PASS
      test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed]

Acknowledgments
---------------
- Claudiu Zissulescu for his unwavering support
- Yuriy Kolerov for testing and troubleshooting
- Vladimir Isaev for the pahole workaround
- Sergey Matyukevich for paving the road by adding the interpreter support

Signed-off-by: Shahab Vahedi <shahab@synopsys.com>
Link: https://lore.kernel.org/r/20240430145604.38592-1-list+bpf@vahedi.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 16:51:36 -07:00
..
caif
device_drivers net/mlx5e: Introduce timestamps statistic counter for Tx DMA layer 2024-04-05 22:24:09 -07:00
devlink ice: Document tx_scheduling_layers parameter 2024-04-22 13:05:19 -07:00
dsa net: dsa: Rename IFLA_DSA_MASTER to IFLA_DSA_CONDUIT 2023-10-24 13:08:14 -07:00
mac80211_hwsim
net_cachelines net: move dev->state into net_device_read_txrx group 2024-03-19 10:47:47 +01:00
netlink_spec Documentation: Document each netlink family 2023-11-24 01:16:56 +00:00
pse-pd net: pse-pd: Add support for PSE PIs 2024-04-18 18:27:33 -07:00
6lowpan.rst
6pack.rst
af_xdp.rst xsk: document ability to redirect to any socket bound to the same umem 2024-02-05 20:01:15 -08:00
alias.rst
arcnet-hardware.rst
arcnet.rst
atm.rst
ax25.rst Documentation: netdev: fix dead link in ax25.rst 2023-09-18 12:56:58 +01:00
bareudp.rst
batman-adv.rst
bonding.rst bonding: Add independent control state machine 2024-02-06 13:17:54 +01:00
bridge.rst Documentation: update mailing list addresses 2024-02-21 13:44:21 -07:00
can.rst can: bcm: add recvmsg flags for own, local and remote traffic 2024-02-12 16:55:17 +01:00
can_ucan_protocol.rst
cdc_mbim.rst
checksum-offloads.rst
dccp.rst
dctcp.rst
dns_resolver.rst dns_resolver: correct module name in dns resolver documentation 2024-03-26 10:15:36 +01:00
driver.rst
eql.rst
ethtool-netlink.rst net: ethtool: pse-pd: Expand pse commands with the PSE PoE interface 2024-04-18 18:27:02 -07:00
failover.rst
fib_trie.rst
filter.rst ARC: Add eBPF JIT support 2024-05-12 16:51:36 -07:00
gen_stats.rst
generic-hdlc.rst
generic_netlink.rst
gtp.rst
ieee802154.rst
ila.rst
index.rst ethtool: Expand Ethernet Power Equipment with c33 (PoE) alongside PoDL 2024-04-18 18:27:01 -07:00
ioam6-sysctl.rst
ip-sysctl.rst net: ipv6/addrconf: clamp preferred_lft to the minimum required 2024-02-15 15:34:40 +01:00
ip_dynaddr.rst
ipsec.rst
ipv6.rst
ipvlan.rst
ipvs-sysctl.rst
j1939.rst
kapi.rst
kcm.rst
l2tp.rst PPPoL2TP: Add more code snippets 2024-02-21 17:13:21 -08:00
lapb-module.rst
mac80211-auth-assoc-deauth.txt
mac80211-injection.rst
mctp.rst
mpls-sysctl.rst
mptcp-sysctl.rst mptcp: add a new sysctl for make after break timeout 2023-10-25 12:23:33 -07:00
msg_zerocopy.rst docs: net: description of MSG_ZEROCOPY for AF_VSOCK 2023-10-15 13:19:42 +01:00
multi-pf-netdev.rst docs: networking: fix indentation errors in multi-pf-netdev 2024-03-14 13:19:27 +01:00
multiqueue.rst
napi.rst
net_dim.rst
net_failover.rst
netconsole.rst net: netconsole: Add continuation line prefix to userdata messages 2024-03-11 14:07:57 -07:00
netdev-features.rst
netdevices.rst net: remove stale mentions of dev_base_lock in comments 2024-02-14 11:20:13 +00:00
netfilter-sysctl.rst
netif-msg.rst
nexthop-group-resilient.rst
nf_conntrack-sysctl.rst netfilter: set default timeout to 3 secs for sctp shutdown send and recv state 2023-08-16 00:05:15 +02:00
nf_flowtable.rst
nfc.rst
openvswitch.rst
operstates.rst
packet_mmap.rst mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
page_pool.rst net: page_pool: expose page pool stats via netlink 2023-11-28 15:48:39 +01:00
phonet.rst
phy.rst net: phy: Introduce PSGMII PHY interface mode 2023-08-14 08:12:53 +01:00
pktgen.rst pktgen: Introducing 'SHARED' flag for testing with non-shared skb 2023-09-28 16:25:14 +02:00
plip.rst
ppp_generic.rst
proc_net_tcp.rst
radiotap-headers.rst
rds.rst
regulatory.rst
representors.rst Documentation: Add documentation for eswitch attribute 2024-03-28 18:20:08 -07:00
rxrpc.rst
scaling.rst net: ethtool: add support for symmetric-xor RSS hash 2023-12-13 22:07:16 -08:00
sctp.rst
secid.rst
seg6-sysctl.rst
segmentation-offloads.rst
sfp-phylink.rst doc: sfp-phylink: update the porting guide with PCS handling 2024-03-07 15:27:05 +01:00
skbuff.rst
smc-sysctl.rst net/smc: add sysctl for max conns per lgr for SMC-R v2.1 2023-11-24 12:13:14 +00:00
snmp_counter.rst docs: automarkup: linkify git revs 2023-11-17 13:13:24 -07:00
statistics.rst netdev: add per-queue statistics 2024-03-07 21:13:25 -08:00
strparser.rst
switchdev.rst
sysfs-tagging.rst
tc-actions-env-rules.rst
tc-queue-filters.rst
tcp-thin.rst
tcp_ao.rst Documentation/tcp: Fix an obvious typo 2023-12-06 12:36:55 +01:00
team.rst
timestamping.rst docs: networking: timestamping: mention MSG_EOR flag 2023-12-13 18:19:39 -08:00
tipc.rst
tls-handshake.rst
tls-offload-layers.svg
tls-offload-reorder-bad.svg
tls-offload-reorder-good.svg
tls-offload.rst
tls.rst
tproxy.rst
tuntap.rst
udplite.rst
vrf.rst
vxlan.rst
x25-iface.rst
x25.rst
xdp-rx-metadata.rst xdp: Add VLAN tag hint 2023-12-13 16:16:40 -08:00
xfrm_device.rst xfrm: generalize xdo_dev_state_update_curlft to allow statistics update 2024-02-05 16:45:49 -08:00
xfrm_proc.rst
xfrm_sync.rst
xfrm_sysctl.rst
xsk-tx-metadata.rst xsk: Add missing SPDX to AF_XDP TX metadata documentation 2023-12-05 15:08:50 +01:00