linux/net/smc
Guangguan Wang bfc6c67ec2 net/smc: use the correct ndev to find pnetid by pnetid table
When using smc_pnet in SMC, it will only search the pnetid in the
base_ndev of the netdev hierarchy(both HW PNETID and User-defined
sw pnetid). This may not work for some scenarios when using SMC in
container on cloud environment.
In container, there have choices of different container network,
such as directly using host network, virtual network IPVLAN, veth,
etc. Different choices of container network have different netdev
hierarchy. Examples of netdev hierarchy show below. (eth0 and eth1
in host below is the netdev directly related to the physical device).
            _______________________________
           |   _________________           |
           |  |POD              |          |
           |  |                 |          |
           |  | eth0_________   |          |
           |  |____|         |__|          |
           |       |         |             |
           |       |         |             |
           |   eth1|base_ndev| eth0_______ |
           |       |         |    | RDMA  ||
           | host  |_________|    |_______||
           ---------------------------------
     netdev hierarchy if directly using host network
           ________________________________
           |   _________________           |
           |  |POD  __________  |          |
           |  |    |upper_ndev| |          |
           |  |eth0|__________| |          |
           |  |_______|_________|          |
           |          |lower netdev        |
           |        __|______              |
           |   eth1|         | eth0_______ |
           |       |base_ndev|    | RDMA  ||
           | host  |_________|    |_______||
           ---------------------------------
            netdev hierarchy if using IPVLAN
            _______________________________
           |   _____________________       |
           |  |POD        _________ |      |
           |  |          |base_ndev||      |
           |  |eth0(veth)|_________||      |
           |  |____________|________|      |
           |               |pairs          |
           |        _______|_              |
           |       |         | eth0_______ |
           |   veth|base_ndev|    | RDMA  ||
           |       |_________|    |_______||
           |        _________              |
           |   eth1|base_ndev|             |
           | host  |_________|             |
           ---------------------------------
             netdev hierarchy if using veth
Due to some reasons, the eth1 in host is not RDMA attached netdevice,
pnetid is needed to map the eth1(in host) with RDMA device so that POD
can do SMC-R. Because the eth1(in host) is managed by CNI plugin(such
as Terway, network management plugin in container environment), and in
cloud environment the eth(in host) can dynamically be inserted by CNI
when POD create and dynamically be removed by CNI when POD destroy and
no POD related to the eth(in host) anymore. It is hard to config the
pnetid to the eth1(in host). But it is easy to config the pnetid to the
netdevice which can be seen in POD. When do SMC-R, both the container
directly using host network and the container using veth network can
successfully match the RDMA device, because the configured pnetid netdev
is a base_ndev. But the container using IPVLAN can not successfully
match the RDMA device and 0x03030000 fallback happens, because the
configured pnetid netdev is not a base_ndev. Additionally, if config
pnetid to the eth1(in host) also can not work for matching RDMA device
when using veth network and doing SMC-R in POD.

To resolve the problems list above, this patch extends to search user
-defined sw pnetid in the clc handshake ndev when no pnetid can be found
in the base_ndev, and the base_ndev take precedence over ndev for backward
compatibility. This patch also can unify the pnetid setup of different
network choices list above in container(Config user-defined sw pnetid in
the netdevice can be seen in POD).

Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-14 12:54:40 +00:00
..
af_smc.c net: better track kernel sockets lifetime 2025-02-21 16:00:58 -08:00
Kconfig net/smc: introduce loopback-ism for SMC intra-OS shortcut 2024-04-30 13:24:48 +02:00
Makefile net/smc: Introduce IPPROTO_SMC 2024-06-17 13:14:09 +01:00
smc.h net/smc: Address spelling errors 2024-10-10 09:05:20 -07:00
smc_cdc.c net/smc: adapt cursor update when sndbuf and peer DMB are merged 2024-04-30 13:24:48 +02:00
smc_cdc.h net/smc: fix kernel panic caused by race of smc_sock 2021-12-28 12:42:45 +00:00
smc_clc.c net/smc: check return value of sock_recvmsg when draining clc data 2024-12-15 12:34:59 +00:00
smc_clc.h net/smc: check smcd_v2_ext_offset when receiving proposal msg 2024-12-15 12:34:59 +00:00
smc_close.c net/smc: put sk reference if close work was canceled 2023-11-06 10:01:07 +00:00
smc_close.h
smc_core.c net/smc: delete pointless divide by one 2025-01-11 13:08:54 -08:00
smc_core.h net/smc: support SMC-R V2 for rdma devices with max_recv_sge equals to 1 2024-12-12 13:50:00 +01:00
smc_diag.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-01-25 14:20:08 -08:00
smc_ib.c net/smc: support SMC-R V2 for rdma devices with max_recv_sge equals to 1 2024-12-12 13:50:00 +01:00
smc_ib.h net/smc: fix smc clc failed issue when netdevice not in init_net 2023-10-13 16:52:02 -07:00
smc_inet.c net/smc: fix lacks of icsk_syn_mss with IPPROTO_SMC 2024-10-10 08:48:11 -07:00
smc_inet.h net/smc: Introduce IPPROTO_SMC 2024-06-17 13:14:09 +01:00
smc_ism.c net/smc: add operations to merge sndbuf with peer DMB 2024-04-30 13:24:48 +02:00
smc_ism.h net/smc: add operations to merge sndbuf with peer DMB 2024-04-30 13:24:48 +02:00
smc_llc.c net/smc: support SMC-R V2 for rdma devices with max_recv_sge equals to 1 2024-12-12 13:50:00 +01:00
smc_llc.h net/smc: Introduce a specific sysctl for TEST_LINK time 2022-09-22 12:58:21 +02:00
smc_loopback.c net/smc: implement DMB-merged operations of loopback-ism 2024-04-30 13:24:49 +02:00
smc_loopback.h net/smc: remove unreferenced header in smc_loopback.h file 2024-07-31 11:48:58 +01:00
smc_netlink.c genetlink: start to validate reserved header bytes 2022-08-29 12:47:15 +01:00
smc_netlink.h net/smc: add support for user defined EIDs 2021-09-14 12:49:10 +01:00
smc_netns.h net/smc: introduce list of pnetids for Ethernet devices 2020-09-28 15:19:03 -07:00
smc_pnet.c net/smc: use the correct ndev to find pnetid by pnetid table 2025-03-14 12:54:40 +00:00
smc_pnet.h net/smc: Use a mutex for locking "struct smc_pnettable" 2022-02-24 09:09:33 -08:00
smc_rx.c net/smc: fix data error when recvmsg with MSG_PEEK flag 2025-01-13 18:59:00 -08:00
smc_rx.h net/smc: fix data error when recvmsg with MSG_PEEK flag 2025-01-13 18:59:00 -08:00
smc_stats.c net/smc: introduce statistics for ringbufs usage of net namespace 2024-08-20 11:38:23 +02:00
smc_stats.h net/smc: introduce statistics for ringbufs usage of net namespace 2024-08-20 11:38:23 +02:00
smc_sysctl.c net/smc: add sysctl for smc_limit_hs 2024-09-10 12:11:04 +02:00
smc_sysctl.h net/smc: add sysctl for max conns per lgr for SMC-R v2.1 2023-11-24 12:13:14 +00:00
smc_tracepoint.c net/smc: Introduce tracepoint for smcr link down 2021-11-01 13:39:14 +00:00
smc_tracepoint.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
smc_tx.c net/smc: remove unneeded atomic operations in smc_tx_sndbuf_nonempty 2023-11-24 15:00:47 +00:00
smc_tx.h smc: Drop smc_sendpage() in favour of smc_sendmsg() + MSG_SPLICE_PAGES 2023-06-24 15:50:12 -07:00
smc_wr.c net/smc: support SMC-R V2 for rdma devices with max_recv_sge equals to 1 2024-12-12 13:50:00 +01:00
smc_wr.h net/smc: Use percpu ref for wr tx reference 2023-03-17 08:59:01 +00:00