linux/tools/testing/selftests/bpf/progs/mptcp_subflow.c

129 lines
2.9 KiB
C
Raw Permalink Normal View History

selftests/bpf: Add mptcp subflow example Move Nicolas' patch into bpf selftests directory. This example adds a different mark (SO_MARK) on each subflow, and changes the TCP CC only on the first subflow. From the userspace, an application can do a setsockopt() on an MPTCP socket, and typically the same value will be propagated to all subflows (paths). If someone wants to have different values per subflow, the recommended way is to use BPF. So it is good to add such example here, and make sure there is no regressions. This example shows how it is possible to: Identify the parent msk of an MPTCP subflow. Put different sockopt for each subflow of a same MPTCP connection. Here especially, two different behaviours are implemented: A socket mark (SOL_SOCKET SO_MARK) is put on each subflow of a same MPTCP connection. The order of creation of the current subflow defines its mark. The TCP CC algorithm of the very first subflow of an MPTCP connection is set to "reno". This is just to show it is possible to identify an MPTCP connection, and set socket options, from different SOL levels, per subflow. "reno" has been picked because it is built-in and usually not set as default one. It is easy to verify with 'ss' that these modifications have been applied correctly. That's what the next patch is going to do. Nicolas' code comes from: commit 4d120186e4d6 ("bpf:examples: update mptcp_set_mark_kern.c") from the MPTCP repo https://github.com/multipath-tcp/mptcp_net-next (the "scripts" branch), and it has been adapted by Geliang. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/76 Co-developed-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Nicolas Rybowski <nicolas.rybowski@tessares.net> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-1-d26029e15cdd@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-09-26 19:30:22 +02:00
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2020, Tessares SA. */
/* Copyright (c) 2024, Kylin Software */
/* vmlinux.h, bpf_helpers.h and other 'define' */
#include "bpf_tracing_net.h"
selftests/bpf: Add getsockopt to inspect mptcp subflow This patch adds a "cgroup/getsockopt" way to inspect the subflows of an MPTCP socket, and verify the modifications done by the same BPF program in the previous commit: a different mark per subflow, and a different TCP CC set on the second one. This new hook will be used by the next commit to verify the socket options set on each subflow. This extra "cgroup/getsockopt" prog walks the msk->conn_list and use bpf_core_cast to cast a pointer for readonly. It allows to inspect all the fields of a structure. Note that on the kernel side, the MPTCP socket stores a list of subflows under 'msk->conn_list'. They can be iterated using the generic 'list' helpers. They have been imported here, with a small difference: list_for_each_entry() uses 'can_loop' to limit the number of iterations, and ease its use. Because only data need to be read here, it is enough to use this technique. It is planned to use bpf_iter, when BPF programs will be used to modify data from the different subflows. mptcp_subflow_tcp_sock() and mptcp_for_each_stubflow() helpers have also be imported. Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-2-d26029e15cdd@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-09-26 19:30:23 +02:00
#include "mptcp_bpf.h"
selftests/bpf: Add mptcp subflow example Move Nicolas' patch into bpf selftests directory. This example adds a different mark (SO_MARK) on each subflow, and changes the TCP CC only on the first subflow. From the userspace, an application can do a setsockopt() on an MPTCP socket, and typically the same value will be propagated to all subflows (paths). If someone wants to have different values per subflow, the recommended way is to use BPF. So it is good to add such example here, and make sure there is no regressions. This example shows how it is possible to: Identify the parent msk of an MPTCP subflow. Put different sockopt for each subflow of a same MPTCP connection. Here especially, two different behaviours are implemented: A socket mark (SOL_SOCKET SO_MARK) is put on each subflow of a same MPTCP connection. The order of creation of the current subflow defines its mark. The TCP CC algorithm of the very first subflow of an MPTCP connection is set to "reno". This is just to show it is possible to identify an MPTCP connection, and set socket options, from different SOL levels, per subflow. "reno" has been picked because it is built-in and usually not set as default one. It is easy to verify with 'ss' that these modifications have been applied correctly. That's what the next patch is going to do. Nicolas' code comes from: commit 4d120186e4d6 ("bpf:examples: update mptcp_set_mark_kern.c") from the MPTCP repo https://github.com/multipath-tcp/mptcp_net-next (the "scripts" branch), and it has been adapted by Geliang. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/76 Co-developed-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Nicolas Rybowski <nicolas.rybowski@tessares.net> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-1-d26029e15cdd@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-09-26 19:30:22 +02:00
char _license[] SEC("license") = "GPL";
char cc[TCP_CA_NAME_MAX] = "reno";
selftests/bpf: Add getsockopt to inspect mptcp subflow This patch adds a "cgroup/getsockopt" way to inspect the subflows of an MPTCP socket, and verify the modifications done by the same BPF program in the previous commit: a different mark per subflow, and a different TCP CC set on the second one. This new hook will be used by the next commit to verify the socket options set on each subflow. This extra "cgroup/getsockopt" prog walks the msk->conn_list and use bpf_core_cast to cast a pointer for readonly. It allows to inspect all the fields of a structure. Note that on the kernel side, the MPTCP socket stores a list of subflows under 'msk->conn_list'. They can be iterated using the generic 'list' helpers. They have been imported here, with a small difference: list_for_each_entry() uses 'can_loop' to limit the number of iterations, and ease its use. Because only data need to be read here, it is enough to use this technique. It is planned to use bpf_iter, when BPF programs will be used to modify data from the different subflows. mptcp_subflow_tcp_sock() and mptcp_for_each_stubflow() helpers have also be imported. Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-2-d26029e15cdd@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-09-26 19:30:23 +02:00
int pid;
selftests/bpf: Add mptcp subflow example Move Nicolas' patch into bpf selftests directory. This example adds a different mark (SO_MARK) on each subflow, and changes the TCP CC only on the first subflow. From the userspace, an application can do a setsockopt() on an MPTCP socket, and typically the same value will be propagated to all subflows (paths). If someone wants to have different values per subflow, the recommended way is to use BPF. So it is good to add such example here, and make sure there is no regressions. This example shows how it is possible to: Identify the parent msk of an MPTCP subflow. Put different sockopt for each subflow of a same MPTCP connection. Here especially, two different behaviours are implemented: A socket mark (SOL_SOCKET SO_MARK) is put on each subflow of a same MPTCP connection. The order of creation of the current subflow defines its mark. The TCP CC algorithm of the very first subflow of an MPTCP connection is set to "reno". This is just to show it is possible to identify an MPTCP connection, and set socket options, from different SOL levels, per subflow. "reno" has been picked because it is built-in and usually not set as default one. It is easy to verify with 'ss' that these modifications have been applied correctly. That's what the next patch is going to do. Nicolas' code comes from: commit 4d120186e4d6 ("bpf:examples: update mptcp_set_mark_kern.c") from the MPTCP repo https://github.com/multipath-tcp/mptcp_net-next (the "scripts" branch), and it has been adapted by Geliang. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/76 Co-developed-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Nicolas Rybowski <nicolas.rybowski@tessares.net> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-1-d26029e15cdd@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-09-26 19:30:22 +02:00
/* Associate a subflow counter to each token */
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__uint(key_size, sizeof(__u32));
__uint(value_size, sizeof(__u32));
__uint(max_entries, 100);
} mptcp_sf SEC(".maps");
SEC("sockops")
int mptcp_subflow(struct bpf_sock_ops *skops)
{
__u32 init = 1, key, mark, *cnt;
struct mptcp_sock *msk;
struct bpf_sock *sk;
int err;
if (skops->op != BPF_SOCK_OPS_TCP_CONNECT_CB)
return 1;
sk = skops->sk;
if (!sk)
return 1;
msk = bpf_skc_to_mptcp_sock(sk);
if (!msk)
return 1;
key = msk->token;
cnt = bpf_map_lookup_elem(&mptcp_sf, &key);
if (cnt) {
/* A new subflow is added to an existing MPTCP connection */
__sync_fetch_and_add(cnt, 1);
mark = *cnt;
} else {
/* A new MPTCP connection is just initiated and this is its primary subflow */
bpf_map_update_elem(&mptcp_sf, &key, &init, BPF_ANY);
mark = init;
}
/* Set the mark of the subflow's socket based on appearance order */
err = bpf_setsockopt(skops, SOL_SOCKET, SO_MARK, &mark, sizeof(mark));
if (err < 0)
return 1;
if (mark == 2)
err = bpf_setsockopt(skops, SOL_TCP, TCP_CONGESTION, cc, TCP_CA_NAME_MAX);
return 1;
}
selftests/bpf: Add getsockopt to inspect mptcp subflow This patch adds a "cgroup/getsockopt" way to inspect the subflows of an MPTCP socket, and verify the modifications done by the same BPF program in the previous commit: a different mark per subflow, and a different TCP CC set on the second one. This new hook will be used by the next commit to verify the socket options set on each subflow. This extra "cgroup/getsockopt" prog walks the msk->conn_list and use bpf_core_cast to cast a pointer for readonly. It allows to inspect all the fields of a structure. Note that on the kernel side, the MPTCP socket stores a list of subflows under 'msk->conn_list'. They can be iterated using the generic 'list' helpers. They have been imported here, with a small difference: list_for_each_entry() uses 'can_loop' to limit the number of iterations, and ease its use. Because only data need to be read here, it is enough to use this technique. It is planned to use bpf_iter, when BPF programs will be used to modify data from the different subflows. mptcp_subflow_tcp_sock() and mptcp_for_each_stubflow() helpers have also be imported. Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-2-d26029e15cdd@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-09-26 19:30:23 +02:00
static int _check_getsockopt_subflow_mark(struct mptcp_sock *msk, struct bpf_sockopt *ctx)
{
struct mptcp_subflow_context *subflow;
int i = 0;
mptcp_for_each_subflow(msk, subflow) {
struct sock *ssk;
ssk = mptcp_subflow_tcp_sock(bpf_core_cast(subflow,
struct mptcp_subflow_context));
if (ssk->sk_mark != ++i) {
ctx->retval = -2;
break;
}
}
return 1;
}
static int _check_getsockopt_subflow_cc(struct mptcp_sock *msk, struct bpf_sockopt *ctx)
{
struct mptcp_subflow_context *subflow;
mptcp_for_each_subflow(msk, subflow) {
struct inet_connection_sock *icsk;
struct sock *ssk;
ssk = mptcp_subflow_tcp_sock(bpf_core_cast(subflow,
struct mptcp_subflow_context));
icsk = bpf_core_cast(ssk, struct inet_connection_sock);
if (ssk->sk_mark == 2 &&
__builtin_memcmp(icsk->icsk_ca_ops->name, cc, TCP_CA_NAME_MAX)) {
ctx->retval = -2;
break;
}
}
return 1;
}
SEC("cgroup/getsockopt")
int _getsockopt_subflow(struct bpf_sockopt *ctx)
{
struct bpf_sock *sk = ctx->sk;
struct mptcp_sock *msk;
if (bpf_get_current_pid_tgid() >> 32 != pid)
return 1;
if (!sk || sk->protocol != IPPROTO_MPTCP ||
(!(ctx->level == SOL_SOCKET && ctx->optname == SO_MARK) &&
!(ctx->level == SOL_TCP && ctx->optname == TCP_CONGESTION)))
return 1;
msk = bpf_core_cast(sk, struct mptcp_sock);
if (msk->pm.subflows != 1) {
ctx->retval = -1;
return 1;
}
if (ctx->optname == SO_MARK)
return _check_getsockopt_subflow_mark(msk, ctx);
return _check_getsockopt_subflow_cc(msk, ctx);
}