net: reduce RTNL hold duration in unregister_netdevice_many_notify() (part 2)

One synchronize_net() call is currently done while holding RTNL.

This is source of RTNL contention in workloads adding and deleting
many network namespaces per second, because synchronize_rcu()
and synchronize_rcu_expedited() can use 60+ ms in some cases.

For cleanup_net() use, temporarily release RTNL
while calling the last synchronize_net().

This should be safe, because devices are no longer visible
to other threads after unlist_netdevice() call
and setting dev->reg_state to NETREG_UNREGISTERING.

In any case, the new netdev_lock() / netdev_unlock()
infrastructure that we are adding should allow
to fix potential issues, with a combination
of a per-device mutex and dev->reg_state awareness.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jesse Brandeburg <jbrandeburg@cloudflare.com>
Link: https://patch.msgid.link/20250114205531.967841-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit is contained in:
Eric Dumazet 2025-01-14 20:55:31 +00:00 committed by Jakub Kicinski
parent ae646f1a0b
commit 83419b61d1

View file

@ -11628,9 +11628,8 @@ void unregister_netdevice_many_notify(struct list_head *head,
rtnl_drop_if_cleanup_net();
flush_all_backlogs();
rtnl_acquire_if_cleanup_net();
/* TODO: move this before the prior rtnl_acquire_if_cleanup_net() */
synchronize_net();
rtnl_acquire_if_cleanup_net();
list_for_each_entry(dev, head, unreg_list) {
struct sk_buff *skb = NULL;