No description
Find a file
Kuniyuki Iwashima b46f4eaa4f af_unix: Clear stale u->oob_skb.
syzkaller started to report deadlock of unix_gc_lock after commit
4090fa373f ("af_unix: Replace garbage collection algorithm."), but
it just uncovers the bug that has been there since commit 314001f0bf
("af_unix: Add OOB support").

The repro basically does the following.

  from socket import *
  from array import array

  c1, c2 = socketpair(AF_UNIX, SOCK_STREAM)
  c1.sendmsg([b'a'], [(SOL_SOCKET, SCM_RIGHTS, array("i", [c2.fileno()]))], MSG_OOB)
  c2.recv(1)  # blocked as no normal data in recv queue

  c2.close()  # done async and unblock recv()
  c1.close()  # done async and trigger GC

A socket sends its file descriptor to itself as OOB data and tries to
receive normal data, but finally recv() fails due to async close().

The problem here is wrong handling of OOB skb in manage_oob().  When
recvmsg() is called without MSG_OOB, manage_oob() is called to check
if the peeked skb is OOB skb.  In such a case, manage_oob() pops it
out of the receive queue but does not clear unix_sock(sk)->oob_skb.
This is wrong in terms of uAPI.

Let's say we send "hello" with MSG_OOB, and "world" without MSG_OOB.
The 'o' is handled as OOB data.  When recv() is called twice without
MSG_OOB, the OOB data should be lost.

  >>> from socket import *
  >>> c1, c2 = socketpair(AF_UNIX, SOCK_STREAM, 0)
  >>> c1.send(b'hello', MSG_OOB)  # 'o' is OOB data
  5
  >>> c1.send(b'world')
  5
  >>> c2.recv(5)  # OOB data is not received
  b'hell'
  >>> c2.recv(5)  # OOB date is skipped
  b'world'
  >>> c2.recv(5, MSG_OOB)  # This should return an error
  b'o'

In the same situation, TCP actually returns -EINVAL for the last
recv().

Also, if we do not clear unix_sk(sk)->oob_skb, unix_poll() always set
EPOLLPRI even though the data has passed through by previous recv().

To avoid these issues, we must clear unix_sk(sk)->oob_skb when dequeuing
it from recv queue.

The reason why the old GC did not trigger the deadlock is because the
old GC relied on the receive queue to detect the loop.

When it is triggered, the socket with OOB data is marked as GC candidate
because file refcount == inflight count (1).  However, after traversing
all inflight sockets, the socket still has a positive inflight count (1),
thus the socket is excluded from candidates.  Then, the old GC lose the
chance to garbage-collect the socket.

With the old GC, the repro continues to create true garbage that will
never be freed nor detected by kmemleak as it's linked to the global
inflight list.  That's why we couldn't even notice the issue.

Fixes: 314001f0bf ("af_unix: Add OOB support")
Reported-by: syzbot+7f7f201cc2668a8fd169@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7f7f201cc2668a8fd169
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240405221057.2406-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-08 19:58:48 -07:00
arch Including fixes from netfilter, bluetooth and bpf. 2024-04-04 14:49:10 -07:00
block block-6.9-20240329 2024-03-29 09:40:22 -07:00
certs This update includes the following changes: 2023-11-02 16:15:30 -10:00
crypto This push fixes a regression that broke iwd as well as a divide by 2024-03-25 10:48:23 -07:00
Documentation Including fixes from netfilter, bluetooth and bpf. 2024-04-04 14:49:10 -07:00
drivers net: ks8851: Handle softirqs at the end of IRQ thread to fix hang 2024-04-08 19:48:48 -07:00
fs bcachefs repair code for 6.9-rc3 2024-04-04 14:36:32 -07:00
include geneve: fix header validation in geneve[6]_xmit_skb 2024-04-08 11:51:04 +01:00
init init: open /initrd.image with O_LARGEFILE 2024-03-26 11:07:19 -07:00
io_uring io_uring/sqpoll: early exit thread if task_context wasn't allocated 2024-03-18 20:22:42 -06:00
ipc sysctl changes for v6.9-rc1 2024-03-18 14:59:13 -07:00
kernel Including fixes from netfilter, bluetooth and bpf. 2024-04-04 14:49:10 -07:00
lib lib: checksum: hide unused expected_csum_ipv6_magic[] 2024-04-08 11:03:05 +01:00
LICENSES
mm Kbuild fixes for v6.9 2024-03-31 11:23:51 -07:00
net af_unix: Clear stale u->oob_skb. 2024-04-08 19:58:48 -07:00
rust Kbuild updates for v6.9 2024-03-21 14:41:00 -07:00
samples Tracing updates for 6.9: 2024-03-18 15:11:44 -07:00
scripts Four small documentation fixes. 2024-04-02 12:44:09 -07:00
security security: Place security_path_post_mknod() where the original IMA call was 2024-04-03 10:21:32 -07:00
sound sound fixes for 6.9-rc2 2024-03-28 14:54:49 -07:00
tools Including fixes from netfilter, bluetooth and bpf. 2024-04-04 14:49:10 -07:00
usr Kbuild updates for v6.8 2024-01-18 17:57:07 -08:00
virt KVM Xen and pfncache changes for 6.9: 2024-03-11 10:42:55 -04:00
.clang-format clang-format: Update with v6.7-rc4's for_each macro list 2023-12-08 23:54:38 +01:00
.cocciconfig
.editorconfig Add .editorconfig file for basic formatting 2023-12-28 16:22:47 +09:00
.get_maintainer.ignore Add Jeff Kirsher to .get_maintainer.ignore 2024-03-08 11:36:54 +00:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore kbuild: create a list of all built DTB files 2024-02-19 18:20:39 +09:00
.mailmap Including fixes from bpf, WiFi and netfilter. 2024-03-28 13:09:37 -07:00
.rustfmt.toml
COPYING
CREDITS Not a ton of stuff happening in the clk framework in this pull request. We got 2024-03-15 11:48:01 -07:00
Kbuild
Kconfig
MAINTAINERS MAINTAINERS: Drop Li Yang as their email address stopped working 2024-04-08 11:44:06 +01:00
Makefile Linux 6.9-rc2 2024-03-31 14:32:39 -07:00
README README: Fix spelling 2024-03-18 03:36:32 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.