linux/net
Pau Espin Pedrol e00431bc93 tcp: accept RST if SEQ matches right edge of right-most SACK block
RFC 5961 advises to only accept RST packets containing a seq number
matching the next expected seq number instead of the whole receive
window in order to avoid spoofing attacks.

However, this situation is not optimal in the case SACK is in use at the
time the RST is sent. I recently run into a scenario in which packet
losses were high while uploading data to a server, and userspace was
willing to frequently terminate connections by sending a RST. In
this case, the ACK sent on the receiver side (rcv_nxt) is frozen waiting
for a lost packet retransmission and SACK blocks are used to let the
client continue uploading data. At some point later on, the client sends
the RST (snd_nxt), which matches the next expected seq number of the
right-most SACK block on the receiver side which is going forward
receiving data.

In this scenario, as RFC 5961 defines, the RST SEQ doesn't match the
frozen main ACK at receiver side and thus gets dropped and a challenge
ACK is sent, which gets usually lost due to network conditions. The main
consequence is that the connection stays alive for a while even if it
made sense to accept the RST. This can get really bad if lots of
connections like this one are created in few seconds, allocating all the
resources of the server easily.

For security reasons, not all SACK blocks are checked (there could be a
big amount of SACK blocks => acceptable SEQ numbers). Furthermore, it
wouldn't make sense to check for RST in blocks other than the right-most
received one because the sender is not expected to be sending new data
after the RST. For simplicity, only up to the 4 most recently updated
SACK blocks (selective_acks[4] field) are compared to find the
right-most block, as usually those are the ones with bigger probability
to contain it.

This patch was tested in a 3.18 kernel and probed to improve the
situation in the scenario described above.

Signed-off-by: Pau Espin Pedrol <pau.espin@tessares.net>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 00:36:18 -07:00
..
6lowpan 6lowpan: move mac802154 header 2016-04-13 10:41:10 +02:00
9p remove lots of IS_ERR_VALUE abuses 2016-05-27 15:26:11 -07:00
802
8021q vlan: Propagate MAC address to VLANs 2016-05-31 11:56:48 -07:00
appletalk
atm net/atm: sk_err_soft must be positive 2016-05-23 13:51:10 -07:00
ax25
batman-adv batman-adv: initialize ELP orig address on secondary interfaces 2016-05-18 11:49:44 +08:00
bluetooth net_sched: transform qdisc running bit into a seqcount 2016-06-07 16:37:13 -07:00
bridge Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-09 15:59:24 -04:00
caif
can
ceph libceph: make ceph_osdc_wait_request() uninterruptible 2016-05-26 01:15:40 +02:00
core net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
dcb
dccp dccp: do not assume DCCP code is non preemptible 2016-05-02 17:02:25 -04:00
decnet decnet: Do not build routes to devices without decnet private data. 2016-04-10 23:01:30 -04:00
dns_resolver KEYS: Add a facility to restrict new links into a keyring 2016-04-11 22:37:37 +01:00
dsa net: dsa: Add new binding implementation 2016-06-04 14:29:55 -07:00
ethernet
hsr net/hsr: Use setup_timer and mod_timer. 2016-05-16 14:00:43 -04:00
ieee802154 net_sched: transform qdisc running bit into a seqcount 2016-06-07 16:37:13 -07:00
ipv4 tcp: accept RST if SEQ matches right edge of right-most SACK block 2016-06-08 00:36:18 -07:00
ipv6 net: vrf: ipv6 support for local traffic to local addresses 2016-06-08 00:25:38 -07:00
ipx
irda TTY and Serial driver update for 4.7-rc1 2016-05-20 20:57:27 -07:00
iucv
kcm kcm: fix a signedness in kcm_splice_read() 2016-05-19 11:26:51 -07:00
key
l2tp net_sched: transform qdisc running bit into a seqcount 2016-06-07 16:37:13 -07:00
l3mdev net: l3mdev: Allow send on enslaved interface 2016-05-09 22:33:52 -04:00
lapb net/lapb: tuse %*ph to dump buffers 2016-05-29 22:33:25 -07:00
llc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-09 15:59:24 -04:00
mac80211 Some more work for 4.7, notably: 2016-05-12 11:46:58 -04:00
mac802154
mpls skbuff: introduce skb_gso_validate_mtu 2016-06-03 19:37:21 -04:00
netfilter net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
netlabel
netlink netlink: Fix dump skb leak/double free 2016-05-16 22:05:15 -04:00
netrom
nfc nfc: nci: Add nci_nfcc_loopback to the nci core 2016-05-04 01:48:16 +02:00
openvswitch ovs: set name assign type of internal port 2016-06-02 18:05:47 -04:00
packet Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-04-23 18:51:33 -04:00
phonet
qrtr Merge tag 'qcom-soc-for-4.7-2' into net-next 2016-05-17 14:11:19 -04:00
rds Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-20 20:01:26 -07:00
rfkill
rose
rxrpc rxrpc: Use pr_<level> and pr_fmt, reduce object size a few KB 2016-06-03 19:41:31 -04:00
sched net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
sctp sctp: Fix warning in sctp_packet_transmit_chunk() 2016-06-03 22:53:26 -07:00
sunrpc NFS client updates for Linux 4.7 2016-05-26 10:33:33 -07:00
switchdev switchdev: pass pointer to fib_info instead of copy 2016-05-17 13:58:49 -04:00
tipc tipc: fix potential null pointer dereferences in some compat functions 2016-05-25 12:33:52 -07:00
unix
vmw_vsock Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-09 15:59:24 -04:00
wimax
wireless mm/page_ref: use page_ref helper instead of direct modification of _count 2016-05-19 19:12:14 -07:00
x25 net: fix a kernel infoleak in x25 module 2016-05-09 22:45:33 -04:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-09 15:59:24 -04:00
compat.c
Kconfig bpf: add generic constant blinding for use in jits 2016-05-16 13:49:32 -04:00
Makefile net: Add Qualcomm IPC router 2016-05-08 23:46:14 -04:00
socket.c fs: poll/select/recvmmsg: use timespec64 for timeout events 2016-05-19 19:12:14 -07:00
sysctl_net.c