linux/net/ipv4
Wei Wang cf1ef3f071 net/tcp_fastopen: Disable active side TFO in certain scenarios
Middlebox firewall issues can potentially cause server's data being
blackholed after a successful 3WHS using TFO. Following are the related
reports from Apple:
https://www.nanog.org/sites/default/files/Paasch_Network_Support.pdf
Slide 31 identifies an issue where the client ACK to the server's data
sent during a TFO'd handshake is dropped.
C ---> syn-data ---> S
C <--- syn/ack ----- S
C (accept & write)
C <---- data ------- S
C ----- ACK -> X     S
		[retry and timeout]

https://www.ietf.org/proceedings/94/slides/slides-94-tcpm-13.pdf
Slide 5 shows a similar situation that the server's data gets dropped
after 3WHS.
C ---- syn-data ---> S
C <--- syn/ack ----- S
C ---- ack --------> S
S (accept & write)
C?  X <- data ------ S
		[retry and timeout]

This is the worst failure b/c the client can not detect such behavior to
mitigate the situation (such as disabling TFO). Failing to proceed, the
application (e.g., SSL library) may simply timeout and retry with TFO
again, and the process repeats indefinitely.

The proposed solution is to disable active TFO globally under the
following circumstances:
1. client side TFO socket detects out of order FIN
2. client side TFO socket receives out of order RST

We disable active side TFO globally for 1hr at first. Then if it
happens again, we disable it for 2h, then 4h, 8h, ...
And we reset the timeout to 1hr if a client side TFO sockets not opened
on loopback has successfully received data segs from server.
And we examine this condition during close().

The rational behind it is that when such firewall issue happens,
application running on the client should eventually close the socket as
it is not able to get the data it is expecting. Or application running
on the server should close the socket as it is not able to receive any
response from client.
In both cases, out of order FIN or RST will get received on the client
given that the firewall will not block them as no data are in those
frames.
And we want to disable active TFO globally as it helps if the middle box
is very close to the client and most of the connections are likely to
fail.

Also, add a debug sysctl:
  tcp_fastopen_blackhole_detect_timeout_sec:
    the initial timeout to use when firewall blackhole issue happens.
    This can be set and read.
    When setting it to 0, it means to disable the active disable logic.

Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24 14:27:17 -04:00
..
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-04-15 21:16:30 -04:00
af_inet.c net: Add sysctl to toggle early demux for tcp and udp 2017-03-24 13:17:07 -07:00
ah4.c IPsec: do not ignore crypto err in ah4 input 2017-01-16 12:57:48 +01:00
arp.c neighbour: fix nlmsg_pid in notifications 2017-03-22 10:48:49 -07:00
cipso_ipv4.c netlabel: out of bound access in cipso_v4_validate() 2017-02-04 19:44:22 -05:00
datagram.c
devinet.c net: rtnetlink: plumb extended ack to doit function 2017-04-17 15:35:38 -04:00
esp4.c esp: Use a synchronous crypto algorithm on offloading. 2017-04-14 10:07:19 +02:00
esp4_offload.c esp4/6: Fix GSO path for non-GSO SW-crypto packets 2017-04-19 07:48:57 +02:00
fib_frontend.c net: rtnetlink: plumb extended ack to doit function 2017-04-17 15:35:38 -04:00
fib_lookup.h
fib_notifier.c ipv4: fib: Remove redundant argument 2017-03-10 09:45:09 -08:00
fib_rules.c ipv4: fib_rules: Dump FIB rules when registering FIB notifier 2017-03-16 10:18:34 -07:00
fib_semantics.c net: ipv4: add support for ECMP hash policy choice 2017-03-21 15:27:19 -07:00
fib_trie.c ipv4: fib: Remove redundant argument 2017-03-10 09:45:09 -08:00
fou.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
gre_demux.c
gre_offload.c net: add recursion limit to GRO 2016-10-20 14:32:22 -04:00
icmp.c net: ipv4: add support for ECMP hash policy choice 2017-03-21 15:27:19 -07:00
igmp.c igmp, mld: Fix memory leak in igmpv3/mld_del_delrec() 2017-02-09 16:43:45 -05:00
inet_connection_sock.c net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
inet_diag.c tcp: remove early retransmit 2017-01-13 22:37:16 -05:00
inet_fragment.c
inet_hashtables.c inet: kill smallest_size and smallest_port 2017-01-18 13:04:29 -05:00
inet_timewait_sock.c ipv4: Namespaceify tcp_tw_recycle and tcp_max_tw_buckets knob 2016-12-29 11:38:31 -05:00
inetpeer.c
ip_forward.c ipv4: allow local fragmentation in ip_finish_output_gso() 2016-11-03 16:10:26 -04:00
ip_fragment.c inet: frag: release spinlock before calling icmp_send() 2017-03-22 15:40:45 -07:00
ip_gre.c ip_tunnel: Allow policy-based routing through tunnels 2017-04-21 13:21:31 -04:00
ip_input.c net: Add sysctl to toggle early demux for tcp and udp 2017-03-24 13:17:07 -07:00
ip_options.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
ip_output.c udp: avoid ufo handling on IP payload compression packets 2017-03-09 18:28:42 -08:00
ip_sockglue.c net-timestamp: avoid use-after-free in ip_recv_error 2017-04-17 12:59:22 -04:00
ip_tunnel.c ip_tunnel: Allow policy-based routing through tunnels 2017-04-21 13:21:31 -04:00
ip_tunnel_core.c netlink: pass extended ACK struct to parsing functions 2017-04-13 13:58:22 -04:00
ip_vti.c ip_tunnel: Allow policy-based routing through tunnels 2017-04-21 13:21:31 -04:00
ipcomp.c
ipconfig.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-04-06 08:24:51 -07:00
ipip.c ip_tunnel: Allow policy-based routing through tunnels 2017-04-21 13:21:31 -04:00
ipmr.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-04-20 10:35:33 -04:00
Kconfig Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2017-02-16 21:25:49 -05:00
Makefile ipv4: fib: Move FIB notification code to a separate file 2017-03-10 09:45:09 -08:00
netfilter.c netfilter: use skb_to_full_sk in ip_route_me_harder 2017-02-28 12:49:36 +01:00
ping.c ping: implement proper locking 2017-03-24 20:50:28 -07:00
proc.c tcp: remove tcp_tw_recycle 2017-03-16 20:33:56 -07:00
protocol.c net: Add sysctl to toggle early demux for tcp and udp 2017-03-24 13:17:07 -07:00
raw.c ipv4: fix a deadlock in ip_ra_control 2017-04-17 12:46:50 -04:00
raw_diag.c net: ip, raw_diag -- Use jump for exiting from nested loop 2016-11-03 15:25:26 -04:00
route.c net: rtnetlink: plumb extended ack to doit function 2017-04-17 15:35:38 -04:00
syncookies.c syncookies: use SipHash in place of SHA1 2017-01-09 13:58:57 -05:00
sysctl_net_ipv4.c net/tcp_fastopen: Disable active side TFO in certain scenarios 2017-04-24 14:27:17 -04:00
tcp.c net/tcp_fastopen: Disable active side TFO in certain scenarios 2017-04-24 14:27:17 -04:00
tcp_bbr.c tcp_bbr: add a state transition diagram and accompanying comment 2016-10-29 17:12:43 -04:00
tcp_bic.c
tcp_cdg.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> 2017-03-02 08:42:27 +01:00
tcp_cong.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-11-22 13:27:16 -05:00
tcp_cubic.c tcp_cubic: fix typo in module param description 2017-04-20 16:16:44 -04:00
tcp_dctcp.c Revert "dctcp: update cwnd on congestion event" 2016-12-06 11:34:24 -05:00
tcp_diag.c net: diag: Fix refcnt leak in error path destroying socket 2016-08-23 23:11:36 -07:00
tcp_fastopen.c net/tcp_fastopen: Disable active side TFO in certain scenarios 2017-04-24 14:27:17 -04:00
tcp_highspeed.c tcp: add cwnd_undo functions to various tcp cc algorithms 2016-11-21 13:20:17 -05:00
tcp_htcp.c
tcp_hybla.c tcp: make undo_cwnd mandatory for congestion modules 2016-11-21 13:20:17 -05:00
tcp_illinois.c tcp: add cwnd_undo functions to various tcp cc algorithms 2016-11-21 13:20:17 -05:00
tcp_input.c net/tcp_fastopen: Disable active side TFO in certain scenarios 2017-04-24 14:27:17 -04:00
tcp_ipv4.c net/tcp_fastopen: Disable active side TFO in certain scenarios 2017-04-24 14:27:17 -04:00
tcp_lp.c tcp: make undo_cwnd mandatory for congestion modules 2016-11-21 13:20:17 -05:00
tcp_metrics.c tcp: remove per-destination timestamp cache 2017-03-16 20:33:56 -07:00
tcp_minisocks.c tcp: Record Rx hash and NAPI ID in tcp_child_process 2017-03-24 20:49:30 -07:00
tcp_nv.c
tcp_offload.c gso: Support partial splitting at the frag_list pointer 2016-09-19 20:59:34 -04:00
tcp_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-04-15 21:16:30 -04:00
tcp_probe.c tcp: Revert "tcp: tcp_probe: use spin_lock_bh()" 2017-02-21 13:26:03 -05:00
tcp_rate.c tcp: export data delivery rate 2016-09-21 00:23:00 -04:00
tcp_recovery.c tcp: fix lost retransmit SNMP under-counting 2017-04-05 18:41:27 -07:00
tcp_scalable.c tcp: add cwnd_undo functions to various tcp cc algorithms 2016-11-21 13:20:17 -05:00
tcp_timer.c tcp: fix various issues for sockets morphing to listen state 2017-03-07 13:58:33 -08:00
tcp_vegas.c tcp: make undo_cwnd mandatory for congestion modules 2016-11-21 13:20:17 -05:00
tcp_vegas.h
tcp_veno.c tcp: add cwnd_undo functions to various tcp cc algorithms 2016-11-21 13:20:17 -05:00
tcp_westwood.c tcp_westwood: fix tcp_westwood_info() style mistakes 2017-03-16 20:23:28 -07:00
tcp_yeah.c tcp: add cwnd_undo functions to various tcp cc algorithms 2016-11-21 13:20:17 -05:00
tunnel4.c
udp.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-02-07 16:29:30 -05:00
udp_diag.c net: inet: diag: expose the socket mark to privileged processes. 2016-09-08 16:13:09 -07:00
udp_impl.h udplite: call proper backlog handlers 2016-11-24 15:32:14 -05:00
udp_offload.c net: add recursion limit to GRO 2016-10-20 14:32:22 -04:00
udp_tunnel.c
udplite.c udplite: call proper backlog handlers 2016-11-24 15:32:14 -05:00
xfrm4_input.c esp: Add a software GRO codepath 2017-02-15 11:04:11 +01:00
xfrm4_mode_beet.c
xfrm4_mode_transport.c xfrm: Add encapsulation header offsets while SKB is not encrypted 2017-04-14 10:07:39 +02:00
xfrm4_mode_tunnel.c xfrm: Add encapsulation header offsets while SKB is not encrypted 2017-04-14 10:07:39 +02:00
xfrm4_output.c xfrm: Add an IPsec hardware offloading API 2017-04-14 10:06:10 +02:00
xfrm4_policy.c xfrm: policy: make policy backend const 2017-02-09 10:22:19 +01:00
xfrm4_protocol.c xfrm: input: constify xfrm_input_afinfo 2017-02-09 10:22:17 +01:00
xfrm4_state.c xfrm: remove unused function 2017-01-10 10:57:12 +01:00
xfrm4_tunnel.c