Network Slowdown, by Design
This article was originally published on LinkedIn, 21 Feb 2022.
How are you measuring your network's performance?
Real-world networks encounter all kinds of problems from issues like duplicate IP addresses, misconfigured routers, inadequate buffering, fast-path slow-path transitions, IP fragmentation, and more.
The TCP protocol, when initially designed, had problems when the path between the endpoints was undergoing congestion, which manifests itself in packet loss and delay. TCP was extended to implement algorithms to detect signs of congestion on the path, and to take actions to avoid sending more traffic into that congestion.
When TCP detects certain round trip delay and loss conditions, the sending side of the TCP connection starts slowing down. That slowing down can be quite severe (and sudden). TCP stacks are supposed to slowly recover. This is the TCP Congestion-Avoidance Algorithm (discussed on Wikipedia at):
https://en.wikipedia.org/wiki/TCP_congestion_control
The effect can be surprising! One can have a very fast network, but get very little throughput, with very high perceived delays.
For faster networks, especially in conjunction with delay, another element can come into play: TCP window scaling. This can greatly increase the amount of data that TCP has "in flight" at any given time, with potentially significant side effects. (discussed somewhat tersely on Wikipedia at:
https://en.wikipedia.org/wiki/TCP_window_scale_option
The TCP stacks of many popular operating systems have incorporated these algorithms for years. However, it often takes both ends of the TCP connection to properly perform these algorithms.
NOTE: Bufferbloat also causes the TCP connection to begin backing off. Bufferbloat refers to disproportionate buffering on the path between the TCP endpoints. This often happens when a Wi-Fi or other radio interface is on the path. (discussed on Wikipedia at):
https://en.wikipedia.org/wiki/Bufferbloat
If you are using a tool, such as iperf, to measure throughput, the results can be quite deceptive. That's because iperf displays effective throughput of data at the transport layer, and thus masks what is actually going on underneath. See:
https://www.iwl.com/idocs/does-iperf-tell-white-lies
So installing higher capacity hardware (going from 1 G to 100 G) does not guarantee higher throughput; if TCP detects congestion, it will back off and slow down.