-
Localhost tcp traffic stalled
Posted on April 28th, 2021 No commentsAfter installing a 4G/LTE USB stick on a Windows 7 desktop the TCP communication on localhost has effectively stalled. Interestingly, the remote traffic was not affected. Local services can be reached by remote clients and remote services can be accessed from the desktop, only the localhost (127.0.0.1) is affected. Still ping (ICMP) works on localhost as before. Capturing the traffic reveals that all localhost services communicate through TCP become utterly slow. The TCP session starts by a zero window packet in both direction right after the SYN-SYN/ACK handshake. After that, only 1 byte payload arrives in 20-60 seconds interval. Most clients give up after 30 seconds and return error. The problem persist after removing every installed software and turning off the Windows Defender service. It doesn’t matter which port the service is listening on.
So it seems the stick’s software modifies the TCP stack somehow. Examining the global TCP configuration a setting stands out: Receive Window Auto-Tuning Level set to Experimental.
Setting it back to normal immediately solves the problem. Whoa!
To be sure there is no other change made by the stick’s software which may cause this extreme slowdown I did a test on a clean, unaffected machine. I set the autotuninglevel to experimental which immediately ruins the local tcp communication. Now I’m confident this one is the culprit. One question remains: Why this setting mess up the localhost traffic only?
According to Microsoft documentation the “experimental” setting is for “extreme scenarios”, whatever it means. This effectively sets the TCP receive window scaling to 14, which is the maximum possible value. It means multiply the window size by 16384.
Why would anyone want to change the auto-tuning to this extreme level? It may help to achieve high transmission speed on long-fat-networks (high RTT), ie. satellite links. However, the USB stick is a mobile communication device with 4G capability. 4G has a similar or a slightly worse RTT compared to DOCSIS networks (20-30 msec) which works fine with the default receive window scaling. There is no sense to change the scaling. It was a mistake by the stick’s manufacturer.
I don’t understand why this auto-tuning option affects localhost TCP communication. Localhost RTT is always the same regardless of the installed physical links. It seems a bad design decision in the OS. But it still doesn’t answer the question, why only localhost suffers by this high scaling factor? Without the source code of Microsoft Windows we cannot be sure. Maybe it could be a problem with the implementation of the localhost interface.
Based on my research, the Windows 10 operating system is not affected. You can safely set the auto-tuning level to experimental on Win10 32/64 bit OSes.
Leave a Reply