Quantcast
Channel: Hacker News
Viewing all articles
Browse latest Browse all 25817

The Ethernet PAUSE frame

$
0
0

This is a bizarre one. It all started when the internet seemed to go out at my house. My desktop, phone, TV, everything stopped working. The usual solution at a time like this is to power cycle the modem and router. While this fixed the situation temporarily, soon after the problem returned. What made me think this was more than just ISP flakiness was that for some reason Chrome actually locked up; good ol’ Windows “this program stopped responding” so like any enterprising engineer I busted open Wireshark.

Some odd frames
Some odd frames

After some clever deductive reasoning, a.k.a randomly unplugging cables from the router, I determined that my TV was sending these mystery frames (yes, my TV — I have a Sony X805D Android TV). After power cycling the TV the problem went away but of course I wanted to figure out what was actually happening. You’d be forgiven if the above frames aren’t immediately recognizable — their definition is buried deep in Appendix 31B of the IEEE 802.3 Ethernet standard.

The type of an Ethernet frame is determined by it’s EtherType, which is a two byte identifier that comes after two six byte MAC addresses denoting source and destination. The mystery frame’s EtherType was 0x8808, which is for Ethernet flow control.

The very existence of Ethernet flow control may come as a shock, especially since protocols like TCP have explicit flow control mechanisms, presumably to compensate for Ethernet’s lack of one. However, on page 752 of the Ethernet spec we find a section dedicated to (rudimentary) flow control. The frame structure is fairly bare-bones: a two byte “opcode”, which in this case is 0x0001 for “PAUSE” and a two byte “pause_time”, denoting increments of 512 bit times (here’s a great diagram of the frame).

To test out the behavior of pause frames more thoroughly I wrote a simple libpcap (or WinPcap) program that transmits a PAUSE frame every ten milliseconds.

Sure enough, sending this frame repeatedly killed all traffic on my home network. (You can check out the full code on GitHub). What’s interesting is that this may have arisen from a bug in my home router (TP-Link AC750 Archer C2 running firmware 0.9.1.3.2). According to the Ethernet spec (31B.1)

The globally assigned 48-bit multicast address 01-80-C2-00-00-01 has been reserved for use in MAC Control PAUSE frames for inhibiting transmission of data frames from a DTE in a full duplex mode IEEE 802.3 LAN. IEEE 802.1D-conformant bridges will not forward frames sent to this multicast destination address, regardless of the state of the bridge’s ports, or whether or not the bridge implements the MAC Control sublayer.

It would appear that there is a clause that specifically attempts to deal with this scenario: nodes sending PAUSE message to the special multicast address 01:80:C2:00:00:01 are instructing the switch to not send them any more frames. My switch seems to honor this, but also forwards the frames to the other nodes on the network, in effect telling THEM to pause in sending frames, which would explain the observed behavior.

I did some digging — my router uses a MediaTek MT7620A router SoC which relies on a Realtek RTL8367RB to perform switch duties. Unfortunately I couldn’t find the data sheet for this specific chip, although the source code for the router is GPL, so the driver itself is perusable. A data sheet for the RTL8366 (a 6/9 port version of the chip) says on page 22:

Frames with group MAC address 01-80-c2-00-00-01 (802.3x Pause) and 01-80-c2-00-00-02 (802.3ad LCAP) will always be filtered. MAC address 01-80-c2-00-00-03 will always be forwarded. This function is controlled by pin strapping of EN_MLT_FWD (pin 32) upon power reset. After power on, the configuration may be changed by MLTID_ST[15:0][1:0] in Register MCPCR0-1 (0x000F – 0x0010).

Have we reached the end of the road? The above seems to suggest that the forwarding of PAUSE frames is controlled both by a pin and a register on the Ethernet switch chip. It would appear on my router this specific (standard conformant) feature was accidentally disabled, either by a floating pin or a zeroed out register leading to a “frame of death” that was forwarded by my switch, killing the network. It’s amazing what you find when you dig!

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


Viewing all articles
Browse latest Browse all 25817

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>