We've been getting reports about "bad WiFi" at work. It's often hard or
impossible for users to tell if this really is a WiFi issue, or
something else entirely, so -- after having worked remotely for almost 3
years in a row -- I went to the office again on a regular basis in an
effort to investigate these reports.
The first thing I did was just take my laptop, walk around, and run
things like iperf3 or just join a video chat. I didn't notice any issues
whatsoever, aside from lots of overlap between the access points, so
maybe roaming could be an issue. (It later turned out that roaming is
indeed a problem for most of our clients.)
Now, I have a Thinkpad and it runs Linux. Most of our users have some
Macbook. Okay then, let's grab a Mac and test again.
Oops. Ping spikes and bad bandwidth drops. Not *always*, but it was bad
enough for me to conclude: "Yep, something's wrong here."
I then spent a lot of time trying to find the cause. Along the way, I
found a faulty switch, but that didn't solve everything. I also found
other issues which had nothing to do with WiFi at all. I learned a lot
about 802.11 WiFi and what not, but I couldn't tell where the ping
spikes and bandwidth drops came from. The fact that my Thinkpad didn't
show the same symptoms kept bugging me.
I was *this* close to buying equipment to do spectrum analysis and other
things, when I finally noticed what was going on.
I almost never just ran tests like iperf3 alone -- I also kept an eye on
which BSSID I was connected to. How do you do that on a Mac? You press
"Option" and click on the WiFi icon in the menu bar. And that was my
mistake.
Opening this menu also triggers scans for other access points, which, I
presume, hops channels and everything. Of course, this kills
performance. I was not aware of this behavior. :( As soon as I stopped
opening that menu, the ping spikes and bandwidth drops were gone.
(That menu doesn't even show you the results of the scans. You have to
open another submenu to see them.)
We later found out that opening the Mac's system settings and visiting
a tab related to networking does the same thing. At least one user had
his system settings open in the background and he forgot about it (why
wouldn't he), which caused issues in video chats.
It's a bit embarrassing that it took me so long to notice. Then again,
these scans don't show up anywere, especially not in the tcpdumps that I
took (including ones made in "monitor" mode). The only thing I saw was
silence and delays. If you want to check for scans in progress, you have
to do this (doesn't work on all MacOS versions, some just don't tell you
about scans):
while sleep 0.0001
do
sudo /System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport en0 -I |
grep -qF 'state: scanning' && printf .
done
And for checking which AP you're connected to, just look at `airport en0
-I`. No more GUI.