* * * * *
“Don't Panic!”
While Mark [1] and I were doing a fast recovery of a customer machine [2] we
received a call from John, the paper millionaire of a dotcom company and
former member of a Grateful Dead cover band to say he couldn't get to his
servers, located in the very same co-location facility we were currently at.
Mark goes over to John's machines. All servers are up, but he can't ping out.
In fact, he can't get past the first hop. Mark then heads over to the core
room, I remain in the co-location room, and we all get on a conference call.
Network seems okay—link light is on at both ends of the connection. No
traffic. Jiggle the cord. Oh! A few packets. Then major lossage again.
Repeat.
John is freaking out because he needs to be on a plane early and it's now
3:30 am or there abouts. He finally conferences in the main sysadmin for
Atlantic Internet [3] because Mark and I can't figure out what's going on.
Neither could the sysadmin. Everything seems okay. Only there's no traffic.
John, panicing is yelling at Mark. Mark is yelling back at John not to panic.
Meanwhile we can barely hear the sysadmin over the conference call.
Pandemonium reigns.
I quickly grab the network analyzer they have (way too cool) an hook it to
John's side of the connection. It lights up like a Christmas tree. Low
utilization, high collisions and an even larger rate of errors. I then take
the unit to the Atlantic Internet side. Nothing. Normal traffic from John's
servers.
We then plug the network analyzer into the Cisco Catalyst 5000 which is
serving as the main switch. Actually, it's more like three switched hubs than
a real switch—there are 24 ports grouped into three sections. Each section is
a hub, but switched between sections.
The network analyzer lights up like a Christmas tree.
The consensus seems to be that the Catalyst is hosed. It probably didn't
survive a DoS attack a few days previously and was slowly going bad. So it
was some quick work to rerun a few cables to nearby switches and remove the
Catalyst from service.
Mark and I didn't leave the office until 5 am.
[1]
http://www.conman.org/people/myg/
[2]
gopher://gopher.conman.org/0Phlog:2000/04/14.1
[3]
http://www.aibusiness.net/
Email author at
[email protected]