Header menu link for other important links
X
Grasp the root causes in the data plane: Diagnosing latency problems with SpiderMon
W. Wang, , A. Chen, T.S.E. Ng
Published in Association for Computing Machinery, Inc
2020
Pages: 55 - 61
Abstract
Unexplained performance degradation is one of the most severe problems in data center networks. The increasing scale of the network makes it even harder to maintain good performance for all users with a low-cost solution. Our system SpiderMon monitors network performance and debugs performance failures inside the network with little overhead. SpiderMon provides a two-phase solution that runs in the data plane. In the monitoring phase, it keeps track of the performance of every flow in the network; upon detecting performance problems, it triggers a debugging phase using a causality analyzer to find out the root cause of performance degradation. To implement these two phases, SpiderMon exploits the capabilities of high-speed programmable switches (e.g., per-packet monitoring, stateful memory). We prototype SpiderMon on using the BMv2 model of P4, and our preliminary evaluation shows that SpiderMon is able to quickly find the root cause of performance degradation problems with minimal overhead. SpiderMon achieves nearly-zero overhead during the monitoring phase and efficiently collects relevant data from switches during the debugging phase. © 2020 Association for Computing Machinery.
About the journal
JournalData powered by TypesetSOSR 2020 - Proceedings of the 2020 Symposium on SDN Research
PublisherData powered by TypesetAssociation for Computing Machinery, Inc