Header menu link for other important links
X

Variance Reduction in Frequency Estimators via Control Variates Method

, Kulkarni R.
Published in ML Research Press
2021
Volume: 161
   
Pages: 183 - 193
Abstract
Generating succinct summaries (also known as sketches) of massive data streams is becoming increasingly important. Such a task typically requires fast, accurate, and small space algorithms in order to support the downstream applications, mainly in areas such as data analysis, machine learning and data mining. A fundamental and well-studied problem in this context is that of estimating the frequencies of the items appearing in a data stream. The Count-Min-Sketch Cormode and Muthukrishnan [2005] and Count-Sketch Charikar et al. [2004] are two known classical algorithms for this purpose. However, a limitation of these techniques is that the variance of their estimate tends to be large. In this work, we address this problem and suggest a technique that reduces the variance in their respective estimates, at the cost of little computational overhead. Our technique relies on the classical Control-Variate trick Lavenberg and Welch [1981] used for reducing variance in Monte-Carlo simulation. We present a theoretical analysis of our proposal by carefully choosing the control variates and complement them with experiments on synthetic as well as real-world datasets. © 2021 Proceedings of Machine Learning Research. All Rights Reserved.
About the journal
JournalProceedings of Machine Learning Research
PublisherML Research Press
ISSN26403498
Open AccessNo