Hashing algorithms are continually used for large-scale learning and similarity search, with computationally cheap and better algorithms being proposed every year. In this paper we focus on hashing algorithms which involve estimating a distance measure d(xi,xj) between two vectors xi,xj. Such hashing algorithms require generation of random variables, and we propose two approaches to reduce the variance of our hashed estimates: control variates and maximum likelihood estimates. We explain how these approaches can be immediately applied to a wide subset of hashing algorithms. Further, we evaluate the impact of these methods on various datasets. We finally run empirical simulations to verify our results. © 2021 K. Kang, S. Kushnarev, W.P. Wong, R. Pratap, H. Yeo & Y. Chen.

Rameshwar Pratap

Department of Computer Science and Engineering

K. Kang

S. Kushnarev

W.P. Wong

H. Yeo

Y. Chen

Indian Institute of Technology Hyderabad (IITH) is a premier institute of science and technology established in 2008. IITH has been consistently ranked in the&nbsp;<a href="https://iith.ac.in/reports/" rel="noopener noreferrer">top 10 institutes</a>&nbsp;in India for Engineering&nbsp;<a href="https://www.nirfindia.org/2021/EngineeringRanking.html" rel="noopener noreferrer" target="_blank">according to NIRF</a>&nbsp;making it one of the most coveted schools for science and technology in the country.

IITH offers&nbsp;<a href="https://iith.ac.in/academics/under-graduate/">undergraduate programs</a>&nbsp;in all the classical engineering disciplines, applied sciences, design, as well as several modern interdisciplinary areas. Students are given a flexibility to explore a broad set of areas, and potentially pursue a minor or double major in a discipline that is not their own. Students who wish to seek a deeper understanding of their own discipline are strongly encouraged to get involved in cutting-edge research with the help of a faculty to mentor them, and earn an Honors in their own field.

The very foundation of IIT Hyderabad is based on&nbsp;<a href="https://iith.ac.in/research/">research and innovation</a>. The vibrant research culture is evident from the number of patents and publications that IITH produces consistently every year. IITH offers&nbsp;<a href="https://iith.ac.in/academics/programmes-offered/">graduate programs</a>&nbsp;at both a masters, and a doctoral level, in several diverse areas. There are separate programs for technology, design, science, and liberal arts. The&nbsp;<a href="https://iith.ac.in/academics/post-graduate/">MTech program</a>&nbsp;is offered in various modes and durations to cater to the ever-growing need of postgraduate level professionals in the country.

IITH encourages and supports&nbsp;<a href="https://iic-iith.netlify.app/index.html" rel="noopener noreferrer" target="_blank">innovation and entrepreneurship</a>&nbsp;at every stage. The minor program in entrepreneurship is open to all students of IITH, and many of the courses are offered by industrialists who are entrepreneurs themselves. There are various&nbsp;<a href="https://iith.ac.in/research/centres-incubators/">innovation centers and incubators</a>&nbsp;that promote entrepreneurship and help mentor young innovators. IITH has been very successful in building tie-ups with leading academic institutions around the globe.

IITH enjoys a very special relationship with Japanese universities and industries that goes beyond academic and research collaborations. In fact, some of the iconic buildings in IITH campus will carry the signature of Japanese architecture.

IITH creates a unique holistic ecosystem for education that offers interactive learning, a very flexible academic structure, cutting-edge research, strong industry collaboration, and entrepreneurship. This is an environment which enables students and faculty to translate their dreams into realities.

IIT Hyderabad

Proceedings of Machine Learning Research

Improving Hashing Algorithms for Similarity Search via MLE and the Control Variates Trick

Journal	Proceedings of Machine Learning Research
Publisher	ML Research Press
ISSN	26403498