We present sketching algorithms for sparse binary datasets, which maintain binary version of the dataset after sketching, while simultaneously preserving multiple similarity measures such as Jaccard Similarity, Cosine Similarity, Inner Product, and Hamming Distance, on the same sketch. A major advantage of our algorithms is that they are randomness efficient, and require significantly less number of random bits for sketching - logarithmic in dimension, while other competitive algorithms require linear in dimension. Our proposed algorithms are efficient, offer a compact sketch of the dataset, and can be efficiently deployed in a distributive setting. We present a theoretical analysis of our approach and complement them with extensive experimentations on public datasets. For analysis purposes, our algorithms require a natural assumption on the dataset. We empirically verify the assumption and notice that it holds on several real-world datasets. © 2020 R. Pratap, K. Revanuru, A. Ravi & R. Kulkarni.

Rameshwar Pratap

Department of Computer Science and Engineering

K. Revanuru

A. Ravi

R. Kulkarni

Indian Institute of Technology Hyderabad (IITH) is a premier institute of science and technology established in 2008. IITH has been consistently ranked in the&nbsp;<a href="https://iith.ac.in/reports/" rel="noopener noreferrer">top 10 institutes</a>&nbsp;in India for Engineering&nbsp;<a href="https://www.nirfindia.org/2021/EngineeringRanking.html" rel="noopener noreferrer" target="_blank">according to NIRF</a>&nbsp;making it one of the most coveted schools for science and technology in the country.

IITH offers&nbsp;<a href="https://iith.ac.in/academics/under-graduate/">undergraduate programs</a>&nbsp;in all the classical engineering disciplines, applied sciences, design, as well as several modern interdisciplinary areas. Students are given a flexibility to explore a broad set of areas, and potentially pursue a minor or double major in a discipline that is not their own. Students who wish to seek a deeper understanding of their own discipline are strongly encouraged to get involved in cutting-edge research with the help of a faculty to mentor them, and earn an Honors in their own field.

The very foundation of IIT Hyderabad is based on&nbsp;<a href="https://iith.ac.in/research/">research and innovation</a>. The vibrant research culture is evident from the number of patents and publications that IITH produces consistently every year. IITH offers&nbsp;<a href="https://iith.ac.in/academics/programmes-offered/">graduate programs</a>&nbsp;at both a masters, and a doctoral level, in several diverse areas. There are separate programs for technology, design, science, and liberal arts. The&nbsp;<a href="https://iith.ac.in/academics/post-graduate/">MTech program</a>&nbsp;is offered in various modes and durations to cater to the ever-growing need of postgraduate level professionals in the country.

IITH encourages and supports&nbsp;<a href="https://iic-iith.netlify.app/index.html" rel="noopener noreferrer" target="_blank">innovation and entrepreneurship</a>&nbsp;at every stage. The minor program in entrepreneurship is open to all students of IITH, and many of the courses are offered by industrialists who are entrepreneurs themselves. There are various&nbsp;<a href="https://iith.ac.in/research/centres-incubators/">innovation centers and incubators</a>&nbsp;that promote entrepreneurship and help mentor young innovators. IITH has been very successful in building tie-ups with leading academic institutions around the globe.

IITH enjoys a very special relationship with Japanese universities and industries that goes beyond academic and research collaborations. In fact, some of the iconic buildings in IITH campus will carry the signature of Japanese architecture.

IITH creates a unique holistic ecosystem for education that offers interactive learning, a very flexible academic structure, cutting-edge research, strong industry collaboration, and entrepreneurship. This is an environment which enables students and faculty to translate their dreams into realities.

Journal	Proceedings of Machine Learning Research
Publisher	ML Research Press
ISSN	26403498