Header menu link for other important links
X
Projection-SVM: Distributed Kernel Support Vector Machine for Big Data using Subspace Partitioning
D. Singh,
Published in Institute of Electrical and Electronics Engineers Inc.
2019
Pages: 74 - 83
Abstract
The training of kernel support vector machine (SVM) is a computationally complex task for large datasets where the number of samples ranges in millions. This is because kernel matrix (in general not sparse) is both computation expensive and memory intensive. Existing methods hardly achieve a linear scale and suffer from high approximation loss. We propose Projection-SVM, a distributed implementation of kernel support vector machine for large datasets using subspace partitioning. In subspace partitioning, a decision tree is constructed on projection of data along the direction of maximum variance (i.e., dominant eigenvector) to obtain smaller partitions (i.e., subspaces) of the dataset. On each of these partitions, a kernel SVM is trained independently over a cluster thereby reducing the overall training time. Also, it results in reducing the prediction time significantly. We demonstrate the efficacy of the proposed approach on eight standard large datasets from various application domains, namely, mnist8m, kddcup99, webspam, etc. where Projection-SVM is on an average 150 times faster than sequential SVM while maintaining the classification accuracy. The experimental results also show the superiority of the Projection-SVM over the state-of-the-art approaches for distributed kernel SVMs, such as DCSVM, CASVM, and DTSVM. © 2018 IEEE.