Header menu link for other important links
X
Class Specific TF-IDF Boosting for Short-text Classification: Application to Short-texts Generated during Disasters
Published in Association for Computing Machinery, Inc
2018
Pages: 1629 - 1637
Abstract
Proper formulation of features plays an important role in short-text classification tasks as the amount of text available is very little. In literature, Term Frequency - Inverse Document Frequency (TF-IDF) is commonly used to create feature vectors for such tasks. However, TF-IDF formulation does not utilize the class information available in supervised learning. For classification problems, if it is possible to identify terms that can strongly distinguish among classes, then more weight can be given to those terms during feature construction phase. This may result in improved classifier performance with the incorporation of extra class label related information. We propose a supervised feature construction method to classify tweets, based on the actionable information that might be present, posted during different disaster scenarios. Improved classifier performance for such classification tasks can be helpful in the rescue and relief operations. We used three benchmark datasets containing tweets posted during Nepal and Italy earthquakes in 2015 and 2016 respectively. Experimental results show that the proposed method obtains better classification performance on these benchmark datasets. © 2018 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC BY 4.0 License.
About the journal
JournalData powered by TypesetThe Web Conference 2018 - Companion of the World Wide Web Conference, WWW 2018
PublisherData powered by TypesetAssociation for Computing Machinery, Inc