Term Specific TF-IDF Boosting for Detection of Rumours in Social Networks

U. Bhattacharjee; Srijith P. K.; Maunendra Sankar Desarkar

doi:10.1109/COMSNETS.2019.8711427

The spread of rumours on a social event affects the propagation of true information regarding the event. Separating a rumour from an informative post is of great importance nowadays because posts which contain rumours, try to provide an information which sounds similar to the actual happening. Rumour detection in social media can be treated as a text classification problem. However, the problem involves several challenges. Both rumours and non-rumours may contain a similar form of textual items like words, sentences etc. about an actual happening. The context and way of representing those textual items make a rumour different from a non-rumour post. In our work, we show that in many such cases, standard ways of feature construction does not always capture these patterns. Therefore, we propose a novel approach of feature construction by reweighting the TF-IDF score of some particular terms taking into account the label information of training data. This leads to a better construction of features than the standard TF-IDF representation leading to a more separable set of features. A classffier with TF-IDF boosted features performs better than the combination of standard TF-IDF and state-of-the-art machine learning algorithms using standard TF-IDF score like LightGBM, Gradient Boosting, SVM etc. This is experimentally validated on three different social events which created rumours. We also show that our model gives a comparable performance to a deep learning model like LSTM with Glove word embedding in a much lesser training time, making our model a convenient one for detecting rumours at an early stage. © 2019 IEEE.

Journal	Data powered by Typeset2019 11th International Conference on Communication Systems and Networks, COMSNETS 2019
Publisher	Data powered by TypesetInstitute of Electrical and Electronics Engineers Inc.