Hate Speech Detection on Twitter approaching the Indonesian Election Using Machine Learning

Dayang Putri Nur Lyrawati; Elly Matul Imah

Open Conference Systems, MISEIC 2019

Dayang Putri Nur Lyrawati, Elly Matul Imah

Last modified: 2019-07-18

Abstract

Twitter is one of the social media which is widely used for sentiment analysis, or opinion mining. Detecting of hate speech is one kind of sentiment analysis application. In this study used hate speech detection from twitter data. Hate speech detction is done by tweet text classification posted. Hate speech is classified by the Generalized Relevance Learning Vector Quantization (GRLVQ) algorithm. GRLVQ algorithm has the advantage of being able to select features that can be classified. The data used in detection of hate speech is the presidential election data annotated by several students and annotated DKI election data. The pre-process used in this study is tokenizing, stemming, cleansing, filtering and string conversion. The hate speech detection test using the GRLVQ algorithm in the presidential election data produced the highest accuracy of 70% with a time of 0.02 seconds and in the election data the highest accuracy was 78.87% with a time of 0.035 seconds. Meanwhile, the detection of hate speech using the SVM algorithm resulted in an accuracy of presidential data of 61.67% with a time of 0.96 seconds and in the election data at 69.2308% with 1.22 seconds. The experimental results show that detection of hate speech using the GRLVQ algorithm is better than using the SVM algorithm.

Keywords: GRLVQ,Hate speech, Sentiment Analysis ,SVM, Twitter.