Open Conference Systems, MISEIC 2018

Font Size: 
Classification of Schizophrenia Data Using Support Vector Machine (SVM)
Theresia Veronika Rampisela

Last modified: 2018-07-07

Abstract


Schizophrenia is a severe and chronic mental illness characterized by disturbance in thinking, perception, and behaviour. Due to these disturbances that may prompt schizophrenics to commit suicide or attempt to do so, they have a lower life expectancy compared to the general population. Schizophrenia is difficult to diagnose as there is no physical test to diagnose it and its symptoms are very similar to several other mental illnesses, such as major depression and bipolar.

This study aims to distinguish people who suffer from schizophrenia from those who do not, by classifying schizophrenia data as well as measuring the accuracy of the classification. The classification is done using a machine learning method, Support Vector Machine (SVM). SVM creates a hyperplane that separates the two groups of people, such that the distance between the nearest data to the hyperplane is maximised. The hyperplane is built based on the data given through a learning process with parameters that are optimised, and the fitted model is then used to classify new data.

Data used in preparation of this article were obtained from the Northwestern University Schizophrenia Data and Software Tool (NUSDAST) database. It consists of 393 observations with 64 features that consists of clinician-filled questionnaire data of Scale for the Assessment of Positive Symptoms (SAPS) and the Scale for the Assessment of Negative Symptoms (SANS) as well as the person’s personal details. Using Matlab R2017a, simulation is then performed with different percentage of training and testing data, and the accuracy and running time for each simulation are measured. Table 1 shows the results of the simulation. (PLEASE SEE THE EXTENDED ABSTRACT FILE)

In conclusion, the schizophrenia data is best classified using 90% of the dataset as training data, where it reaches up to 92.5% accuracy in classifying. However, when running time is a concern, the model with 20% training data could also be considered as it is more than 2 times faster and has an accuracy of 90.4%. Therefore, the SVM method could be used to classify schizophrenia as it has high accuracy rate. It is recommended to use another machine learning method or use data from other schizophrenia-related questionnaires for further research.


Keywords


binary classification, machine learning, mental illness, schizophrenia, Support Vector Machine