Open Conference Systems, MISEIC 2019

Font Size: 
Comparison of Fast Fuzzy Clustering Based on Kernel and Support Vector Machines in the Classification of Schizophrenia
Sri Hartini, Zuherman Rustam

Last modified: 2019-07-10

Abstract


Schizophrenia is a chronic and severe mental disorder that affects the mind, feeling, and behavior. It is characterized by unusual thoughts or experiences, disorganized speech or behavior and decreased participation in social life. Treatment is usually permanent and often involves a combination of medications, psychotherapy and coordinated particular care services. Therefore, it could be better if someone with schizophrenia can be detected early.

 

This research, therefore, aims to compare fast fuzzy clustering based on kernel and support vector machines on Northwestern University Schizophrenia dataset, consisting of 171 Schizophrenia and 221 non-Schizophrenia samples. The instances are described by 64 features which were numerically and categorically characterized, specifically: gender, dominant hand, ethnic, race, age, and 25 Scale for the Assessment of Negative Symptoms (SANS) and 34 Scale for the Assessment of Negative Symptoms (SAPS).

 

Fast fuzzy clustering or KC-means clustering is a modified version of the k-means and fuzzy c-means method. This method combines these two methods and makes fuzzy c-means work faster with the performance close to only its method. Furthermore, fast fuzzy clustering based on kernel means that every distance formula in the algorithm is substituted with the distance of two input vector data in the feature space in the kernel function. There are several kernel parameters, but in this paper Gaussian radial basis kernel function is used.

 

Thereafter, the performance of support vector machine will be examined as the comparison in accuracy, sensitivity, precision, specificity, F1-score, and running time. Vapnik et al initially proposed this method as appropriate to use in cases where there are exactly two classes. In this method, data points are considered as the support vectors with a purpose to find the best hyperplane that separates all data points into classes. Moreover, because a simple hyperplane sometimes is not useful for some binary classification problems, the same kernel function is also used in support vector machine so that it can maintain nearly all the simplicity needed.

 

From the experiments, fast fuzzy clustering gives better performance with the average of accuracy (75.85 percent), sensitivity (87.45 percent), precision (86.79 percent), specificity (84.40 percent), and F1-score (85.70 percent) which were obtained when kernel parameter σ=0.05 is used. It was further deduced from the results that support vector machines provides better performance in accuracy, sensitivity, precision, specificity, and F1-score with the slower running time when σ=0.0001 was used. When kernel parameter σ=0.0001 was used, support vector machines method gives the highest value in the average of accuracy (88.44 percent), sensitivity (99.89 percent), precision (88.52), specificity (89.75 percent), and F1-score (93.86 percent).

 

Both methods already provide great performance in this classification, but it is also always space for improvement. Future researches are therefore encouraged to explore the possibility of developing new models or methods to obtain better performance. It is also possible to apply these methods in other datasets hence the model is examined in various ways so that it can really be considered as the appropriate method that gives an accurate diagnosis.


Keywords


Classification; KC-means; Kernel Function; Schizophrenia; Support Vector Machines.