Last modified: 2018-07-07
Abstract
Esti Latifah1, Sarini Abdullah2 and Saskya Mary Soemartojo3
Â
1, 2 Department of mathematics, Universitas Indonesia, INDONESIA.
(E-mail: esti.latifah@sci.ui.ac.id, sarini@sci.ui.ac.id)
3 Department of mathematics, Universitas Indonesia, INDONESIA.
(E-mail: saskya@sci.ui.ac.id)
Â
Â
ABSTRACT
Â
Parkinson’s disease (PD) is a neurodegenerative disorder that caused by the result of lack of dopamine in a specific area of the brain called substantia nigra. It is a long term degenerative disorder of the central nervous system that mainly effect the motor system that has some impacts such as diffculty in speech, problem in swallowing, and dressing, trouble with handwritting or even doing some activities, and tremor. Based on this problem, researchers use the Parkinson’s Progression Markers Initiative (PPMI) database to classify subtypes: Tremor Dominant (TD) and postural instability gait difficulty (PIGD). Identifying the factors of parkinson’s disease subtypes is crucial in understanding the appropriate therapy for parkinson’s disease patient. Furthermore, it gives some characteristics of patient that is classified into TD or PIGD. Classification method is use to identify the factors of parkinson’s disease subtypes.
Â
The classification method use  to catagorize a class of  data. It has been widely used in various fields such as Support Vector Machine (SVM), Neural Network, Decision Tree, and Random Forest.  In handling a large sample and features, SVM tends to be difficult to solve the problem. In the other hand, Neural Network can handle the large number of sample and features but it takes a lot of iteration before getting the good model. Other methods, Decision Tree gives a good performence in handling large this problem. However, the model produced by this method is unstable and high in variance that leads to overfitting. Because of the limitation of the method before, Random Forest is the compatible method used to handle the problem above.
Â
Random Forest is ensemble decision tree that creates the forest with the number of trees. Random Forest has the same structure with decision tree but it works differently. Bagging and random feature selection are the two main technique of Random Forest. In bagging, each tree is trained on a bootstrap sample of training data. Instead of using all features, Random Forest randomly selected a subset of featureto split each node when growing a tree. Each tree will genertaes its own prediction and combine all of these separate prediction using majority vote. This technique is used in order to get the robust model and handle overfit. To assess the prediction performace, Random Forest performs a type of cross-validation called Out Of  Bag (OOB).
Â
Using Random Forest on 207 sample with PD and 47 variables obtained from Movement Disorder Society-Unified Parkinson Disease Rating Scale (MDS-UPDRS) part II and part III in event V12. The factors of  PD subtypes have been identify. There are some importance variables based on mean decrease gini with the value  such as constancy of rest tremor, rest tremore amplitude(RUE), tremor, walking and balance, freezing, postural stability, postural tremor of right hand, rest tremor amplitude(LUE). The overall accuracy is about 94,48%, the accuracy of the PIGD class that is classified in the correct class is about 91,43%, and the accuracy of the TD class that is classified in the correct class is about 97,33%. PD patient who is classified to PIGD class have the lower value in constancy of rest tremor, rest tremore amplitude(RUE), tremor, rest tremor amplitude(LUE), and postural tremor of right hand than PD patient with TD and the higher value in postural stability, walking and balance,  and freezing than PD patient with TD.