Open Conference Systems, MISEIC 2018

Font Size: 
Comparative Study of Application of Algorithm C4.5 and Naïve Bayes as Supporting Money Loan Decision (Case Study at Employee Cooperative PT Karyamitra Budisentosa Pandaan)
Akas Bagus Setiawan, Febriliyan Samopa

Last modified: 2018-07-07

Abstract


Money lending system in the cooperative employees of  PT. Karyamitra Budisentosa has a central role because members of the co-operatives themselves are 100 members and need to be analyzed by whom members are eligible to be loaned. The risk of a non-performing loan system will cause bad debts, it will disrupt the financial system and business processes that exist in the cooperative. while non-performing loans based on data collected from 2013 to 2017 amounted to 57.69% in 2013, 50% in 2014, 52.38% in 2015, 71.4% in 2016 and 72.72% in 2017 which occurred on Employee Cooperative PT. Karyamitra Budisentosa when compared with credit with the status paid off.

This research uses the tools of RapidMiner 8.0 which is machine learning to study the data history with method C4.5 and Naïve Bayes then from both methods is taken the highest AUC (Area Under Cover) value, the AUC value is the interpretation of the average sensitivity for all Specificity values ​​(accuracy, precision, recall) are possible. The AUC value is used to measure the general diagnostic test in analyzing the data history, Naïve Bayes is a method that calculates the probability of the rate of occurrence of data on one another. The C4.5 algorithm is one of several algorithms in the decision tree method that converts data into decision tree, to then be inferred into rule's.

Based on the comparison of test results through various scenarios on both methods, Naïve Bayes method with AUC achievement was 0.875 based on Automatic sampling, 0.896 based on Shuffled sampling, 0.875 based on Stratified sampling, and 0.819 based on Linear sampling while C4.5 method with achievement value of 0.779 based on Automatic sampling and amounted to 0.781 based on Shuffled sampling, amounting to 0.781 based on Stratified sampling, and by 0.803 based on Linear sampling. each method was tested using a 5-fold test scenario with Cross Validation.

Testing with a linear sampling test scenario feels the best because the AUC values ​​of the C4.5 and Naïve Bayes methods are almost balanced even though Naïve Bayes is still showing its superiority, here are the results of the test.

Table 1. Comparison of accuracy, precision, recall and AUC methods C4.5 and Naïve Bayes (table 1 on suplementary file, because of failed to upload on this sheet)

From table 1 it can be seen that the highest accuracy, recall, AUC value is indicated by Naïve Bayes method with values ​​of 76.77%, 74.42% and 0.840. As for the highest precision value shown by method C4.5 with 76.47%, from table 1 is represented in the graph in Figure 1 below.

Figure 1. Graph of Comparison of Performance Test of 5-Fold Linear Sampling Scenario (figure 1 on suplementary file, because of failed to upload on this sheet)

Based on the research that has been done, Naïve Bayes is the best recommended method in the form of a Decision Support System.


Keywords


Algorithm C4.5, Naïve Bayes, AUC, Loan money, RapidMiner 8.0, machine learning, data history.