TY - BOOK AU - Guzman, Jhessa Crizelle P. and Santiago, Jonathan C. TI - An Enhancement of the C4.5 Algorithm Applied in Credit Risk Management AV - QA76.9 G89 2016 U1 - . PY - 2016/// CY - . PB - . KW - academic writing N1 - ABSTRACT: Credit risk management is a very crucial issue banks and financial institutions are facing and it is very important from them to know the likelihood for a credit applicant to default on the financial obligation. With the use of credit scoring, precise judgment of the credit worthiness of applicants helps financial institutions to grant credit with minimized possible losses. C4.5 is a statistical classifier data mining algorithm that generates decision tree used as a credit scoring technique to help them decide whether to grant credit or not. It uses training set as input and information entropy to choose the attribute that effectively splits its set which is the attribute with the highest information gain. C4.5 is an improvement of the ID3 algorithm, however the researchers found some problems and limitations in the existing algorithm. First is it cannot detect noisy data that lessens its accuracy in decision making. Second is the algorithm cannot determine attribute correlation. Lastly, the algorithm needs to scan all the continuous attribute values to find the threshold. To improve the algorithm, researchers came up with solutions to solve the three problems stated. The researchers created an enhanced C4.5 algorithm that detects the noisy data for more accurate decision making, determines attribute correlation with the use of Pearson correlation coefficient to lessen the attributes to be evaluated every time it chooses a splitting attribute and reduces the process of computation when finding a threshold of a continuous attribute ; F N2 - ABSTRACT: Credit risk management is a very crucial issue banks and financial institutions are facing and it is very important from them to know the likelihood for a credit applicant to default on the financial obligation. With the use of credit scoring, precise judgment of the credit worthiness of applicants helps financial institutions to grant credit with minimized possible losses. C4.5 is a statistical classifier data mining algorithm that generates decision tree used as a credit scoring technique to help them decide whether to grant credit or not. It uses training set as input and information entropy to choose the attribute that effectively splits its set which is the attribute with the highest information gain. C4.5 is an improvement of the ID3 algorithm, however the researchers found some problems and limitations in the existing algorithm. First is it cannot detect noisy data that lessens its accuracy in decision making. Second is the algorithm cannot determine attribute correlation. Lastly, the algorithm needs to scan all the continuous attribute values to find the threshold. To improve the algorithm, researchers came up with solutions to solve the three problems stated. The researchers created an enhanced C4.5 algorithm that detects the noisy data for more accurate decision making, determines attribute correlation with the use of Pearson correlation coefficient to lessen the attributes to be evaluated every time it chooses a splitting attribute and reduces the process of computation when finding a threshold of a continuous attribute ER -