Skip to Content

Ebook Neural Network Survival Analysis for Personal Loan Data

Traditionally, the primary goal of credit scoring was to distinguish good customers from bad customers without taking into account when customers tend to default. The latter issue is however becoming more and more a key research question since the whole process of credit granting and repayment is being more and more conceived as dynamic instead of static. The advantages of having models that estimate when customers default are

  • the ability to compute the profitability over a customer’s lifetime and perform profit scoring;
  • these models may provide the bank with an estimate of the default levels over time which is useful for debt provisioning;
  • the estimates may help to decide upon the term of the loan;
  • changes in economic conditions can be easier incorporated.

Until now, not many authors have addressed the issue of predicting customer default times. Narain was one the first authors to use survival analysis methods for credit scoring. He analysed a data set of 1242 applicants accepted for a 24 month loan between mid 1986 and mid 1988. The data was analysed using the Kaplan-Meier method and by fitting exponential regression models. It was shown that the results obtained are encouraging and reasonable.

Banasik et al. report on the use of the proportional hazards model for predicting when borrowers default or pay off early. They use personal loan data from a major U.K. financial institution which consists of application information of 50000 loans accepted between June 1994 and March 1997 together with their monthly performance description for the period up to July 1997. The data was analysed using the non-parametric proportional hazards model (no baseline hazard assumption), two parametric proportional hazards models using exponential and Weibull baseline hazards, and an ordinary logistic regression approach. Stepanova and Thomas continue the research by Banasik et al. and try to further augment the performance of the estimated proportional hazards models.

In, Stepanova and Thomas perform behavioral scoring using PHAB (proportional hazards analysis behavior scores) models. The authors conclude by saying that the PHAB scores are useful as indicators of both risk and profit.

Although the proportional hazards model is the most frequently used model for survival analysis, it still suffers from a number of drawbacks. A first drawback is that the functional form of the inputs remains linear or some mild extension thereof. If more complex terms are to be included (e.g. interaction terms between inputs, quadratic terms, ...), they must be specified somewhat arbitrarily by the user. Furthermore, this linear form invokes extreme hazard rates for subjects with outlying values for their inputs. And finally, in the standard proportional hazards model, the baseline hazard function is assumed to be uniform across the entire population resulting in proportional hazards. Although time-varying inputs and stratification allow for non-proportionality, these extensions might not provide the best way to model the baseline variation. In this paper, we will discuss how neural networks may offer an answer to these problems. This will be investigated in a credit scoring context by studying when customers tend to default or pay off their loan early.

This paper is organised as follows. In section 2, we briefly discuss the proportional hazards model. Section 3 presents a literature overview on the use of neural networks for survival analysis. The empirical setup and the results are reported in section 4. Section 5 concludes the paper.

Download
PDF Ebook Neural Network Survival Analysis for Personal Loan Data