Posted on Kamakura's blog by Donald R. van Deventer:
As the credit crisis is subsiding, attention is turning to the different techniques used to model and simulate default rates for all classes of credits. There is a special interest in mortgage default models, however, because of the central role that mortgage defaults have played in the 2007 to 2009 credit crisis. This blog post discusses two different techniques for mortgage default modeling: reduced form macro factor modeling and roll rate modeling. We then apply both techniques to a dataset of mortgage defaults in the United States.
Reduced form macro factor models estimate default rates by using logistic regression to link defaults with movements in macro factors of interest. An alternative approach is to use so-called “roll rate” models. Roll rate models vary slightly in their implementation, but, in the classic version, a roll-rate model specifies the transition probabilities of loans between various delinquency states (typically “current,” 30 days past due and 90 days past due) and the default state to other states, period to period. In this way the loan “rolls” through the delinquency and default process. Roll rate models can have slightly different and slightly more complex specifications, but this is the general structure. They are very similar to the transition matrix approach used for corporate credit portfolio management, where the “states” are defined by agency ratings or internal ratings.
Obviously, time-and-state invariant transition probabilities are not the most realistic default modeling approach. We expect that more delinquencies will turn into defaults in periods of high unemployment and rapid home price decline than in periods of low unemployment and home price appreciation. However, to make up for what they lack in realism, these fixed transition probabilities allow for very simple modeling and forecasting, even though the models have no explicit link to macro-economic variables like home price appreciation, interest rates, and unemployment. There are similarities, for example, with the simplifications used in corporate credit analysis using the copula method. The copula method normally assumes that cross-company default correlations are fixed over time and states (i.e. different macro factor environments). These assumptions are highly unrealistic but attractive to analysts who are willing to sacrifice accuracy for simplicity of analysis. In the case of roll rate models, the fixed transition probabilities turn roll rate models into Markov chains where well known solution techniques can be employed.
Markov chains are a very concise way of modeling transition probabilities over discrete time intervals. We start with the current state of the portfolio. Take the case where the percentage of current loans is 80%, the percentage of delinquent loans is 15%, and the percent of loans in default is 5%. We can represent those starting probabilities as a vector:
Further, assume that we have good estimates of the fraction of current loans that move over a given length of time from the current states:
- Movements from “current” into “delinquency” or “default,”
- Movements from “delinquent” to “current” or “default,” and
- Movements from “default” to “delinquent” or “current.” Many analysts assume that movements out of the default state are rare, but it is possible for judicial action to halt the auctioning of foreclosed properties in some circumstances.
We can represent these transition probabilities as a matrix whose i,jth entry is the probability of moving from state j to state i.
To be concrete, we assume that in the subsequent period, 1% of the loans that were current moved into default, and 2% moved into delinquency. We assume that40% of the loans in delinquency moved into default and 30% became current. Finally we assume 20% of the loans in default became current, and none of the loans in default can move directly into delinquency. This set of transition probabilities can be depicted by the matrix P given by:
So the distribution of probabilities next period and in the nth period is given by:
What is analytically convenient about this specification is that we have completely described the system with this one matrix. The analyst who wishes to simulate default experience on a portfolio of these assets simply has to reapply P to the initial vector of probabilities and then apply payoffs to each state. However, this process can be even further simplified.
Simple roll rate models with time and state independent transition probabilities may be unattractive based on the modeling assumptions, but those assumptions provide quite a bit of tractability and analytical convenience.
Reduced form macro factor models are more complex but dramatically more accurate, and they provide concrete links between macro-economic factors and default probabilities. This allows the analyst to calculate macro-factor stress tests like those mandated under the Supervisory Capital Assessment Program in early 2009 in the United States.
Instead of assigning fixed transition probabilities, the relevant historical default experience is fitted to a series of time and state varying macro-factor values and borrower attributes. Typical explanatory variables include things like home prices, interest rates, initial loan to value ratios, current loan to value ratios, credit scores, and so on. Default (and/or prepayment and delinquency) is modeled as:
where yt is a variable that represents default/no default, xt is a vector of relevant macro factors, and εtrepresents orthogonal risk. These sorts of equations are often estimated through logistic regression, a maximum likelihood technique for binary dependent variables. For an example of how to use logistic regression to simulate correlated default in a corporate default context, please see our September 24, 2009 blog post “Modeling Correlated Default in a Reduced Form Model: A Worked Example” on www.kamakuraco.com. The logistic form can also be applied to default rates on grouped data, even though the time series of the default variable is not a vector of zeros and ones. To estimate the logistic coefficients in this context, we transform the default rate time series and apply generalized linear models for estimation.
Once we have determined the explanatory variables and their coefficients, we can then simulate the forward values of the explanatory variables xt and use the β coefficients in the logistic formula to calculate the probabilities. This is a slightly more involved process than the roll rate model, but it has a number of attractive features. The probability of default depends directly on things like home prices, interest rates, and creditworthiness. That is, default probabilities vary over time and across the business cycle. The probability of default is much higher when home prices fall then when they rise. This can be captured using this reduced form modeling technique, but it is beyond the scope of the “common practice” roll rate model discussed above. The merits of this approach are easy to summarize:
- We can directly stress test the default probabilities by varying the explanatory variables xt. Since the roll rate model is independent of the explanatory variables xt, this is not possible in roll rate models. For the same reason, the “transitional matrix” approach to corporate default modeling cannot be easily stress tested for changes in the macro-economic environment. Using the reduced form approach to mortgage default modeling, we can directly answer the question “how will the default probability change if home prices (or other variables of interest) fall an additional X%?”
- The assumptions allow for default probabilities to change over time (because mortgage age is a common explanatory variable) and with the economic environment
- Simulated defaults are more accurate because the state of the economy is allowed to directly impact the results
- As we simulate default across a variety of asset classes, the correlation between asset classes varies over time. As we saw in the 2007-2009 credit crisis, the default probabilities of Washington Mutual, Lehman Brothers and home mortgages were highly correlated because of a common dependence on the same macro-factor: home prices.
Comparing the Accuracy of Reduced Form and Roll-Rate Mortgage Default Models
In this section, we compare the relative performance of reduced form and roll rate mortgage default models on a common data set. This model comparison is consistent with the standards set forth in the Basel II capital calculations of the Basel Committee on Banking Supervision. The typical procedures for credit model testing are outlined in van Deventer, Li and Wang (2007, The Basel Handbook, second edition, Michael Ong, editor). We report our testing results in abbreviated form here using quarterly mortgage default data from the Mortgage Bankers Association.
The first question we posed in model testing is this: “Does the incorporation of 30 day and 90 day delinquency rates in a well-specified reduced form model improve one’s ability to predict the default rates supplied by the Mortgage Bankers Association?” Note that this not the same question as asking whether or not the delinquency status of a specific loan is a statistically significant predictor of the probability of default of that particular loan. Our focus here is 10 years of quarterly data supplied by the Mortgage Bankers Association, not whether or not Fred Flintstone’s loan defaults next month.
Our base model for model testing is the suite of mortgage models updated quarterly by Kamakura. These models are documented in the “Kamakura Risk Information Services Default Probability Functions Technical Guide, Mortgage Bankers Association Default Rate, Version 20090331-2.” These models forecast default rates for four classes of mortgages:
- Prime fixed rate loans
- Prime adjustable rate loans
- Subprime fixed rate loans
- Subprime adjustable rate loans
Mortgage default rates are fitted to a logistic function with explanatory variables that include interest rates, the state of the economy, and home price returns. The first graph below shows that the in-sample accuracy of the subprime adjustable rate model was a 97.17% adjusted R-squared on the transformed subprime adjustable rate default rate. The second graph below shows that, even when the model is estimated on data only through March 2007, the integrity of the model is such that out of sample performance was excellent even on the subprime adjustable rate mortgage sector. As is well known, it was this sector at the heart of the 2007-2009 crisis.
In order to answer the question posed above, we added 30 day and 90 day delinquencies reported by the Mortgage Bankers Association to Kamakura’s standard subprime adjustable rate mortgage default model. The results will surprise many: after we account for the effects of macro factors, the percentage of mortgages that are delinquent by one month or three months had no additional explanatory power in predicting the aggregate mortgage default rate. To be clear, the 30 day and 90 day lagged delinquency rates were not statistically significant when combined with the statistically significant macro factors in the standard Kamakura models. This result can be best understood by one of the bullet points above: things become more and more correlated as macro factors move. Movements in macro factors that cause defaults to increase (say a decrease in home prices) are very likely to cause delinquencies to increase as well. This effect is so strong that the marginal explanatory power of the level of delinquencies given the movements in the other factors is indistinguishable from zero.
Another important test of a model is its performance out of sample. It is important to rigorously test all of models out of sample to ensure their stability and reliability looking forward. To that end, we wanted to see how well a model estimated based solely off of pre-credit crisis data would perform during the crisis, using both the roll rate and reduced form approaches.
There are many ways to estimate transition probabilities, but they presented particular problems on the 1998-2007 Mortgage Bankers Association default dataset. In particular, the constraints that the probabilities remain positive and add up to one were binding: transition probabilities could not be directly estimated without imposing additional constraints. Instead, we incorporate the transition probabilities from the Federal Reserve Bank of New York’s credit conditions release for First Lien subprime mortgages1. They provide the following 90 day transition probability matrix as of March, 2007:
Given the initial probabilities in March of 2007 of 0.93, 0.0367, and 0.0332 for Current, Delinquent, and Default respectively, we can use the transition probabilities in P to trace out a predicted path for defaults over the next several years. We can compare this prediction to the actual default experience over that period and the predicted default experience for our reduced form macro factor model that is estimated only on data from March 2007 and earlier. The following graph contains the results.
As the graph shows, the roll rate model (while very simple to implement) produces projected defaults that are far too low. Why? First, since roll rate models are based on historical aggregated relationships, they understate the amount of defaults during periods with big macro factor movements (like large declines in home prices during the credit crisis). We can see this by the fact that the green line lies roughly ten percentage points below the actual default experience at every date. Second, the path of the roll rate models is very smooth. Actual defaults, by contrast, show some volatility as the composition of the underlying pool is affected by new mortgage originations added to the pool, by defaults, and by prepayments.
Because of these problems, many practitioners who use roll rate models allow the transition probabilities to vary over time according to some functional relationship that drives the transition probabilities. This removes much of the analytical convenience of the roll rate models in exchange for limited improvements in realism and accuracy.
The question then becomes “how does one allow transition probabilities to vary over time and across states?” This naturally leads back to our first stress test on this subject: incorporating information on last period’s default and delinquency rates into a reduced form macro factor model. As we discussed above, adding in information on default and delinquency into Kamakura’s best practice reduced form model provides no additional explanatory power. The extension to turn roll rate models into more realistic tools does two things: first, it removes the analytical tractability and parsimony of Markov Chains, and second, it provides no additional explanatory power (and no additional accuracy) compared to reduced form macro factor modeling.
Markov chain roll rate models are a convenient, but unrealistic, depiction of reality. The lack of realism manifests through projected default rates that are often biased, and which exhibit unreasonable behavior. We find that mortgage default modeling can be done much more accurately on grouped default rate data if macro factors are linked directly to the default rate itself. We look forward to re-addressing this issue on loan level mortgage default data in another blog entry. The answer is different but closely related to our results on the Mortgage Bankers Association default rate data series.
1This can be found at http://data.newyorkfed.org/creditconditions/