Home | En Español | Contact Us | A to Z 

What We Know About Mortgage
Lending Discrimination In America

Stage 3: The Loan Approval or Disapproval Decision

The decision about whether to accept or reject a mortgage loan application has been the subject of an impressive amount of sophisticated statistical analysis. The primary information used in these studies is a repository of data compiled as a consequence of the 1975 Home Mortgage Disclosure Act (HMDA). HMDA mandated the annual reporting of information, by all mortgage lending institutions with at least $10 million in assets, on the number and dollar amount of both home mortgage and home improvement loans, by census tract or county. Since passage of Section 1211 of the Financial Institutions Reform, Recovery, and Enforcement Act of 1989, HMDA data have also included the race, gender, and income of mortgage loan applicants.

HMDA data are routinely used to compare a lender's denial rates for minority and white loan applicants, as a measure of their loan performance with regard to minorities. These analyses typically show that mortgage applications by minorities are rejected at higher rates than whites, even controlling for such factors as applicant income.

Although many useful studies are based on HMDA data, these data alone cannot prove or disprove the existence of lending discrimination, because they do not provide enough information to control for all relevant differences between white and minority borrowers. Even though HMDA data now include borrowers' race, ethnicity, and income, they do not include critical information on the wealth and debt levels of loan applicants, their credit histories, the characteristics of properties serving as collateral, the terms of loans for which applications were submitted, or the underwriting criteria used to determine eligibility. Herein lies a good part of the story behind the fierce analytical debate about what can and cannot be said about discrimination in mortgage lending.

The seminal study in the debate over discrimination at the loan approval stage is a study by researchers at the Federal Reserve Bank of Boston that was initially released in 1992 and published in final form in 1996. Other recent studies make valuable methodological and substantive contributions, but the lack of a data set comparable to the one collected for the Boston Fed Study casts a shadow over all this other research and makes the results difficult to interpret. Only the body of work that includes the Boston Fed Study itself and the contributions of its numerous critics makes it possible to derive a credible estimate of discrimination in the loan approval process at the regional level.

The Boston Fed Study began with the HMDA data, but assembled the additional information needed to measure discrimination through a survey of Boston-area lenders. This survey collected 38 additional variables for each application in the Boston Fed sample, covering the whole array of information needed to control for legitimate differences in applicant creditworthiness. That the Boston Fed sponsored the study and gained the cooperation of area lenders could suggest that the lending community did not expect the study to find statistically compelling evidence of discrimination. But it did just that--finding that minority status was indeed a statistically significant and fairly large influence in lending decisions, even when a mass of detailed information systematically related to the lending decision was controlled for in the statistical analysis. The Boston Fed's basic model found that the probability of loan denial in the Boston area was about 80 percent higher for a black or Hispanic applicant than for a white applicant, after loan, property, and applicant characteristics were all controlled for.

The findings of the Boston Fed Study had an explosive effect on the mortgage lending discrimination debate, initially stimulating extensive soul searching by the industry followed by a great deal of analytic scrutiny of both the study's data quality and its methodological approaches. The findings have emerged remarkably intact in the face of most of this scrutiny. But certain complex analytical questions remain that some analysts conclude are enough to undermine the credibility of the original findings. Specifically:

    • Omitted Variables. Key variables that affect the lending decision, and that are correlated with race or ethnicity, may have been omitted from the Boston Fed analysis. If so, the estimated impact of minority status on the approval decision is overstated, because it partly reflects the impact of other, legitimate factors that vary with race and ethnicity.
    • Data Errors. Mistakes in data entry or data coding may have distorted the Boston Fed analysis, possibly leading to over-estimates of the importance of minority status. In addition, some loans in the Boston Fed data set may have been incorrectly classified as approved or disapproved.
    • Incorrect Specification. When analysts "specify" a predictive equation, they have to make assumptions about how different factors interact to influence the approval or denial decision. If the Boston Fed's equations were incorrectly specified, they might again overstate the importance of minority status.
    • Endogenous Explanatory Variables. Some of the variables in the Boston Fed equations that are used to help explain or predict loan approval may in fact be decided at the same time as--or in conjunction with--the loan approval decision. If this is the case, the independent effects of minority on loan outcomes cannot be disentangled without more complex models of the interactive process.

Because of the importance of the Boston Fed Study, this project undertook a comprehensive review and re-analysis to assess these critiques. In some cases, re-analysis shows that the critics are simply wrong; the problem they identify does not exist or the bias involved is empirically insignificant. In several cases, however, we agree with the critics that a limitation in the Boston Fed Study could potentially lead to a serious overstatement of discrimination, and we have explored these cases in detail. Moreover, we find that the critical literature has raised several important issues concerning the interpretation of the Boston Fed Study's results. This analysis leads to the following major conclusions:

    • The large differences in loan denial rates between minority and white applicants found by the Boston Fed cannot be explained away by data errors, omitted variables, or inter-relationships between factors that influence loan approval (endogeneity).
    • The Boston Fed Study results do not definitively prove either the presence or the absence of differential treatment discrimination in loan approval, nor do they definitively prove either the presence or the absence of disparate impact discrimination.
    • The Boston Fed Study results provide such strong evidence of differential denial rates (other things being equal) that they establish a presumption that discrimination exists, and effectively shift the "burden of proof" to those who would argue that these differences are entirely due to legitimate underwriting criteria that reflect an applicant's creditworthiness and therefore serve a business necessity.
    • The best way to determine whether the observed minority-white differences in loan denials are the result of underwriting practices justified by business necessity would be to replicate the Boston Fed Study with the addition of loan performance data, while the best and possibly only way to distinguish differential treatment discrimination from disparate impact discrimination at the loan approval stage is to conduct paired testing.

The remainder of this section explains these conclusions is discussed in greater detail and then turns to other evidence regarding potential discrimination at the loan approval stage.

What We Can Learn from the Boston Fed Study

Critics of Boston Fed Study have argued that key variables that affect the lending decision, and that are correlated with race, may have been omitted from the analysis. If so, the estimated impact of race on the approval decision would be overstated because it would incorporate the impacts of other, legitimate factors that co-vary with race. Others present evidence of data errors in loan terms, application characteristics, or loan outcomes in the Boston Fed data base that they think lead to upwardly biased estimates of discrimination. Finally, critics argue that single-equation models of loan denial are inevitably biased because many loan terms--in particular the loan-to-value ratio, which changes if the applicant changes the downpayment--are actually the result of negotiation or participation in a special loan program. In other words, key explanatory variables in the Boston Fed equations may in fact be endogenous--decided at the same time or in conjunction with the outcome they are supposed to help explain or predict. If this is the case, the independent effect of race may be incorrectly estimated.

We have examined all of these arguments and, where possible, tested them with the public-use version of the Boston Fed Study's data. We conclude that the large differences in loan denial rates between minority and white applicants that are identified by the Boston Fed Study cannot be explained by data errors, omitted variables, or the endogeneity of loan terms. No reasonable procedure for solving any of these potential problems eliminates the large positive impact of minority status on loan denial. In particular, some scholars have made the reasonable argument that a variable indicating whether a loan application meets the lenders' underwriting guidelines should be included in the Boston Fed's loan denial equation to correct for aspects of applicants' credit histories that are omitted from other explanatory variables. If this variable does indeed capture such omitted elements, however, then the unobserved factors influencing "meets guidelines" will be correlated with unobserved factors influencing loan denial. We show that this is not the case. It follows that the "meets guidelines" variable does not correct for omitted variables. In addition, we find that accounting for the endogeneity of various loan terms never results in a substantial reduction in the estimated minority-status coefficient, and in several particularly plausible cases, this step actually makes that coefficient larger.

Based on our review of the Boston Fed Study and its critics, we conclude that no study has demonstrated either the presence or the absence of differential treatment discrimination in loan approval, at least not in a large sample of lenders. This conclusion puts the authors of this report at odds with the authors of the Boston Fed Study, who claim that they measure differential treatment discrimination, but also with several of their critics, who claim that there is no discrimination at all. The Boston Fed Study results constitute differential treatment discrimination only under the assumption that all lenders use the same underwriting guidelines. With this assumption, any group-based difference in treatment after controlling for underwriting variables implies that the guidelines are applied differentially across groups, which is, by definition, differential treatment discrimination. Because virtually all lenders sell some of their loans in the secondary mortgage market, they have some incentive to use the underwriting guidelines that institutions in that market, such as Fannie Mae and Freddie Mac, have established. However, not all loans are sold in the secondary market, and the lending process often involves many individuals in the same lending institution, who may not always apply guidelines uniformly. No evidence currently exists to determine the extent to which underwriting standards vary across lenders.

Moreover, we find that no study has conclusively demonstrated either the presence or the absence of disparate impact discrimination in loan approval. The Boston Fed Study results measure disparate impact discrimination only under the assumptions that 1) different lenders use different underwriting guidelines, 2) existing guidelines are accurately linked to loan profitability, on average, and 3) deviations from average guidelines cannot be justified on the basis of business necessity. These assumptions could be satisfied, for example, if underwriting guidelines vary across lenders solely for idiosyncratic reasons or if some lenders purposefully develop guidelines that have a disparate impact on minority applicants. However, no existing study sheds light on whether or not these assumptions are met.

Although the Boston Fed Study does not definitively prove the existence of either differential treatment or disparate impact discrimination, it clearly establishes the presumption that one or both exists. This presumption can be rebutted only with evidence that observed differences in loan approval between minorities and whites can be entirely explained by profit-based differences in the underwriting guidelines used by the lenders to which minorities and whites apply. To use the legal term, the Boston Fed Study makes a prima facie case that discrimination exists. If such a case were made in a courtroom setting, the burden of proof would shift to lenders. To escape the conclusion that they are discriminating, lenders would have to prove that their actions were based on "business necessity"; that is, that they used underwriting guidelines with a clear connection to the return on loans, that they applied these guidelines equally to all groups, and that no equally profitable guidelines without a disparate impact on minority applicants were available. In our view, no scholar has come close to showing that the observed inter-group differences in loan approval in Boston can be justified in business terms.

In fact, the available evidence suggests that business necessity is unlikely to explain a large share of the observed minority-white difference in loan denial. In particular, legitimate differences in underwriting guidelines must be associated with real differences in lenders' experiences and are therefore most likely to arise between lenders that specialize in groups of borrowers with different average creditworthiness. Thus, if differences across lenders in legitimate underwriting criteria have a major impact on the observed minority-white difference in loan denial, then allowing underwriting criteria to vary across lenders should dramatically lower the estimated minority-status coefficient. This turns out not to be the case. The Boston Fed Study rejects the hypothesis that the underwriting model is different for single-family houses, multi-family houses, or condominiums. Moreover, analysts have found little evidence that individual underwriting variables receive different weights for minority and white applicants. In addition, the estimated coefficient on minority status is virtually the same when separate regressions (and hence separate underwriting guidelines) are estimated for lenders that specialize in lending to minorities and for other lenders. Finally, the minority-status coefficient is literally unaffected if one excludes two minority lenders, who together account for half of the minority applications in the Boston Fed sample.

As explained earlier, the "meets guidelines" variable might be related to the issue of business necessity. Under the assumption that minority households do a poorer job than white households in selecting lenders that meet their credit needs, including the "meets guidelines" variable (and treating it as endogenous) can be interpreted as a way to account for legitimate differences in underwriting guidelines across the lenders visited by minorities and whites. In this case, we find that roughly 27 percent of the minority-white difference in loan denial is due to business necessity, not discrimination. However, this assumption is not consistent with the results summarized in the previous paragraph. If minority households simply do a poorer job finding just the right lender, then, contrary to this evidence, the minority-white difference in loan approval should be larger for lenders that specialize in lending to minorities. This is not the case.

The best way to determine whether the observed minority-white differences in loan-denial rates are the result of underwriting practices justified by business necessity would be to replicate the Boston Fed Study in other cities, with the addition of loan performance data. This approach would make it possible to determine which observed application characteristics are accurate predictors of loan returns and therefore which underwriting guidelines are legitimate. Minority-white differences in loan-denial that remain after accounting for legitimate underwriting guidelines are evidence of discrimination. Unfortunately however, this approach would not be able to distinguish between differential treatment and disparate impact discrimination. A combination of application data (including credit history), and performance data should make it possible to identify legitimate underwriting guidelines, and even to determine if those guidelines vary by location or by some other variable. However, these data would contain only a few observations for each individual lender and therefore could not be used to identify each lender's actual underwriting guidelines. As a result, it would be impossible to determine whether remaining minority-white differences in loan denial for an individual lender are due to that lender's use of different guidelines for minorities and whites (differential treatment discrimination) or its use of illegitimate guidelines that place minority applicants at a disadvantage (disparate impact discrimination).

The best, and possibly the only way to isolate differential treatment discrimination in loan approvals is with the paired testing methodology. Specifically, two applicants with the same credit histories and in need of the same type of loan would apply for a mortgage at the same lender. Differential treatment discrimination exists if minority applicants are systematically treated less favorably in a large sample of cases. Research of this type would shed no light on disparate impact discrimination because it would compare the treatment of identically qualified minority and white applicants at the same lender. Thus, observed differences in treatment could not be due to underwriting guidelines that illegitimately magnify differences in credit characteristics between minorities and whites, that is, to disparate impact discrimination.

Unfortunately, a paired testing study of loan approval faces major challenges. Perhaps the most important is that it would be difficult, and might even be illegal, to assign false credit characteristics to testers as a means of ensuring that teammates have identical loan qualifications. This step would be difficult because it would require the cooperation of the firms that maintain the credit records used by lenders. It might be illegal because laws prohibit false statements on credit applications with intent to defraud. We do not believe that testing is a fraudulent activity, because testers would never actually close the loan transaction. But the courts have not yet ruled on this matter and any group that pushes paired testing into the loan-approval stage of the mortgage process might face high legal bills, if not worse. It might be possible to conduct tests using people's actual credit characteristics, but this approach would be administratively difficult because testers would still have to be matched to have the same credit qualifications. As a result, a very large pool of potential testers would be required.

Using Defaults to Measure Discrimination in Loan Approvals

Some researchers have used information on differential default rates as a strategy for determining whether or not discrimination occurs at the loan approval stage. This approach is premised on the argument that lenders who discriminate against minority applicants do so by effectively raising their underwriting standards--rejecting minorities who meet the standard required of whites, and only accepting minorities who meet a higher standard. If this is the case, minorities who receive loans will be less likely to default than whites. Therefore, if minority default rates are the same or higher than those of whites (other things being equal), lenders must not be discriminating.

As it turns out, this simple and intuitively appealing argument runs into severe methodological hurdles when used to measure discrimination in mortgage lending. The first problem is that equations estimating the impact or race or ethnicity on default are biased if key underwriting variables are omitted from the analysis. Specifically, the default approach yields results that are biased against finding discrimination unless it includes all variables that a) influence default, b) are observed by the lender at the time of loan approval, and c) are correlated with minority status. The second problem is that the default approach cannot detect discrimination unless some relevant underwriting variables are omitted from the estimates of differential default rates, because the analysis requires at least some variables that a) influence default, b) are observed by the lender at the time the loan is approved, and c) are not correlated with minority status. Even if all variables that influence default and are observed by the lender are available for analysis (solving problem number one), it is virtually certain that some of these will be correlated with minority status (exacerbating problem number two). In the more likely case that the researcher does not have all this information, there is no way to rule out the likelihood that some of the omitted variables are correlated with minority status, which means that the estimates of discrimination will be understated (problem number one). Even if these two fundamental sources of bias can be avoided, the analyst might still not be able to obtain unbiased estimates of discrimination using the default approach, because the characteristics of the borrower that are unobserved by the lender and by analysts (characteristics, as noted earlier, that on average can give lenders an incentive to practice economic discrimination) can also introduce bias--generally a downward bias.

A new specification of the default approach asks a new question in an effort to overcome the problems just discussed. The new question is this: Is the minority-majority default difference greater in locations where the lending industry is more concentrated, a situation that presumably gives the lender more leeway to discriminate? However, this new specification does not save the default approach because it depends on two virtually mutually implausible assumptions: a) that if lenders discriminate at all they do it more severely when market concentration is higher, but b) that lenders do not alter any other aspect of their underwriting procedures in the presence of more concentration. The study that uses this new approach explicitly violates the second assumption by showing, in another context, that lenders ration credit more aggressively in more concentrated markets. Thus, even this new approach cannot answer the question of how much discrimination exists in mortgage markets.


In addition to potential discrimination against minority borrowers, lenders may discriminate against minority neighborhoods (either through differential treatment or disparate impacts). Discrimination based on location is often referred to as redlining, because historically, some lending institutions were found to have maps with red lines delineating neighborhoods within which they would not do business. Redlining is typically measured in two ways. The first focuses on the case-by-case process of approving or denying loans. Redlining is said to occur when otherwise comparable loans are more likely to be denied for houses in minority neighborhoods than for houses in white neighborhoods, even though all credit-relevant characteristics of applicants, properties, and loans are the same. Studies of this kind face the same basic challenge as studies of discrimination against individual loan applicants, namely to find a data set with adequate information on loans and applicants, including applicant credit history. The only studies of redlining with such information turn out to be based on the Boston Fed Study's data. Two of these studies find no evidence of redlining but a third, which accounts for the relationship between redlining and private mortgage insurance, finds redlining against low-income neighborhoods, which in Boston are almost all largely black.

The second approach to the measurement of redlining focuses on aggregate lending outcomes. In this context, redlining is said to occur when minority neighborhoods receive a smaller volume of mortgage loan funds than white neighborhoods that are comparable in all relevant respects. This approach has received more empirical attention than the individual-level approach. Most studies focus on outcomes by census tract, while one attempts to isolate the role of lenders. Many studies in this literature find signs of redlining, but others do not, and no consensus has emerged on the extent of redlining or appropriate methods for measuring it.

Negotiating Loan Terms

At the loan approval stage, lenders not only decide whether to make a loan. They also set the terms of the loan, including the interest rate, loan fees, maturity, loan-to-value ratio, and loan type (conventional, adjustable rate, FHA, and so on). This is an important issue, because fair housing complaints often involve unfair terms and conditions for loans, and there is reason to believe that the lending industry may be in the process of shifting from "credit rationing"--where customers perceived to be high-risk are denied loans--toward "risk-based pricing"--where these same customers are simply charged a higher price for loans.

One early analytic study found discrimination against blacks and Hispanics in interest rates and loan fees, but not in loan maturities. Another also found discrimination against blacks in the setting of interest rates. Both used extensive statistical controls to isolate the effect of race and ethnicity from the effects of other factors. Two more recent studies reviewed for this report examine discrimination in overages, defined as the excess of the final contractual interest rate over the lender's official rate when it first commits to a loan. Both of these studies find cases in which the overages charged to black and Hispanic borrowers are higher than those charged white customers by a small but statistically significant amount.

With respect to type of loan, several research studies have examined the probability that a borrower will receive an FHA loan instead of a conventional loan. Both borrowers and lenders have an interest in this choice. FHA guidelines are relatively flexible, and may qualify borrowers who do not meet conventional underwriting standards. This makes them attractive to both borrowers and lenders. But FHA loans may cost more than conventional loans, and may also permit higher fees to the lenders. It is clear that minority borrowers, in fact, rely more heavily on FHA loans than do white borrowers. What the analytical literature shows is that, controlling for borrower, property, and loan characteristics, minorities are still more likely than whites to receive FHA loans. One plausible explanation is that minorities are steered in the FHA direction because of discrimination in the market for conventional mortgages.

Back to main menu

Content Archived: January 20, 2009

FOIA Privacy Web Policies and Important Links [logo: Fair Housing and Equal Opportunity]
U.S. Department of Housing and Urban Development
451 7th Street S.W.
Washington, DC 20410
Telephone: (202) 708-1112 TTY: (202) 708-1455