What We Know About Mortgage
Lending Discrimination In America
Stage 3: The Loan Approval or Disapproval Decision
The decision about whether to accept or reject a mortgage loan application
has been the subject of an impressive amount of sophisticated statistical
analysis. The primary information used in these studies is a repository of data
compiled as a consequence of the 1975 Home Mortgage Disclosure Act (HMDA). HMDA
mandated the annual reporting of information, by all mortgage lending
institutions with at least $10 million in assets, on the number and dollar
amount of both home mortgage and home improvement loans, by census tract or
county. Since passage of Section 1211 of the Financial Institutions Reform,
Recovery, and Enforcement Act of 1989, HMDA data have also included the race,
gender, and income of mortgage loan applicants.
HMDA data are routinely used to compare a lender's denial rates for
minority and white loan applicants, as a measure of their loan performance with
regard to minorities. These analyses typically show that mortgage applications
by minorities are rejected at higher rates than whites, even controlling for
such factors as applicant income.
Although many useful studies are based on HMDA data, these data alone cannot
prove or disprove the existence of lending discrimination, because they do not
provide enough information to control for all relevant differences between white
and minority borrowers. Even though HMDA data now include borrowers' race,
ethnicity, and income, they do not include critical information on the wealth
and debt levels of loan applicants, their credit histories, the characteristics
of properties serving as collateral, the terms of loans for which applications
were submitted, or the underwriting criteria used to determine eligibility.
Herein lies a good part of the story behind the fierce analytical debate about
what can and cannot be said about discrimination in mortgage lending.
The seminal study in the debate over discrimination at the loan approval
stage is a study by researchers at the Federal Reserve Bank of Boston that was
initially released in 1992 and published in final form in 1996. Other recent
studies make valuable methodological and substantive contributions, but the lack
of a data set comparable to the one collected for the Boston Fed Study casts a
shadow over all this other research and makes the results difficult to
interpret. Only the body of work that includes the Boston Fed Study itself and
the contributions of its numerous critics makes it possible to derive a credible
estimate of discrimination in the loan approval process at the regional
level.
The Boston Fed Study began with the HMDA data, but assembled the additional
information needed to measure discrimination through a survey of Boston-area
lenders. This survey collected 38 additional variables for each application in
the Boston Fed sample, covering the whole array of information needed to control
for legitimate differences in applicant creditworthiness. That the Boston Fed
sponsored the study and gained the cooperation of area lenders could suggest
that the lending community did not expect the study to find statistically
compelling evidence of discrimination. But it did just that--finding that
minority status was indeed a statistically significant and fairly large
influence in lending decisions, even when a mass of detailed information
systematically related to the lending decision was controlled for in the
statistical analysis. The Boston Fed's basic model found that the
probability of loan denial in the Boston area was about 80 percent higher for a
black or Hispanic applicant than for a white applicant, after loan, property,
and applicant characteristics were all controlled for.
The findings of the Boston Fed Study had an explosive effect on the mortgage
lending discrimination debate, initially stimulating extensive soul searching by
the industry followed by a great deal of analytic scrutiny of both the
study's data quality and its methodological approaches. The findings have
emerged remarkably intact in the face of most of this scrutiny. But certain
complex analytical questions remain that some analysts conclude are enough to
undermine the credibility of the original findings. Specifically:
- Omitted Variables
. Key variables that affect the lending
decision, and that are correlated with race or ethnicity, may have been
omitted from the Boston Fed analysis. If so, the estimated impact of
minority status on the approval decision is overstated, because it
partly reflects the impact of other, legitimate factors that vary with
race and ethnicity.
- Data Errors
. Mistakes in data entry or data coding may have
distorted the Boston Fed analysis, possibly leading to over-estimates of
the importance of minority status. In addition, some loans in the Boston
Fed data set may have been incorrectly classified as approved or
disapproved.
- Incorrect Specification.
When analysts "specify" a
predictive equation, they have to make assumptions about how different
factors interact to influence the approval or denial decision. If the
Boston Fed's equations were incorrectly specified, they might
again overstate the importance of minority status.
- Endogenous Explanatory Variables
. Some of the variables in
the Boston Fed equations that are used to help explain or predict loan
approval may in fact be decided at the same time as--or in
conjunction with--the loan approval decision. If this is the case,
the independent effects of minority on loan outcomes cannot be
disentangled without more complex models of the interactive
process.
Because of the importance of the Boston Fed Study, this project undertook a
comprehensive review and re-analysis to assess these critiques. In some cases,
re-analysis shows that the critics are simply wrong; the problem they identify
does not exist or the bias involved is empirically insignificant. In several
cases, however, we agree with the critics that a limitation in the Boston Fed
Study could potentially lead to a serious overstatement of discrimination, and
we have explored these cases in detail. Moreover, we find that the critical
literature has raised several important issues concerning the interpretation of
the Boston Fed Study's results. This analysis leads to the following major
conclusions:
- The large differences in loan denial rates between minority and
white applicants found by the Boston Fed cannot be explained away by
data errors, omitted variables, or inter-relationships between factors
that influence loan approval (endogeneity).
- The Boston Fed Study results do not definitively prove either the
presence or the absence of differential treatment discrimination in loan
approval, nor do they definitively prove either the presence or the
absence of disparate impact discrimination.
- The Boston Fed Study results provide such strong evidence of
differential denial rates (other things being equal) that they establish
a presumption that discrimination exists, and effectively shift the
"burden of proof" to those who would argue that these
differences are entirely due to legitimate underwriting criteria that
reflect an applicant's creditworthiness and therefore serve a
business necessity.
- The best way to determine whether the observed minority-white
differences in loan denials are the result of underwriting practices
justified by business necessity would be to replicate the Boston Fed
Study with the addition of loan performance data, while the best and
possibly only way to distinguish differential treatment discrimination
from disparate impact discrimination at the loan approval stage is to
conduct paired testing.
The remainder of this section explains these conclusions is discussed in
greater detail and then turns to other evidence regarding potential
discrimination at the loan approval stage.
What We Can Learn from the Boston Fed Study
Critics of Boston Fed Study have argued that key variables that affect the
lending decision, and that are correlated with race, may have been omitted from
the analysis. If so, the estimated impact of race on the approval decision would
be overstated because it would incorporate the impacts of other, legitimate
factors that co-vary with race. Others present evidence of data errors in loan
terms, application characteristics, or loan outcomes in the Boston Fed data base
that they think lead to upwardly biased estimates of discrimination. Finally,
critics argue that single-equation models of loan denial are inevitably biased
because many loan terms--in particular the loan-to-value ratio, which
changes if the applicant changes the downpayment--are actually the result
of negotiation or participation in a special loan program. In other words, key
explanatory variables in the Boston Fed equations may in fact be
endogenous--decided at the same time or in conjunction with the
outcome they are supposed to help explain or predict. If this is the case, the
independent effect of race may be incorrectly estimated.
We have examined all of these arguments and, where possible, tested them with
the public-use version of the Boston Fed Study's data. We conclude that the
large differences in loan denial rates between minority and white applicants
that are identified by the Boston Fed Study cannot be explained by data errors,
omitted variables, or the endogeneity of loan terms. No reasonable procedure for
solving any of these potential problems eliminates the large positive impact of
minority status on loan denial. In particular, some scholars have made the
reasonable argument that a variable indicating whether a loan application meets
the lenders' underwriting guidelines should be included in the Boston
Fed's loan denial equation to correct for aspects of applicants'
credit histories that are omitted from other explanatory variables. If this
variable does indeed capture such omitted elements, however, then the unobserved
factors influencing "meets guidelines" will be correlated with
unobserved factors influencing loan denial. We show that this is not the case.
It follows that the "meets guidelines" variable does not correct for
omitted variables. In addition, we find that accounting for the endogeneity of
various loan terms never results in a substantial reduction in the estimated
minority-status coefficient, and in several particularly plausible cases, this
step actually makes that coefficient larger.
Based on our review of the Boston Fed Study and its critics, we conclude that
no study has demonstrated either the presence or the absence of
differential treatment discrimination in loan approval, at least not in a large
sample of lenders. This conclusion puts the authors of this report at odds with
the authors of the Boston Fed Study, who claim that they measure differential
treatment discrimination, but also with several of their critics, who claim that
there is no discrimination at all. The Boston Fed Study results constitute
differential treatment discrimination only under the assumption that all lenders
use the same underwriting guidelines. With this assumption, any group-based
difference in treatment after controlling for underwriting variables implies
that the guidelines are applied differentially across groups, which is, by
definition, differential treatment discrimination. Because virtually all lenders
sell some of their loans in the secondary mortgage market, they have some
incentive to use the underwriting guidelines that institutions in that market,
such as Fannie Mae and Freddie Mac, have established. However, not all loans are
sold in the secondary market, and the lending process often involves many
individuals in the same lending institution, who may not always apply guidelines
uniformly. No evidence currently exists to determine the extent to which
underwriting standards vary across lenders.
Moreover, we find that no study has conclusively demonstrated either the
presence or the absence of disparate impact discrimination in loan
approval. The Boston Fed Study results measure disparate impact discrimination
only under the assumptions that 1) different lenders use different underwriting
guidelines, 2) existing guidelines are accurately linked to loan profitability,
on average, and 3) deviations from average guidelines cannot be justified on the
basis of business necessity. These assumptions could be satisfied, for example,
if underwriting guidelines vary across lenders solely for idiosyncratic reasons
or if some lenders purposefully develop guidelines that have a disparate impact
on minority applicants. However, no existing study sheds light on whether or not
these assumptions are met.
Although the Boston Fed Study does not definitively prove the existence of
either differential treatment or disparate impact discrimination, it clearly
establishes the presumption that one or both exists. This presumption can be
rebutted only with evidence that observed differences in loan approval between
minorities and whites can be entirely explained by profit-based differences in
the underwriting guidelines used by the lenders to which minorities and whites
apply. To use the legal term, the Boston Fed Study makes a prima facie
case that discrimination exists. If such a case were made in a courtroom
setting, the burden of proof would shift to lenders. To escape the conclusion
that they are discriminating, lenders would have to prove that their actions
were based on "business necessity"; that is, that they used
underwriting guidelines with a clear connection to the return on loans, that
they applied these guidelines equally to all groups, and that no equally
profitable guidelines without a disparate impact on minority applicants were
available. In our view, no scholar has come close to showing that the observed
inter-group differences in loan approval in Boston can be justified in business
terms.
In fact, the available evidence suggests that business necessity is unlikely
to explain a large share of the observed minority-white difference in loan
denial. In particular, legitimate differences in underwriting guidelines must be
associated with real differences in lenders' experiences and are therefore most
likely to arise between lenders that specialize in groups of borrowers with
different average creditworthiness. Thus, if differences across lenders in
legitimate underwriting criteria have a major impact on the observed
minority-white difference in loan denial, then allowing underwriting criteria to
vary across lenders should dramatically lower the estimated minority-status
coefficient. This turns out not to be the case. The Boston Fed Study rejects the
hypothesis that the underwriting model is different for single-family houses,
multi-family houses, or condominiums. Moreover, analysts have found little
evidence that individual underwriting variables receive different weights for
minority and white applicants. In addition, the estimated coefficient on
minority status is virtually the same when separate regressions (and hence
separate underwriting guidelines) are estimated for lenders that specialize in
lending to minorities and for other lenders. Finally, the minority-status
coefficient is literally unaffected if one excludes two minority lenders, who
together account for half of the minority applications in the Boston Fed
sample.
As explained earlier, the "meets guidelines" variable might be
related to the issue of business necessity. Under the assumption that minority
households do a poorer job than white households in selecting lenders that meet
their credit needs, including the "meets guidelines" variable (and
treating it as endogenous) can be interpreted as a way to account for legitimate
differences in underwriting guidelines across the lenders visited by minorities
and whites. In this case, we find that roughly 27 percent of the minority-white
difference in loan denial is due to business necessity, not discrimination.
However, this assumption is not consistent with the results summarized in the
previous paragraph. If minority households simply do a poorer job finding just
the right lender, then, contrary to this evidence, the minority-white difference
in loan approval should be larger for lenders that specialize in lending to
minorities. This is not the case.
The best way to determine whether the observed minority-white differences in
loan-denial rates are the result of underwriting practices justified by business
necessity would be to replicate the Boston Fed Study in other cities, with the
addition of loan performance data. This approach would make it possible to
determine which observed application characteristics are accurate predictors of
loan returns and therefore which underwriting guidelines are legitimate.
Minority-white differences in loan-denial that remain after accounting for
legitimate underwriting guidelines are evidence of discrimination. Unfortunately
however, this approach would not be able to distinguish between differential
treatment and disparate impact discrimination. A combination of application data
(including credit history), and performance data should make it possible to
identify legitimate underwriting guidelines, and even to determine if those
guidelines vary by location or by some other variable. However, these data would
contain only a few observations for each individual lender and therefore could
not be used to identify each lender's actual underwriting guidelines. As a
result, it would be impossible to determine whether remaining minority-white
differences in loan denial for an individual lender are due to that lender's use
of different guidelines for minorities and whites (differential treatment
discrimination) or its use of illegitimate guidelines that place minority
applicants at a disadvantage (disparate impact discrimination).
The best, and possibly the only way to isolate differential treatment
discrimination in loan approvals is with the paired testing methodology.
Specifically, two applicants with the same credit histories and in need of the
same type of loan would apply for a mortgage at the same lender. Differential
treatment discrimination exists if minority applicants are systematically
treated less favorably in a large sample of cases. Research of this type would
shed no light on disparate impact discrimination because it would compare the
treatment of identically qualified minority and white applicants at the same
lender. Thus, observed differences in treatment could not be due to underwriting
guidelines that illegitimately magnify differences in credit characteristics
between minorities and whites, that is, to disparate impact discrimination.
Unfortunately, a paired testing study of loan approval faces major
challenges. Perhaps the most important is that it would be difficult, and might
even be illegal, to assign false credit characteristics to testers as a means of
ensuring that teammates have identical loan qualifications. This step would be
difficult because it would require the cooperation of the firms that maintain
the credit records used by lenders. It might be illegal because laws prohibit
false statements on credit applications with intent to defraud. We do not
believe that testing is a fraudulent activity, because testers would never
actually close the loan transaction. But the courts have not yet ruled on this
matter and any group that pushes paired testing into the loan-approval stage of
the mortgage process might face high legal bills, if not worse. It might be
possible to conduct tests using people's actual credit characteristics,
but this approach would be administratively difficult because testers would
still have to be matched to have the same credit qualifications. As a result, a
very large pool of potential testers would be required.
Using Defaults to Measure Discrimination in Loan
Approvals
Some researchers have used information on differential default rates as a
strategy for determining whether or not discrimination occurs at the loan
approval stage. This approach is premised on the argument that lenders who
discriminate against minority applicants do so by effectively raising
their underwriting standards--rejecting minorities who meet the standard
required of whites, and only accepting minorities who meet a higher standard. If
this is the case, minorities who receive loans will be less likely to default
than whites. Therefore, if minority default rates are the same or higher than
those of whites (other things being equal), lenders must not be
discriminating.
As it turns out, this simple and intuitively appealing argument runs into
severe methodological hurdles when used to measure discrimination in mortgage
lending. The first problem is that equations estimating the impact or race or
ethnicity on default are biased if key underwriting variables are omitted from
the analysis. Specifically, the default approach yields results that are biased
against finding discrimination unless it includes all variables
that a) influence default, b) are observed by the lender at the time of loan
approval, and c) are correlated with minority status. The second problem is that
the default approach cannot detect discrimination unless some relevant
underwriting variables are omitted from the estimates of differential
default rates, because the analysis requires at least some variables that a)
influence default, b) are observed by the lender at the time the loan is
approved, and c) are not correlated with minority status. Even if all
variables that influence default and are observed by the lender are available
for analysis (solving problem number one), it is virtually certain that some of
these will be correlated with minority status (exacerbating problem number two).
In the more likely case that the researcher does not have all this information,
there is no way to rule out the likelihood that some of the omitted variables
are correlated with minority status, which means that the estimates of
discrimination will be understated (problem number one). Even if these two
fundamental sources of bias can be avoided, the analyst might still not be able
to obtain unbiased estimates of discrimination using the default approach,
because the characteristics of the borrower that are unobserved by the lender
and by analysts (characteristics, as noted earlier, that on average can give
lenders an incentive to practice economic discrimination) can also introduce
bias--generally a downward bias.
A new specification of the default approach asks a new question in an effort
to overcome the problems just discussed. The new question is this: Is the
minority-majority default difference greater in locations where the lending
industry is more concentrated, a situation that presumably gives the lender more
leeway to discriminate? However, this new specification does not save the
default approach because it depends on two virtually mutually implausible
assumptions: a) that if lenders discriminate at all they do it more severely
when market concentration is higher, but b) that lenders do not alter any other
aspect of their underwriting procedures in the presence of more concentration.
The study that uses this new approach explicitly violates the second assumption
by showing, in another context, that lenders ration credit more aggressively in
more concentrated markets. Thus, even this new approach cannot answer the
question of how much discrimination exists in mortgage markets.
Redlining
In addition to potential discrimination against minority borrowers,
lenders may discriminate against minority neighborhoods (either through
differential treatment or disparate impacts). Discrimination based on location
is often referred to as redlining, because historically, some lending
institutions were found to have maps with red lines delineating neighborhoods
within which they would not do business. Redlining is typically measured in two
ways. The first focuses on the case-by-case process of approving or denying
loans. Redlining is said to occur when otherwise comparable loans are more
likely to be denied for houses in minority neighborhoods than for houses in
white neighborhoods, even though all credit-relevant characteristics of
applicants, properties, and loans are the same. Studies of this kind face the
same basic challenge as studies of discrimination against individual loan
applicants, namely to find a data set with adequate information on loans and
applicants, including applicant credit history. The only studies of redlining
with such information turn out to be based on the Boston Fed Study's data.
Two of these studies find no evidence of redlining but a third, which accounts
for the relationship between redlining and private mortgage insurance, finds
redlining against low-income neighborhoods, which in Boston are almost all
largely black.
The second approach to the measurement of redlining focuses on aggregate
lending outcomes. In this context, redlining is said to occur when minority
neighborhoods receive a smaller volume of mortgage loan funds than white
neighborhoods that are comparable in all relevant respects. This approach has
received more empirical attention than the individual-level approach. Most
studies focus on outcomes by census tract, while one attempts to isolate the
role of lenders. Many studies in this literature find signs of redlining, but
others do not, and no consensus has emerged on the extent of redlining or
appropriate methods for measuring it.
Negotiating Loan Terms
At the loan approval stage, lenders not only decide whether to make a loan.
They also set the terms of the loan, including the interest rate, loan fees,
maturity, loan-to-value ratio, and loan type (conventional, adjustable rate,
FHA, and so on). This is an important issue, because fair housing complaints
often involve unfair terms and conditions for loans, and there is reason to
believe that the lending industry may be in the process of shifting from
"credit rationing"--where customers perceived to be high-risk are
denied loans--toward "risk-based pricing"--where these same
customers are simply charged a higher price for loans.
One early analytic study found discrimination against blacks and Hispanics in
interest rates and loan fees, but not in loan maturities. Another also found
discrimination against blacks in the setting of interest rates. Both used
extensive statistical controls to isolate the effect of race and ethnicity from
the effects of other factors. Two more recent studies reviewed for this report
examine discrimination in overages, defined as the excess of the final
contractual interest rate over the lender's official rate when it first
commits to a loan. Both of these studies find cases in which the overages
charged to black and Hispanic borrowers are higher than those charged white
customers by a small but statistically significant amount.
With respect to type of loan, several research studies have examined the
probability that a borrower will receive an FHA loan instead of a conventional
loan. Both borrowers and lenders have an interest in this choice. FHA guidelines
are relatively flexible, and may qualify borrowers who do not meet conventional
underwriting standards. This makes them attractive to both borrowers and
lenders. But FHA loans may cost more than conventional loans, and may also
permit higher fees to the lenders. It is clear that minority borrowers, in fact,
rely more heavily on FHA loans than do white borrowers. What the analytical
literature shows is that, controlling for borrower, property, and loan
characteristics, minorities are still more likely than whites to receive FHA
loans. One plausible explanation is that minorities are steered in the FHA
direction because of discrimination in the market for conventional
mortgages.
Back to main menu
Content Archived: January 20, 2009