Emory Law Journal

Beyond Context: Social Facts as Case-Specific Evidence
Gregory Mitchell,
Laurens Walker,
John Monahan *Gregory Mitchell is Mortimer M. Caplin Professor of Law & Class of 1948 Research Professor, University of Virginia. Please direct correspondence to Greg Mitchell at University of Virginia School of Law, 580 Massie Road, Charlottesville, VA 22903-1738, greg_mitchell@virginia.edu. We appreciate the helpful input of Craig Callen, George Cohen, Brandon Garrett, Gregory Mandel, and workshop participants at Michigan State University College of Law and Temple University Beasley School of Law, and we appreciate the research assistance of Myra Chapman, Katherine Foster, Elizabeth Kade, Leah McLaughlin, and Adam Pollet.Laurens Walker is T. Munford Boyd Professor of Law, University of Virginia.John Monahan is John S. Shannon Distinguished Professor of Law & Horace W. Goldsmith Research Professor of Law, University of Virginia.

Abstract

Experts often seek to apply social science to the facts of a particular case. Sometimes experts link social science findings to cases using only their expert judgment, and other times experts conduct case-specific research using social science principles and methods to produce case-specific evidence. This Article argues against expert judgment as the means of linking general social science to specific cases, and for the use of methodologically rigorous case-specific research to produce “social facts,” or case-specific evidence derived from social science principles. We explain the many ways that social fact studies can be conducted to yield reliable case-specific opinions, and we dispel the view that litigation poses insurmountable barriers to the conduct of case-specific empirical research. Social fact studies are feasible for both plaintiffs and defendants, and they provide much sounder conclusions about the relevance of social science to a litigated case than does linkage via expert judgment.

Introduction

Since Louis Brandeis first introduced it in court over a century ago, 1 See Muller v. Oregon, 208 U.S. 412, 419 & n.1 (1908) (“[T]he brief filed by Mr. Louis D. Brandeis, for the defendant . . . . [included] extracts from over ninety reports of committees, bureaus of statistics, commissioners of hygiene, inspectors of factories, both in this country and in Europe, to the effect that long hours of labor are dangerous for women, primarily because of their special physical organization.”). evidence drawn from social science research has played an important role in many forms of litigation. 2 See generally John Monahan & Laurens Walker, Social Science in Law (7th ed. 2010) (providing history and discussion of social science as used in a variety of legal contexts). Perhaps the most prominent use of social science evidence occurs in employment discrimination class actions, where plaintiffs often rely on the testimony of an expert to identify factors within the defendant organization that social science studies suggest could pose a common risk of harm to all class members. The prototypical example is found in Dukes v. Wal-Mart Stores, Inc., a nationwide gender discrimination class action involving hundreds of thousands of women. 3603 F.3d 571 (9th Cir.) (en banc), cert. granted, 131 S. Ct. 795 (2010). The likely size of the class was a matter of some dispute between the majority and dissent in the en banc opinion issued by the Ninth Circuit. Id. at 578 n.3. For information about the proposed class and the relief sought, see Plaintiffs’ Third Amended Complaint at 21–25, Dukes v. Wal-Mart Stores, Inc., 222 F.R.D. 137 (N.D. Cal. 2004) (No. C-01-2252 MJJ), available at http://www.impactfund.org/documents/cat_95-100/Third_Amended_Complaint.pdf. Professor Nagareda describes Dukes as “the largest class action in history under Title VII of the Civil Rights Act of 1964.” Richard A. Nagareda, Class Certification in the Age of Aggregate Proof, 84 N.Y.U. L. Rev. 97, 102 (2009) (footnote omitted). A sociology expert reviewed the case record on Wal-Mart’s employment practices in light of “what social science research shows to be factors that create and sustain bias and those that minimize bias” 4Declaration of William T. Bielby, Ph.D. in Support of Plaintiffs’ Motion for Class Certification at 5, Dukes, 222 F.R.D. 137 (No. C-01-2252 MJJ), available at http://www.walmartclass.com/staticdata/reports/r3.html. and concluded that Wal-Mart’s practices “contribute to disparities between men and women in their compensation and career trajectories at the company.” 5Id. at 41. The district court relied on this expert’s opinions in granting the plaintiffs’ class certification motion. Dukes, 222 F.R.D. at 153–54, aff’d sub nom. Dukes v. Wal-Mart, Inc., 509 F.3d 1168 (9th Cir. 2007), aff’d in part on reh’g sub nom. Dukes v. Wal-Mart Stores, Inc., 603 F.3d 571 (9th Cir.) (en banc), cert. granted, 131 S. Ct. 795 (2010). The initial appellate panel also allowed the expert’s opinions as evidence of commonality. Dukes, 509 F.3d at 1179–80. The Ninth Circuit granted rehearing en banc, and a split court upheld the district court’s certification decision. Dukes, 603 F.3d at 628. We discuss in greater detail the opinions of the expert in Dukes—and our concerns about those opinions—in two recent publications. See John Monahan, Laurens Walker & Gregory Mitchell, Contextual Evidence of Gender Discrimination: The Ascendance of “Social Frameworks, 94 Va. L. Rev. 1715, 1742–48 (2008) [hereinafter Monahan et al., Ascendance of Social Frameworks] (arguing that the expert, although claiming to present a social framework, testified about social facts specific to the defendant); John Monahan, Laurens Walker & Gregory Mitchell, The Limits of Social Framework Evidence, 8 Law, Probability & Risk 307, 311–19 (2009) [hereinafter Monahan et al., Limits of Social Frameworks] (noting a lack of objective measurements to ensure facts were true and representative of all of defendant’s locations).

The approach the expert used in Dukes has come to be known as “social framework analysis”—so called because an expert uses social science research as a framework for analyzing the facts of a particular case. 6Professors Susan Fiske and Eugene Borgida first used this phrase as an extension of Walker and Monahan’s “social frameworks” concept, which involves using general social science evidence to provide a frame of reference or background information to assist the factfinder deciding issues in a specific case. Compare Susan T. Fiske & Eugene Borgida, Social Framework Analysis as Expert Testimony in Sexual Harassment Suits, in Sexual Harassment in the Workplace: Proceedings of New York University 51st Annual Conference on Labor 575, 577 (Samuel Estreicher ed., 1999) (“A social framework analysis uses general conclusions from tested, reliable, peer-reviewed social science research and applies it to the case at hand.”), with Laurens Walker & John Monahan, Social Frameworks: A New Use of Social Science in Law, 73 Va. L. Rev. 559, 559 (1987) (“[G]eneral research results are used to construct a frame of reference or background context for deciding factual issues crucial to the resolution of a specific case. We call this . . . social frameworks.”). Whereas Walker and Monahan envisioned judicial instructions on reliable social science propositions that would provide jurors with new general information to help them make sense of the case-specific evidence, Fiske and Borgida described a “newer methodology of social framework analysis” in which “[c]onclusions aggregated from the research literature are applied to particular cases” by expert witnesses. Fiske & Borgida, supra, at 575–77. As we have discussed previously, in our view case-specific opinions based on Fiske and Borgida’s “social framework analysis” violate basic rules of expert evidence and scientific reliability and should not be confused with Walker and Monahan’s conception of social framework evidence. See Monahan et al., Ascendance of Social Frameworks, supra note 5, at 1746 n.84 (“Whereas Walker and Monahan expressly argued that any inferences to be drawn from the general research to the specific case should be the province of the fact-finder working within a court’s instructions, Fiske and Borgida expressly advocated that experts make such linkages for the fact-finder . . . .”); Monahan et al., Limits of Social Frameworks, supra note 5, at 311–19 (arguing that any opinion based on intuition, instead of reliable methods, falls short of the rigorous post-Daubert standard for expert testimony). For a recent opinion excluding testimony by Dr. Borgida based on “social framework analysis,” see EEOC v. Bloomberg L.P., No. 07 Civ. 8383(LAP), slip op. at 14–15 (S.D.N.Y. Aug. 31, 2010) (stating, among other reasons for excluding Dr. Borgida’s opinions, that “he relied on insufficient facts and data” and “the opinions in [his] report are supported by what appears to be a ‘because I said so’ explanation”). To avoid confusion, we place “social framework analysis” in quotation marks wherever the phrase is used to refer to experts using their personal judgment rather than scientific methods to link social science to specific cases. In this approach, the expert uses her judgment, rather than traditional empirical methods, to link social science propositions to a particular case. 7 See Monahan et al., Limits of Social Frameworks, supra note 5, at 315 (noting that experts using “social framework analysis” fill gaps with causal judgments that are not based on accepted methods of causal testing). For instance, the expert in Dukes conducted no observational, statistical, or experimental tests to determine whether any particular employment practice of Wal-Mart actually contributed to sex disparities in pay; he simply reviewed discovery materials and judged Wal-Mart’s practices to contain features that social science studies suggest can be associated with intergroup bias. 8 Id. at 314–17. This method has been developed by experts exclusively for use in litigation. As the expert for the Dukes plaintiffs testified in a subsequent case:

[S]ocial framework analysis is a legal term and not a scientific term. It’s a label that’s been applied to what social scientists do when they come into a litigation context. Issues of causality in the social sciences have a long and rich methodological tradition that has nothing to do with social framework analysis. 9Videotaped Deposition of William T. Bielby, Ph.D., Taken 01-15-08, at 105–06, EEOC v. Wal-Mart Stores, Inc., No. 6:01-CV-339-KKC, 2010 WL 583681 (E.D. Ky. Feb. 16, 2010), 2008 WL 6858762. Dr. Bielby’s opinions in this case were recently excluded, in part, because those opinions, which were based on “social framework analysis,” were not sufficiently connected to the case at hand. See Wal-Mart Stores, 2010 WL 583681, at *4 (“The burden . . . is on the plaintiff to prove that intentional discrimination occurred at this particular distribution center, not just that gender stereotyping or intentional discrimination is prevalent in the world. Dr. Bielby does not opine on whether intentional discrimination occurred at the distribution center.”).

In stark contrast to those experts who engage in conjectural “social framework analysis,” other experts do use traditional social scientific techniques to assess conditions directly relevant to the case at hand. For instance, a psychologist may conduct an experiment to determine whether a photo lineup suggested the defendant as the perpetrator, 10For a web-based demonstration of such a study, see Consultation and Expert Testimony, Eyewitness Identification Research Lab., http://eyewitness.utep.edu/consult01.html (last visited May 20, 2011). or an economist may estimate the impact of alleged monopolistic practices on consumer prices using econometric analyses of market data. 11 See, e.g., California v. Infineon Techs. AG, No. C 06-4333 PJH, 2008 WL 4155665, at *7 (N.D. Cal. Sept. 5, 2008) (“As is the norm in complex antitrust cases, the parties have weighed in on both sides of this question with reference to the testimony of supporting experts, who present conflicting econometric models in support of their contrasting conclusions.”). In these instances, the expert applies scientific principles and methods to case-specific data in the same way that the expert would use scientific principles and methods to analyze data outside the litigation context. When social scientific principles and methods are used to develop opinions about the parties, practices, or behaviors involved in a particular case, such evidence has been referred to as “social facts.” 12 See Laurens Walker & John Monahan, Social Facts: Scientific Methodology as Legal Precedent, 76 Calif. L. Rev. 877, 881–82 & n.26 (1988). Thus, when we speak here of social facts or social fact studies, we mean case-specific evidence produced through the application of reliable social science principles and methods to case-specific data, and when we speak of “social framework analysis,” we mean case-specific evidence produced through an expert’s application of social science findings to a particular case using expert judgment rather than traditional empirical methods.

This Article addresses the scientific and legal merits of using social science techniques to develop evidence specific to a particular case and argues for the use of methodologically rigorous social fact studies whenever experts seek to link social science principles to particular cases. 13It is important to note that our rejection of “social framework analysis” applies with equal force to all parties to litigation. Although experts for plaintiffs appear to use this approach more commonly in civil cases than defendants, defense experts have used this approach. See, e.g., Memorandum in Support of Admissibility of Expert Testimony Regarding the Lack of Race Based Preferential Treatment by Thompson or Skipper at 2, Rice v. Or. Dep’t of Corr., No. 04C-19412 (Or. Cir. Ct. Feb. 12, 2008), 2008 WL 3886606 (“Defendants anticipate the testimony of the following expert witness: Dr. M. Kahlil Zonoozy will testify as an expert about social framework evidence based upon his review of witness testimony. Dr. Zonoozy will opine that there were inaccurate perceptions at issue in the workplace at Santiam in 2003 and 2004. These inaccurate perceptions related to race. He will further opine that he saw no evidence of race-based preferential treatment by Superintendent Thompson of Security Manager Carter and Officer Skipper.”). We find this approach unacceptable regardless of the offering party. We explain the many ways that social fact studies can be conducted to yield reliable case-specific opinions, and we seek to dispel the view that litigation poses insurmountable barriers to the conduct of case-specific empirical research. We conclude that social fact studies are feasible for both plaintiffs and defendants—with or without special access to the parties involved in a case—and provide much sounder conclusions about the relevance of social science to a litigated case than does “social framework analysis.”

We proceed in three Parts. In Part I, we situate social facts in the context of other uses of social science evidence for legal purposes, and we describe a number of research methodologies that generate social facts. In Part II, we explain that social fact studies, when done properly, possess scientific reliability and “fit” the facts of a particular case, 14 See, e.g., Daubert v. Merrell Dow Pharm., Inc., 509 U.S. 579, 589–92 (1993) (noting that Federal Rule of Evidence 702 requires expert evidence to be both relevant and helpful). as required of any expert evidence under current Federal Rules. 15Kumho Tire Co. v. Carmichael, 526 U.S. 137, 147–48 (1999) (determining that Rule 702 applies to all expert testimony, not just that which is “scientific”). We pay particular attention to the evidentiary benefits of social facts under the fit requirement, and we discuss how to assess the fit of social science research offered in a particular case. In Part III, we examine key legal and ethical issues concerning access to the data needed to conduct case-specific studies and the protection of study participants. Although the benefits of social fact research are already appreciated in several domains, 16The benefits of social fact studies are perhaps best appreciated in the domain of trademark litigation. An American court first admitted a consumer confusion survey in a trademark dispute in 1940. See Oneida, Ltd. v. Nat’l Silver Co., 25 N.Y.S.2d 271, 286 (Sup. Ct. 1940) (explaining how females were asked to identify the maker of silverware to measure possible confusion between the two products). Parties to trademark disputes now routinely rely on survey evidence to show consumer confusion or lack thereof. See Neal Miller, Facts, Expert Facts, and Statistics: Descriptive and Experimental Research Methods in Litigation, 40 Rutgers L. Rev. 101, 137 (1987); Natalie-Claire Woods, Survey, Survey Evidence in Lanham Act Violations, 15 Trinity L. Rev. 67, 71 (2008) (“In fact, out-of-court consumer polling is perhaps the most well received method of introducing, either directly or as an expert witness opinion, evidence regarding the reactions of the public to the trademarks at issue. These surveys and polls are used to determine the aforementioned issues of confusion, secondary meaning, and suggestiveness or generic nature of a trademark. Results of the surveys are offered into evidence directly or as opinion of an expert witness.” (footnotes omitted)). We discuss a number of examples of social facts in trademark cases in Part II below. the feasibility and potential value of social fact research commend its use across many types of cases for many kinds of questions. Proper social fact studies should replace “social framework analysis.”

I. Social Facts Defined and Described

Social facts, as defined by Walker and Monahan, 17 See Walker & Monahan, supra note 12, at 881 & n.26 (contrasting their definition of “social fact” with that of Marvell, whose definition of the term was closer to Davis’s definition of “legislative fact” (quoting Kenneth Culp Davis, An Approach to Problems of Evidence in the Administrative Process, 55 Harv. L. Rev. 364, 423–25 (1942)) (internal quotation marks omitted)). are a type of “adjudicative fact[],” as defined by Kenneth Culp Davis almost 60 years ago. 18 See generally Davis, supra note 17, at 402 (“When an agency finds facts concerning immediate parties—what the parties did, what the circumstances were, what the background conditions were—the agency is performing an adjudicative function, and the facts may conveniently be called adjudicative facts.”). The Federal Rules of Evidence incorporated Davis’s definition of adjudicative facts into Rule 201’s provisions on the kinds of facts that may be judicially noticed, and the Advisory Committee’s Note to Rule 201 provides guidance on the meaning and scope of adjudicative facts—those “which relate to the parties”:

When a court or an agency finds facts concerning the immediate parties—who did what, where, when, how, and with what motive or intent—the court or agency is performing an adjudicative function, and the facts are conveniently called adjudicative facts. . . .

Stated in other terms, the adjudicative facts are those to which the law is applied in the process of adjudication. They are the facts that normally go to the jury in a jury case. They relate to the parties, their activities, their properties, their businesses. 19 Fed. R. Evid. 201 advisory committee’s note (alteration in original) (quoting 2 Kenneth Culp Davis, Administrative Law Treatise 353 (1958)).

Social facts, then, are a special type of adjudicative fact, produced by applying social science techniques to case-specific data in order to help prove some issue in the case. 20Walker & Monahan, supra note 12, at 881. One of the most common examples of a social fact is statistical evidence in a discrimination case used to prove that a protected category status was a reliable predictor of employment outcomes at a particular organization. 21 See id. at 880 (identifying the common use of social science research as a fact-finding tool in Title VII cases). In such cases, a statistician, economist, or other social scientist with statistical training analyzes case-specific applicant or employee data using reliable statistical techniques to assess the impact of various employee characteristics (e.g., race, education level, prior experience) on outcomes and to estimate the likelihood that any observed disparities associated with the protected category variable (race, in this example) would arise from chance after controlling for disparities associated with other variables (education and experience, in this example). 22Of course, there will be differences of opinion as to the best model to use to estimate these effects, but so long as a defensible, reliable approach is applied to adequate data, these differences of opinion will not necessarily render the statistical opinions inadmissible. See, e.g., Steven L. Willborn & Ramona L. Paetzold, Statistics Is a Plural Word, 122 Harv. L. Rev. F. 48, 56 (2008), http://www.harvardlawreview.org/media/pdf/willborn_paetzold.pdf (“All statistical methods involve a host of underlying assumptions. In an ideal world, whenever a particular method is used, all of its underlying assumptions would be perfectly and fully met, especially in situations involving issues as important as civil rights. But in practice, with real-world data, this simply does not happen. For this reason, among others, experts have to make choices. . . . What is important in both social science and litigation is that the expert reveal the choices that were made and the extent to which assumptions are met.” (footnote omitted)).

A. Comparing Social Facts to Social Authority and Social Frameworks

Social fact evidence can be contrasted with “social authority” 23 See generally John Monahan & Laurens Walker, Social Authority: Obtaining, Evaluating, and Establishing Social Science in Law, 134 U. Pa. L. Rev. 477, 478 (1986) (proposing a shift to consideration of empirical data as “social authority”). and “social framework” 24 See generally Walker & Monahan, supra note 6, at 559 (describing the use of general research results to construct a frame of reference for analyzing factual issues). evidence, the other two broad categories of evidence derived from social science research. Social science research serves as social authority when general-causation principles (e.g., the impact of segregation on educational achievement) or descriptions drawn from general social science research (e.g., aggregated responses to attitude surveys) serve as the basis for law making. 25 See Monahan & Walker, supra note 23, at 499 (“Courts should place confidence in a piece of scientific research to the extent that the research . . . is generalizable to the case at issue . . . .”). Social authority is similar to Davis’s conception of “legislative facts.” See Davis, supra note 17, at 402. Monahan and Walker argued that social science used as the basis for legislative facts should be seen as a source of legal authority rather than as a source of facts and hence the label “social authority.” Monahan & Walker, supra note 23, at 488. Conceiving social science as legal authority rather than factual information affects how social science should be presented to courts or other law-making bodies, how social science should be evaluated by those bodies, and how resistant to change laws based on social science should be (i.e., the precedential value of social science and the conditions under which social authority should be altered). See id. at 495–516 (suggesting ways in which courts should treat social science evidence). For instance, congressional committees considered social science research on discrimination against older workers, and the enacted version of the Age Discrimination in Employment Act contained findings drawn from this research to show the need for greater protection of older workers. 2629 U.S.C. § 621(a) (2006). Reconceiving social science used for general law-making purposes as social authority instead of legislative facts, as it is treated by Davis, 27 See Davis, supra note 17, at 402 (classifying evidence that informs legislative judgment as “legislative facts”). encourages courts to treat social science as an analogue to legal precedent that can be revisited as the social science evolves. 28For a discussion of the implications of viewing social science as social authority rather than as legislative facts, see John Monahan & Laurens Walker, Empirical Questions Without Empirical Answers, 1991 Wis. L. Rev. 569, 573–74.

Social science research serves as social framework evidence when general-causation principles or descriptive information drawn from general social science research provides a frame of reference or context that may help jurors understand the meaning of case-specific evidence admitted at trial. 29Walker & Monahan, supra note 6, at 559. For instance, research on the relationship between eyewitness confidence and accuracy may help a juror evaluate eyewitness testimony in a case. 30 See, e.g., State v. Chapple, 660 P.2d 1208, 1223–24 (Ariz. 1983) (holding that the exclusion of expert testimony on the factors that affect the reliability of eyewitness identifications was reversible error). Similarly, information about the frequency with which abuse victims continue to interact with their abusers may be used to counter possible misconceptions about the relationship between a defendant and the alleged victim. 31 See, e.g., People v. McGuiness, 665 N.Y.S.2d 752, 754 (App. Div. 1997) (finding no error in the admission of expert testimony “explaining behavior that would otherwise appear unusual to the average juror; for example, why a victim of sexual abuse might not immediately report such abuse, as is the case here, or why a child would continue contact and maintain a relationship with the abuser”). Social frameworks inhabit a middle ground between social authority and social facts: social frameworks involve general propositions drawn from social science research (as does social authority), but these general propositions are used by the factfinder to help resolve a dispute in a specific case (as are social facts). 32Monahan et al., Ascendance of Social Frameworks, supra note 5, at 1725–27.

Social facts thus differ from social authority and social frameworks in two significant and interrelated respects: (1) social facts involve case-specific descriptive or causal claims, whereas social authority and social frameworks involve general propositions about causation or about the prevalence of certain behaviors, characteristics, or outcomes in the aggregate; and (2) because social facts involve case-specific claims, social facts require the application of sound methods and principles to case-specific data to reach descriptive and causal conclusions about the case at hand. As alluded to in the Introduction, “social framework analysis” of the kind performed by the plaintiffs’ expert in Dukes conflates what we call social frameworks and social facts because the expert, basing his opinions only on general social science research, offers case-specific claims without conducting case-specific research using generally accepted social science methods. 33 See Dukes v. Wal-Mart Stores, Inc., 222 F.R.D. 137, 152 (N.D. Cal. 2004) (describing the expert’s conclusions about Wal-Mart’s practices following a review of the evidence and organizational research on the topic), aff’d sub nom. Dukes v. Wal-Mart, Inc., 509 F.3d 1168 (9th Cir. 2007), aff’d in part on reh’g sub nom. Dukes v. Wal-Mart Stores, Inc., 603 F.3d 571 (9th Cir.) (en banc), cert. granted, 131 S. Ct. 795 (2010). If general social science principles are to be linked to a specific case by a social science expert, then those linkage opinions need to be based on a social fact study and not on a naked claim of expert judgment that is the equivalent of ipse dixit. 34 See Monahan et al., Limits of Social Frameworks, supra note 5, at 317–18 (noting that a reliance on experience, alone, lacks the necessary scientific rigor for admission); David L. Faigman, Evidentiary Incommensurability: A Preliminary Exploration of the Problem of Reasoning from General Scientific Data to Individualized Legal Decision-Making, 75 Brook. L. Rev. 1115, 1135 (2010) (“Put another way, scientist-experts are limited to testifying about what their respective field’s research can validly add to fact-finders’ deliberations—and nothing more. This injunction, however, is not always followed. In particular, experts frequently seek to comment not simply on the import of general research findings, but on whether a particular case fits those findings. Scientific research that permits a valid description of a general phenomenon, however, does not invariably give experts the capacity to validly determine whether an individual case is an instance of that general phenomenon.”).

B. Types of Social Facts

A wide variety of social science methods can be used to produce social facts. What a party hopes to learn should drive the design of a social fact study because different research designs have different possibilities and limitations. In some cases, the goal will be description. In other cases, learning why some outcomes or behaviors occurred, and why others did not, may be the goal. In yet other cases, testing the parties’ case-specific hypotheses may be the goal.

1. Obtaining Descriptive Information

Typically, parties rely on their own discovery statements and testimony, the testimony of Federal Rule of Civil Procedure 30(b)(6) representatives, and the testimony of nonparty witnesses to provide descriptive information about a case. In many cases, a descriptive social fact study could provide more reliable information about facts relevant to a case. For instance, parties to a trademark dispute can present the testimony of a few consumers who were and were not confused about the source of a product, or they can present the results of a study that systematically assessed confusion among consumers. 35 See, e.g., Pharmacia Corp. v. Alcon Labs., Inc., 201 F. Supp. 2d 335, 368 (D.N.J. 2002) (looking to a survey of ophthalmologists and optometrists to determine that they could readily distinguish between the pharmaceuticals at issue). Courts have recognized the superiority of the latter evidence, so much so that a failure to introduce a consumer confusion survey can lead to an adverse inference against the claimant. 36 See id. at 373 (“The Court is aware that Pharmacia is not legally required to conduct a confusion survey. But under the circumstances of this case, Pharmacia’s failure to conduct any confusion survey weighs against its request for a preliminary injunction. Such a failure, particularly when the trademark owner is financially able, justifies an inference ‘that the plaintiff believes the results of the survey will be unfavorable.’” (quoting Charles Jacquin et Cie, Inc. v. Destileria Serralles, Inc., 921 F.2d 467, 475 (3d Cir. 1990))). Another example of a survey used for descriptive social fact purposes is found in high-profile cases where attorneys seek to buttress their arguments that unfavorable press justifies a change of venue with surveys designed to measure the negative impact of pretrial publicity on the attitudes and beliefs of potential jurors. 37 See Christina A. Studebaker et al., Assessing Pretrial Publicity Effects: Integrating Content Analytic Results, 24 Law & Hum. Behav. 317, 319–20 (2000) (“Public opinion surveying has been referred to as ‘the technique of choice for showing that a likelihood of prejudice exists’ because a large number of people in the relevant community can be contacted relatively quickly in order to assess the amount of knowledge people have about a case (presumably derived from pretrial publicity) and their opinions about the defendant.” (quoting Michael T. Nietzel & Ronald C. Dillehay, Psychologists as Consultants for Changes of Venue: The Use of Public Opinion Surveys, 7 Law & Hum. Behav. 309, 312 (1983))). Community surveys often figure prominently as well in obscenity cases, where they are used to gauge local views toward the materials in question. 38 See, e.g., State v. Williams, 598 N.E.2d 1250, 1257 (Oh. Ct. App. 1991) (“[A] properly conducted opinion poll may be relevant to a determination of whether the particular film in question is obscene. On the issue of relevance, the poll must be relevant to a determination of both community standards in general and the community’s acceptance of viewing the particular film in question.”). Other methods can also be used to count or summarize relevant data. For instance, descriptive summary statistics based on an analysis of employee records may be used to give substance to anecdotal evidence of discrimination in a disparate treatment case (along with statistics-based causal claims about the sources of any disparity). 39 See, e.g., Hazelwood Sch. Dist. v. United States, 433 U.S. 299, 307–08 (1977) (“Where gross statistical disparities can be shown, they alone may in a proper case constitute prima facie proof of a pattern or practice of discrimination.”).

Where the goal is descriptive, the particular research design will depend on the things to be counted or described. If the researcher seeks to tabulate or describe only historical facts, then a research design that ensures reliable collection and review of adequate case-specific historical data will be needed. 40The key is to develop a systematic protocol for categorizing the data based on the facts of interest using a consistent level of analysis. See generally Gary King et al., Designing Social Inquiry: Scientific Inference in Qualitative Research (1994) (discussing the importance of rigorous analysis of qualitative data). If the researcher seeks to describe ongoing behaviors, outcomes, beliefs, or other present facts, then historical data will need to be supplemented with new data, either via a survey or observational study. 41An example of an observational social fact study conducted for descriptive purposes is found in Sepulveda v. Wal-Mart Stores, Inc., a wage-and-hour case, where the defendant offered an observational study of the work of assistant managers as evidence of time spent in exempt versus nonexempt activities. 237 F.R.D. 229, 236 (C.D. Cal. 2006), aff’d in part, rev’d in part, 275 F. App’x 672 (9th Cir. 2008). In either case, it may be appropriate to sample the available evidence, rather than attempt to tabulate every fact, depending on the amount of data involved and the amount of data needed to reach reliable descriptive inferences. 42For a discussion of the uses of sampling techniques to gather evidence in complex cases, see Laurens Walker & John Monahan, Sampling Evidence at the Crossroads, 80 S. Cal. L. Rev. 969 (2007). For instance, state revenue agencies often use statistically reliable samples of sales or use records in connection with tax audits, and this evidence can play a key role in administrative or court proceedings. 43 See, e.g., Aerostructures Corp. v. Revenue Comm’r, No. 03-1412-III, 2004 WL 3528278, at *2 (Tenn. Ch. Ct. Nov. 8, 2004) (“The Court finds from the proof that this was an appropriate case in which to perform a sample audit. The Court finds that the volume of the taxpayer’s records was too large to audit them. . . . [T]he Court finds that all the criteria the Department requires for a sample audit were present, and the sample audit was a reasonable method for the Department to use in this case to determine tax liability.”).

In all cases, getting the facts right is important, but doing so can be particularly difficult when the relevant facts are in the possession of large numbers of nonparties, as in trademark disputes, or where the data is voluminous, as in many class actions or business cases involving a tremendous number of transactions recorded electronically. In such cases, a social fact study may be necessary to ensure a reliable factual basis for the court’s legal conclusions or to help the court navigate through complex and contested evidence. 44Such a study could be performed by experts for the parties or by a court-appointed expert. See Manual for Complex Litigation (Fourth) § 11.51, at 112 (2004) (discussing the benefits of court-appointed experts). A court-appointed expert can assemble her own facts and is not limited to considering the evidence presented by the parties and their experts. See id. § 11.51, at 113.

2. Obtaining Explanatory Information

Where a party seeks explanatory information, or to gain a better understanding of the issues in a case, the nature of the outcomes or behaviors to be studied will determine the nature of the research design. If the outcome, event, or behavior to be explained is embedded in a rich social environment, then an interview, survey, or observational study may be appropriate for learning why people seem to behave in particular ways and whether others have proper expectations about behavior. For example, an observational study of work performance may be particularly useful to an expert’s understanding of the job relatedness of items on a selection test. 45 See, e.g., Wayne F. Cascio, Sex Discrimination in the Workplace: Lessons from Two High-Profile Cases, in Sex Discrimination in the Workplace 143, 145 (Faye J. Crosby et al. eds., 2007) (relating testimony based on observations of how firefighters performed rescues). Similarly, if the goal is to explore the impact of changes in conditions on behaviors or outcomes across a range of assumptions, then a computer simulation may be appropriate. 46Computer simulations may be particularly appropriate when the relevant characteristics of agents or organizational practices can be defined using clear parameters, and the effect of one characteristic on another can be mathematically modeled or reduced to symbolic logic. See generally Eliot R. Smith & Frederica R. Conrey, Agent-Based Modeling: A New Approach for Theory Building in Social Psychology, 11 Personality & Soc. Psychol. Rev. 87, 88 (2007). Thomas Schelling’s study of neighborhood segregation illustrates the effective use of agent-based modeling in assessing the impact of environmental conditions on individual behaviors. Id. at 89. Schelling’s model assumed that each individual would avoid neighborhoods in which he would have minority status. Id. Thus, where a party’s theory of the case can be converted into clear behavioral or organizational rules that should have certain impacts on some dependent measure, a computer simulation may be used to test the theory or, more likely, to estimate the range of effects that should be observed in the actual case. If the thing to be explained is a historical event unlikely to recur, then an interview or survey study may be appropriate as well as an experiment designed to simulate the event (to the extent possible).

An experimental simulation will be particularly appropriate for simple historical events. A good example is found in the case of Gil v. Mazzuca, which involved the manslaughter prosecution of a man who threw a bucket of hardened plaster off an apartment building roof, killing a police officer on the street below. 4791 F. Supp. 2d 586, 587–88 (S.D.N.Y. 2000). Mr. Gil claimed that he meant for the bucket to hit an empty sidewalk near the police officer and did not believe the bucket could reach him. 48 Id. at 588. To support his case, Gil offered the testimony of a clinical psychologist who examined Gil and found cognitive deficits that affected his judgment, a physicist who testified about the trajectory of a bucket under specified conditions, and an experimental psychologist who studied people’s beliefs about physics to testify about lay misconceptions. 49 Id. The experimental psychologist conducted an experiment in which individuals threw a bucket off a roof under physical conditions designed to simulate those in Gil’s case and found that the great majority of participants overshot their target, underestimating the distance the bucket would travel. 50Brief Amicus Curiae of the American Psychological Ass’n in Support of Appellant at 11–12, People v. Gil, 674 N.Y.S.2d 651 (App. Div. 1998) (No. 10639/93), available at http://www.apa.org/about/offices/ogc/amicus/people.pdf (“Dr. McCloskey had nineteen men do exactly what appellant did—throw a bucket of plaster from a building. The results were exactly what one would expect [in light of Dr. McCloskey’s prior intuitive physics research]: seventeen of 19 men threw the bucket of cement beyond the target—some as much as 25 feet beyond—despite the fact that almost all of the respondents thought they had hit the target or fallen short. (More than half thought they had fallen short.)”). The trial court admitted the testimony of the clinical psychologist and the physicist but excluded the testimony of the experimental psychologist on grounds that participants in the academic and social fact experiment were not acting under equivalent circumstances, namely, that none of the study participants were throwing the bucket under conditions of agitation or anger (i.e., the testimony was barred due to a lack of fit or external validity). 51 Gil, 91 F. Supp. 2d at 588–89 & n.1. While this ruling seems well within the judge’s discretion, 52 See id. at 591 (“Nowhere in all of petitioner’s 57-page brief in support of this writ, or in oral argument before this court, or in our own independent research, were we able to find any cases that stand for the proposition that the expert testimony offered here is constitutionally mandated. On the contrary, the trial judge’s evaluation that the proposed expert testimony differs enough from the facts of this case that it might confuse the jury is precisely the sort of discretionary judgment that courts are permitted to make.”). we could imagine another judge admitting the results of the study to support the defendant’s theory of factual mistake if there were evidence indicating that the intended landing area was a safe zone. 53The district court considering Gil’s habeas petition emphasized that even if Gil had intended to land the bucket on the sidewalk, such behavior was still likely reckless in light of the number of people in the area with immediate access to the sidewalk. See id. at 592 (“[W]e concur with the trial and appellate courts’ evaluation that the issue of whether petitioner intended the bucket to hit the sidewalk or the street does not assist the jury in determining whether such an act is reckless. Even assuming that petitioner did in fact make sure that no one was on the sidewalk when he ‘lobbed’ the bucket, any number of scenarios could have resulted in someone from petitioner’s building or the crowd of onlookers entering the zone of danger once the bucket had left petitioner’s hands.”). The larger point is that the case illustrates how a simple event can be simulated to support or counter what a party claims to have thought or understood about the event in question.

In some cases, a social fact study may help to determine what potential jurors believe about social science phenomena relevant to the case and whether social framework testimony on the phenomena might assist the jurors. For instance, to bolster the argument for admission of expert testimony on the fallibility of memory in I. Lewis (“Scooter”) Libby’s obstruction of justice trial, the defense submitted evidence from a survey of jury-eligible citizens in the District of Columbia on their beliefs about the workings of memory. 54United States v. Libby, 461 F. Supp. 2d 3, 10 (D.D.C. 2006). The court concluded that this survey and other evidence did not establish that jurors were so unaware of the foibles of memory that they would be aided by the expert testimony. 55 Id. at 18 (“Based on the foregoing, the Court cannot conclude that the defendant has satisfied his burden of establishing that the expert testimony of Dr. Bjork will be helpful to the jury. Not only are the studies offered by the defendant inapposite to the situation here, but the theories upon which Dr. Bjork would testify are not beyond the ken of the average juror.”). Had the survey been more closely tailored to the issues of the case and shown greater disparity between lay beliefs and social science findings, the testimony might have been admitted. 56 See id. at 16 (“[E]ven if this Court could accept the proposition that these research studies support the defendant’s proposition that jurors do not have an understanding of memory errors such as the errors that allegedly occurred in this case, which it cannot do, the Court declines to accept the findings of these studies for a more basic reason—the reliability of these studies as applied to this case is questionable.”).

Implementing social fact studies to gain an understanding of the causes and consequences of behaviors may be particularly helpful for remedial purposes, especially in structural reform litigation, where courts are often seen as insufficiently situated institutionally to gather the information needed to understand the nature of a problem and formulate effective policies. 57 See Michael C. Dorf, Legal Indeterminacy and Institutional Design, 78 N.Y.U. L. Rev. 875, 941–42 (2003) (“Even the most enthusiastic defenders of structural reform litigation recognize that courts are at best ‘sub optimal decision makers’ in [prison reform].”). Dorf notes that “problem-solving courts,” especially some drug courts, have recognized the benefits of systematic review of case outcomes by persons with expertise in program evaluation. See id. at 939 (noting the Center for Court Innovation’s efforts to systematize monitoring of treatment providers and the potential for such monitoring regimes to ratchet up performance benchmarks nationwide). An example of a study undertaken for litigation purposes that ultimately had far-reaching remedial impact was the observational study of traffic stops performed for the case of State v. Soto. 58734 A.2d 350 (N.J. Super. Ct. Law Div. 1996). In Soto, a group of African-American plaintiffs alleged that the New Jersey State Police had engaged in discriminatory enforcement of traffic laws from 1988 to 1991 and sought suppression of evidence gathered during their traffic stops. 59 Id. at 352. To support their claim, the plaintiffs introduced a survey designed by a psychologist of traffic stops along a segment of the New Jersey Turnpike involving three exits. 60 Id. The research team observed the race both of motorists stopped and of violators not stopped by police, and found that black motorists were stopped at a disproportionately high rate. 61 Id. at 352–54. The trial court found this evidence to be proof of a de facto policy of the police to target black motorists, 62 Id. at 360–61. but the study’s impact was felt far beyond the suppression of evidence in Soto—this case-specific study led to a systemic review of police practices in New Jersey. 63 See Commonwealth v. Lora, 886 N.E.2d 688, 701 n.29 (Mass. 2008) (“The Soto decision had far-ranging effects within New Jersey. It led to a review of the law enforcement practices of the New Jersey State police, which led the New Jersey Attorney General to conclude ‘that defendants perceived to be African-American, Black or Hispanic are entitled to discovery [regarding racial profiling] for motor vehicle stops that originated as a result of observations made by [New Jersey] State Troopers.’ In 2000, the Supreme Court of New Jersey issued an administrative order, at the request of the New Jersey Attorney General, assigning one judge to hear all motions for discovery relating to racial profiling by the New Jersey State police to ensure ‘centralized judicial management’ of the rapidly emerging issue.” (alterations in original) (citations omitted) (quoting State v. Lee, 920 A.2d 80, 82, 85 (N.J. 2007))). Accordingly, this study, which was designed for social fact purposes (i.e., to assess whether the New Jersey police engaged in the practice of racial profiling that was likely to have affected the plaintiffs adversely), wound up being used as social authority.

3. Testing Specific Hypotheses

Where a litigant desires to test a case-specific hypothesis, the research design should emphasize control over the independent variables (“a cause in a causal relationship” 64 Donald P. Schwab, Research Methods for Organizational Studies 303 (2d ed. 2005). ) and accurate measurement of the dependent variables (“the consequence in a causal relationship” 65 Id. at 301. ), so that valid conclusions about causation can be reached. 66A hypothesis need not be causal in nature—it may posit some other association among variables without specifying a causal relation. We focus more attention on causal hypotheses, however, as they are often the hypotheses of interest in cases. Many studies undertaken for testing purposes will lead to explanatory information, and many studies undertaken to better understand a phenomenon may involve testing of a case-specific hypothesis. But the latter need not involve testing of a hypothesis, causal or otherwise, for it may be designed to lead to alternative theories of a case or to understand how pieces of evidence fit together in the minds of the actors involved. The ideal means of testing a causal hypothesis is through use of an experiment in which participants are randomly assigned to different experimental conditions and their behaviors are recorded to assess how changes in experimental conditions affect the behavior in question. 67 See Stephen G. West, Alternatives to Randomized Experiments, 18 Current Directions Psychol. Sci. 299, 299 (2009) (“The randomized experiment (RE) enjoys a reputation as the gold standard of research designs. When this design can be properly implemented and its assumptions are met, it enables making strong, transparent inferences about causality that are unrivaled by those produced by other designs.”). Ideally, an experiment would also involve random selection of participants, but true random selection of experimental participants is rare. Geoffrey Keppel & Sheldon Zedeck, Data Analysis for Research Designs 16 (1989). Random assignment is typically an adequate control for differences that individual participants bring to the study that could affect how they react to the experimental tests if sample sizes are sufficient. See id. at 16–17 (“Random assignment is critical to the assumption that the groups formed prior to the introduction of the experimental treatments are equal . . . .”). Individual differences of participants may be an important influence on how individuals respond to various experimental conditions, but these differences should cancel one another out with random assignment and adequate sample size so that an influence of the independent variables on the dependant variable can be detected. See id. (“If subjects can be considered equal at the outset, then any differences that occur after introduction of a treatment can be attributed to the experimenter’s intervention.”). Nevertheless, more detailed testing may reveal that individual differences interact with the experimental variables to produce different patterns of results for different groups. See, e.g., Gregory Mitchell, Why Law and Economics’ Perfect Rationality Should Not Be Traded for Behavioral Law and Economics’ Equal Incompetence, 91 Geo. L.J. 67, 140–42 (2002) (discussing sex differences in risk taking for choices framed as gains versus losses). For a discussion of how to select sample size for an experiment, see generally John A. List et al., So You Want to Run an Experiment, Now What? Some Simple Rules of Thumb for Optimal Experimental Design 7–12 (Nat’l Bureau of Econ. Research, Working Paper No. 15701, 2010).

Perhaps the best example of a social fact experiment is a “tester study,” in which potential plaintiffs from a protected group and comparable others outside the group pose as job or loan applicants, renters, or home buyers to test whether members of the different groups are treated differently by employers, lenders, landlords, or sellers. 68 See Devah Pager, The Use of Field Experiments for Studies of Employment Discrimination: Contributions, Critiques, and Directions for the Future, 609 Annals Am. Acad. Pol. & Soc. Sci. 104, 105, 125 (2007) (discussing the use of field experiments to detect patterns of discrimination). Such studies are often called “audit studies” within academic research. Id. at 109. In correspondence tester studies, a matched set of application materials are compiled that differ only with respect to indicators of the sex or race of the applicant (the independent variable), employers are randomly selected and assigned to receive one set of application materials, and the number of interviews granted to the matched applicants (the dependent variable) is measured. 69 Id. at 109–10. With in-person tester studies, members of two races or opposite sexes, who are matched for appearance and qualifications and are trained to answer queries equivalently, apply in person for jobs, with employer treatment of each being recorded shortly after each interaction. 70 Id. at 111–12. Correspondence tests are cheaper and easier to implement than in-person tests, but it can be difficult to send clear cues about race (clearer expectations about the sex associated with certain names makes this less problematic for gender tester studies) in correspondence materials, and a limited range of jobs use electronic or mail solicitations for candidates. 71 Id. at 110–11. In-person tests avoid the problem of signaling demographic status because of the personal interactions involved, and in-person tests gather a greater range of dependent measures, but they are costlier and more difficult to implement than correspondence tests. 72 Id. at 112. Either type of study when done well, however, will provide reliable information about the likelihood of a particular defendant to discriminate on the basis of sex, race, or some other protected trait. 73 See id. at 109 (“In the case of employment discrimination, two main types of audit studies offer useful approaches: correspondence tests and in-person audits.”). Where a single potential defendant is the target of inquiry, multiple tester applications (in person or by mail) to different locations or managers or by a variety of matched testers will be necessary to establish a pattern. Id. at 125.

Another common social fact experiment occurs in trademark disputes where the researcher tests for the effects of source confusion on consumer purchases. In one case of alleged consumer confusion between two brands of non-cola soda, people entering a store were given a coupon for fifty cents off any brand of soda, with the exception of cola-flavored soda, to encourage them to purchase one of the products involved in the dispute. 74Squirt Co. v. Seven-Up Co., 207 U.S.P.Q. (BNA) 12, 24 (E.D. Mo. 1979). After checking out and paying for their purchases, consumers were asked by the researchers if they had used the coupon they had received, and each consumer who answered yes was then asked, “What brands of six-pack 12-ounce cans of non-alcoholic beverage other than cola flavor did you buy?” 75 Id. (internal quotation marks omitted). The researchers recorded the consumers’ answers and then compared them with the actual brands of soda in their shopping bags. 76 Id. The consumers were asked, “May I see the six-pack(s) of [each brand] you bought? I know it is an inconvenience to unpack your groceries so I will give you a $3.00 gift certificate good on any purchase of $3.00 or more at this store through July 30th if you will show me the six-pack(s) you bought.” Id. (internal quotation marks omitted).

Sometimes a social fact experiment will be designed simply to reject the null hypothesis of a chance relationship between two variables without directly testing for the cause of any nonrandom relationship, as with the experiment conducted in the case of James Newsome. 77Newsome v. McCabe, 319 F.3d 301, 306 (7th Cir. 2003) (permitting the introduction of a chi-square analysis to calculate the probability of three eyewitnesses independently committing chance effort in a lineup identification). After being cleared of a murder conviction on actual innocence grounds fifteen years after being imprisoned, Mr. Newsome filed a civil rights action against the City of Chicago and its three police officers who allegedly induced three witnesses to falsely identify Newsome as the assailant in a murder case. 78 Id. at 302. To support his argument that the police did not use fair lineup procedures and encouraged witnesses to identify him falsely, Newsome retained Dr. Gary Wells, an expert on eyewitness identifications, to conduct an experiment on the likelihood that three individuals would select Newsome over the alleged true perpetrator, Dennis Emerson, from a pictorial lineup containing both men:

[Wells] showed two panels of subjects different pictures of Emerson for 15 seconds then, after some time had passed, showed them pictures of the men in the lineup and asked them to choose the one they had seen in the initial photograph. Of 50 members on the first panel, none selected Newsome’s photo; of 500 members on the second panel (which was shown a different photo of Emerson), 15 chose Newsome’s photo. Performing a chi-square test, Wells calculated that the probability of all three eyewitnesses independently picking Newsome out of a lineup by chance error was substantially less than one in 1,000, implying that the officers must have manipulated their identifications. 79 Id. at 305–06.

The appellate court upheld the district court’s admission of the results of Wells’s experiment, noting that the experiment yielded relevant information despite its failure to test for the causal influence of the police on identifications:

Chicago does not contend that there was a better way to find out whether [the witnesses] would have identified Newsome without the coaching. Instead it insists that Wells’ testimony was irrelevant because he did not determine how the witnesses had been induced to believe that they saw Newsome commit the murder. Yet testimony need not prove everything in order to be useful. As we have said, the jury had to consider the possibility that unhappy chance rather than malfeasance led to the mistaken conviction. Wells provided information valuable in this endeavor. 80 Id. at 306.

Several alternatives to experiments exist for testing case-specific hypotheses. We have already mentioned the most common means of testing hypotheses in employment cases: statistical analyses of applicant or employee records. 81 See supra notes 21–22 and accompanying text. Hypotheses can also be tested using observational studies 82 See West, supra note 67, at 302–03 (discussing the ways in which observational studies allow researchers to assess causal effect). or using qualitative data from case records by systematically coding the records (or a sample of the records) according to a predetermined protocol designed to identify and categorize or measure instances of the variables of interest. 83 See Laura A. King, Measures and Meanings: The Use of Qualitative Data in Social and Personality Psychology, in The SAGE Handbook of Methods in Social Psychology 173, 178–89 (Carol Sansone et al. eds., 2004) (discussing considerations of framing, designing, recruiting, coding, and interpreting in qualitative research).

The point that social fact studies can be performed on existing documentary evidence and thus do not require the collection of new data deserves emphasis. Unlike the physical sciences, much social science data begins as qualitative data that is converted to quantitative data or is otherwise systematized for descriptive or causal inference purposes. Records available in litigation may present a rich qualitative data source that can be systematically reviewed using reliable qualitative and quantitative social science techniques. 84For an extended discussion of the use of qualitative data for scientific inference purposes, see King et al., supra note 40. Thus, it is never the case that the litigation context necessitates reliance on “expert judgment” in the guise of “social framework analysis” as the means of linking social science to a particular case. If data cannot be analyzed reliably using qualitative or quantitative methods (due, say, to concerns about bias in the selection of case records), then no scientifically reliable inferences can be drawn from the data. “Social framework analysis” cannot render inadequate data adequate.

II. The Admissibility of Social Facts

Because social facts involve the application of social scientific techniques to a particular case, social facts are typically introduced through an expert witness who conducted or oversaw the social fact research. That means the expert’s opinions must satisfy the threshold requirements for admissibility, which in federal court are: (1) that the opinions help the factfinder understand other evidence in the case or determine a fact at issue; (2) that the opinions be based on sufficient data and reliable methods; and (3) that the opinions apply reliably to the facts of the case. 85 Fed. R. Evid. 702. Rule 702 was amended “in response to Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993), and to the many cases applying Daubert.” Id. advisory committee’s note. Social fact studies, if properly conceived to shed light on the issues in dispute in a particular case, should satisfy the helpfulness requirement because social fact studies generate useful information that the factfinder would otherwise lack. 86 See supra Part I.B (discussing how social fact studies provide descriptive information, explain the occurrence of behaviors, and test hypotheses). Social fact studies, if properly performed, should likewise satisfy the scientific reliability and fit requirements 87 See Daubert, 509 U.S. at 589 (“[A]ll scientific testimony or evidence . . . [must be] not only relevant, but reliable.”); id. at 591 (“Rule 702 further requires that the evidence or testimony ‘assist the trier of fact to understand the evidence or to determine a fact in issue.’ This condition goes primarily to relevance. . . . The consideration has been aptly described by Judge Becker as one of ‘fit.’ ‘Fit’ is not always obvious, and scientific validity for one purpose is not necessarily scientific validity for other, unrelated purposes.” (citation omitted) (quoting Fed. R. Evid. 702 and United States v. Downing, 753 F.2d 1224, 1242 (3d Cir. 1985))). Social-science-based evidence equivalent to what we call social facts was admitted prior to Daubert on grounds that such evidence was helpful to a jury and, sometimes, on the additional grounds that the expert used generally accepted principles or methods to reach her case-specific opinions. For example, before Daubert, statistical evidence was typically examined just for relevance and prejudice and was not barred by Frye’s general acceptance standard. See David H. Kaye et al., The New Wigmore: A Treatise on Evidence § 11.3.1, at 371 (2004) (discussing Frye v. United States, 293 F. 1013 (D.C. Cir. 1923)). And psychiatric assessments, if subjected to any special scrutiny at all, see id. § 7.8.2, at 270, usually survived scrutiny so long as generally accepted principles or procedures served as the basis for the opinion. See, e.g., United States v. Gould, 741 F.2d 45, 49–50 (4th Cir. 1984) (“[W]henever an insanity defense is sought to be raised by the proffer of evidence of a newly-identified mental ‘disease or defect,’ the proffered evidence is relevant for the purpose only if there is shown to be substantial acceptance within the relevant discipline of the general hypothesis that the disorder may deprive some persons of the substantial capacity either to appreciate the wrongfulness of the particular conduct in issue or to conform their conduct to the particular requirements of law in issue.”), superseded by statute, Insanity Defense Reform Act, Pub. L. No. 98-473, 98 Stat. 2057 (1984), as recognized in United States v. Worrell, 313 F.3d 867, 872 (4th Cir. 2002). After Daubert, the admission of properly performed social fact studies should not be in question on scientific reliability or fit grounds, even if the social fact study involves a novel application of social scientific principles or methods. Such an approach could well be barred under the Frye standard, however. See, e.g., Commonwealth v. Crews, 640 A.2d 395, 402 (Pa. 1994) (“[I]t is the conclusions to be drawn from the statistical information accumulated to date regarding DNA matches that has not achieved widespread acceptance within the scientific community. . . . Therefore the trial court properly refused to entertain statistical information regarding the match. The expert, however, was permitted to testify that the match of three out of four loci made it more probable than not that the sperm was that of the defendant. This type of expert opinion testimony does not violate Frye, and thus was properly admitted.”). because social fact studies, by definition, involve the application of reliable social science principles and techniques to the facts of a particular case.

The reliability and fit requirements for expert opinions parallel scientific concerns with the internal and external validity of a study’s findings. 88 See, e.g., John Brewer & Albert Hunter, Multimethod Research 158 (Sage Library of Soc. Research 175, 1989) (“Research to generate and test causal hypotheses is usually judged in terms of two standards: internal and external validity.”). Examining research for internal validity involves asking “whether the methods and analyses employed were sound enough to justify the inferences drawn by the researcher.” 89 Monahan & Walker, supra note 2, at 68; accord Thomas D. Cook & Donald T. Campbell, Quasi-Experimentation: Design & Analysis Issues for Field Settings 37 (1979) (“Internal validity refers to the approximate validity with which we infer that a relationship between two variables is causal or that the absence of a relationship implies the absence of cause.”); Robert M. Lawless et al., Empirical Methods in Law 36 (2010). (“Internal validity refers specifically to the extent to which the research design allows the drawing of valid inferences about the relationships between variables.”). Examining research for external validity involves asking “whether the inferences drawn from the study can be applied to groups beyond those actually studied.” 90 Monahan & Walker, supra note 2, at 68–69; accord Cook & Campbell, supra note 89, at 37 (“External validity refers to the approximate validity with which we can infer that the presumed causal relationship can be generalized to and across alternate measures of the cause and effect and across different types of persons, settings, and times.”). As we discuss in the succeeding sections, whereas social fact studies raise special concerns about internal validity due to the possible contamination of results from the associated litigation, such studies will also tend to have greater external validity for the case at hand than general social science studies.

A. Internal Validity

Different methods and datasets pose different internal validity concerns, but standard solutions exist for many of these problems. 91For example, although intentional and unintentional destruction of records is a concern for any social fact study, the problem of missing data is not new to social science research. See Daniel A. Newman, Missing Data Techniques and Low Response Rates: The Role of Systematic Nonresponse Parameters, in Statistical and Methodological Myths and Urban Legends: Doctrine, Verity and Fable in the Organizational and Social Sciences 7, 8 (Charles E. Lance & Robert J. Vandenberg eds., 2009) (noting the difficulties faced by a data analyst when sampled individuals do not respond to a survey or survey item). A variety of methods exist for dealing with this problem and estimating the impact of the missing data on study conclusions. See id. at 12 tbl.1.2 (identifying the three levels of missing data and the corresponding data-analytic methods for handling each). Rather than discuss threats to internal validity generally, 92For such a discussion, see Monahan & Walker, supra note 2, at 68–72. we focus here on two special concerns that may arise due to the litigation context in which social fact studies occur: biased expert witnesses and contamination from the associated litigation. As we explain, expert witness bias presents no greater concerns for social fact testimony than other types of case-specific testimony, and we then discuss a variety of ways of dealing with data contamination concerns. 93Our focus here is on issues that may affect the internal validity of a social fact. In Part III below, we deal with questions of legal access to the data needed for a social fact study and the ethics of conducting social fact studies.

1. Bias

As with any form of expert testimony, 94 See, e.g., Samuel R. Gross, Expert Evidence, 1991 Wis. L. Rev. 1113, 1231 (“Expert testimony is a sizeable cottage industry that is geared entirely to provide effective partisan evidence.”). social fact studies present potential problems of expert bias and overclaiming. However, the influence of expert biases and pressure from lawyers to give favorable opinions should pose less of a problem for social fact testimony than other forms of social-science-based testimony that go beyond simply reporting on research performed for purely academic purposes 95Of course, many researchers expect their academic work to have applied purposes and undertake academic research to try to influence public policy debates. Thus, the academic-purpose/litigation-purpose distinction may adhere to an idealistic view of pure scientific research that is hard to find in practice. See Daubert v. Merrell Dow Pharm., Inc., 43 F.3d 1311, 1317 (9th Cir. 1995) (“One very significant fact to be considered is whether the experts are proposing to testify about matters growing naturally and directly out of research they have conducted independent of the litigation, or whether they have developed their opinions expressly for purposes of testifying. That an expert testifies for money does not necessarily cast doubt on the reliability of his testimony, as few experts appear in court merely as an eleemosynary gesture. But in determining whether proposed expert testimony amounts to good science, we may not ignore the fact that a scientist’s normal workplace is the lab or the field, not the courtroom or the lawyer’s office. That an expert testifies based on research he has conducted independent of the litigation provides important, objective proof that the research comports with the dictates of good science.” (footnote omitted)). because an essential requirement for internal validity is that the researcher document the steps taken with sufficient detail for another researcher to replicate her results. 96 See Gregory Mitchell, Empirical Legal Scholarship as Scientific Dialogue, 83 N.C. L. Rev. 167, 176 (2004) (noting the ability of intersubjective testing and review to increase objectivity of empirical research). Indeed, one of the strongest points in favor of social facts, compared to “social framework analysis,” is transparency in the process whereby social facts were formed. If an expert cannot present the data and methods that led to social fact opinions in a particular instance for another social scientist to review, then the social fact fails to meet reliability requirements. 97 See Monahan et al., Limits of Social Frameworks, supra note 5, at 315–17 (discussing the scientific requirement that the bases for a researcher’s inferences be made public).

2. Contamination

A more serious threat to the internal validity of a social fact study is that ongoing or threatened litigation may somehow alter the behavior of those being observed. For instance, in EEOC v. Dial Corp., a researcher retained by the EEOC administered a questionnaire to assess whether a hostile work environment existed within the defendant corporation. 98No. CIV.A. 99-C-3356, 2002 WL 31061088, at *1–3 (N.D. Ill. Sept. 17, 2002). Potential respondents included a number of plaintiff class members, and respondents were notified of the study’s purpose but were told that their responses would be confidential. 99 Id. at *4–5. The defendant moved to exclude expert testimony based on the questionnaire, and the court ruled that, among other problems with the study, apparent bias in responses made the questionnaire results unreliable. 100 Id. at *9 (“[T]he inclusion of a large number of class members in the survey appears to have strongly influenced the overall results, which further supports the defendant’s position that the survey data do not reliably reflect the views or experiences of the overall population of relevant employees.”).

Where there are valid concerns that participant confidentiality or anonymity will not remove the threat of litigation contamination, alternative approaches should be taken to minimize this threat. One alternative is to employ nonreactive methods (e.g., analysis of archival records, computer simulations, or truly unobtrusive observational studies where the fact of observation is not apparent). 101 See, e.g., Jerry M. Newman, Discrimination in Recruitment: An Empirical Analysis, 32 Indus. & Lab. Rel. Rev. 15, 17 (1978) (distributing fictitious résumés to employers unaware of the experiment to assess racial dimensions of their employment decisions). Another is to employ a design that studies third parties rather than the parties to the lawsuit.

Yet another option, where the parties or their agents need to serve as participants, is to conduct the study in such a way as to conceal the fact of the study or at least the study’s purpose, as is the case with tester studies, and if possible, to use persons who are blind to the study’s purpose to administer the study. 102 See, e.g., Vita-Mix Corp. v. Basic Holding, Inc., 581 F.3d 1317, 1325 (Fed. Cir. 2009) (double-blind study of blend users with respect to their manner of use of stir stick in patent infringement case); Marlo v. UPS, Inc., No. CV 03-04336DDP(RZX), 2005 WL 6197774, at *10 (C.D. Cal. Mar. 1, 2005) (double-blind survey of employees regarding their duties in wage-and-hour case). A number of unobtrusive methods other than the tester study exist for measuring even such socially charged topics as racism and discrimination. For instance, e-mail experiments can be conducted where the apparent race, ethnicity, or gender of the correspondent is systematically varied and responses to requests within the e-mails are measured. 103 See, e.g., Brad J. Bushman & Angelica M. Bonacci, You’ve Got Mail: Using E-mail to Examine the Effect of Prejudiced Attitudes on Discrimination Against Arabs, 40 J. Experimental Soc. Psychol. 753, 758 (2004) (“The current study used a novel procedure, the lost e-mail technique, to demonstrate that prejudiced individuals discriminate against Arabs when they can remain anonymous.”). Or, an experiment can be embedded in an observational study, where the race or sex of an interacting partner is systematically varied and the interactions are recorded unobtrusively to test for disparate treatment. 104 See, e.g., Samuel L. Gaertner et al., Race of Victim, Nonresponsive Bystanders, and Helping Behavior, 117 J. Soc. Psychol. 69, 73–77 (1982) (assessing how a victim’s race and the presence of bystanders impacted not just the decision to help, but also latency and heart rate measures).

A fourth alternative is to conduct a study with similarly situated persons who are not involved in the lawsuit. This approach was employed in Whiteway v. FedEx Kinko’s Office and Print Services, Inc., a wage-and-hour class action covering center managers employed in California. 105No. C 05-2320 SBA, 2007 WL 2408872, *1–2 (N.D. Cal. Aug. 21, 2007), rev’d, 319 F. App’x 688 (9th Cir. 2009). Because agents of the defendant were not supposed to have contact with class members, an expert for the defendant conducted a study of the exempt and nonexempt duties performed by a sample of branch managers in other western states. 106 Id. at *8. The plaintiff challenged the study on grounds that it did not examine the activities of the actual class members (which is an external validity challenge on the basis of participants’ characteristics, a topic we address below), but the court rejected this challenge: “FedEx argues, and Whiteway does not effectively rebut, that there is no operational/functional difference between the centers in California and the centers in other western states surveyed.” Id. (citation omitted). This approach should be possible in any large organization where similarly situated teams, units, or branches can be observed or assigned to different conditions of a study. 107 See, e.g., id. at *9 (“[T]here remains no evidence[] that . . . the job duties/responsibilities of any Center Manager . . . are any different than another.”).

Yet another possible check on demand effects arising from participant knowledge of the lawsuit would be the inclusion within each study condition of a subgroup of participants who are expressly told that the study is connected to the lawsuit. This would allow researchers to examine whether such knowledge leads to differences in behavior within any of the conditions. Whether the added cost and complexity of such an added condition are warranted will depend on the nature of the study and the likelihood of response bias in the event the study’s connection to the lawsuit is suspected. To prevent interference by a party or employees of a party, efforts should be made to limit knowledge of the study or its purposes to a small control group. In cases of severe concern, checks for party interference could be built into the study (e.g., use of confederates within the participant pool or misinformation to the company about what effects would be expected to vindicate a party’s theory of the case to check for potential managerial tampering). 108 See, e.g., Cara Laney et al., The Red Herring Technique: A Methodological Response to the Problem of Demand Characteristics, 72 Psychol. Res. 362, 364 (2008) (“The Red Herring technique allows naturally curious subjects to ‘figure out’ what the study is about without actually figuring out what the study is about (and thus becoming subject to demand). . . . [I]t is applicable to a wide range of studies in psychology, especially those involving deception.”); id. (“The Red Herring is an extra layer of information (separate from both what subjects were told and what we were actually studying), intended to provide a plausible explanation for the tasks subjects are asked to complete. That is, we are doubly deceiving subjects.”).

Given these solutions, social fact studies should not be dismissed out of hand on grounds of contamination from the associated litigation, as some have done. 109Tuli v. Brigham & Women’s Hosp., Inc., 592 F. Supp. 2d 208, 214 (D. Mass. 2009) (“While one could attempt to perform [case-specific] tests, their scientific integrity would be fatally compromised when conducted within the context of a lawsuit against those individuals or the corporation that employs them. Social scientific research into the basic principles of sex stereotyping normally involves voluntary participants who are assured (and can rely on these assurances) of the complete anonymity and confidentiality of their responses. It is unlikely that researchers could obtain candid and uncensored self-reports of attitudes from employees who are aware that the research is related to a pending lawsuit against the organization that employs them. Thus, concerns about scientific validity . . . do not recommend mounting an organizational investigation using standard social science techniques.” (quoting expert report of Dr. Peter Glick)); Jennifer S. Hunt et al., Scientific Status, in 2 Modern Scientific Evidence: The Law and Science of Expert Testimony § 18:14 (David L. Faigman et al. eds., 2008) (“In addition, the argument that contract research is the appropriate method of determining whether gender stereotyping occurred ignores several important limitations of conducting such research. One serious problem with the contract approach is that there is no way of determining the extent to which employees’ responses will be tainted by knowledge of the litigation underway, the sponsors of the survey, or the potential ramifications of their responses. It is likely that employees will try to give unbiased responses, even if they are not accurate. Moreover, research indicates that gender stereotyping often occurs outside of conscious awareness, so even if employees are completely honest, their responses may not reveal the actual occurrence of gender stereotyping.” (footnote omitted)). The potential influence of the researcher and research setting on results is a pervasive problem in the behavioral sciences. 110 See, e.g., Austin Lee Nichols & Jon K. Maner, The Good-Subject Effect: Investigating Participant Demand Characteristics, 135 J. Gen. Psychol. 151, 151 (2008) (“Researchers are often concerned with the presence of demand characteristics, cues that make participants aware of what the experimenter expects to find or how participants are expected to behave, and the researchers typically use methods for reducing the demand.”). Thus, the question is not whether an expert who seeks to apply social science to a particular case should rely on general social science studies possessing no potential demand characteristics or social fact studies possessing potential demand characteristics. Rather, the question is whether adequate protections were taken against demand effects in whatever research forms the basis for the opinions. We see no reason to expect demand effects to threaten findings from social fact studies any more than many other academic studies, where the participants often know they are being studied and often know the general purpose of the study before any behaviors are measured. Where such threats exist, the researcher should take precautions to minimize the risks of contamination whatever the research setting.

B. External Validity

While internal validity threats may be greater for some social fact studies due to the litigation connection, concerns about external validity threats will actually be lower precisely because the study is tailored to address questions specific to the litigation. Where the parties themselves are studied or case-specific data is analyzed using social science tools, the superior fit of social facts to findings from generic social science studies is obvious. But as we have discussed, not every social fact study will be a direct test of case-specific hypotheses using case-specific data. 111 See supra Part I.B.1–2. In some cases, the social fact study may simulate the conditions involved in the litigation or seek to gather relevant information from third parties via surveys or experiments. 112 See supra Part I.B.1–2. In these cases, degree of fit can be assessed along five dimensions that social scientists commonly consider the determinants of external validity: (1) the independent variables studied; (2) the dependent variables measured; (3) the persons studied; (4) the settings studied; and (5) the time frame studied. 113 See, e.g., Lawless et al., supra note 89, at 410 (defining external validity as the extent to which findings can be generalized to different people, settings, times, and measures); Marilynn B. Brewer, Research Design and Issues of Validity, in Handbook of Research Methods in Social and Personality Psychology 3, 4 (Harry T. Reis & Charles M. Judd eds., 2000) (“External validity . . . refer[s] to the generalizability of the causal finding, that is, whether it can be concluded that the same cause–effect relationship would be obtained across different subjects, settings, and methods.”); T. D. Cook, Generalization: Conceptions in the Social Sciences, in 9 International Encyclopedia of the Social & Behavioral Sciences 6037, 6037 (Neil J. Smelser & Paul B. Baltes et al. eds., 2001) (“[S]ocial scientists typically draw conclusions about four not completely independent entities—human populations, physical settings, causes, and observables . . . . Time, understood as historical period, might also be added.”). If a change along one of these dimensions would lead to a change in the behaviors or causal relations observed before the change, then the study’s findings will not be externally valid for the changed domain. 114 See Cook, supra note 113, at 6037 (noting the threats to external validity posed by extrapolating findings to different age groups, geographic areas, and institutional settings).

A researcher designing a social fact study should be conscious of differences between the study and the case at hand lest the study’s results be deemed inadmissible on lack of fit grounds. 115 See, e.g., Cnty. of Kenosha v. C & S Mgmt., Inc., 588 N.W.2d 236, 253–54 (Wis. 1999) (excluding survey of community views in an obscenity trial where “the innocuous description of the types of activities the survey respondent was to consider [wa]s too far removed from the graphic scenes of sexual activity in [the videotape in question] to be relevant on the question of whether that particular video is obscene”). A judge considering the fit of research to the case should ask whether surface differences between the case facts and research design are meaningful differences that render inferences from the research misleading or inapplicable.

1. Independent Variables: Do the Hypothesized Causal Variables in the Research Approximate the Causal Variables in the Plaintiff’s and Defendant’s Theories of the Case?

The significance of the choice of independent variables to the external validity of a social fact study is apparent. If the expert fails to study the variables hypothesized by the parties to have caused or contributed to the outcomes in the case, then the social fact study will not fit the facts of the case. For instance, in a gender discrimination case based on a failure to hire, the independent variable under the plaintiff’s theory of the case may be applicant sex, whereas the defendant may argue that applicant qualifications caused the outcome in the case. 116In most such cases, sex will be treated as a dichotomous independent variable (i.e., with the categories male versus female), but in cases where failure to conform to gender stereotypes is the basis for a sex discrimination claim, degree of fit with gender stereotypes might be the proper independent variable. Thus, for a social fact study of gender discrimination to be externally valid on the independent variable dimension in this case, the study must examine the impact of applicant sex and qualifications on employment outcomes. 117An expert may choose to test only one of the independent variables, in which case the study will be externally valid but its statistical conclusion and internal validities are threatened by the failure to control for a third variable that could explain or affect the covariation between the independent and dependent variables studied.

To be externally valid, the independent variable need not perfectly mirror the variable alleged to be the direct cause of harm to the plaintiff or alleged to provide an excuse for the defendant’s conduct. For instance, in the pictorial lineup experiment conducted in the Newsome case, 118 See supra notes 77–80 and accompanying text. the appellate court upheld the district court’s admission of the experimental results despite surface differences between the case and the experiment:

Experiments of the kind that Wells performed are the norm in this branch of science and have met the standard for scholarly publication and acceptance. There were of course potential problems. For example, Wells assumed that Emerson is the killer, so that the witnesses saw him; if anyone other than Emerson committed the murder, the test is invalid. Wells was candid about this vital assumption, which was open to probing and argument by the defendants. Wells also assumed that two-dimensional images (pictures) yield the same effects on memory as three-dimensional views (live action in the victim’s grocery store; lineups in the police station; identifications in open court). This may or may not hold, but the claim of equivalence was open to exploration at trial, and it is hard to see what else Wells could have done. Even if he could have conscripted Emerson and the lineup participants for an experiment, time has so altered their appearance since the events of October 1979 that the results would have been unreliable. . . . Yet testimony need not prove everything in order to be useful. . . . Wells’ testimony was not a distraction in this civil proceeding but went to an important ingredient of the plaintiff’s claim. 119Newsome v. McCabe, 319 F.3d 301, 306 (7th Cir. 2003).

2. Dependent Variables: Are the Outcomes Studied in the Research Similar to the Outcomes at Issue in the Litigation?

Deciding on the proper dependent variable for a social fact study will be straightforward in many cases, as in a failure-to-hire case where the fact of hiring (or not) is the proper dependent measure in a statistical analysis of applicant data or another form of social fact study. In some cases, the ultimate legal question of interest may be more difficult to operationalize. For instance, in Lanham Act cases, in which social fact studies are frequently used, 120 See Siegrun D. Kane, Trademark Law § 16:6, at 16-14 (PLI Intellectual Prop. Law Library No. G1-8804, 4th ed. 2006) (“The judicial attitude toward surveys has moved 180 degrees in past decades. While many courts used to exclude survey evidence as inadmissible hearsay or give it little weight, surveys are now generally looked on as significant evidence.”); supra note 16 (discussing the use of surveys in trademark cases). the key legal question is whether consumers or potential consumers are likely to be confused by two specific products or services. 121Squirt Co. v. Seven-Up Co., 207 U.S.P.Q. (BNA) 12, 19 (E.D. Mo. 1979). Turning this question into a testable proposition may be more difficult than it seems. 122Framing the question properly can be seen as a problem of both external validity and construct validity. “Construct validity involves making inferences from the sampling particulars of a study to the higher-order constructs they represent.” William R. Shadish et al., Experimental and Quasi-Experimental Designs for Generalized Causal Inference 65 (Kathi Prancan ed., 2002).Construct validity and external validity are related to each other in two ways. First, both are generalizations. . . . Second, valid knowledge of the constructs that are involved in a study can shed light on external validity questions, especially if a well-developed theory exists that describes how various constructs and instances are related to each other.Id. at 93. For example, answers to a survey question that asks, “What is the first thing that comes to mind when looking at [the allegedly infringing trademark]?,” have been uniformly deemed inadmissible because “calling to mind” differs from “likelihood of confusion.” 1236 J. Thomas McCarthy, McCarthy on Trademarks and Unfair Competition § 32:176, at 32-375 to -376 (4th ed. 2008) (internal quotation marks omitted). As Judge Rich observed:

The very fact of calling to mind may indicate that the mind is distinguishing, rather than being confused by, two marks. . . . Seeing a yellow traffic light immediately “calls to mind” the green that has gone and the red that is to come, or vice versa; that does not mean that confusion is being caused. As we are conditioned, it means exactly the opposite. 124 In re Ferrero, 479 F.2d 1395, 1397 (C.C.P.A. 1973).

Similarly, a study investigating if the Ghostbusters movie logo is “reminiscent” of the Casper, the Friendly Ghost character does not provide sufficient evidence that audiences will be confused about whether the movie logo and cartoon character have a common source. 125Harvey Cartoons v. Columbia Pictures Indus., Inc., 645 F. Supp. 1564, 1573 (S.D.N.Y. 1986). “[I]f the survey questions are not congruent with the issues in the case, the results will not only be irrelevant, but may also be prejudicially misleading to a jury . . . .” 126 McCarthy, supra note 123, § 32:170, at 32-351; accord Starter Corp. v. Converse, Inc., 170 F.3d 286, 297 (2d Cir. 1999) (holding that it was proper for district court to exclude research because the survey questions were “little more than a memory test” and, therefore, were not probative of the likelihood of confusion); J & J Snack Foods, Corp. v. Earthgrains Co., 220 F. Supp. 2d 358, 370 (D.N.J. 2002) (“Above all, the survey’s design must fit the issue which is to be decided by the jury, and not some inaccurate restatement of the issue, lest the survey findings inject confusion or inappropriate definitions into evidence, confounding rather than assisting the jury.”); Franklin Res., Inc. v. Franklin Credit Mgmt. Corp., 988 F. Supp. 322, 335 (S.D.N.Y. 1997) (“Surveys which do nothing more than demonstrate the respondents’ ability to read are not probative on the issue of likelihood of consumer confusion.”); Jacob Jacoby, Survey and Field Experimental Evidence, in The Psychology of Evidence and Trial Procedure 175, 186 (Saul M. Kassin & Lawrence S. Wrightsman eds., 1985) (“[C]ourts raise two points regarding the questions posed to respondents: (1) Do these questions address the legal issues that are relevant to the case? (2) If so, are the questions posed in a clear and unbiased manner?”). The risk of wasting resources due to an ill-conceived dependent variable argues strongly for the social fact expert to work closely with the attorneys to ensure that the study examines proper variables.

3. Persons: Are the Research Participants Similar to the Persons Involved in the Case?

Under the Lanham Act, potential confusion only matters for the “consumers and potential consumers” of the products or services in question. 127 See, e.g., Sizes Unlimited, Inc. v. Sizes to Fit, Inc., 871 F. Supp. 1558, 1561 (E.D.N.Y. 1994) (“Consumers and potential consumers of a product must, on the basis of the mark at issue, associate the goods or services at issue with a single source, even if that source is anonymous.”). It is not surprising, therefore, that an initial task in designing social fact research for use in trademark litigation is to determine the universe of people whose level of confusion is to be estimated. 128 McCarthy, supra note 123, § 32:159, at 32-319. “Selection of the proper universe is a crucial step, for even if the proper questions are asked in a proper manner, if the wrong persons are asked, the results are likely to be irrelevant.” 129 Id.; accord Robert C. Bird, Streamlining Consumer Survey Analysis: An Examination of the Concept of Universe in Consumer Surveys Offered in Intellectual Property Litigation, 88 Trademark Rep. 269, 276–77 (1998) (“Determination of the universe represents one of the most significant challenges a survey expert will face in drafting a consumer survey. A misaligned universe can doom otherwise competent research and trigger an adverse decision by the court.”); Shari Seidman Diamond, Survey Research, in 1 Modern Scientific Evidence, supra note 109, § 8:10 (“One of the first steps in designing a survey or in deciding whether an existing survey is relevant is to identify the target population (or universe).”); Lawrence E. Evans, Jr. & David M. Gunn, Trademark Surveys, 79 Trademark Rep. 1, 31 (1989) (“Errors in [selecting the universe] are more likely to prove fatal than errors in the content of the questions, for there is some value in a slanted question asked of the right witness, but no value in asking the right question of the wrong witness.”); Jacoby, supra note 126, at 179–80 (“It has become axiomatic in trademark case law that the key consideration in the design of a survey is whether the appropriate universe was tested. More surveys are held inadmissible or given no weight for having employed an improper universe than for any other reason.” (citations omitted)).

The precise population of consumers or potential consumers whose confusion is at issue in a given trademark dispute, of course, will vary according to the types of products or services being purchased. 130 See McCarthy, supra note 123, § 23:5, at 23-23 (“[C]ustomers may be consumers, professional purchasers or wholesalers or retailers. A potential customer is one who might someday purchase this kind of product or service and pays attention to brands in that market.” (footnote omitted)). For example, in one case, the proper population for a survey relating to confusion between fishing reels was persons over fourteen years who had fished in fresh water in the prior twelve months. 131Brunswick Corp. v. Spinit Reel Co., 832 F.2d 513, 523 n.6 (10th Cir. 1987). In another case, the proper population for a survey on confusion between prescription medications for glaucoma was found to be ophthalmologists and optometrists who, as the prescribers, “make the ultimate determination as to which medications pharmacists will dispense and end-users—patients—will receive.” 132Pharmacia Corp. v. Alcon Labs., Inc., 201 F. Supp. 2d 335, 365 (D.N.J. 2002). In the landmark case of Zippo Manufacturing Co. v. Rogers Imports, Inc., involving dueling cigarette lighters, the relevant population was determined to be “all smokers aged eighteen years and older residing in the continental United States.” 133216 F. Supp. 670, 681 (S.D.N.Y. 1963).

A focus on the wrong population can be fatal. In Amstar Corp. v. Domino’s Pizza, Inc., the researcher surveyed women at home during daylight hours. 134615 F.2d 252, 264 (5th Cir. 1980). Responses from this sample were found to be inadequate for proving a likelihood of confusion between defendant’s Domino’s pizza restaurants and plaintiff’s Domino sugar because the defendant’s customers were mainly young, single, male college students, whereas the plaintiff’s customers were mainly middle-aged women. 135 Id. The women surveyed, having little exposure to Domino’s pizza trademark, were not the appropriate population for assessing confusion. 136 Id.

The risk of different results between participant samples extends to the general social science literature as well. For example, consumer research involving student samples has been found to produce different results from research involving nonstudent samples in a number of respects. 137 See Robert A. Peterson, On the Use of College Students in Social Science Research: Insights from a Second-Order Meta-Analysis, 28 J. Consumer Res. 450, 458 (2001) (“[T]he present research suggests that, by relying on college student subjects, researchers may be constrained regarding what might be learned about consumer behavior and in certain instances may even be misinformed.”). Thus, off-the-rack research, even if it directly addresses the independent and dependent variables implicated in a case, may lack external validity due to differences between the people studied and the relevant population for a case.

The particular characteristics of individuals that will pose external validity threats differ by type of case. For instance, in an obscenity case, the community from which the sample is drawn will be important. In eyewitness identification studies, the race of participants will be important due to potential bias in same- versus other-race identifications. 138On differences in identifications by race, see Christian A. Meissner & John C. Brigham, Thirty Years of Investigating the Own-Race Bias in Memory for Faces: A Meta-Analytic Review, 7 Psychol. Pub. Pol’y & L. 3, 21 (2001). A particular concern with extrapolating from general social science research to a specific discrimination case, as happens when an expert performs “social framework analysis,” is that many subjects of personnel experiments possess little work experience and are less personally interested in or motivated to attend to the tasks at hand than actual job applicants, employees, and managers at many companies. 139 See supra note 6 and accompanying text. Further, these differences have been shown to affect the likelihood that bias will be observed in personnel decisions. 140 See, e.g., Randall A. Gordon & Richard D. Arvey, Age Bias in Laboratory and Field Settings: A Meta-Analytic Investigation, 34 J. Applied Soc. Psychol. 468, 485–86 (2004) (“For the most part, the results from analyses that examined the issue of generalizability show that greater and more relevant information and greater and more relevant experience among raters, judges, or supervisors leads to less age bias.”); Cynthia M. Marlowe et al., Gender and Attractiveness Biases in Hiring Decisions: Are More Experienced Managers Less Biased?, 81 J. Applied Psychol. 11, 18 (1996) (“Managers of all experience levels exhibited bias in the rating conditions, despite the fact that all of our applicant photographs were rated as being at least somewhat attractive. . . . These biases tended to decrease as managerial experience increased, except that less attractive women were routinely judged to be the worst applicants.”); Dianna L. Stone et al., Methodological Problems Associated with Research on Unfair Discrimination Against Racial Minorities, 18 Hum. Resource Mgmt. Rev. 243, 251 (2008) (“Interestingly, an analysis of the average effect size estimates revealed that studies using non-representative samples had a larger effect size (r = .24) than those using representative samples (r = .14). These results suggest that race may have less of an effect on personnel decisions in actual organizational settings than in contrived settings.”). A major advantage of a social fact study over “social framework analysis” is that social fact studies may include job applicants, employees, and managers drawn from the pool of persons involved in the case or persons matched on levels of education, experience, training, and demographic characteristics to the persons involved in the litigation. 141 See, e.g., supra notes 58–62 and accompanying text.

4. Setting: Are the Research Setting and Tasks Similar to Those in the Case?

In trademark cases, the setting in which the social fact study has been conducted affects the admissibility of the study and its ability to persuade: “[T]he closer the survey methods mirror the situation in which the ordinary person would encounter the trademark, the greater the evidentiary weight of the survey results.” 142 McCarthy, supra note 123, § 32:163, at 32-333; accord Kane, supra note 120, § 16:6.1, at 16-20 (“The more remote the survey is from actual marketplace conditions, the less persuasive it will be.”). For example, in the case of alleged consumer confusion between two brands of non-cola soda where actual consumer behavior was observed, 143 See supra notes 74– 76 and accompanying text. the court commented favorably on this market survey:

[A]n actual purchaser is asked to list the brands he has just purchased, and then asked to display the brands he has named, in order to determine the accuracy of his listed purchases. . . . [T]he fact that the survey was conducted in a live market environment and measured actual consumer purchasing behavior as opposed to being conducted in the home and measuring consumer opinion, lends greater reliability to the survey results. 144Squirt Co. v. Seven-Up Co., 207 U.S.P.Q. (BNA) 12, 25–26 (E.D. Mo. 1979).

Some courts have discounted or rejected surveys that did not reproduce the state of mind of consumers in an actual retail setting. As Judge Wyzanski famously commented:

If the interviewee is not in a buying mood but is just in a friendly mood answering a pollster, his degree of attention is quite different from what it would be had he his wallet in his hand. Many men do not take the same trouble to avoid confusion when they are responding to sociological investigators as when they spend their cash. 145Am. Luggage Works, Inc. v. U.S. Trunk Co., 158 F. Supp. 50, 53 (D. Mass. 1957).

Others, however, have taken the view that it is not necessary that a survey be administered only to people with wallet in hand in a live market environment. 146 See McCarthy, supra note 123, § 32:163, at 32-334 (“To require that a survey be taken ‘during the buying decision’ is an impossible requirement tantamount to rejecting all survey evidence.”). As Judge Feinberg stated in Zippo:

While it may be that in general the store is the best place to measure the state of mind at the time of purchase, it would be virtually impossible to obtain a representative national sample if stores were used. An interview at a respondent’s home is probative of his state of mind at the time of purchase, although the deviation from the actual purchase situation should be considered in weighing the force of this evidence. 147Zippo Mfg. Co. v. Rogers Imports, Inc., 216 F. Supp. 670, 685 (S.D.N.Y. 1963) (footnotes omitted).

The setting dimension presents one of the strongest arguments for social fact studies in discrimination cases because it can be difficult to simulate the relevant conditions of a particular organization in a lab or to find another field setting that matches the defendant organization in key respects. In post-hire discrimination cases (and many hiring cases), managers almost always have more personalized information about the employees (or applicants) in question and operate under stronger potential penalties for discrimination than, for example, college students playing the role of managers in laboratory simulations and making decisions about hypothetical or role-playing employees. Importantly, both of these factors—degree and type of personalized knowledge and accountability pressures on decisionmakers—are known to moderate the likelihood that prejudicial attitudes or stereotypes will bias judgments or decisions about women or minorities, and a host of other organization-specific factors may dampen or magnify the prospect of unbiased decision making. 148 See, e.g., Frank J. Landy, Stereotypes, Bias, and Personnel Decisions: Strange and Stranger, 1 Indus. & Organizational Psychol. 379, 380, 384 (2008) (noting the impact of adequate safeguards and individuating information on workplace discrimination); Philip E. Tetlock et al., The Challenge of Debiasing Personnel Decisions: Avoiding Both Under- and Overcorrection, 1 Indus. & Organizational Psychol. 439, 440 (2008) (finding empirical and theoretical support for the proposition that accountability pressures will push decisionmakers to value individuating over implicit biases). See generally Philip E. Tetlock & Gregory Mitchell, Implicit Bias and Accountability Systems: What Must Organizations Do to Prevent Discrimination?, 29 Res. Organizational Behav. 3, 11–18 (2009) (critiquing the argument that equal opportunity is impossible in a society with inequality of result). Furthermore, local labor market conditions will likely impact how organizations hire, promote, and pay their employees; conditions within the industry and the region in which the organization is located may also impact employment practices. 149For example, within the United States, racial biases are correlated with geographic region, income levels, and educational attainment. See, e.g., Peter Burns & James G. Gimpel, Economic Insecurity, Prejudicial Stereotypes, and Public Opinion on Immigration Policy, 115 Pol. Sci. Q. 201, 212–17 (2000) (discussing the impact of contextual and personal factors on racial attitudes in 1992 and 1996); cf. Nakajima v. Gen. Motors Corp., 857 F. Supp. 100, 105 (D.D.C. 1994) (“Where, as here, the expert’s opinion is based on an incorrect assumption about the country in which a plaintiff will reside, the testimony should not be permitted because it fails to serve its purpose of aiding the trier of fact in its determination of lost future earnings.”); id. (“Additionally, plaintiff’s contention that the use of United States’ statistical and economic data is necessary because comparable Japanese data is not available is not supported by the record. A review of [an expert’s] deposition and the Year Book of Labour Statistics, published by the Japanese Ministry of Labour, shows that adequate Japanese data on the factors considered under District of Columbia law exists. Therefore, the testimony of [the expert], insofar as it is based on a presumption of Nakajima’s future residence in the United States, is excluded.” (citation omitted)).

Findings from the lab based on college students making low-stakes or hypothetical decisions in limited interactions may yield important findings on basic cognitive and motivational processes involved in impression formation and intergroup behavior, but these findings may provide little guidance to the behaviors likely to be observed under the conditions of the real workplace involved in the case. Accordingly, the ability of a social fact study to be customized to include particular job classifications, particular divisions in a firm, particular stores or factories owned by a firm, particular tasks, and particular processes, or to analyze existing case-specific data using systematic qualitative or quantitative techniques, presents a considerable advantage over “social framework analysis” and expert speculation about the relevance of studies conducted in the lab or in different organizational settings for the case at hand.

5. Timing: Has Time Changed Anything?

The time dimension will rarely pose a severe threat to the external validity of a social fact study in a civil case due to statutes of limitations that encourage plaintiffs to file their claims within a relatively short time of the challenged conduct. 150For instance, Title VII provides that charges of discrimination should be submitted to the EEOC within 180 days of the alleged discrimination in states with no state agency devoted to handling discrimination claims or, in states with such agencies, within 300 days of the alleged discrimination or 30 days of termination of state agency proceedings (whichever date is earlier). 42 U.S.C. § 2000e–5(e)(1) (2006). Considerable time may pass, however, before the EEOC decides whether to pursue the case or issues a right-to-sue letter or between the start of a suit and involvement of an expert. Plus, the continuing violation theory of discrimination may further widen the gap between the date of alleged discrimination and the filing of suit, or the plaintiff may seek to introduce evidence of discriminatory conduct outside the limitations period to support her case. See Kyle Graham, The Continuing Violations Doctrine, 43 Gonz. L. Rev. 271, 275 (2007/08) (“The first sort of continuing violation aggregates multiple allegedly wrongful acts, failures to act, or decisions such that the limitations period begins to run on this collected malfeasance only when the defendant ceases its improper conduct. The second type of continuing violation divides what might otherwise represent a single, time-barred cause of action into several separate claims, at least one of which accrues within the limitations period prior to suit.” (footnote omitted)). Thus, where there are concerns about memory or changes due to intervening events, the researcher should take these concerns into account and seek to examine the impact of this passage of time on any newly collected data. See Barbara A. Gutek, My Experience as an Expert Witness in Sex Discrimination and Sexual Harassment Litigation, in Sex Discrimination in the Workplace, supra note 45, at 131, 137 (“But given the long time frame of this case and many other class actions, it is important to find out when relevant behaviors occurred; surely it makes a difference if all objectionable behavior occurred more than five years ago or if the amount of potentially harassing behavior increased over time.”). In contrast, “social framework analysis” may rely on general social science research conducted many years ago, meaning that temporal differences create a greater external validity risk:

At some point, many social research findings—no matter how valid at the time they were obtained—can be criticized as being outdated or “stale.” They no longer reflect the current situation. A study finding massive discrimination against women in law schools in the 1950’s, for example, may not be generalizable as evidence in a Title VII suit against a law school in the 2000’s. 151 Monahan & Walker, supra note 2, at 74. Research has found that the likelihood of finding age bias in personnel decisions decreased over the last generation. Gordon & Arvey, supra note 140, at 479. Likewise, gender and racial attitudes have liberalized considerably over the last fifty years. See, e.g., Lawrence D. Bobo & Camille Z. Charles, Race in the American Mind: From the Moynihan Report to the Obama Candidacy, 621 Annals Am. Acad. Pol. & Soc. Sci. 243, 245 (2009) (“Overall, . . . these improvements in whites’ racial attitudes are sweeping and robust . . . .”); Clem Brooks & Catherine Bolzendahl, The Transformation of US Gender Role Attitudes: Cohort Replacement, Social-Structural Change, and Ideological Learning, 33 Soc. Sci. Res. 106, 107 (2004) (“Highly restrictive attitudes, characterized by negative beliefs about women in non-domestic roles, an unwillingness to support women’s rights across a wide range of institutions, and a tendency to endorse gender-based differences in power and responsibility have evolved into seemingly more liberal attitudes.”).

The crucial question on the temporal dimension will be whether significant organizational, societal, political, or other events intervened between the date of the contested conduct and the date of the study, raising concerns about generalizing backwards in time. 152As an example of an important societal change, consider the impact of 9/11 on study outcomes: if a Muslim brought a religious discrimination case arising from events occurring before September 11, 2001, we might be concerned about the external validity of a study into the impact of religious affiliation on employment outcomes conducted after September 11, 2001. For instance, perhaps the organization instituted new managerial training or oversight after the lawsuit was filed that sensitized managers to problems of workplace discrimination (or at least to the risk of misbehavior detection). If so, use of a social fact method that involves systematic analysis of historical case records, rather than an observational or field study that focuses on current relations or behavior, may be more appropriate.

III. Legal and Ethical Issues in Social Fact Studies

All social fact studies occur in a context of general legal rules, and the opportunity to investigate a case using social fact research depends, finally, on this context. In this Part, we describe the different contexts in which research may occur to identify legal and ethical issues in each. We adopt as our central organizing variable the timing of the research in relation to the filing of suit because different concerns arise between prefiling and postfiling research. We also distinguish between research on one of the parties or their representatives and research on third parties. Our goals are to suggest judicial interpretations of applicable rules that permit greater use of social fact studies and to provide guidance to practitioners and researchers contemplating a possible social fact study.

A. Prefiling Research

Both potential plaintiffs and defendants may be interested in gathering and analyzing their own data, either to support or defend against a future filing, or to assess the likelihood that a claim could succeed should it be filed. Plaintiffs commonly make some effort to gather information before filing to determine the potential value of bringing a lawsuit. With the recent reinterpretations of Federal Rule of Civil Procedure 8 regarding the level of factual detail and support needed to survive a motion to dismiss for failure to state a claim, 153 See Ashcroft v. Iqbal, 129 S. Ct. 1937, 1949 (2009) (“To survive a motion to dismiss, a complaint must contain sufficient factual matter, accepted as true, to ‘state a claim to relief that is plausible on its face.’ A claim has facial plausibility when the plaintiff pleads factual content that allows the court to draw the reasonable inference that the defendant is liable for the misconduct alleged.” (citation omitted) (quoting Bell Atlantic Corp. v. Twombly, 550 U.S. 544, 570 (2007)); Twombly, 550 U.S. at 562–63 (“Conley’s ‘no set of facts’ language has been questioned, criticized, and explained away long enough. . . . [A]fter puzzling the profession for 50 years, this famous observation has earned its retirement. The phrase is best forgotten as an incomplete, negative gloss on an accepted pleading standard: once a claim has been stated adequately, it may be supported by showing any set of facts consistent with the allegations in the complaint.”). such prefiling fact investigation is likely to increase. A social fact study, such as a tester study in a failure-to-hire case showing systemic differential treatment of male and female applicants, could be an invaluable tool for plaintiffs seeking to gain additional factual support for their complaints.

1. Observational Studies

The propriety of a social fact study depends on the manner in which the study is conducted and the source of information used. If the research is done by systematically observing the potential defendant’s behavior in a public or semipublic place (provided no violations occur with respect to the right to enter and observe the location or premises in question), then the study should pose no legal problems. Although conducted postfiling, the observational study conducted in State v. Soto provides an excellent example of an observational study that could have been conducted prefiling using wholly public observations. 154 See supra notes 58– 62 and accompanying text.

2. Surveys and Experiments

Opportunities such as that found in Soto are limited, 155Although one can imagine unobtrusive observational studies providing useful data in important types of cases, in particular wage-and-hour class actions, for example, where observers could randomly and systematically sample the behavior of different types of workers in organizations permitting public access (e.g., restaurants, retail stores). and therefore much prefiling research by potential plaintiffs will either require use of third parties as the study participants or the use of deception in interactions with the potential defendant to gain information (explicit cooperation by a potential defendant in a study to investigate potential wrongdoing is unlikely, of course, unless the potential defendant sees some advantage to facilitating a systematic study of relevant conditions or behavior). A common form of third-party study is a survey of consumer confusion in a trademark dispute. 156 See supra notes 74–76 and accompanying text. Such studies can be performed prefiling without raising any special legal concerns. 157In a related vein, the FTC could also utilize such studies to investigate claims of false or deceptive advertising, and reputational damage caused to high-profile figures by alleged defamatory statements could be assessed prefiling using surveys and structured interviews and market analysis in the case of celebrities. As noted below, such prefiling studies on third parties may present discovery issues. See infra notes 168–77. However, to the extent a consulting expert is used to conduct the study in anticipation of litigation, the work product doctrine will provide some insulation from discovery of unfavorable study results or “false starts” that have to be scrapped. Furthermore, so long as the researcher uses informed consent or other safeguards against participant harm, there should be no special ethical constraints on such a study either.

Where the plaintiff attempts to gain prefiling data from the defendant, deception or concealment will sometimes be used by the plaintiff, and the defendant may respond by accusing the plaintiff of fraud. For instance, in Education/Instruccion, Inc. v. Copley Management and Development Corp., the plaintiffs based their discrimination claims on the results of tester studies, and the defendant responded with a counterclaim for misrepresentation, which the court deemed valid under state law. 158No. 81-532-Z, 1982 U.S. Dist. LEXIS 16667, at *2 (D. Mass. Oct. 14, 1982). The tort of misrepresentation in Massachusetts requires: “First, the tortfeasor must make a false statement which he knows to be false. Second, he must intend to incude [sic] his victim to rely on that statement. Third, his victim must, in fact, so rely.” Id. at *1–2. Under Havens Realty Corp. v. Coleman, testers are defined as “individuals who, without an intent to rent or purchase a home or apartment, pose as renters or purchasers for the purpose of collecting evidence of unlawful steering practices.” 455 U.S. 363, 373 (1982). The court then ruled that “the state law, as applied to testers, [ran] afoul of the federal constitution’s Supremacy Clause.” 159 Education/Instruccion, 1982 U.S. Dist. LEXIS 16667, at *2. In another tester case, J.K. Guardian Security Services filed suit against the testers alleging that they had committed fraud by creating fictitious résumés and making false representations about their willingness to work for the defendant, but the complaint was dismissed for failure to plead “more specific damages.” 160Robert Thomas Roos, Note, No Harm, No Fraud: The Invalidity of State Fraud Claims Brought Against Employment Testers, 53 Vand. L. Rev. 1687, 1692–93 (2000) (discussing Kyles v. J.K. Guardian Sec. Servs., Inc., 222 F.3d 289 (7th Cir. 2000)). Although these cases ended in pro-tester results, it appears that “the issue of potential fraud claims brought against employment testers remains unsettled in the vast majority of American jurisdictions.” 161 Id. at 1693.

Challenges to tester standing are unlikely to succeed, at least in federal court. The Supreme Court has ruled that testers have standing to pursue a discrimination claim under the Fair Housing Act, 162 Havens, 455 U.S. at 373–74; see also Kyles, 222 F.3d at 292 (noting that testers “have been used for years to assess compliance with the nation’s fair housing laws”); id. at 299 (“The fact that testers have no interest in a job does not diminish the deterrent role they play by filing suit under Title VII. In that regard, testers are situated similarly to unlawfully discharged employees who are ineligible for reinstatement because of wrongdoing discovered after they were fired. Evidence of such wrongdoing limits the relief they may obtain under Title VII, but it does not bar them from bringing suit.”); Molovinsky v. Fair Emp’t Council of Greater Washington, Inc., 683 A.2d 142, 146 (D.C. 1996) (“Violation of a plaintiff’s statutory rights may itself constitute an ‘actual or threatened injury’ sufficient to confer Article III standing.” (quoting Havens, 455 U.S. at 373)). But see Fair Emp’t Council of Greater Washington, Inc. v. BMC Mktg. Corp., 28 F.3d 1268, 1274 (D.C. Cir. 1994) (“[T]he facts as alleged in the complaint do not come close to indicating that either tester ‘will again be subjected to the alleged illegality’. The tester plaintiffs therefore lack standing to seek prospective relief.” (citation omitted) (quoting City of L.A. v. Lyons, 461 U.S. 95, 109 (1983))). and Title VII authorizes the EEOC to accept charges of employment discrimination “filed by or on behalf of a person claiming to be aggrieved.” 16342 U.S.C. § 2000e–5(b) (2006). Further, the EEOC has classified testers as aggrieved persons on grounds that a discriminatory rejection of employment constitutes injury regardless of whether an actual employment loss occurred. 164EEOC Compliance Manual (BNA) No. 915.002 (May 22, 1996), available at http://www.eeoc.gov/policy/docs/testers.html.

Another means of gaining prefiling information would be via the plaintiff’s systematic survey of former or existing employees. While such a survey may, from a scientific perspective, raise selection bias worries, the legal and ethical worries arise from placing present and past employees in the position of disclosing potentially restricted information and from contacting employees of a represented party. The particular risks of conducting such a study will depend on the contracts by which any employees may be bound and on the jurisdiction’s particularized privacy and ethical regulations. Under the Model Rules of Professional Conduct, counsel for one party is not supposed to contact anyone whom counsel knows or should know supervises or regularly consults with the opposing party’s counsel concerning the matter, has authority to bind the opposing party, or whose act or omission could be imputed to the other party for purposes of civil or criminal liability. 165 Model Rules of Prof’l Conduct R. 4.2 cmt. 7 (1983). Thus, so long as the contacts are with nonparticipant, low-level employees, such contacts are unlikely to be ethical violations, depending on the particular jurisdiction’s interpretation of the applicable ethical rules. 166 See id. R. 4.2 cmt. 8 (“The prohibition on communications with a represented person only applies in circumstances where the lawyer knows that the person is in fact represented in the matter to be discussed. This means that the lawyer has actual knowledge of the fact of the representation; but such actual knowledge may be inferred from the circumstances.”). Though studies by potential defendants are far less common, they do occur. In particular, applicants and employees may be subjected to testing as a condition of employment, and employees seen as posing risks in the workplace may be subjected to psychological assessments and treatment as a condition of continued employment. 167 See Constance Weisner et al., Substance Use, Symptom, and Employment Outcomes of Persons with a Workplace Mandate for Chemical Dependency Treatment, 60 Psychiatric Services 646 (2009) (noting that employer mandates pressure individuals to enter chemical dependency treatment programs). Whether the results of these studies can be used in future litigation will depend on the contractual terms of the particular employment relationship and applicable privacy regulations. Thus, the same difficulties discussed with respect to plaintiff’s prefiling studies of defendants would be present in the converse situation.

3. Discoverability

When plaintiffs or defendants engage in studies of themselves, the concern is not about the legality and ethics of gathering the data but rather what should happen should the study yield unfavorable results. For instance, in Hudson v. General Dynamics Corp., counsel for potential plaintiffs distributed questionnaires to two groups of individuals: (1) plaintiffs represented by or seeking representation by counsel; and (2) current and former employees of the defendant identified as potential witnesses. 168186 F.R.D. 271, 275 (D. Conn. 1999). Defendants sought to discover the questionnaire responses, but plaintiffs sought protection under both the attorney–client privilege and the work product privilege. 169 Id. at 273. The court ruled that questionnaires administered initially to offer legal assistance were protected by the attorney–client privilege, 170 Id. at 276. but questionnaires sent to former employees solely to solicit witness statements were not protected, even though some of the respondents later sought representation from the attorneys. 171 Id. at 277. Because these initial questionnaires were completed not for the purpose of obtaining legal advice and prior to the existence of any attempt by the recipient to create an attorney–client relationship, responses to the initial questionnaire fell outside any attorney–client privilege. 172 Id. The court ruled that there could be no retroactive application of the attorney–client privilege. Id.

Similarly, the Hudson court ruled that the work product doctrine applied to the original questionnaires received from plaintiffs seeking representation, except with respect to questionnaires to potential witnesses who later became the attorneys’ clients. 173 Id. at 276–77. The court ruled that defendants failed to show a “substantial need” for the first set of questionnaires 174 Id. at 276 (quoting Fed. R. Civ. P. 26(b)(3)(A)(ii)) (internal quotation marks omitted). but that the second set of “questionnaires [were] simply witness statements with none of the indicia or purpose of any privilege.” 175 Id. at 277. The court also ruled that “[a]ny claim of work product as to the blank questionnaire itself was waived by plaintiffs’ attorney when the questionnaire was sent out to third party former employees.” Id. Studies by a corporate defendant of the behavior of its own employees raise similar risks of disclosure if care is not taken to ensure the attorney–client or work product privilege applies. 176 See, e.g., Paul E. Starkman, Tips and Traps for the Unwary When Auditing and Measuring the Effectiveness of Corporate Compliance and Ethics Programs: An Outside Counsel’s Perspective, in Corporate Compliance and Ethics Institute 2009, at 695, 708–09 (PLI Corporate Law & Practice, Course Handbook Ser. No. 18176, 2009) (discussing ways to protect audit results from discovery, and suggesting that the “self-evaluative” privilege, the attorney–client privilege, or the work product privilege may protect results if audits were performed in anticipation of litigation). In our view, greater use of internal social fact studies should be encouraged by giving greater protection to studies undertaken for self-critical purposes. 177 Accord Greg Mitchell, Good Scholarly Intentions Do Not Guarantee Good Policy, 95 Va. L. Rev. Brief 109, 115 (2010), http://www.virginialawreview.org/inbrief/2010/02/28/mitchell.pdf (“Companies are in the best position to detect and correct workforce disparities. We need to create a set of legal rules and policies that reward internal monitoring and self-correction and that penalize deliberate ignorance about the discriminatory impact of a company’s personnel policies.”).

B. Postfiling Research

In the postfiling period, the interest of a party in conducting a social fact study of the opposing party will, in almost every case, be determined as an aspect of the discovery process. The Federal Rules of Civil Procedure were originally drafted at a time when social fact research was relatively uncommon, but a number of the Rules can be read to authorize social fact studies as part of the discovery process.

1. Social Facts Under the Federal Rules of Civil Procedure

The most straightforward means for a plaintiff, for instance, to acquire relevant documents from a defendant is through Federal Rule of Civil Procedure 26. The Rule requires a defendant to produce relevant documents relating, for example, to its employment practices, 178 See Fed. R. Civ. P. 26(a)(1)(A)(ii) (describing a party’s duty to disclose). and it serves as the foundation (along with Rules 33 and 34) for the exchange of information now regularly used as the basis for statistical social fact studies of candidate or employee data as well as the case documents and deposition testimony used in social framework analysis.

Another source of data for a social fact study may be an observational study conducted under Federal Rule of Civil Procedure 34, which provides access to party premises “so that the requesting party may inspect, measure, survey, photograph, test, or sample the property or any designated object or operation on it.” 179 Id. 34(a)(2) (emphasis added). Thus, an expert who seeks to observe the allegedly discriminatory employers on-site in the “ordinary course of business” could invoke Rule 34 as authority. 180 See, e.g., Coleman v. Schwarzenegger, Nos. CIV S-90-0520 LKK JFM P, C01-1351 TEH, 2007 WL 3231706, at *4 (E.D. Cal. & N.D. Cal. Oct. 30, 2007) (allowing experts to enter a prison to confer with and interview staff during on-site inspection, as long as staff members were “in the ordinary course of business” and “reasonably available”); Morales v. Turman, 59 F.R.D. 157, 159 (E.D. Tex. 1972) (allowing experts in a participant observation study to speak with inmates and staff of a prison to “observe the operations of special treatment centers and other locations where inmates are incarcerated”). Observational studies may be useful in wage-and-hour lawsuits. See, e.g., Whiteway v. FedEx Kinkos Office & Print Servs., Inc., No. C 05-2320 SBA, 2010 WL 1980229, at *4 (N.D. Cal. May 17, 2010) (admitting a survey showing that defendant’s other Center Managers met the company’s expectations); Sepulveda v. Wal-Mart Stores, Inc., 237 F.R.D. 229, 236 (C.D. Cal. 2006) (admitting a survey based, in part, on the observation of eighteen assistant managers in different California stores), aff’d in part, rev’d in part, 275 F. App’x 672 (9th Cir. 2008). Rule 34 would be most useful for a plaintiff conducting unobtrusive observational studies, but it could be extended to include experimental studies and surveys that directly relate to the issues in contention. 181Rule 33 interrogatories could also serve as a foundation for survey requests to defendant-employees, but relief from the numerical limits under Rule 33 would likely be necessary. See Fed. R. Civ. P. 33(a)(1) (“Unless otherwise stipulated or ordered by the court, a party may serve on any other party no more than 25 written interrogatories . . . .”). Regardless of whether the studies require public or private access to company records, sites, or employees, Rule 34 should afford the plaintiff a court-mandated opportunity to access the needed data.

Also, under Federal Rule of Civil Procedure 35, any party whose mental condition is in controversy may be compelled to submit to a mental examination by an expert. 182 Id. 35(a)(1). In an employment discrimination case, if the plaintiff were to charge, and the defendant to deny, that defendant’s employers acted with a subjective and unconscious bias, then perhaps a mental examination could provide support for the plaintiff’s claim. 183 See, e.g., Schlagenhauf v. Holder, 379 U.S. 104, 114–22 (1964) (suggesting that Rule 35 may extend to a defendant asserting his mental condition in defense of a claim). But to compel a party to submit to a mental examination, that party’s mental state must be “in controversy,” and there must be “good cause” for the examination. 184 Fed. R. Civ. P. 35(a). One party cannot put another party’s mental state in controversy. 185Koch v. Cox, 489 F.3d 384, 391 (D.C. Cir. 2007). Whether a mental condition is actually in controversy may be difficult to determine at times. For example, if a defense expert disputes the underlying science and application of the science to the case by a plaintiff’s expert who asserts that discriminatory “implicit bias” was likely at work in a company, does this count as an affirmative denial that puts the condition into dispute? In such cases, the trial judge must make a discretionary determination of whether the “in controversy” and “good cause” requirements have been met. 186 Schlagenhauf, 379 U.S. at 119; see also Lowe v. Phila. Newspapers, Inc., 101 F.R.D. 296, 299 (E.D. Pa. 1983) (“Good cause has been shown under Rule 35(a) for such an examination. Plaintiff’s emotional and mental state of health has clearly been put in issue by plaintiff.”); Brandenberg v. El Al Isr. Airlines, 79 F.R.D. 543, 546 (S.D.N.Y. 1978) (“In view of the allegations of injury and damage in the complaint, a psychiatric examination of the plaintiff under Rule 35(a) is clearly appropriate.”).

Even if an issue is deemed to be in controversy and there is good cause to compel an examination, the person to be examined must be an actual party to the action under Rule 35. 187 Fed. R. Civ. P. 35(a)(1) (“The court . . . may order a party . . . .” (emphasis added)). Typically, one who is not a party to an action, but merely an agent of the named defendant, is not covered by Rule 35. 188 See Schlagenhauf, 379 U.S. at 115 n.12 (“Although petitioner was an agent of [the defendant], he was himself a party to the action. He is to be distinguished from one who is not a party but is, for example, merely the agent of a party.”); Kropp v. Gen. Dynamics Corp., 202 F. Supp. 207, 208 (E.D. Mich. 1962) (holding that the court lacked jurisdiction to compel a truck driver, a nonparty and agent of corporate defendant, to submit to a physical examination under Rule 35(a)). But, in Beach v. Beach, the court ruled that “[o]ne who is not a party in form may be, for various purposes, a party in substance.” 189114 F.2d 479, 481 (D.C. Cir. 1940). Likewise, in Dinsel v. Pennsylvania Railroad Co., the court relied on its inherent power to order the examination of an employee of a party. 190144 F. Supp. 880, 882 (W.D. Pa. 1956). Therefore, it appears that there is at least a possibility of a court compelling an agent of a defendant to submit to a mental examination, even if the agent is not a named party who has affirmatively placed his mental state in controversy.

Together, Rules 26, 33, 34, and 35 provide a textual foundation for court-ordered access by a party to another party or representatives of that party for purposes of conducting surveys, observational studies, experimental studies, and perhaps individualized mental examinations of opponents. 191Where parties conduct social fact studies on themselves or third parties postfiling, discovery concerns should be greatly reduced so long as the study was conducted by a consulting expert for trial preparation purposes. See Fed. R. Civ. P. 26(b)(4)(B) (“Ordinarily, a party may not . . . discover facts known or opinions held by an expert who has been retained or specially employed by another party in anticipation of litigation . . . .”). A purpose-based justification for discovery in aid of social fact studies can also be supplied, for the clear policy of the Federal Rules of Civil Procedure is to encourage the exchange of relevant, nonprivileged information among parties as was the practice in equity prior to adoption of the Rules. 192 See Fleming James, Jr., Discovery, 38 Yale L.J. 746, 746 (1929) (“Equity would . . . entertain jurisdiction over a bill of discovery in aid of an intended action at law or defense therein, or a defense in an equity suit.”).

The equitable origins of modern discovery rules are clear. 193 See generally id. at 746–49 (discussing the history of the equitable bill of discovery). The common law courts had no procedure to compel the disclosure of potential evidence before trial. 194 Id. at 746. Equity, on the other hand, permitted a bill of discovery and required a sworn response. 195 Id. For instance, in Kurtz v. Brown, the court delineated both the origins and purpose of discovery when it ruled:

As the object of this jurisdiction in cases of bills of discovery is to assist and promote the administration of public justice in other courts, they are greatly favored in equity, and will be sustained in all cases where some well-founded objection does not exist against the exercise of the jurisdiction. 196152 F. 372, 375 (3d Cir. 1906) (quoting 2 Joseph Story, Story’s Equity Jurisprudence § 1488 (Fred B. Rothman & Co. 1988) (1866)).

These practices continued in the United States in federal (and most state) courts until 1938, when the Federal Rules of Civil Procedure were adopted and became available as a model for change in state court procedures. 197 Jack H. Friedenthal et al., Civil Procedure § 7.1, at 397 (4th ed. 2005). The new Federal Rules eliminated the law–equity distinction and embraced liberal discovery procedures. 198 See id. (“As one scholar has noted, broad discovery has transcended its role as a ‘mere procedural rule’ and become the cornerstone of American civil litigation.” (quoting Geoffrey C. Hazard, Jr., From Whom No Secrets Are Hid, 76 Tex. L. Rev. 1665, 1694 (1998))).

Equally certain is that the Federal Rules of Civil Procedure are meant to encourage the pretrial exchange of information. 199 See Stephen N. Subrin, Fishing Expeditions Allowed: The Historical Background of the 1938 Federal Discovery Rules, 39 B.C. L. Rev. 691, 697 (1998) (“It is probable that no procedural process offers greater opportunities for increasing the efficiency of the administration of justice than that of discovery before trial.” (quoting Edson R. Sunderland, Foreword to George Ragland, Jr., Discovery Before Trial, at iii, iii (1932))). In 1947, in Hickman v. Taylor, the Supreme Court recognized this objective:

[T]he deposition-discovery rules are to be accorded a broad and liberal treatment. No longer can the time-honored cry of “fishing expedition” serve to preclude a party from inquiring into the facts underlying the opponent’s case. Mutual knowledge of all the relevant facts gathered by both parties is essential to proper litigation . . . . 200 Id. at 691 (alterations in original) (quoting 329 U.S. 495, 507 (1947)).

The federal judiciary continues to honor this view. 201 E.g., Beale v. District of Columbia, 545 F. Supp. 2d 8, 15 (D.D.C. 2008) (“The histories, vagaries and progress of each case are unique, and the judge managing discovery is in the best position to weigh the equities.”). Moreover, the modern discovery rules recognize that complete, reliable information is needed for more accurate outcomes in trials and early resolution of cases that should not go to trial. Parties thus have a duty to undertake reasonable efforts to respond fully and accurately to discovery requests, 202 See, e.g., Fed. R. Civ. P. 26(g)(1) (“By signing, . . . [a] party certifies that[,] . . . with respect to a disclosure, it is complete and correct as of the time it is made . . . .”); id. 37(a)(4) (“[A]n evasive or incomplete disclosure, answer, or response must be treated as a failure to disclose, answer, or respond.”). and to supplement and correct their past responses as new information becomes available. 203 Id. 26(e)(1)(A).

We therefore propose that the modern question under the Federal Rules should be whether, under the circumstances of the case and in light of potential costs and benefits, the equities favor granting access to the persons, locations, or data requested for social fact purposes. In making this assessment, the potential benefits of a social fact study, which promises reliable science-based data directly relevant to the case, should weigh strongly in favor of granting access.

2. Ethical Issues

A final concern is whether ethical principles permit a judge to order a social fact investigation without the prior consent of the participants, as may be necessary in some instances to avoid contamination of study results. To our knowledge, no direct authority on this point exists, but the conclusions of the Federal Judicial Center’s Advisory Committee on Experimentation in the Law strongly suggest that judicial endorsement of a social fact study without prior consent raises no ethical problems if the potential harms to participants are slight or the potential gains are substantial 204 See Advisory Comm. on Experimentation in the Law, Fed. Judicial Ctr., Experimentation in the Law 4247 (1981). The Committee’s report was focused on the use of programmatic experiments by courts for rule-making purposes. Id. at 3. The potential harms associated with assigning litigants to alternative rule regimes are likely to be much greater than the potential harms to participants in any of the social fact studies discussed in this Article. Accord Pager, supra note 68, at 127 (“Implicit in [the court rulings permitting the use of tester studies] . . . is the belief that the misrepresentation involved in testing is worth the unique benefit this practice can provide in uncovering discrimination and enforcing civil rights laws.”). and any possible harm can be minimized with procedures other than prior informed consent. 205 See Advisory Comm. on Experimentation in the Law, supra note 204, at 118–21 (identifying procedural and statistical methods that protect the privacy of individual subjects). Indeed, because judges have at their disposal a set of protective measures greater than those of Institutional Review Boards, use of deception and potential invasions of privacy should raise less concern in postfiling social fact studies than in social science studies conducted in academic settings. In particular, the judge can put in place protective orders guaranteeing participant anonymity and confidentiality, ordering the findings’ use in court only under seal, limiting the uses to which the data can be put, mandating that participants receive particular poststudy disclosures in circumstances where prior consent was not possible, and providing for avenues of complaint and relief to aggrieved or concerned participants. 206 See Fed. R. Civ. P. 26(c)(1) (listing the circumstances in which a court may issue a protective order). Rule 26 gives trial judges broad power to regulate the discovery process through protective orders, including protections extended to nonparties from whom information is sought. See id. (“The court may, for good cause, issue an order to protect a party or person . . . .” (emphasis added)). In cases of heightened concern about potential harm from deception, the court can authorize a small-scale pilot study in which participants are debriefed immediately and any harms assessed before the full study is authorized.

Conclusion

Social fact studies can be performed scientifically and ethically. The legal rules for admissibility of social science evidence pose no insurmountable hurdle to the use of social facts, as courts today regularly accept social fact evidence in a variety of forms. The legal rules likewise permit the collection of the data needed for social fact studies, and in any case, the data needed for such studies will often be little different from the data already exchanged in discovery.

The proposal that expert witnesses carry out original empirical research relevant to disputed facts in litigation, rather than rely on generic social science studies that the experts speculatively link to the case at bar, has been criticized on the grounds that conducting case-specific research is “expensive, time-consuming,” and unlikely to be approved by the parties. 207Melissa Hart & Paul M. Secunda, A Matter of Context: Social Framework Evidence in Employment Discrimination Class Actions, 78 Fordham L. Rev. 37, 53 (2009). Professors Hart and Secunda provide no support for their characterization of social fact studies as expensive or time consuming and provide no information about the relative costs of social fact studies versus “social framework analysis.” We see no a priori reason why social fact studies will be more burdensome on time or money dimensions. The cost of each approach depends on the scope of the study or analysis undertaken. For example, a systematic coding of case records, wed with a quantitative analysis, could be conducted more reliably and perhaps more efficiently than a “social framework analysis,” depending on who serves as the document coders and the scope of the task given to the coders. We understand that in some cases having an expert witness engage in speculation may be cheaper, quicker, and simpler than conducting original research, but those savings come at the cost of scientific reliability. If experts go beyond providing context for a case through a description of general social science research to make claims about the meaning of social scientific principles for a particular case, then those case-specific claims should be the product of reliable case-specific research. 208 See Fed. R. Evid. 702 advisory committee’s note (“If the expert purports to apply principles and methods to the facts of the case, it is important that this application be conducted reliably.”); cf. Monahan et al., Ascendance of Social Frameworks, supra note 5, at 1738 (“We recognize that ‘social fact’ studies of the kind that would survive Rule 702 scrutiny might be costly and might require judicial involvement to ensure access to company personnel. But this possibility does not, in our view, justify the acceptance of unscientific speculation in the form of ‘social framework analysis.’”); Monahan et al., Limits of Social Frameworks, supra note 5, at 318 n.58 (“In any event, we are aware of no court rulings that excuse expert witness reliability requirements because compliance with those requirements would be difficult or costly.”).

Footnotes

*Gregory Mitchell is Mortimer M. Caplin Professor of Law & Class of 1948 Research Professor, University of Virginia. Please direct correspondence to Greg Mitchell at University of Virginia School of Law, 580 Massie Road, Charlottesville, VA 22903-1738, greg_mitchell@virginia.edu. We appreciate the helpful input of Craig Callen, George Cohen, Brandon Garrett, Gregory Mandel, and workshop participants at Michigan State University College of Law and Temple University Beasley School of Law, and we appreciate the research assistance of Myra Chapman, Katherine Foster, Elizabeth Kade, Leah McLaughlin, and Adam Pollet.Laurens Walker is T. Munford Boyd Professor of Law, University of Virginia.John Monahan is John S. Shannon Distinguished Professor of Law & Horace W. Goldsmith Research Professor of Law, University of Virginia.

1 See Muller v. Oregon, 208 U.S. 412, 419 & n.1 (1908) (“[T]he brief filed by Mr. Louis D. Brandeis, for the defendant . . . . [included] extracts from over ninety reports of committees, bureaus of statistics, commissioners of hygiene, inspectors of factories, both in this country and in Europe, to the effect that long hours of labor are dangerous for women, primarily because of their special physical organization.”).

2 See generally John Monahan & Laurens Walker, Social Science in Law (7th ed. 2010) (providing history and discussion of social science as used in a variety of legal contexts).

3603 F.3d 571 (9th Cir.) (en banc), cert. granted, 131 S. Ct. 795 (2010). The likely size of the class was a matter of some dispute between the majority and dissent in the en banc opinion issued by the Ninth Circuit. Id. at 578 n.3. For information about the proposed class and the relief sought, see Plaintiffs’ Third Amended Complaint at 21–25, Dukes v. Wal-Mart Stores, Inc., 222 F.R.D. 137 (N.D. Cal. 2004) (No. C-01-2252 MJJ), available at http://www.impactfund.org/documents/cat_95-100/Third_Amended_Complaint.pdf. Professor Nagareda describes Dukes as “the largest class action in history under Title VII of the Civil Rights Act of 1964.” Richard A. Nagareda, Class Certification in the Age of Aggregate Proof, 84 N.Y.U. L. Rev. 97, 102 (2009) (footnote omitted).

4Declaration of William T. Bielby, Ph.D. in Support of Plaintiffs’ Motion for Class Certification at 5, Dukes, 222 F.R.D. 137 (No. C-01-2252 MJJ), available at http://www.walmartclass.com/staticdata/reports/r3.html.

5Id. at 41. The district court relied on this expert’s opinions in granting the plaintiffs’ class certification motion. Dukes, 222 F.R.D. at 153–54, aff’d sub nom. Dukes v. Wal-Mart, Inc., 509 F.3d 1168 (9th Cir. 2007), aff’d in part on reh’g sub nom. Dukes v. Wal-Mart Stores, Inc., 603 F.3d 571 (9th Cir.) (en banc), cert. granted, 131 S. Ct. 795 (2010). The initial appellate panel also allowed the expert’s opinions as evidence of commonality. Dukes, 509 F.3d at 1179–80. The Ninth Circuit granted rehearing en banc, and a split court upheld the district court’s certification decision. Dukes, 603 F.3d at 628. We discuss in greater detail the opinions of the expert in Dukes—and our concerns about those opinions—in two recent publications. See John Monahan, Laurens Walker & Gregory Mitchell, Contextual Evidence of Gender Discrimination: The Ascendance of “Social Frameworks, 94 Va. L. Rev. 1715, 1742–48 (2008) [hereinafter Monahan et al., Ascendance of Social Frameworks] (arguing that the expert, although claiming to present a social framework, testified about social facts specific to the defendant); John Monahan, Laurens Walker & Gregory Mitchell, The Limits of Social Framework Evidence, 8 Law, Probability & Risk 307, 311–19 (2009) [hereinafter Monahan et al., Limits of Social Frameworks] (noting a lack of objective measurements to ensure facts were true and representative of all of defendant’s locations).

6Professors Susan Fiske and Eugene Borgida first used this phrase as an extension of Walker and Monahan’s “social frameworks” concept, which involves using general social science evidence to provide a frame of reference or background information to assist the factfinder deciding issues in a specific case. Compare Susan T. Fiske & Eugene Borgida, Social Framework Analysis as Expert Testimony in Sexual Harassment Suits, in Sexual Harassment in the Workplace: Proceedings of New York University 51st Annual Conference on Labor 575, 577 (Samuel Estreicher ed., 1999) (“A social framework analysis uses general conclusions from tested, reliable, peer-reviewed social science research and applies it to the case at hand.”), with Laurens Walker & John Monahan, Social Frameworks: A New Use of Social Science in Law, 73 Va. L. Rev. 559, 559 (1987) (“[G]eneral research results are used to construct a frame of reference or background context for deciding factual issues crucial to the resolution of a specific case. We call this . . . social frameworks.”). Whereas Walker and Monahan envisioned judicial instructions on reliable social science propositions that would provide jurors with new general information to help them make sense of the case-specific evidence, Fiske and Borgida described a “newer methodology of social framework analysis” in which “[c]onclusions aggregated from the research literature are applied to particular cases” by expert witnesses. Fiske & Borgida, supra, at 575–77. As we have discussed previously, in our view case-specific opinions based on Fiske and Borgida’s “social framework analysis” violate basic rules of expert evidence and scientific reliability and should not be confused with Walker and Monahan’s conception of social framework evidence. See Monahan et al., Ascendance of Social Frameworks, supra note 5, at 1746 n.84 (“Whereas Walker and Monahan expressly argued that any inferences to be drawn from the general research to the specific case should be the province of the fact-finder working within a court’s instructions, Fiske and Borgida expressly advocated that experts make such linkages for the fact-finder . . . .”); Monahan et al., Limits of Social Frameworks, supra note 5, at 311–19 (arguing that any opinion based on intuition, instead of reliable methods, falls short of the rigorous post-Daubert standard for expert testimony). For a recent opinion excluding testimony by Dr. Borgida based on “social framework analysis,” see EEOC v. Bloomberg L.P., No. 07 Civ. 8383(LAP), slip op. at 14–15 (S.D.N.Y. Aug. 31, 2010) (stating, among other reasons for excluding Dr. Borgida’s opinions, that “he relied on insufficient facts and data” and “the opinions in [his] report are supported by what appears to be a ‘because I said so’ explanation”). To avoid confusion, we place “social framework analysis” in quotation marks wherever the phrase is used to refer to experts using their personal judgment rather than scientific methods to link social science to specific cases.

7 See Monahan et al., Limits of Social Frameworks, supra note 5, at 315 (noting that experts using “social framework analysis” fill gaps with causal judgments that are not based on accepted methods of causal testing).

8 Id. at 314–17.

9Videotaped Deposition of William T. Bielby, Ph.D., Taken 01-15-08, at 105–06, EEOC v. Wal-Mart Stores, Inc., No. 6:01-CV-339-KKC, 2010 WL 583681 (E.D. Ky. Feb. 16, 2010), 2008 WL 6858762. Dr. Bielby’s opinions in this case were recently excluded, in part, because those opinions, which were based on “social framework analysis,” were not sufficiently connected to the case at hand. See Wal-Mart Stores, 2010 WL 583681, at *4 (“The burden . . . is on the plaintiff to prove that intentional discrimination occurred at this particular distribution center, not just that gender stereotyping or intentional discrimination is prevalent in the world. Dr. Bielby does not opine on whether intentional discrimination occurred at the distribution center.”).

10For a web-based demonstration of such a study, see Consultation and Expert Testimony, Eyewitness Identification Research Lab., http://eyewitness.utep.edu/consult01.html (last visited May 20, 2011).

11 See, e.g., California v. Infineon Techs. AG, No. C 06-4333 PJH, 2008 WL 4155665, at *7 (N.D. Cal. Sept. 5, 2008) (“As is the norm in complex antitrust cases, the parties have weighed in on both sides of this question with reference to the testimony of supporting experts, who present conflicting econometric models in support of their contrasting conclusions.”).

12 See Laurens Walker & John Monahan, Social Facts: Scientific Methodology as Legal Precedent, 76 Calif. L. Rev. 877, 881–82 & n.26 (1988). Thus, when we speak here of social facts or social fact studies, we mean case-specific evidence produced through the application of reliable social science principles and methods to case-specific data, and when we speak of “social framework analysis,” we mean case-specific evidence produced through an expert’s application of social science findings to a particular case using expert judgment rather than traditional empirical methods.

13It is important to note that our rejection of “social framework analysis” applies with equal force to all parties to litigation. Although experts for plaintiffs appear to use this approach more commonly in civil cases than defendants, defense experts have used this approach. See, e.g., Memorandum in Support of Admissibility of Expert Testimony Regarding the Lack of Race Based Preferential Treatment by Thompson or Skipper at 2, Rice v. Or. Dep’t of Corr., No. 04C-19412 (Or. Cir. Ct. Feb. 12, 2008), 2008 WL 3886606 (“Defendants anticipate the testimony of the following expert witness: Dr. M. Kahlil Zonoozy will testify as an expert about social framework evidence based upon his review of witness testimony. Dr. Zonoozy will opine that there were inaccurate perceptions at issue in the workplace at Santiam in 2003 and 2004. These inaccurate perceptions related to race. He will further opine that he saw no evidence of race-based preferential treatment by Superintendent Thompson of Security Manager Carter and Officer Skipper.”). We find this approach unacceptable regardless of the offering party.

14 See, e.g., Daubert v. Merrell Dow Pharm., Inc., 509 U.S. 579, 589–92 (1993) (noting that Federal Rule of Evidence 702 requires expert evidence to be both relevant and helpful).

15Kumho Tire Co. v. Carmichael, 526 U.S. 137, 147–48 (1999) (determining that Rule 702 applies to all expert testimony, not just that which is “scientific”).

16The benefits of social fact studies are perhaps best appreciated in the domain of trademark litigation. An American court first admitted a consumer confusion survey in a trademark dispute in 1940. See Oneida, Ltd. v. Nat’l Silver Co., 25 N.Y.S.2d 271, 286 (Sup. Ct. 1940) (explaining how females were asked to identify the maker of silverware to measure possible confusion between the two products). Parties to trademark disputes now routinely rely on survey evidence to show consumer confusion or lack thereof. See Neal Miller, Facts, Expert Facts, and Statistics: Descriptive and Experimental Research Methods in Litigation, 40 Rutgers L. Rev. 101, 137 (1987); Natalie-Claire Woods, Survey, Survey Evidence in Lanham Act Violations, 15 Trinity L. Rev. 67, 71 (2008) (“In fact, out-of-court consumer polling is perhaps the most well received method of introducing, either directly or as an expert witness opinion, evidence regarding the reactions of the public to the trademarks at issue. These surveys and polls are used to determine the aforementioned issues of confusion, secondary meaning, and suggestiveness or generic nature of a trademark. Results of the surveys are offered into evidence directly or as opinion of an expert witness.” (footnotes omitted)). We discuss a number of examples of social facts in trademark cases in Part II below.

17 See Walker & Monahan, supra note 12, at 881 & n.26 (contrasting their definition of “social fact” with that of Marvell, whose definition of the term was closer to Davis’s definition of “legislative fact” (quoting Kenneth Culp Davis, An Approach to Problems of Evidence in the Administrative Process, 55 Harv. L. Rev. 364, 423–25 (1942)) (internal quotation marks omitted)).

18 See generally Davis, supra note 17, at 402 (“When an agency finds facts concerning immediate parties—what the parties did, what the circumstances were, what the background conditions were—the agency is performing an adjudicative function, and the facts may conveniently be called adjudicative facts.”).

19 Fed. R. Evid. 201 advisory committee’s note (alteration in original) (quoting 2 Kenneth Culp Davis, Administrative Law Treatise 353 (1958)).

20Walker & Monahan, supra note 12, at 881.

21 See id. at 880 (identifying the common use of social science research as a fact-finding tool in Title VII cases).

22Of course, there will be differences of opinion as to the best model to use to estimate these effects, but so long as a defensible, reliable approach is applied to adequate data, these differences of opinion will not necessarily render the statistical opinions inadmissible. See, e.g., Steven L. Willborn & Ramona L. Paetzold, Statistics Is a Plural Word, 122 Harv. L. Rev. F. 48, 56 (2008), http://www.harvardlawreview.org/media/pdf/willborn_paetzold.pdf (“All statistical methods involve a host of underlying assumptions. In an ideal world, whenever a particular method is used, all of its underlying assumptions would be perfectly and fully met, especially in situations involving issues as important as civil rights. But in practice, with real-world data, this simply does not happen. For this reason, among others, experts have to make choices. . . . What is important in both social science and litigation is that the expert reveal the choices that were made and the extent to which assumptions are met.” (footnote omitted)).

23 See generally John Monahan & Laurens Walker, Social Authority: Obtaining, Evaluating, and Establishing Social Science in Law, 134 U. Pa. L. Rev. 477, 478 (1986) (proposing a shift to consideration of empirical data as “social authority”).

24 See generally Walker & Monahan, supra note 6, at 559 (describing the use of general research results to construct a frame of reference for analyzing factual issues).

25 See Monahan & Walker, supra note 23, at 499 (“Courts should place confidence in a piece of scientific research to the extent that the research . . . is generalizable to the case at issue . . . .”). Social authority is similar to Davis’s conception of “legislative facts.” See Davis, supra note 17, at 402. Monahan and Walker argued that social science used as the basis for legislative facts should be seen as a source of legal authority rather than as a source of facts and hence the label “social authority.” Monahan & Walker, supra note 23, at 488. Conceiving social science as legal authority rather than factual information affects how social science should be presented to courts or other law-making bodies, how social science should be evaluated by those bodies, and how resistant to change laws based on social science should be (i.e., the precedential value of social science and the conditions under which social authority should be altered). See id. at 495–516 (suggesting ways in which courts should treat social science evidence).

2629 U.S.C. § 621(a) (2006).

27 See Davis, supra note 17, at 402 (classifying evidence that informs legislative judgment as “legislative facts”).

28For a discussion of the implications of viewing social science as social authority rather than as legislative facts, see John Monahan & Laurens Walker, Empirical Questions Without Empirical Answers, 1991 Wis. L. Rev. 569, 573–74.

29Walker & Monahan, supra note 6, at 559.

30 See, e.g., State v. Chapple, 660 P.2d 1208, 1223–24 (Ariz. 1983) (holding that the exclusion of expert testimony on the factors that affect the reliability of eyewitness identifications was reversible error).

31 See, e.g., People v. McGuiness, 665 N.Y.S.2d 752, 754 (App. Div. 1997) (finding no error in the admission of expert testimony “explaining behavior that would otherwise appear unusual to the average juror; for example, why a victim of sexual abuse might not immediately report such abuse, as is the case here, or why a child would continue contact and maintain a relationship with the abuser”).

32Monahan et al., Ascendance of Social Frameworks, supra note 5, at 1725–27.

33 See Dukes v. Wal-Mart Stores, Inc., 222 F.R.D. 137, 152 (N.D. Cal. 2004) (describing the expert’s conclusions about Wal-Mart’s practices following a review of the evidence and organizational research on the topic), aff’d sub nom. Dukes v. Wal-Mart, Inc., 509 F.3d 1168 (9th Cir. 2007), aff’d in part on reh’g sub nom. Dukes v. Wal-Mart Stores, Inc., 603 F.3d 571 (9th Cir.) (en banc), cert. granted, 131 S. Ct. 795 (2010).

34 See Monahan et al., Limits of Social Frameworks, supra note 5, at 317–18 (noting that a reliance on experience, alone, lacks the necessary scientific rigor for admission); David L. Faigman, Evidentiary Incommensurability: A Preliminary Exploration of the Problem of Reasoning from General Scientific Data to Individualized Legal Decision-Making, 75 Brook. L. Rev. 1115, 1135 (2010) (“Put another way, scientist-experts are limited to testifying about what their respective field’s research can validly add to fact-finders’ deliberations—and nothing more. This injunction, however, is not always followed. In particular, experts frequently seek to comment not simply on the import of general research findings, but on whether a particular case fits those findings. Scientific research that permits a valid description of a general phenomenon, however, does not invariably give experts the capacity to validly determine whether an individual case is an instance of that general phenomenon.”).

35 See, e.g., Pharmacia Corp. v. Alcon Labs., Inc., 201 F. Supp. 2d 335, 368 (D.N.J. 2002) (looking to a survey of ophthalmologists and optometrists to determine that they could readily distinguish between the pharmaceuticals at issue).

36 See id. at 373 (“The Court is aware that Pharmacia is not legally required to conduct a confusion survey. But under the circumstances of this case, Pharmacia’s failure to conduct any confusion survey weighs against its request for a preliminary injunction. Such a failure, particularly when the trademark owner is financially able, justifies an inference ‘that the plaintiff believes the results of the survey will be unfavorable.’” (quoting Charles Jacquin et Cie, Inc. v. Destileria Serralles, Inc., 921 F.2d 467, 475 (3d Cir. 1990))).

37 See Christina A. Studebaker et al., Assessing Pretrial Publicity Effects: Integrating Content Analytic Results, 24 Law & Hum. Behav. 317, 319–20 (2000) (“Public opinion surveying has been referred to as ‘the technique of choice for showing that a likelihood of prejudice exists’ because a large number of people in the relevant community can be contacted relatively quickly in order to assess the amount of knowledge people have about a case (presumably derived from pretrial publicity) and their opinions about the defendant.” (quoting Michael T. Nietzel & Ronald C. Dillehay, Psychologists as Consultants for Changes of Venue: The Use of Public Opinion Surveys, 7 Law & Hum. Behav. 309, 312 (1983))).

38 See, e.g., State v. Williams, 598 N.E.2d 1250, 1257 (Oh. Ct. App. 1991) (“[A] properly conducted opinion poll may be relevant to a determination of whether the particular film in question is obscene. On the issue of relevance, the poll must be relevant to a determination of both community standards in general and the community’s acceptance of viewing the particular film in question.”).

39 See, e.g., Hazelwood Sch. Dist. v. United States, 433 U.S. 299, 307–08 (1977) (“Where gross statistical disparities can be shown, they alone may in a proper case constitute prima facie proof of a pattern or practice of discrimination.”).

40The key is to develop a systematic protocol for categorizing the data based on the facts of interest using a consistent level of analysis. See generally Gary King et al., Designing Social Inquiry: Scientific Inference in Qualitative Research (1994) (discussing the importance of rigorous analysis of qualitative data).

41An example of an observational social fact study conducted for descriptive purposes is found in Sepulveda v. Wal-Mart Stores, Inc., a wage-and-hour case, where the defendant offered an observational study of the work of assistant managers as evidence of time spent in exempt versus nonexempt activities. 237 F.R.D. 229, 236 (C.D. Cal. 2006), aff’d in part, rev’d in part, 275 F. App’x 672 (9th Cir. 2008).

42For a discussion of the uses of sampling techniques to gather evidence in complex cases, see Laurens Walker & John Monahan, Sampling Evidence at the Crossroads, 80 S. Cal. L. Rev. 969 (2007).

43 See, e.g., Aerostructures Corp. v. Revenue Comm’r, No. 03-1412-III, 2004 WL 3528278, at *2 (Tenn. Ch. Ct. Nov. 8, 2004) (“The Court finds from the proof that this was an appropriate case in which to perform a sample audit. The Court finds that the volume of the taxpayer’s records was too large to audit them. . . . [T]he Court finds that all the criteria the Department requires for a sample audit were present, and the sample audit was a reasonable method for the Department to use in this case to determine tax liability.”).

44Such a study could be performed by experts for the parties or by a court-appointed expert. See Manual for Complex Litigation (Fourth) § 11.51, at 112 (2004) (discussing the benefits of court-appointed experts). A court-appointed expert can assemble her own facts and is not limited to considering the evidence presented by the parties and their experts. See id. § 11.51, at 113.

45 See, e.g., Wayne F. Cascio, Sex Discrimination in the Workplace: Lessons from Two High-Profile Cases, in Sex Discrimination in the Workplace 143, 145 (Faye J. Crosby et al. eds., 2007) (relating testimony based on observations of how firefighters performed rescues).

46Computer simulations may be particularly appropriate when the relevant characteristics of agents or organizational practices can be defined using clear parameters, and the effect of one characteristic on another can be mathematically modeled or reduced to symbolic logic. See generally Eliot R. Smith & Frederica R. Conrey, Agent-Based Modeling: A New Approach for Theory Building in Social Psychology, 11 Personality & Soc. Psychol. Rev. 87, 88 (2007). Thomas Schelling’s study of neighborhood segregation illustrates the effective use of agent-based modeling in assessing the impact of environmental conditions on individual behaviors. Id. at 89. Schelling’s model assumed that each individual would avoid neighborhoods in which he would have minority status. Id. Thus, where a party’s theory of the case can be converted into clear behavioral or organizational rules that should have certain impacts on some dependent measure, a computer simulation may be used to test the theory or, more likely, to estimate the range of effects that should be observed in the actual case.

4791 F. Supp. 2d 586, 587–88 (S.D.N.Y. 2000).

48 Id. at 588.

49 Id.

50Brief Amicus Curiae of the American Psychological Ass’n in Support of Appellant at 11–12, People v. Gil, 674 N.Y.S.2d 651 (App. Div. 1998) (No. 10639/93), available at http://www.apa.org/about/offices/ogc/amicus/people.pdf (“Dr. McCloskey had nineteen men do exactly what appellant did—throw a bucket of plaster from a building. The results were exactly what one would expect [in light of Dr. McCloskey’s prior intuitive physics research]: seventeen of 19 men threw the bucket of cement beyond the target—some as much as 25 feet beyond—despite the fact that almost all of the respondents thought they had hit the target or fallen short. (More than half thought they had fallen short.)”).

51 Gil, 91 F. Supp. 2d at 588–89 & n.1.

52 See id. at 591 (“Nowhere in all of petitioner’s 57-page brief in support of this writ, or in oral argument before this court, or in our own independent research, were we able to find any cases that stand for the proposition that the expert testimony offered here is constitutionally mandated. On the contrary, the trial judge’s evaluation that the proposed expert testimony differs enough from the facts of this case that it might confuse the jury is precisely the sort of discretionary judgment that courts are permitted to make.”).

53The district court considering Gil’s habeas petition emphasized that even if Gil had intended to land the bucket on the sidewalk, such behavior was still likely reckless in light of the number of people in the area with immediate access to the sidewalk. See id. at 592 (“[W]e concur with the trial and appellate courts’ evaluation that the issue of whether petitioner intended the bucket to hit the sidewalk or the street does not assist the jury in determining whether such an act is reckless. Even assuming that petitioner did in fact make sure that no one was on the sidewalk when he ‘lobbed’ the bucket, any number of scenarios could have resulted in someone from petitioner’s building or the crowd of onlookers entering the zone of danger once the bucket had left petitioner’s hands.”).

54United States v. Libby, 461 F. Supp. 2d 3, 10 (D.D.C. 2006).

55 Id. at 18 (“Based on the foregoing, the Court cannot conclude that the defendant has satisfied his burden of establishing that the expert testimony of Dr. Bjork will be helpful to the jury. Not only are the studies offered by the defendant inapposite to the situation here, but the theories upon which Dr. Bjork would testify are not beyond the ken of the average juror.”).

56 See id. at 16 (“[E]ven if this Court could accept the proposition that these research studies support the defendant’s proposition that jurors do not have an understanding of memory errors such as the errors that allegedly occurred in this case, which it cannot do, the Court declines to accept the findings of these studies for a more basic reason—the reliability of these studies as applied to this case is questionable.”).

57 See Michael C. Dorf, Legal Indeterminacy and Institutional Design, 78 N.Y.U. L. Rev. 875, 941–42 (2003) (“Even the most enthusiastic defenders of structural reform litigation recognize that courts are at best ‘sub optimal decision makers’ in [prison reform].”). Dorf notes that “problem-solving courts,” especially some drug courts, have recognized the benefits of systematic review of case outcomes by persons with expertise in program evaluation. See id. at 939 (noting the Center for Court Innovation’s efforts to systematize monitoring of treatment providers and the potential for such monitoring regimes to ratchet up performance benchmarks nationwide).

58734 A.2d 350 (N.J. Super. Ct. Law Div. 1996).

59 Id. at 352.

60 Id.

61 Id. at 352–54.

62 Id. at 360–61.

63 See Commonwealth v. Lora, 886 N.E.2d 688, 701 n.29 (Mass. 2008) (“The Soto decision had far-ranging effects within New Jersey. It led to a review of the law enforcement practices of the New Jersey State police, which led the New Jersey Attorney General to conclude ‘that defendants perceived to be African-American, Black or Hispanic are entitled to discovery [regarding racial profiling] for motor vehicle stops that originated as a result of observations made by [New Jersey] State Troopers.’ In 2000, the Supreme Court of New Jersey issued an administrative order, at the request of the New Jersey Attorney General, assigning one judge to hear all motions for discovery relating to racial profiling by the New Jersey State police to ensure ‘centralized judicial management’ of the rapidly emerging issue.” (alterations in original) (citations omitted) (quoting State v. Lee, 920 A.2d 80, 82, 85 (N.J. 2007))). Accordingly, this study, which was designed for social fact purposes (i.e., to assess whether the New Jersey police engaged in the practice of racial profiling that was likely to have affected the plaintiffs adversely), wound up being used as social authority.

64 Donald P. Schwab, Research Methods for Organizational Studies 303 (2d ed. 2005).

65 Id. at 301.

66A hypothesis need not be causal in nature—it may posit some other association among variables without specifying a causal relation. We focus more attention on causal hypotheses, however, as they are often the hypotheses of interest in cases. Many studies undertaken for testing purposes will lead to explanatory information, and many studies undertaken to better understand a phenomenon may involve testing of a case-specific hypothesis. But the latter need not involve testing of a hypothesis, causal or otherwise, for it may be designed to lead to alternative theories of a case or to understand how pieces of evidence fit together in the minds of the actors involved.

67 See Stephen G. West, Alternatives to Randomized Experiments, 18 Current Directions Psychol. Sci. 299, 299 (2009) (“The randomized experiment (RE) enjoys a reputation as the gold standard of research designs. When this design can be properly implemented and its assumptions are met, it enables making strong, transparent inferences about causality that are unrivaled by those produced by other designs.”). Ideally, an experiment would also involve random selection of participants, but true random selection of experimental participants is rare. Geoffrey Keppel & Sheldon Zedeck, Data Analysis for Research Designs 16 (1989). Random assignment is typically an adequate control for differences that individual participants bring to the study that could affect how they react to the experimental tests if sample sizes are sufficient. See id. at 16–17 (“Random assignment is critical to the assumption that the groups formed prior to the introduction of the experimental treatments are equal . . . .”). Individual differences of participants may be an important influence on how individuals respond to various experimental conditions, but these differences should cancel one another out with random assignment and adequate sample size so that an influence of the independent variables on the dependant variable can be detected. See id. (“If subjects can be considered equal at the outset, then any differences that occur after introduction of a treatment can be attributed to the experimenter’s intervention.”). Nevertheless, more detailed testing may reveal that individual differences interact with the experimental variables to produce different patterns of results for different groups. See, e.g., Gregory Mitchell, Why Law and Economics’ Perfect Rationality Should Not Be Traded for Behavioral Law and Economics’ Equal Incompetence, 91 Geo. L.J. 67, 140–42 (2002) (discussing sex differences in risk taking for choices framed as gains versus losses). For a discussion of how to select sample size for an experiment, see generally John A. List et al., So You Want to Run an Experiment, Now What? Some Simple Rules of Thumb for Optimal Experimental Design 7–12 (Nat’l Bureau of Econ. Research, Working Paper No. 15701, 2010).

68 See Devah Pager, The Use of Field Experiments for Studies of Employment Discrimination: Contributions, Critiques, and Directions for the Future, 609 Annals Am. Acad. Pol. & Soc. Sci. 104, 105, 125 (2007) (discussing the use of field experiments to detect patterns of discrimination). Such studies are often called “audit studies” within academic research. Id. at 109.

69 Id. at 109–10.

70 Id. at 111–12.

71 Id. at 110–11.

72 Id. at 112.

73 See id. at 109 (“In the case of employment discrimination, two main types of audit studies offer useful approaches: correspondence tests and in-person audits.”). Where a single potential defendant is the target of inquiry, multiple tester applications (in person or by mail) to different locations or managers or by a variety of matched testers will be necessary to establish a pattern. Id. at 125.

74Squirt Co. v. Seven-Up Co., 207 U.S.P.Q. (BNA) 12, 24 (E.D. Mo. 1979).

75 Id. (internal quotation marks omitted).

76 Id. The consumers were asked, “May I see the six-pack(s) of [each brand] you bought? I know it is an inconvenience to unpack your groceries so I will give you a $3.00 gift certificate good on any purchase of $3.00 or more at this store through July 30th if you will show me the six-pack(s) you bought.” Id. (internal quotation marks omitted).

77Newsome v. McCabe, 319 F.3d 301, 306 (7th Cir. 2003) (permitting the introduction of a chi-square analysis to calculate the probability of three eyewitnesses independently committing chance effort in a lineup identification).

78 Id. at 302.

79 Id. at 305–06.

80 Id. at 306.

81 See supra notes 21–22 and accompanying text.

82 See West, supra note 67, at 302–03 (discussing the ways in which observational studies allow researchers to assess causal effect).

83 See Laura A. King, Measures and Meanings: The Use of Qualitative Data in Social and Personality Psychology, in The SAGE Handbook of Methods in Social Psychology 173, 178–89 (Carol Sansone et al. eds., 2004) (discussing considerations of framing, designing, recruiting, coding, and interpreting in qualitative research).

84For an extended discussion of the use of qualitative data for scientific inference purposes, see King et al., supra note 40.

85 Fed. R. Evid. 702. Rule 702 was amended “in response to Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993), and to the many cases applying Daubert.” Id. advisory committee’s note.

86 See supra Part I.B (discussing how social fact studies provide descriptive information, explain the occurrence of behaviors, and test hypotheses).

87 See Daubert, 509 U.S. at 589 (“[A]ll scientific testimony or evidence . . . [must be] not only relevant, but reliable.”); id. at 591 (“Rule 702 further requires that the evidence or testimony ‘assist the trier of fact to understand the evidence or to determine a fact in issue.’ This condition goes primarily to relevance. . . . The consideration has been aptly described by Judge Becker as one of ‘fit.’ ‘Fit’ is not always obvious, and scientific validity for one purpose is not necessarily scientific validity for other, unrelated purposes.” (citation omitted) (quoting Fed. R. Evid. 702 and United States v. Downing, 753 F.2d 1224, 1242 (3d Cir. 1985))). Social-science-based evidence equivalent to what we call social facts was admitted prior to Daubert on grounds that such evidence was helpful to a jury and, sometimes, on the additional grounds that the expert used generally accepted principles or methods to reach her case-specific opinions. For example, before Daubert, statistical evidence was typically examined just for relevance and prejudice and was not barred by Frye’s general acceptance standard. See David H. Kaye et al., The New Wigmore: A Treatise on Evidence § 11.3.1, at 371 (2004) (discussing Frye v. United States, 293 F. 1013 (D.C. Cir. 1923)). And psychiatric assessments, if subjected to any special scrutiny at all, see id. § 7.8.2, at 270, usually survived scrutiny so long as generally accepted principles or procedures served as the basis for the opinion. See, e.g., United States v. Gould, 741 F.2d 45, 49–50 (4th Cir. 1984) (“[W]henever an insanity defense is sought to be raised by the proffer of evidence of a newly-identified mental ‘disease or defect,’ the proffered evidence is relevant for the purpose only if there is shown to be substantial acceptance within the relevant discipline of the general hypothesis that the disorder may deprive some persons of the substantial capacity either to appreciate the wrongfulness of the particular conduct in issue or to conform their conduct to the particular requirements of law in issue.”), superseded by statute, Insanity Defense Reform Act, Pub. L. No. 98-473, 98 Stat. 2057 (1984), as recognized in United States v. Worrell, 313 F.3d 867, 872 (4th Cir. 2002). After Daubert, the admission of properly performed social fact studies should not be in question on scientific reliability or fit grounds, even if the social fact study involves a novel application of social scientific principles or methods. Such an approach could well be barred under the Frye standard, however. See, e.g., Commonwealth v. Crews, 640 A.2d 395, 402 (Pa. 1994) (“[I]t is the conclusions to be drawn from the statistical information accumulated to date regarding DNA matches that has not achieved widespread acceptance within the scientific community. . . . Therefore the trial court properly refused to entertain statistical information regarding the match. The expert, however, was permitted to testify that the match of three out of four loci made it more probable than not that the sperm was that of the defendant. This type of expert opinion testimony does not violate Frye, and thus was properly admitted.”).

88 See, e.g., John Brewer & Albert Hunter, Multimethod Research 158 (Sage Library of Soc. Research 175, 1989) (“Research to generate and test causal hypotheses is usually judged in terms of two standards: internal and external validity.”).

89 Monahan & Walker, supra note 2, at 68; accord Thomas D. Cook & Donald T. Campbell, Quasi-Experimentation: Design & Analysis Issues for Field Settings 37 (1979) (“Internal validity refers to the approximate validity with which we infer that a relationship between two variables is causal or that the absence of a relationship implies the absence of cause.”); Robert M. Lawless et al., Empirical Methods in Law 36 (2010). (“Internal validity refers specifically to the extent to which the research design allows the drawing of valid inferences about the relationships between variables.”).

90 Monahan & Walker, supra note 2, at 68–69; accord Cook & Campbell, supra note 89, at 37 (“External validity refers to the approximate validity with which we can infer that the presumed causal relationship can be generalized to and across alternate measures of the cause and effect and across different types of persons, settings, and times.”).

91For example, although intentional and unintentional destruction of records is a concern for any social fact study, the problem of missing data is not new to social science research. See Daniel A. Newman, Missing Data Techniques and Low Response Rates: The Role of Systematic Nonresponse Parameters, in Statistical and Methodological Myths and Urban Legends: Doctrine, Verity and Fable in the Organizational and Social Sciences 7, 8 (Charles E. Lance & Robert J. Vandenberg eds., 2009) (noting the difficulties faced by a data analyst when sampled individuals do not respond to a survey or survey item). A variety of methods exist for dealing with this problem and estimating the impact of the missing data on study conclusions. See id. at 12 tbl.1.2 (identifying the three levels of missing data and the corresponding data-analytic methods for handling each).

92For such a discussion, see Monahan & Walker, supra note 2, at 68–72.

93Our focus here is on issues that may affect the internal validity of a social fact. In Part III below, we deal with questions of legal access to the data needed for a social fact study and the ethics of conducting social fact studies.

94 See, e.g., Samuel R. Gross, Expert Evidence, 1991 Wis. L. Rev. 1113, 1231 (“Expert testimony is a sizeable cottage industry that is geared entirely to provide effective partisan evidence.”).

95Of course, many researchers expect their academic work to have applied purposes and undertake academic research to try to influence public policy debates. Thus, the academic-purpose/litigation-purpose distinction may adhere to an idealistic view of pure scientific research that is hard to find in practice. See Daubert v. Merrell Dow Pharm., Inc., 43 F.3d 1311, 1317 (9th Cir. 1995) (“One very significant fact to be considered is whether the experts are proposing to testify about matters growing naturally and directly out of research they have conducted independent of the litigation, or whether they have developed their opinions expressly for purposes of testifying. That an expert testifies for money does not necessarily cast doubt on the reliability of his testimony, as few experts appear in court merely as an eleemosynary gesture. But in determining whether proposed expert testimony amounts to good science, we may not ignore the fact that a scientist’s normal workplace is the lab or the field, not the courtroom or the lawyer’s office. That an expert testifies based on research he has conducted independent of the litigation provides important, objective proof that the research comports with the dictates of good science.” (footnote omitted)).

96 See Gregory Mitchell, Empirical Legal Scholarship as Scientific Dialogue, 83 N.C. L. Rev. 167, 176 (2004) (noting the ability of intersubjective testing and review to increase objectivity of empirical research).

97 See Monahan et al., Limits of Social Frameworks, supra note 5, at 315–17 (discussing the scientific requirement that the bases for a researcher’s inferences be made public).

98No. CIV.A. 99-C-3356, 2002 WL 31061088, at *1–3 (N.D. Ill. Sept. 17, 2002).

99 Id. at *4–5.

100 Id. at *9 (“[T]he inclusion of a large number of class members in the survey appears to have strongly influenced the overall results, which further supports the defendant’s position that the survey data do not reliably reflect the views or experiences of the overall population of relevant employees.”).

101 See, e.g., Jerry M. Newman, Discrimination in Recruitment: An Empirical Analysis, 32 Indus. & Lab. Rel. Rev. 15, 17 (1978) (distributing fictitious résumés to employers unaware of the experiment to assess racial dimensions of their employment decisions).

102 See, e.g., Vita-Mix Corp. v. Basic Holding, Inc., 581 F.3d 1317, 1325 (Fed. Cir. 2009) (double-blind study of blend users with respect to their manner of use of stir stick in patent infringement case); Marlo v. UPS, Inc., No. CV 03-04336DDP(RZX), 2005 WL 6197774, at *10 (C.D. Cal. Mar. 1, 2005) (double-blind survey of employees regarding their duties in wage-and-hour case).

103 See, e.g., Brad J. Bushman & Angelica M. Bonacci, You’ve Got Mail: Using E-mail to Examine the Effect of Prejudiced Attitudes on Discrimination Against Arabs, 40 J. Experimental Soc. Psychol. 753, 758 (2004) (“The current study used a novel procedure, the lost e-mail technique, to demonstrate that prejudiced individuals discriminate against Arabs when they can remain anonymous.”).

104 See, e.g., Samuel L. Gaertner et al., Race of Victim, Nonresponsive Bystanders, and Helping Behavior, 117 J. Soc. Psychol. 69, 73–77 (1982) (assessing how a victim’s race and the presence of bystanders impacted not just the decision to help, but also latency and heart rate measures).

105No. C 05-2320 SBA, 2007 WL 2408872, *1–2 (N.D. Cal. Aug. 21, 2007), rev’d, 319 F. App’x 688 (9th Cir. 2009).

106 Id. at *8. The plaintiff challenged the study on grounds that it did not examine the activities of the actual class members (which is an external validity challenge on the basis of participants’ characteristics, a topic we address below), but the court rejected this challenge: “FedEx argues, and Whiteway does not effectively rebut, that there is no operational/functional difference between the centers in California and the centers in other western states surveyed.” Id. (citation omitted).

107 See, e.g., id. at *9 (“[T]here remains no evidence[] that . . . the job duties/responsibilities of any Center Manager . . . are any different than another.”).

108 See, e.g., Cara Laney et al., The Red Herring Technique: A Methodological Response to the Problem of Demand Characteristics, 72 Psychol. Res. 362, 364 (2008) (“The Red Herring technique allows naturally curious subjects to ‘figure out’ what the study is about without actually figuring out what the study is about (and thus becoming subject to demand). . . . [I]t is applicable to a wide range of studies in psychology, especially those involving deception.”); id. (“The Red Herring is an extra layer of information (separate from both what subjects were told and what we were actually studying), intended to provide a plausible explanation for the tasks subjects are asked to complete. That is, we are doubly deceiving subjects.”).

109Tuli v. Brigham & Women’s Hosp., Inc., 592 F. Supp. 2d 208, 214 (D. Mass. 2009) (“While one could attempt to perform [case-specific] tests, their scientific integrity would be fatally compromised when conducted within the context of a lawsuit against those individuals or the corporation that employs them. Social scientific research into the basic principles of sex stereotyping normally involves voluntary participants who are assured (and can rely on these assurances) of the complete anonymity and confidentiality of their responses. It is unlikely that researchers could obtain candid and uncensored self-reports of attitudes from employees who are aware that the research is related to a pending lawsuit against the organization that employs them. Thus, concerns about scientific validity . . . do not recommend mounting an organizational investigation using standard social science techniques.” (quoting expert report of Dr. Peter Glick)); Jennifer S. Hunt et al., Scientific Status, in 2 Modern Scientific Evidence: The Law and Science of Expert Testimony § 18:14 (David L. Faigman et al. eds., 2008) (“In addition, the argument that contract research is the appropriate method of determining whether gender stereotyping occurred ignores several important limitations of conducting such research. One serious problem with the contract approach is that there is no way of determining the extent to which employees’ responses will be tainted by knowledge of the litigation underway, the sponsors of the survey, or the potential ramifications of their responses. It is likely that employees will try to give unbiased responses, even if they are not accurate. Moreover, research indicates that gender stereotyping often occurs outside of conscious awareness, so even if employees are completely honest, their responses may not reveal the actual occurrence of gender stereotyping.” (footnote omitted)).

110 See, e.g., Austin Lee Nichols & Jon K. Maner, The Good-Subject Effect: Investigating Participant Demand Characteristics, 135 J. Gen. Psychol. 151, 151 (2008) (“Researchers are often concerned with the presence of demand characteristics, cues that make participants aware of what the experimenter expects to find or how participants are expected to behave, and the researchers typically use methods for reducing the demand.”).

111 See supra Part I.B.1–2.

112 See supra Part I.B.1–2.

113 See, e.g., Lawless et al., supra note 89, at 410 (defining external validity as the extent to which findings can be generalized to different people, settings, times, and measures); Marilynn B. Brewer, Research Design and Issues of Validity, in Handbook of Research Methods in Social and Personality Psychology 3, 4 (Harry T. Reis & Charles M. Judd eds., 2000) (“External validity . . . refer[s] to the generalizability of the causal finding, that is, whether it can be concluded that the same cause–effect relationship would be obtained across different subjects, settings, and methods.”); T. D. Cook, Generalization: Conceptions in the Social Sciences, in 9 International Encyclopedia of the Social & Behavioral Sciences 6037, 6037 (Neil J. Smelser & Paul B. Baltes et al. eds., 2001) (“[S]ocial scientists typically draw conclusions about four not completely independent entities—human populations, physical settings, causes, and observables . . . . Time, understood as historical period, might also be added.”).

114 See Cook, supra note 113, at 6037 (noting the threats to external validity posed by extrapolating findings to different age groups, geographic areas, and institutional settings).

115 See, e.g., Cnty. of Kenosha v. C & S Mgmt., Inc., 588 N.W.2d 236, 253–54 (Wis. 1999) (excluding survey of community views in an obscenity trial where “the innocuous description of the types of activities the survey respondent was to consider [wa]s too far removed from the graphic scenes of sexual activity in [the videotape in question] to be relevant on the question of whether that particular video is obscene”).

116In most such cases, sex will be treated as a dichotomous independent variable (i.e., with the categories male versus female), but in cases where failure to conform to gender stereotypes is the basis for a sex discrimination claim, degree of fit with gender stereotypes might be the proper independent variable.

117An expert may choose to test only one of the independent variables, in which case the study will be externally valid but its statistical conclusion and internal validities are threatened by the failure to control for a third variable that could explain or affect the covariation between the independent and dependent variables studied.

118 See supra notes 77–80 and accompanying text.

119Newsome v. McCabe, 319 F.3d 301, 306 (7th Cir. 2003).

120 See Siegrun D. Kane, Trademark Law § 16:6, at 16-14 (PLI Intellectual Prop. Law Library No. G1-8804, 4th ed. 2006) (“The judicial attitude toward surveys has moved 180 degrees in past decades. While many courts used to exclude survey evidence as inadmissible hearsay or give it little weight, surveys are now generally looked on as significant evidence.”); supra note 16 (discussing the use of surveys in trademark cases).

121Squirt Co. v. Seven-Up Co., 207 U.S.P.Q. (BNA) 12, 19 (E.D. Mo. 1979).

122Framing the question properly can be seen as a problem of both external validity and construct validity. “Construct validity involves making inferences from the sampling particulars of a study to the higher-order constructs they represent.” William R. Shadish et al., Experimental and Quasi-Experimental Designs for Generalized Causal Inference 65 (Kathi Prancan ed., 2002).Construct validity and external validity are related to each other in two ways. First, both are generalizations. . . . Second, valid knowledge of the constructs that are involved in a study can shed light on external validity questions, especially if a well-developed theory exists that describes how various constructs and instances are related to each other.Id. at 93.

1236 J. Thomas McCarthy, McCarthy on Trademarks and Unfair Competition § 32:176, at 32-375 to -376 (4th ed. 2008) (internal quotation marks omitted).

124 In re Ferrero, 479 F.2d 1395, 1397 (C.C.P.A. 1973).

125Harvey Cartoons v. Columbia Pictures Indus., Inc., 645 F. Supp. 1564, 1573 (S.D.N.Y. 1986).

126 McCarthy, supra note 123, § 32:170, at 32-351; accord Starter Corp. v. Converse, Inc., 170 F.3d 286, 297 (2d Cir. 1999) (holding that it was proper for district court to exclude research because the survey questions were “little more than a memory test” and, therefore, were not probative of the likelihood of confusion); J & J Snack Foods, Corp. v. Earthgrains Co., 220 F. Supp. 2d 358, 370 (D.N.J. 2002) (“Above all, the survey’s design must fit the issue which is to be decided by the jury, and not some inaccurate restatement of the issue, lest the survey findings inject confusion or inappropriate definitions into evidence, confounding rather than assisting the jury.”); Franklin Res., Inc. v. Franklin Credit Mgmt. Corp., 988 F. Supp. 322, 335 (S.D.N.Y. 1997) (“Surveys which do nothing more than demonstrate the respondents’ ability to read are not probative on the issue of likelihood of consumer confusion.”); Jacob Jacoby, Survey and Field Experimental Evidence, in The Psychology of Evidence and Trial Procedure 175, 186 (Saul M. Kassin & Lawrence S. Wrightsman eds., 1985) (“[C]ourts raise two points regarding the questions posed to respondents: (1) Do these questions address the legal issues that are relevant to the case? (2) If so, are the questions posed in a clear and unbiased manner?”).

127 See, e.g., Sizes Unlimited, Inc. v. Sizes to Fit, Inc., 871 F. Supp. 1558, 1561 (E.D.N.Y. 1994) (“Consumers and potential consumers of a product must, on the basis of the mark at issue, associate the goods or services at issue with a single source, even if that source is anonymous.”).

128 McCarthy, supra note 123, § 32:159, at 32-319.

129 Id.; accord Robert C. Bird, Streamlining Consumer Survey Analysis: An Examination of the Concept of Universe in Consumer Surveys Offered in Intellectual Property Litigation, 88 Trademark Rep. 269, 276–77 (1998) (“Determination of the universe represents one of the most significant challenges a survey expert will face in drafting a consumer survey. A misaligned universe can doom otherwise competent research and trigger an adverse decision by the court.”); Shari Seidman Diamond, Survey Research, in 1 Modern Scientific Evidence, supra note 109, § 8:10 (“One of the first steps in designing a survey or in deciding whether an existing survey is relevant is to identify the target population (or universe).”); Lawrence E. Evans, Jr. & David M. Gunn, Trademark Surveys, 79 Trademark Rep. 1, 31 (1989) (“Errors in [selecting the universe] are more likely to prove fatal than errors in the content of the questions, for there is some value in a slanted question asked of the right witness, but no value in asking the right question of the wrong witness.”); Jacoby, supra note 126, at 179–80 (“It has become axiomatic in trademark case law that the key consideration in the design of a survey is whether the appropriate universe was tested. More surveys are held inadmissible or given no weight for having employed an improper universe than for any other reason.” (citations omitted)).

130 See McCarthy, supra note 123, § 23:5, at 23-23 (“[C]ustomers may be consumers, professional purchasers or wholesalers or retailers. A potential customer is one who might someday purchase this kind of product or service and pays attention to brands in that market.” (footnote omitted)).

131Brunswick Corp. v. Spinit Reel Co., 832 F.2d 513, 523 n.6 (10th Cir. 1987).

132Pharmacia Corp. v. Alcon Labs., Inc., 201 F. Supp. 2d 335, 365 (D.N.J. 2002).

133216 F. Supp. 670, 681 (S.D.N.Y. 1963).

134615 F.2d 252, 264 (5th Cir. 1980).

135 Id.

136 Id.

137 See Robert A. Peterson, On the Use of College Students in Social Science Research: Insights from a Second-Order Meta-Analysis, 28 J. Consumer Res. 450, 458 (2001) (“[T]he present research suggests that, by relying on college student subjects, researchers may be constrained regarding what might be learned about consumer behavior and in certain instances may even be misinformed.”).

138On differences in identifications by race, see Christian A. Meissner & John C. Brigham, Thirty Years of Investigating the Own-Race Bias in Memory for Faces: A Meta-Analytic Review, 7 Psychol. Pub. Pol’y & L. 3, 21 (2001).

139 See supra note 6 and accompanying text.

140 See, e.g., Randall A. Gordon & Richard D. Arvey, Age Bias in Laboratory and Field Settings: A Meta-Analytic Investigation, 34 J. Applied Soc. Psychol. 468, 485–86 (2004) (“For the most part, the results from analyses that examined the issue of generalizability show that greater and more relevant information and greater and more relevant experience among raters, judges, or supervisors leads to less age bias.”); Cynthia M. Marlowe et al., Gender and Attractiveness Biases in Hiring Decisions: Are More Experienced Managers Less Biased?, 81 J. Applied Psychol. 11, 18 (1996) (“Managers of all experience levels exhibited bias in the rating conditions, despite the fact that all of our applicant photographs were rated as being at least somewhat attractive. . . . These biases tended to decrease as managerial experience increased, except that less attractive women were routinely judged to be the worst applicants.”); Dianna L. Stone et al., Methodological Problems Associated with Research on Unfair Discrimination Against Racial Minorities, 18 Hum. Resource Mgmt. Rev. 243, 251 (2008) (“Interestingly, an analysis of the average effect size estimates revealed that studies using non-representative samples had a larger effect size (r = .24) than those using representative samples (r = .14). These results suggest that race may have less of an effect on personnel decisions in actual organizational settings than in contrived settings.”).

141 See, e.g., supra notes 58–62 and accompanying text.

142 McCarthy, supra note 123, § 32:163, at 32-333; accord Kane, supra note 120, § 16:6.1, at 16-20 (“The more remote the survey is from actual marketplace conditions, the less persuasive it will be.”).

143 See supra notes 74– 76 and accompanying text.

144Squirt Co. v. Seven-Up Co., 207 U.S.P.Q. (BNA) 12, 25–26 (E.D. Mo. 1979).

145Am. Luggage Works, Inc. v. U.S. Trunk Co., 158 F. Supp. 50, 53 (D. Mass. 1957).

146 See McCarthy, supra note 123, § 32:163, at 32-334 (“To require that a survey be taken ‘during the buying decision’ is an impossible requirement tantamount to rejecting all survey evidence.”).

147Zippo Mfg. Co. v. Rogers Imports, Inc., 216 F. Supp. 670, 685 (S.D.N.Y. 1963) (footnotes omitted).

148 See, e.g., Frank J. Landy, Stereotypes, Bias, and Personnel Decisions: Strange and Stranger, 1 Indus. & Organizational Psychol. 379, 380, 384 (2008) (noting the impact of adequate safeguards and individuating information on workplace discrimination); Philip E. Tetlock et al., The Challenge of Debiasing Personnel Decisions: Avoiding Both Under- and Overcorrection, 1 Indus. & Organizational Psychol. 439, 440 (2008) (finding empirical and theoretical support for the proposition that accountability pressures will push decisionmakers to value individuating over implicit biases). See generally Philip E. Tetlock & Gregory Mitchell, Implicit Bias and Accountability Systems: What Must Organizations Do to Prevent Discrimination?, 29 Res. Organizational Behav. 3, 11–18 (2009) (critiquing the argument that equal opportunity is impossible in a society with inequality of result).

149For example, within the United States, racial biases are correlated with geographic region, income levels, and educational attainment. See, e.g., Peter Burns & James G. Gimpel, Economic Insecurity, Prejudicial Stereotypes, and Public Opinion on Immigration Policy, 115 Pol. Sci. Q. 201, 212–17 (2000) (discussing the impact of contextual and personal factors on racial attitudes in 1992 and 1996); cf. Nakajima v. Gen. Motors Corp., 857 F. Supp. 100, 105 (D.D.C. 1994) (“Where, as here, the expert’s opinion is based on an incorrect assumption about the country in which a plaintiff will reside, the testimony should not be permitted because it fails to serve its purpose of aiding the trier of fact in its determination of lost future earnings.”); id. (“Additionally, plaintiff’s contention that the use of United States’ statistical and economic data is necessary because comparable Japanese data is not available is not supported by the record. A review of [an expert’s] deposition and the Year Book of Labour Statistics, published by the Japanese Ministry of Labour, shows that adequate Japanese data on the factors considered under District of Columbia law exists. Therefore, the testimony of [the expert], insofar as it is based on a presumption of Nakajima’s future residence in the United States, is excluded.” (citation omitted)).

150For instance, Title VII provides that charges of discrimination should be submitted to the EEOC within 180 days of the alleged discrimination in states with no state agency devoted to handling discrimination claims or, in states with such agencies, within 300 days of the alleged discrimination or 30 days of termination of state agency proceedings (whichever date is earlier). 42 U.S.C. § 2000e–5(e)(1) (2006). Considerable time may pass, however, before the EEOC decides whether to pursue the case or issues a right-to-sue letter or between the start of a suit and involvement of an expert. Plus, the continuing violation theory of discrimination may further widen the gap between the date of alleged discrimination and the filing of suit, or the plaintiff may seek to introduce evidence of discriminatory conduct outside the limitations period to support her case. See Kyle Graham, The Continuing Violations Doctrine, 43 Gonz. L. Rev. 271, 275 (2007/08) (“The first sort of continuing violation aggregates multiple allegedly wrongful acts, failures to act, or decisions such that the limitations period begins to run on this collected malfeasance only when the defendant ceases its improper conduct. The second type of continuing violation divides what might otherwise represent a single, time-barred cause of action into several separate claims, at least one of which accrues within the limitations period prior to suit.” (footnote omitted)). Thus, where there are concerns about memory or changes due to intervening events, the researcher should take these concerns into account and seek to examine the impact of this passage of time on any newly collected data. See Barbara A. Gutek, My Experience as an Expert Witness in Sex Discrimination and Sexual Harassment Litigation, in Sex Discrimination in the Workplace, supra note 45, at 131, 137 (“But given the long time frame of this case and many other class actions, it is important to find out when relevant behaviors occurred; surely it makes a difference if all objectionable behavior occurred more than five years ago or if the amount of potentially harassing behavior increased over time.”).

151 Monahan & Walker, supra note 2, at 74. Research has found that the likelihood of finding age bias in personnel decisions decreased over the last generation. Gordon & Arvey, supra note 140, at 479. Likewise, gender and racial attitudes have liberalized considerably over the last fifty years. See, e.g., Lawrence D. Bobo & Camille Z. Charles, Race in the American Mind: From the Moynihan Report to the Obama Candidacy, 621 Annals Am. Acad. Pol. & Soc. Sci. 243, 245 (2009) (“Overall, . . . these improvements in whites’ racial attitudes are sweeping and robust . . . .”); Clem Brooks & Catherine Bolzendahl, The Transformation of US Gender Role Attitudes: Cohort Replacement, Social-Structural Change, and Ideological Learning, 33 Soc. Sci. Res. 106, 107 (2004) (“Highly restrictive attitudes, characterized by negative beliefs about women in non-domestic roles, an unwillingness to support women’s rights across a wide range of institutions, and a tendency to endorse gender-based differences in power and responsibility have evolved into seemingly more liberal attitudes.”).

152As an example of an important societal change, consider the impact of 9/11 on study outcomes: if a Muslim brought a religious discrimination case arising from events occurring before September 11, 2001, we might be concerned about the external validity of a study into the impact of religious affiliation on employment outcomes conducted after September 11, 2001.

153 See Ashcroft v. Iqbal, 129 S. Ct. 1937, 1949 (2009) (“To survive a motion to dismiss, a complaint must contain sufficient factual matter, accepted as true, to ‘state a claim to relief that is plausible on its face.’ A claim has facial plausibility when the plaintiff pleads factual content that allows the court to draw the reasonable inference that the defendant is liable for the misconduct alleged.” (citation omitted) (quoting Bell Atlantic Corp. v. Twombly, 550 U.S. 544, 570 (2007)); Twombly, 550 U.S. at 562–63 (“Conley’s ‘no set of facts’ language has been questioned, criticized, and explained away long enough. . . . [A]fter puzzling the profession for 50 years, this famous observation has earned its retirement. The phrase is best forgotten as an incomplete, negative gloss on an accepted pleading standard: once a claim has been stated adequately, it may be supported by showing any set of facts consistent with the allegations in the complaint.”).

154 See supra notes 58– 62 and accompanying text.

155Although one can imagine unobtrusive observational studies providing useful data in important types of cases, in particular wage-and-hour class actions, for example, where observers could randomly and systematically sample the behavior of different types of workers in organizations permitting public access (e.g., restaurants, retail stores).

156 See supra notes 74–76 and accompanying text.

157In a related vein, the FTC could also utilize such studies to investigate claims of false or deceptive advertising, and reputational damage caused to high-profile figures by alleged defamatory statements could be assessed prefiling using surveys and structured interviews and market analysis in the case of celebrities. As noted below, such prefiling studies on third parties may present discovery issues. See infra notes 168–77. However, to the extent a consulting expert is used to conduct the study in anticipation of litigation, the work product doctrine will provide some insulation from discovery of unfavorable study results or “false starts” that have to be scrapped.

158No. 81-532-Z, 1982 U.S. Dist. LEXIS 16667, at *2 (D. Mass. Oct. 14, 1982). The tort of misrepresentation in Massachusetts requires: “First, the tortfeasor must make a false statement which he knows to be false. Second, he must intend to incude [sic] his victim to rely on that statement. Third, his victim must, in fact, so rely.” Id. at *1–2. Under Havens Realty Corp. v. Coleman, testers are defined as “individuals who, without an intent to rent or purchase a home or apartment, pose as renters or purchasers for the purpose of collecting evidence of unlawful steering practices.” 455 U.S. 363, 373 (1982).

159 Education/Instruccion, 1982 U.S. Dist. LEXIS 16667, at *2.

160Robert Thomas Roos, Note, No Harm, No Fraud: The Invalidity of State Fraud Claims Brought Against Employment Testers, 53 Vand. L. Rev. 1687, 1692–93 (2000) (discussing Kyles v. J.K. Guardian Sec. Servs., Inc., 222 F.3d 289 (7th Cir. 2000)).

161 Id. at 1693.

162 Havens, 455 U.S. at 373–74; see also Kyles, 222 F.3d at 292 (noting that testers “have been used for years to assess compliance with the nation’s fair housing laws”); id. at 299 (“The fact that testers have no interest in a job does not diminish the deterrent role they play by filing suit under Title VII. In that regard, testers are situated similarly to unlawfully discharged employees who are ineligible for reinstatement because of wrongdoing discovered after they were fired. Evidence of such wrongdoing limits the relief they may obtain under Title VII, but it does not bar them from bringing suit.”); Molovinsky v. Fair Emp’t Council of Greater Washington, Inc., 683 A.2d 142, 146 (D.C. 1996) (“Violation of a plaintiff’s statutory rights may itself constitute an ‘actual or threatened injury’ sufficient to confer Article III standing.” (quoting Havens, 455 U.S. at 373)). But see Fair Emp’t Council of Greater Washington, Inc. v. BMC Mktg. Corp., 28 F.3d 1268, 1274 (D.C. Cir. 1994) (“[T]he facts as alleged in the complaint do not come close to indicating that either tester ‘will again be subjected to the alleged illegality’. The tester plaintiffs therefore lack standing to seek prospective relief.” (citation omitted) (quoting City of L.A. v. Lyons, 461 U.S. 95, 109 (1983))).

16342 U.S.C. § 2000e–5(b) (2006).

164EEOC Compliance Manual (BNA) No. 915.002 (May 22, 1996), available at http://www.eeoc.gov/policy/docs/testers.html.

165 Model Rules of Prof’l Conduct R. 4.2 cmt. 7 (1983).

166 See id. R. 4.2 cmt. 8 (“The prohibition on communications with a represented person only applies in circumstances where the lawyer knows that the person is in fact represented in the matter to be discussed. This means that the lawyer has actual knowledge of the fact of the representation; but such actual knowledge may be inferred from the circumstances.”).

167 See Constance Weisner et al., Substance Use, Symptom, and Employment Outcomes of Persons with a Workplace Mandate for Chemical Dependency Treatment, 60 Psychiatric Services 646 (2009) (noting that employer mandates pressure individuals to enter chemical dependency treatment programs).

168186 F.R.D. 271, 275 (D. Conn. 1999).

169 Id. at 273.

170 Id. at 276.

171 Id. at 277.

172 Id. The court ruled that there could be no retroactive application of the attorney–client privilege. Id.

173 Id. at 276–77.

174 Id. at 276 (quoting Fed. R. Civ. P. 26(b)(3)(A)(ii)) (internal quotation marks omitted).

175 Id. at 277. The court also ruled that “[a]ny claim of work product as to the blank questionnaire itself was waived by plaintiffs’ attorney when the questionnaire was sent out to third party former employees.” Id.

176 See, e.g., Paul E. Starkman, Tips and Traps for the Unwary When Auditing and Measuring the Effectiveness of Corporate Compliance and Ethics Programs: An Outside Counsel’s Perspective, in Corporate Compliance and Ethics Institute 2009, at 695, 708–09 (PLI Corporate Law & Practice, Course Handbook Ser. No. 18176, 2009) (discussing ways to protect audit results from discovery, and suggesting that the “self-evaluative” privilege, the attorney–client privilege, or the work product privilege may protect results if audits were performed in anticipation of litigation).

177 Accord Greg Mitchell, Good Scholarly Intentions Do Not Guarantee Good Policy, 95 Va. L. Rev. Brief 109, 115 (2010), http://www.virginialawreview.org/inbrief/2010/02/28/mitchell.pdf (“Companies are in the best position to detect and correct workforce disparities. We need to create a set of legal rules and policies that reward internal monitoring and self-correction and that penalize deliberate ignorance about the discriminatory impact of a company’s personnel policies.”).

178 See Fed. R. Civ. P. 26(a)(1)(A)(ii) (describing a party’s duty to disclose).

179 Id. 34(a)(2) (emphasis added).

180 See, e.g., Coleman v. Schwarzenegger, Nos. CIV S-90-0520 LKK JFM P, C01-1351 TEH, 2007 WL 3231706, at *4 (E.D. Cal. & N.D. Cal. Oct. 30, 2007) (allowing experts to enter a prison to confer with and interview staff during on-site inspection, as long as staff members were “in the ordinary course of business” and “reasonably available”); Morales v. Turman, 59 F.R.D. 157, 159 (E.D. Tex. 1972) (allowing experts in a participant observation study to speak with inmates and staff of a prison to “observe the operations of special treatment centers and other locations where inmates are incarcerated”). Observational studies may be useful in wage-and-hour lawsuits. See, e.g., Whiteway v. FedEx Kinkos Office & Print Servs., Inc., No. C 05-2320 SBA, 2010 WL 1980229, at *4 (N.D. Cal. May 17, 2010) (admitting a survey showing that defendant’s other Center Managers met the company’s expectations); Sepulveda v. Wal-Mart Stores, Inc., 237 F.R.D. 229, 236 (C.D. Cal. 2006) (admitting a survey based, in part, on the observation of eighteen assistant managers in different California stores), aff’d in part, rev’d in part, 275 F. App’x 672 (9th Cir. 2008).

181Rule 33 interrogatories could also serve as a foundation for survey requests to defendant-employees, but relief from the numerical limits under Rule 33 would likely be necessary. See Fed. R. Civ. P. 33(a)(1) (“Unless otherwise stipulated or ordered by the court, a party may serve on any other party no more than 25 written interrogatories . . . .”).

182 Id. 35(a)(1).

183 See, e.g., Schlagenhauf v. Holder, 379 U.S. 104, 114–22 (1964) (suggesting that Rule 35 may extend to a defendant asserting his mental condition in defense of a claim).

184 Fed. R. Civ. P. 35(a).

185Koch v. Cox, 489 F.3d 384, 391 (D.C. Cir. 2007).

186 Schlagenhauf, 379 U.S. at 119; see also Lowe v. Phila. Newspapers, Inc., 101 F.R.D. 296, 299 (E.D. Pa. 1983) (“Good cause has been shown under Rule 35(a) for such an examination. Plaintiff’s emotional and mental state of health has clearly been put in issue by plaintiff.”); Brandenberg v. El Al Isr. Airlines, 79 F.R.D. 543, 546 (S.D.N.Y. 1978) (“In view of the allegations of injury and damage in the complaint, a psychiatric examination of the plaintiff under Rule 35(a) is clearly appropriate.”).

187 Fed. R. Civ. P. 35(a)(1) (“The court . . . may order a party . . . .” (emphasis added)).

188 See Schlagenhauf, 379 U.S. at 115 n.12 (“Although petitioner was an agent of [the defendant], he was himself a party to the action. He is to be distinguished from one who is not a party but is, for example, merely the agent of a party.”); Kropp v. Gen. Dynamics Corp., 202 F. Supp. 207, 208 (E.D. Mich. 1962) (holding that the court lacked jurisdiction to compel a truck driver, a nonparty and agent of corporate defendant, to submit to a physical examination under Rule 35(a)).

189114 F.2d 479, 481 (D.C. Cir. 1940).

190144 F. Supp. 880, 882 (W.D. Pa. 1956).

191Where parties conduct social fact studies on themselves or third parties postfiling, discovery concerns should be greatly reduced so long as the study was conducted by a consulting expert for trial preparation purposes. See Fed. R. Civ. P. 26(b)(4)(B) (“Ordinarily, a party may not . . . discover facts known or opinions held by an expert who has been retained or specially employed by another party in anticipation of litigation . . . .”).

192 See Fleming James, Jr., Discovery, 38 Yale L.J. 746, 746 (1929) (“Equity would . . . entertain jurisdiction over a bill of discovery in aid of an intended action at law or defense therein, or a defense in an equity suit.”).

193 See generally id. at 746–49 (discussing the history of the equitable bill of discovery).

194 Id. at 746.

195 Id.

196152 F. 372, 375 (3d Cir. 1906) (quoting 2 Joseph Story, Story’s Equity Jurisprudence § 1488 (Fred B. Rothman & Co. 1988) (1866)).

197 Jack H. Friedenthal et al., Civil Procedure § 7.1, at 397 (4th ed. 2005).

198 See id. (“As one scholar has noted, broad discovery has transcended its role as a ‘mere procedural rule’ and become the cornerstone of American civil litigation.” (quoting Geoffrey C. Hazard, Jr., From Whom No Secrets Are Hid, 76 Tex. L. Rev. 1665, 1694 (1998))).

199 See Stephen N. Subrin, Fishing Expeditions Allowed: The Historical Background of the 1938 Federal Discovery Rules, 39 B.C. L. Rev. 691, 697 (1998) (“It is probable that no procedural process offers greater opportunities for increasing the efficiency of the administration of justice than that of discovery before trial.” (quoting Edson R. Sunderland, Foreword to George Ragland, Jr., Discovery Before Trial, at iii, iii (1932))).

200 Id. at 691 (alterations in original) (quoting 329 U.S. 495, 507 (1947)).

201 E.g., Beale v. District of Columbia, 545 F. Supp. 2d 8, 15 (D.D.C. 2008) (“The histories, vagaries and progress of each case are unique, and the judge managing discovery is in the best position to weigh the equities.”).

202 See, e.g., Fed. R. Civ. P. 26(g)(1) (“By signing, . . . [a] party certifies that[,] . . . with respect to a disclosure, it is complete and correct as of the time it is made . . . .”); id. 37(a)(4) (“[A]n evasive or incomplete disclosure, answer, or response must be treated as a failure to disclose, answer, or respond.”).

203 Id. 26(e)(1)(A).

204 See Advisory Comm. on Experimentation in the Law, Fed. Judicial Ctr., Experimentation in the Law 4247 (1981). The Committee’s report was focused on the use of programmatic experiments by courts for rule-making purposes. Id. at 3. The potential harms associated with assigning litigants to alternative rule regimes are likely to be much greater than the potential harms to participants in any of the social fact studies discussed in this Article. Accord Pager, supra note 68, at 127 (“Implicit in [the court rulings permitting the use of tester studies] . . . is the belief that the misrepresentation involved in testing is worth the unique benefit this practice can provide in uncovering discrimination and enforcing civil rights laws.”).

205 See Advisory Comm. on Experimentation in the Law, supra note 204, at 118–21 (identifying procedural and statistical methods that protect the privacy of individual subjects).

206 See Fed. R. Civ. P. 26(c)(1) (listing the circumstances in which a court may issue a protective order). Rule 26 gives trial judges broad power to regulate the discovery process through protective orders, including protections extended to nonparties from whom information is sought. See id. (“The court may, for good cause, issue an order to protect a party or person . . . .” (emphasis added)).

207Melissa Hart & Paul M. Secunda, A Matter of Context: Social Framework Evidence in Employment Discrimination Class Actions, 78 Fordham L. Rev. 37, 53 (2009). Professors Hart and Secunda provide no support for their characterization of social fact studies as expensive or time consuming and provide no information about the relative costs of social fact studies versus “social framework analysis.” We see no a priori reason why social fact studies will be more burdensome on time or money dimensions. The cost of each approach depends on the scope of the study or analysis undertaken. For example, a systematic coding of case records, wed with a quantitative analysis, could be conducted more reliably and perhaps more efficiently than a “social framework analysis,” depending on who serves as the document coders and the scope of the task given to the coders.

208 See Fed. R. Evid. 702 advisory committee’s note (“If the expert purports to apply principles and methods to the facts of the case, it is important that this application be conducted reliably.”); cf. Monahan et al., Ascendance of Social Frameworks, supra note 5, at 1738 (“We recognize that ‘social fact’ studies of the kind that would survive Rule 702 scrutiny might be costly and might require judicial involvement to ensure access to company personnel. But this possibility does not, in our view, justify the acceptance of unscientific speculation in the form of ‘social framework analysis.’”); Monahan et al., Limits of Social Frameworks, supra note 5, at 318 n.58 (“In any event, we are aware of no court rulings that excuse expert witness reliability requirements because compliance with those requirements would be difficult or costly.”).