NHGRI logo

Ethical Issues in Developing a Haplotype Map with Socially Defined Populations

Morris W. Foster, Ph.D.
Department of Anthropology
University of Oklahoma, Norman, Oklahoma

Categories such as race and ethnicity are useful as heuristic starting-points for the investigation of biological relatedness. In attempting to approximate the range of human genetic variation, social categories such as these offer a practical way of approaching the study of that diversity and of defining inclusive criteria for participant recruitment. It is this practical consideration and the imperfect (sometimes very imperfect) fit between social and genetic definitions of a population that give rise to a series of scientific and ethical issues in contemplating the development of a human haplotype map with identified populations.

An important preliminary question is whether or not it is necessary to use separate, defined populations for such a project, or whether, alternatively, a unified non-population specific haplotype map could be developed that could provide essentially the same information. This topic will be discussed at the haplotype map meeting.

A haplotype map project would not directly raise all the issues involved in other types of genetic studies with identified populations, even if racial and ethnic identifiers were to be retained with the sample set. This is because the map, once developed, would become essentially only a resource for use in other (subsequent) genetic studies (such as association studies) and would not itself have direct clinical use. Nevertheless, the proposed haplotype map project cannot be considered in isolation from the more general, ongoing discussion of the implications of using socially constructed identities in genetic research. Nor can it be considered apart from prior efforts to catalogue human genetic diversity and the controversies that surrounded them.

Population Definition and History

The ways in which we are accustomed to group people socially (race, ethnicity, nationality, religious affiliation, geographic locality, etc.) are not random. The lines that exist between those social categories constitute partial barriers to interaction (and reproduction). In addition, many socially defined populations - especially small ones - are affected by founder events, genetic drift, population bottlenecks and admixture. For these reasons, socially defined populations often do have some actual genetic meaning.

However, social mechanisms exist that can make socially defined populations imperfect as proxies for biological relatedness when developing a haplotype map or conducting other genetic studies. These include: migration; intermarriage; cultural preferences for claiming only one's father's identity or one's mother's; fictive and adoptive kinship; instrumental and situational choices in asserting alternative identities; colonialist, racist, and nationalist ideologies that impose new identities on subjugated populations or population subgroups; and economic, religious, and other barriers to interaction (and reproduction) within larger social categories. Thus, even after a human haplotype map is constructed, socially defined populations still will differ from genetically defined populations - a fact that is well understood by geneticists but not necessarily by the general public. A genetic or biological view of population history will not be the same as a social view of population history, although the two versions will involve some of the same members.

Another confounding variable is that many genetic studies rely solely or primarily on participant self-report to assess membership in a socially defined population. While this standard is appropriate for studying social identities as social phenomena, it does not allow for the critical scientific investigation of their accuracy.

Risk and Reification

Using broad, socially defined populations to structure participant recruitment for a haplotype map project, and retaining racial and ethnic identifiers, may lead non-scientists who become aware of the project to reify those social categories as biological constructs, fostering an unintended genetic essentialism in the way the public understands such categories as race and ethnicity. That essentialism could obscure the important fact that the "boundaries" between groups are highly fluid and that most genetic variation exists within all groups - not between them. Even the smallest socially defined population will have multiple haplotypes, and haplotypes will be shared among different, socially defined groups.

We know that the association of racial, ethnic or other social identities with genetic findings about disease may be exclusionary and stigmatizing. Diagnoses of sickle cell and cystic fibrosis, for instance, often are missed or delayed when patients are not identified as being members of higher risk racial or ethnic populations. Being a member of what is publicly labeled as a higher risk population also can lead to a shared stigmatization (as happened historically in the case of sickle cell trait), even though one is neither a carrier nor an affected. While disease genes and genetically targeted therapies will not be discovered or developed by haplotype mapping itself, the association of socially defined populations with genetic definitions of populations may contribute to the reification of the former as well-defined biological categories, with implications for public perceptions of health and other measures of social status.

Genetic interpretations of population history also are not without risk, having the potential for unintended uses in political and legal settings, as well as nationalist and racist re-interpretations by non-scientists. A haplotype map that associates racial, ethnic, and other socially constructed identities with specific ancestral haplotypes could even potentially be used in legislation or a lawsuit to determine which identity an individual can or must use. The United States, for instance, has historically used lay interpretations of biological ancestry as a statutory basis for determining which individuals are legally African American or Native American. Indigenous and colonized populations are most vulnerable to these misuses, but minority populations (for instance, those in Europe), also could be at risk.

Vulnerabilities such as these are not always self-evident because risk tends to be culturally defined. In the cases of many non-majority communities, researchers and institutional review boards will be unable to anticipate all culturally specific risks (such as those associated with the creation of immortalized cell lines and public databases). Outsiders often are unable to fully appreciate community-specific risks of these kinds even once they have been identified for them, in part because minority community members' perceptions of these risks may have been heightened by their historical experiences of being economically and politically disadvantaged with respect to the majority society (of which biomedical research is a tangible manifestation). The difference in power between researchers (as representatives of the Euro-American political economy) and some socially defined populations lacking in significant economic and political resources may affect the latter's ability to fully conceptualize and negotiate the conditions for sample donation and to take effective action on any subsequent concerns about sample misuse or adverse interpretations of genetic findings. For these reasons, community involvement and consultation are essential in planning genetic research with socially defined populations (see separate background paper on Community Involvement and Consultation).


Naming a socially defined population in a disease gene association study means that all members may be affected by the findings of the study (because of the social identifiability of those populations to others), including members who did not consent to or take part in the study. Potentially affected individuals will be defined by how researchers choose which social labels to use in relation to a given sample of donor participants. In particular, the use of parent-child trios in a publicly accessible database or cell-line collection raises questions about the degree to which both families and individuals may be identifiable, depending on the size of the community and on the uniqueness of the social identities involved.

Among members of smaller populations that are, in effect, extended pedigrees, there will be fewer differences in conserved haplotypes than among members of larger social categories. Those smaller population studies will, essentially, be pedigree studies and may require more stringent human subjects protections to preserve family and individual privacy. In general, smaller or more isolated populations will require stronger protections because of their greater potential for identifiability.

In a larger, socially defined population in which there is considerable difference in members' haplotypes, a small number of donor participants may not represent the full range of genetic variation present in the population. Ascertaining the intra-population range of variation will thus be important for any downstream benefits that haplotype mapping may have for persons with that social categorization.

Benefits and Justice

Although much more study is needed, it is currently believed that the genetic basis for many common, complex disorders is essentially the same across populations. The rationale for using different populations in the development of a human haplotype map is that different haplotype structures in different groups provide different types of information. For example, large haplotype blocks are most useful for initial identification of particular chromosome regions of interest. On the other hand, smaller haplotype blocks are more useful for localizing genetic effects to smaller regions, and sometimes even to specific genes. In either case, however, the information provided by a haplotype map will be useful for finding genes that contribute to disease in all populations - not just in the particular groups sampled for the map. Since the ultimate purpose of a haplotype map is to facilitate further genetic research that may eventually result in diagnostic, therapeutic, or other health benefits for individuals in all population groups, there may be little direct or immediate benefit to members of the specific groups included in the resource.

Nevertheless, the inclusion of members of a particular group in sampling for a haplotype map project may provide important "spillover" benefit to members of those groups - if only because it increases the probability that the disease genes of greatest interest to them will be found expeditiously, thus enabling further research on the relevant genetic and environmental contributions to those diseases to move forward. Correspondingly, the decision to exclude certain groups from a haplotype map project could have the effect of depriving them of this potential (albeit indirect) benefit.

Thus, the decision as to which populations to consider for inclusion or exclusion in a haplotype map project may have important ramifications for the setting of downstream research priorities that, in turn, could have implications for helping either to exacerbate or ameliorate health disparities between groups. While it is therefore important to take into account the special vulnerability of identified populations (especially non-majority populations) in planning a haplotype map project and cell-line collection, it is equally important to consider the possible consequences of excluding them.

Community Involvement and Consultation with Identified Populations in Genetic Research

Morris W. Foster, Ph.D.
Department of Anthropology
University of Oklahoma, Norman, Oklahoma

Genetic research - particularly involving members of identified populations - can present unique challenges for the protection of human subjects, raising difficult issues of collective risks and benefits that cannot be addressed fully through individual informed consent and evaluation by Institutional Review Boards. This issue has catalyzed the development of innovative efforts to involve communities in the design and review of genetic studies.

Initial experiences in conducting community consultations suggest that outsiders often cannot anticipate local perceptions of risks and benefits. Hence, it is important to engage communities to discover how members perceive the implications of a specific study as well as to identify ways to minimize perceived risks. Frequently, the more prominent local concerns about a genetic study focus on how its conduct and findings may disrupt relationships among community members, rather than on how information about the community may be used by outsiders.

Some genetics researchers have criticized community involvement as requiring too much time and too many resources. However, when viewed as the basis for a long-term collaboration between researchers and communities, the initial expense of this trust-building process can be understood as an investment rather than as an unreasonable cost.

Communities: Communities are composed of individuals with shared interests and interactional patterns that constitute an internal social dynamic that permits (among other things) collective decision-making processes. Communities and populations may overlap in a number of different ways. For example, populations may be composed of more than one community (e.g. a research population described generally as "African Americans" that is composed of multiple smaller communities) or may fail to correspond with any existing community.

Types of communities: Some communities, such as Native-American tribal communities, are politically organized in such a way that they provide formal mechanisms for those communities to give (or withhold) consent to conduct genetic studies among their members. Other kinds of communities, however, lack similar central public authorities. In those latter cases, different forms of community involvement (ranging from establishing a dialogue between researchers and members to a more formal survey of representative members' views) can be undertaken to give community members a voice in how a study is designed and how individual participants and supporting communities are protected. While the opportunity for a full airing of differing views within a community may have the effect of exacerbating pre-existing social divisions, just having such information about differing perceptions within a community can be important in considering a study's implications. Thus, community involvement and consultation has considerable value even in the absence of a communal consensus.

Potential Problems: Community involvement sometimes is complicated by: hierarchically nested identities that may necessitate multiple levels of consultation; geographic dispersion of some socially defined populations into multiple communities; other populations that are composed of geographically overlapping communities; and persons who share a social identity but do not choose to be members of communities that are organized around that label. Community consultation must also take into account the value of respecting the autonomous choice of individuals to participate in a study or not.

Types of Issues Commonly Negotiated: Among the issues commonly negotiated with communities in the context of genetic research are: control of future uses of DNA samples; reporting of study findings to supporting communities; incorporation of research questions of interest to community members; funding of community organizations and individuals to conduct parts of the study; provision of other opportunities for training to community members; provision of health benefits to the community; and sharing of any commercial royalties from study discoveries. With respect to DNA repositories, evolving experience with minority communities suggests that representatives should be involved in the coordination of sample collections (including the selection of populations for inclusion), in the development of repository policies for storing and distributing biological materials, in the evaluation of proposed studies using stored materials, and in the disposition of stored materials.

Last Reviewed: February 22, 2012

Last updated: February 22, 2012