Exploring Diversity In DNA Research
- Joshua Lim
- Mar 10, 2021
- 4 min read
Since the completion of the Human Genome Project in 2003, the comparison of DNA among different groups of people has become an important tool in genetic discoveries. One of the most widely used approaches is the genome-wide association study (GWAS). This study compares single nucleotide polymorphisms (SNPs) of people with a disease to the SNPs of those without the disease. SNPs are genomic locations known to have single nucleotide variations (1). Critically, GWAS allows for the identification of SNPs that can act as biomarkers for a disease and is now often used to predict a person’s risk for certain diseases. However, over 80 percent of the DNA sequences used in current research are from European ancestry (2). Here, I discuss how limited representation impacts genomic findings and the benefits of including more diverse populations in research and human health.
In the human genome, there are approximately 60 million SNPs and about 15 percent are known to vary between ethnic populations (3), complicating genetic comparisons. If European control groups are used when comparing SNPs to people of dissimilar ethnicities, it would be difficult to distinguish between SNP variations caused by differences in ethnicity or by the disease itself. In other words, conducting studies with overrepresented European DNA sequence information could lead to false-positive and false-negative results in minority groups. For example, clinicians use DNA sequences to measure a patient's risk for hypertrophic cardiomyopathy, a condition in which the heart muscle becomes abnormally thick (4). However, because the data is based on research done mainly on European DNA sequences, African Americans are often misdiagnosed with the condition (4). Alternatively, false-negatives can also arise, as certain diseases are linked to different SNPs depending on ethnicity (5). Genomic research only with European DNA can miss out on biomarkers unique to ethnic groups, leaving them undiagnosed even when they have the disease. An example of this is the SNP rs1051730, which is associated with an increased risk of lung cancer in African Americans, but not in White Americans (6). If biomarkers such as rs1051730 are overlooked, the risk assessment of lung cancer in African Americans would produce false-negative results. Thus, the overrepresentation of sequence information from a single ethnic group can lead to inaccurate risk assessments.
Research done solely with European DNA can also limit the effectiveness of clinical interventions in other ethnic groups. For instance, CYP2D6 is a gene that varies in frequency depending on ethnicity. These variants can affect a person’s ability to safely consume certain drugs. By researching variants common to different ethnicities, researchers can determine the safety and efficacy of drugs in different populations with greater accuracy and equity. Due to the way research is currently being conducted, White populations can confidently make safer decisions. However, for minorities, their decision is made with less confidence and greater risk (2), contributing to inequalities in healthcare.
Fortunately, including different ethnicities in genetic research can lead to more accurate diagnoses and improved therapeutic strategies in all populations. Since the vast majority of GWAS research is dependent on European DNA, much of the predictive disease tracking technology currently available is unavailable to people of non-European ancestry. 23andMe, a company that provides genetic testing for disease risk, provides 22 different tests - 16 are not available to those with non-European ancestry (4). However, by utilizing more inclusive data, disease testing can become more accessible and accurate for minorities. Simulations show that having just 20 people of African ancestry in a control group of 200 people would allow clinicians a 50 percent probability of correctly identifying the variant as a biomarker (7). Furthermore, increasing ethnic diversity can lead to the discovery of new associations between SNPs and diseases. For example, the study of SNP variants in people of African descent led to the discovery of PCSK9’s role in the regulation of blood cholesterol. This discovery not only benefited people of African descent, but it was also repurposed into research for a drug with global utility (8). By identifying more genes such as PCSK9, we can better understand diseases and develop more effective therapeutic approaches for all ethnicities.
As the importance and utilization of genomic technology grow, more genomic data will continue to be collected. However, adding more sequencing data only from European groups will limit potential discoveries, disadvantaging the entirety of the human population. Inaccuracies rooted in research from limited genomic resources will create medical problems, expenses, and perpetuate inequality. Diversity in genomics is imperative to creating an equitable future that unlocks the full potential of genomics.
References:
1. Tam, Vivian, et al. “Benefits and Limitations of Genome-Wide Association Studies.” Nature Reviews Genetics, vol. 20, no. 8, 2019, pp. 467–484., doi:10.1038/s41576-019-0127-1.
2. Popejoy, Alice B., and Stephanie M. Fullerton. “Genomics Is Failing on Diversity.” Nature, vol. 538, no. 7624, 12 Oct. 2016, pp. 161–164., doi:10.1038/538161a.
3. Huang, Tao, et al. “Genetic Differences among Ethnic Groups.” BMC Genomics, vol. 16, no. 1, 2015, doi:10.1186/s12864-015-2328-0.
4. Haga, Susanne B. “Impact of Limited Population Diversity of Genome-Wide Association Studies.” Genetics in Medicine, vol. 12, no. 2, 2009, pp. 81–84., doi:10.1097/gim.0b013e3181ca2bbf.
5. Varela, Miguel A, et al. “Transfer of Genetic Therapy across Human Populations: Molecular Targets for Increasing Patient Coverage in Repeat Expansion Diseases.” European Journal of Human Genetics, vol. 24, no. 2, 2015, pp. 271–276., doi:10.1038/ejhg.2015.94.
6. Schwartz, Ann G., et al. “Racial Differences in the Association Between SNPs on 15q25.1, Smoking Behavior, and Risk of Non-Small Cell Lung Cancer.” Journal of Thoracic Oncology, vol. 4, no. 10, 1 Oct. 2009, pp. 1195–1201., doi:10.1097/jto.0b013e3181b244ef.
7. Manrai, Arjun K., et al. “Genetic Misdiagnoses and the Potential for Health Disparities.” New England Journal of Medicine, vol. 375, no. 7, 18 Aug. 2016, pp. 655–665., doi:10.1056/nejmsa1507092.
8. Sirugo, Giorgio, et al. “The Missing Diversity in Human Genetic Studies.” Cell, vol. 177, no. 1, 21 Mar. 2019, pp. 26–31., doi:10.1016/j.cell.2019.02.048.
Comments