Once an enormous undertaking requiring years of effort and billions of dollars, it is now possible to sequence an entire human genome in approximately two days for about $1,000.(a) The ease with which we can now obtain genomic information has the potential to transform medicine, allowing doctors to more accurately diagnose diseases and tailor treatment to suit an individual’s genomic profile.
Unfortunately, our ability to acquire genetic data has significantly outpaced our capacity to interpret and use it. In the current medical landscape, scenarios like the following are not uncommon:
After waiting months to get a medical genetics appointment and then weeks to hear the results, a family is hopeful for answers about their two-year-old son’s mysterious developmental problems. He was born with a heart abnormality and a missing kidney, and as he grew up, he appeared different from other children and was not meeting developmental milestones. His doctors determine that he has a rare chromosome deletion, but cannot say whether or how it is related to his condition because they aren’t aware of anyone else with this genetic abnormality. While they may have found a clue in the child’s genome, they don’t yet have the tools or information to understand what it means.
Everyone has variants in their genetic code like this child, but only some are associated with health problems – the rest constitute normal human genetic variation.(b) In order to decipher the true meaning of genetic variants, large amounts of information — more than any one laboratory alone can generate — must be collected and shared by laboratories, clinicians, and researchers. Genetic databases address this problem by aggregating genetic test results to catalogue gene variants.
While it may be years before we fully understand rare conditions like the one suffered by the child in the story above, access to a genetic database could help his doctors identify other children with the same genetic variant, determine if they have similar symptoms, and learn about treatments and therapies that have worked for these patients. In the long run, databases can also assist researchers seeking to understand a gene’s role in a particular disease. They will help the medical community transition from merely possessing massive amounts of genomic data to understanding and using this information.
Deciphering the Secrets in Our DNA
Researchers and clinicians have just scratched the surface in understanding how individual genes cause traditional “genetic diseases” like cystic fibrosis or sickle cell anemia (genetics), as well as how multiple genes in an individual’s genome contribute to complex diseases like diabetes and cardiovascular disease.(c) Only about 4,000 of the estimated 20,000 genes in the human genome have been linked to specific diseases.1 Even genes that have been studied for many years, such as the CFTR gene associated with cystic fibrosis, are not fully understood. Since discovering CFTR in 1989, scientists have learned a great deal about specific variants of the gene and how they cause disease, but there are still numerous variants that cannot be clearly categorized as disease-causing (pathogenic) or harmless (benign).2
While researchers continue to conduct large-scale studies on how genes affect health and cause disease, doctors are beginning to incorporate genomic data into clinical practice. When they see patients with health issues that may have an underlying genetic cause, clinicians often order testing of specific genes, a set of genes, or the patient’s entire genome. The resulting genetic information may be used to make or confirm a diagnosis, advise a patient on their prognosis, determine the most effective course of treatment, or decide how a patients’ health should be monitored and what other tests they should receive. Test results are also used in genetic counseling to determine whether other family members are at risk for a disease.(d)
Because genetic test results can be used to make important determinations about patient care, thorough and accurate interpretation of test results is critical. Incorrect interpretations can result in misdiagnosis, prompting unnecessary monitoring, procedures, treatments, and stress for individuals and their family members. This is especially problematic for individuals who have predictive testing when few or no signs or symptoms of disease are present; without symptoms, medical decisions may be based largely on the genetic test results.
The clinical laboratories that conduct genetic tests often play a significant role in the interpretation of findings, with laboratory scientists identifying and analyzing hundreds of thousands of genomic variants per year. The laboratory’s interpretation helps the clinician understand what impact, if any, a genomic variant may have on a patient’s health. But interpreting the human genome is not a simple task: genes have different structures and functions, the specific types of variants that cause disease differ from one gene to the next, and the ways variants are inherited or transmitted can vary.(e) Clinical laboratory professionals therefore spend significant amounts of time researching variants by consulting the medical literature and proprietary genomic databases.
While professional guidelines offer a loose framework to help laboratories determine the significance of genomic variants, a good deal of subjectivity remains, and even two highly competent laboratory professionals may interpret the same genetic variant differently.3 As more laboratories test patients’ entire genomes, they will encounter variants that they have little experience interpreting, most of which have not been extensively studied by researchers. When unable to determine if a genetic variant is the cause of a disease, the laboratory may report uncertainty about the result. Due to the many sources of variation among laboratories, patients with the same variant identified at different laboratories may receive conflicting reports on what that genetic anomaly means for their health.
Tools for Sharing Genomic Information
Given how critical the accurate interpretation of genomic data is to patient health, laboratories and clinicians need better resources to help them interpret genomic information accurately. The International Collaboration for Clinical Genomics, through the Clinical Genome (ClinGen) Resource Program, is working to harness the vast amount of genomic information generated by clinical laboratories and researchers to make it publicly available through ClinVar, a database housed in the National Center for Biotechnology Information (NCBI) at the National Institutes of Health (NIH).
ClinVar collects information about genomic variants and their relationships to human health from clinical laboratories and summarizes it for clinicians, researchers, and laboratory professionals. To protect patient privacy, the data is stripped of all identifying information (such as name and date of birth) and precautions are in place to control access.(f) ClinVar differs from other genomic databases in that it is free and publicly available and it focuses on all types of genetic variation, rather than just a particular set of genes. It collects information not only about the final decisions made by laboratories, but also the process by which they interpreted the evidence.
Genomic databases like ClinVar can provide researchers with data to analyze related to a particular disease. Researchers can also identify potential research subjects with rare genomic variants and, through carefully controlled protocols, invite them to participate in a study. By providing access to large amounts of data, genomic databases help laboratories interpret and standardize the results they provide to doctors and patients. The databases can also serve as a quality improvement measure, as a laboratory’s results can be evaluated against its own and other laboratories’ historical data. Clinicians can use genomic databases to compare their patients to others and learn from other patients with the same genomic variants or diseases.(g)
Patients are also getting involved in the movement by sharing their own genomic information and connecting with other patients via genetic registries. While data submitted to ClinVar and similar genomic databases typically comes directly from laboratories, registries collect data directly from patients. Often created by groups representing particular diseases, these registries are designed to gather the kinds of detailed, standardized information researchers need to identify and develop potential therapies and design clinical trials with objectively measurable endpoints. For example, the organization Patient Crossroads has created registries for numerous genetic diseases, including a muscular dystrophy registry with data on over 2,000 patients.
Registries typically ask detailed questions about an individual’s health history and enable storage of genetic tests results and other reports, but the individual chooses what information to enter and share with the public, researchers, and other patients. While registries contain genetic data, they are also rich sources of more general medical information. For example, they may provide health histories that can help clinicians and patients understand the symptoms and progressions of particular conditions.Many patient registries have evolved to include a community component facilitating communication and interaction among individuals with similar diseases or genetic conditions.
Paving the Way for the Genomics Revolution
The ability to sequence the human genome is one of the most significant medical breakthroughs of the last fifty years. But in order to reap the full potential of the DNA revolution, scientists need access to as much genomic data as possible, especially when a variant is too rare for most health providers to have on their own records. Genomic information means little without context, and sharing this information – in a private, regulated way – will have both individual and global benefits. When an individual undergoes genetic testing, a centralized repository of all known information on genomic variants will allow the laboratory to more accurately interpret their results and will help their clinician to develop a more personalized management plan. Ultimately, genomic data sharing will help the scientific community to better understand the relationship between genes and human health, paving the way for new medical breakthroughs and better health for all of us.
The authors would like to acknowledge Danielle Metterville, the ClinGen Resource Program, and the ICCG Education, Ethics, and Engagement Working Group.
Endnotes
- McKusick-Nathans Institute of Genetic Medicine, John Hopkins University (2014) Online Mendelian Inheritance in Man.
- John R. Riordan, Johanna M. Rommens, Bat-sheva Kerem, Noa Alon, Richard Rozmahel, Zbyszko Grzelczak, Julian Zielenski, Si Lok, Natasa Plavsic, Jia-Ling Chou, Mitchell L. Drumm, Michael C. Iannuzzi, Francis S. Collins, and Lap-Chee Tsui (1989) “Identification of the cystic fibrosis gene: cloning and characterization of complementary DNA,” Science, 245(4922): 1066-1073.
- C. Sue Richards, Sherri Bale, Daniel B. Bellissimo, Soma Das, Wayne W. Grody, Madhuri R. Hegde, Elaine Lyon, Brian E. Ward, and the Molecular Subcommittee of the ACMG Laboratory Quality Assurance Committee (2008) “ACMG recommendations for standards for interpretation and reporting of sequence variations: Revisions 2007,” Genetics in Medicine, 10(4): 294-300. Sarah T. South, Charles Lee, Allen N. Lamb, Anne W. Higgins, and Hutton M. Kearney, for the Working Group for the American College of Medical Genetics and Genomics (ACMG) Laboratory Quality Assurance Committee (2013) “ACMG standards and guidelines for constitutional cytogenomic microarray analysis, including postnatal and prenatal applications: revision 2013,” Genetics in Medicine, 15(11): 901-909.
- Melissa Gymrek, Amy L. McGuire, David Golan, Eran Halperin, and Yaniv Erlich (2013) “Identifying personal genomes by surname inference,” Science, 339(6117): 321-324.
- Elizabeth A. Worthey, Alan N. Mayer, Grant D. Syverson, Daniel Helbling, Benedetta B. Bonacci, Brennan Decker, Jaime M. Serpe, Trivikram Dasu, Michael R. Tschannen, Regan L. Veith, Monica J. Basehore, Ulrich Broeckel, Aoy Tomita-Mitchell, Marjorie J. Arca, James T. Casper, David A. Margolis, David P. Bick, Martin J. Hessner, John M. Routes, James W. Verbsky, Howard J. Jacob, and David P. Dimmock (2011) “Making a definitive diagnosis: Successful clinical application of whole exome sequencing in a child with intractable inflammatory bowel disease,” Genetics in Medicine, 13: 255-262.
Sidenotes
- (a) The first human genome was sequenced in 2003 at a cost of $2.7 billion. This year, news articles announced the arrival of the $1,000 genome, though this figure includes sequencing alone and not the cost of interpretation. Sequencing and testing only specific genes can cost even less.
- (b) Genes are the instructions made of DNA inside each cell that tell the body how to grow and develop properly. Though everyone’s DNA code is very similar, there will always be changes, or variants, from person to person. Variants within genes contribute to the qualities that make us all unique, such as eye color and blood type, but can also contribute to health and developmental problems.
- (c) The term genetics typically refers to the study of a single gene and of diseases caused by variants in individual genes, while the term genomics refers to the study of all of a person’s genes and their interactions with one another and the environment.
- (d) Because many genetic changes are inherited, multiple individuals in a family may be impacted by the interpretation of a single variant. Family genetic counseling helps identify which family members are at the highest risk, arrange for testing of these individuals, and explain complicated genetic information to family members.
- (e) Laboratories often have differing levels of expertise in interpreting variants detected in specific genes. Familiarity with a particular gene can vary, and factors such as the research interests and knowledge of the laboratory’s genetic professionals, the amount of previous testing the laboratory has performed for the gene, and the ability of the laboratory to offer complementary testing may influence the interpretation of test results.
- (f) One of the biggest concerns about a publicly available database of genomic information is patient privacy.4 ClinVar and other genomic databases take precautions that make it difficult to identify individuals based on their genomic data and to hold those who attempt to do so accountable. This does not negate the potential for loss of privacy, and patients must be made aware of this possibility before participating in registries or genomic databases.
- (g) Six-year-old Nicholas Volkner’s unknown intestinal problems persisted after 100 surgeries, until doctors found a previously unseen mutation on his X chromosome.5 The research team in charge of the case was able to establish an effective treatment based on the diagnosis, and one of the leading scientists now advocates DNA sequencing as a routine procedure for unidentifiable medical conditions. Having this data on file could help doctors diagnose and treat future patients who present similar symptoms.