Genetically Complex Cardiovascular Traits
Origins, Problems, and Potential Solutions
Modern molecular genetic analysis tools are making it possible for researchers to investigate, and in many cases actually disclose, mutations and other genetic factors that contribute to disease susceptibility. However, the ease with which these factors can be identified is dictated by not only the number of factors underlying or influencing the trait, but also by the manner in which these factors interact. Traits that are influenced by multiple genetic and nongenetic factors are termed “complex” genetic traits and are receiving a great deal of attention in the current medical literature. Hypertension and blood pressure regulation are considered paradigmatic complex traits. In this paper, the origin, nature, and dilemmas associated with the analysis of complex traits are considered. Basic biochemical and physiological determinants of blood pressure are described in an effort to show how genetic complexity could arise within an individual, and fundamental concepts in population genetics and evolutionary theory are discussed to expose the reasons certain forms of genetic complexity can emerge and be sustained in the population at large. Methods for approaching the genetic dissection of complex traits and diseases are also enumerated, with simple descriptions of the scientific motivation offered for each. Problems plaguing these approaches are also discussed. Finally, areas for future research are outlined with the hope of sparking further debate on the subject.
- linkage analysis
- statistical models
- gene mapping
- complex traits
- genetic epidemiology
- Sewall Wright
The search for genes influencing traits and diseases of all sorts has become the focal point of a great deal of contemporary medical research.1 The reason for this is obvious: If a gene contributing to a malleable, preventable, or treatable condition can be identified, then that gene’s structure, function, and ultimate role in influencing the relevant condition can be determined. This knowledge could lead to better ways of predicting the (future) presence of that condition, diagnosing that condition, and preventing (or enhancing) that condition. It should be no surprise that most of the research in this area has focused on diseases that are debilitating or common sources of morbidity and mortality in the population at large. Hypertension and related cardiovascular diseases are major contributors to morbidity and mortality in modern industrial and urbanized societies and have thus received considerable attention from geneticists. Unfortunately, susceptibilities to hypertension and its sequelae are known to be mediated by a number of genetic and nongenetic factors. This fact makes hypertensive cardiovascular disease (HCD) paradigmatic of so-called “complex” genetic traits. Although the label “complex trait” has been used indiscriminately and could be applied, given its vagueness, to any trait, disease, or condition, the true hallmark feature of a complex trait as initially defined1 is an underlying determination that can only be attributed to multiple genes and environmental factors. With this in mind, it is thought that the disclosure and characterization of the factors contributing to the emergence and maintenance of complex traits will require very sophisticated research strategies.1–3⇓⇓ Many of the extant strategies for dissecting the genetic basis of complex traits are inadequate and not very powerful. This is likely due to the fact that not much time has elapsed between the invention of molecular tools that could be used to probe for genes and the present state of medical research. It is therefore important to consider issues that might be of relevance for the development of better research strategies that make use of these tools.
In this article a discussion of the origins of, problems associated with, and research avenues for investigating complex HCD is offered. A crucial distinction between complexity at the level of an individual and complexity at the level of a population is made. It is hoped that by discussing issues surrounding the very definition, origin, and problems associated with complex genetic disease research, a greater focus on appropriate research strategies for HCD will emerge. A more complete discussion of relevant issues developed in this article is given by N.J. Schork (in preparation).
Complexity of Complex Traits: Individuals
Human physiology and biochemistry are extremely complex. Nowhere is this clearer than in the regulation of human blood pressure. Consider Fig 1, which offers an abstraction of the factors mediating blood pressure level within an individual. It is clear that an individual’s blood pressure level is influenced by a host of systems and subsystems, all interwoven into a complex network that is simultaneously filled with hierarchies and redundancies. On top of this network are additional phenomena, such as development, growth, and aging, which might further complicate blood pressure regulation, since each system or subsystem may have a more or less pronounced effect on an individual’s blood pressure level at different times in the life of that individual. This multitude of systems and age dependencies creates enormous potential for a variety of mutant genes to upset or impact blood pressure level. Identification of such genes through classical genetic strategies that involve studying hypertensive individuals or individuals with HCD will then be plagued by this very redundancy, compensatory control, and factors like them. This is the case simply because the effect of any one gene may be obscured or confounded by the effects of others.
There are at least five problems plaguing the identification of genes underlying complex traits that have received recent attention:
(1) Classical polygenic or “threshold” inheritance, in which a number of genotypes or mutations at different loci (which likely impact different physiological systems) must be transmitted to an individual before his or her system is sufficiently challenged to result in disease. Thus, despite the arbitrary nature with which blood pressure criteria are used to diagnose hypertension, it may be the case that one needs to possess a number of genes before his or her blood pressure will surpass these arbitrary thresholds. It may also be the case that one needs to possess a number of genes before additional pathologies associated with HCD (eg, vascular damage) appear.
(2) Locus heterogeneity, in which defects in any of a number of genes or loci can confer disease susceptibility independently of each other.1 Thus, under heterogeneity, individuals with similar phenotypic features or disease states may possess different genetic variants that lead to the disease.
(3) Epistasis, or gene interaction, in which the possession of a certain mutation or genotype will confer susceptibility to a degree dictated by the presence of other mutations or genotypes. Thus epistasis reflects basic interactive effects of mutations, genotypes, and/or their biological products.
(4) Gene×environment interactions, in which a gene or genes have their deleterious effects only when an individual possessing them is exposed to particular environmental stimuli.
(5) Developmental or time-dependent expression of genes, in which a gene, whether in mutant form or not, has its most pronounced deleterious effect at a certain time or developmental stage (eg, puberty).4
Combating polygenic inheritance, heterogeneity, epistasis, and developmental effects in gene mapping and characterization studies has been a primary motivating factor for a great deal of contemporary statistical/genetic modeling and research. This is the case because traditionally such modeling and research has often focused on the “testing” of specific (read: individual) genes or genomic regions, as opposed to multiple genes or environmental factors, thought to influence particular traits.1 However, although some progress in this area has been made, existing statistical models and methods are still inadequate and need to accommodate an even wider array of complexities if they are to be at all realistic and useful.4
On reflection of the fact that blood pressure regulation is complex enough to admit great potential for a multitude of genetic and nongenetic factors to induce deleterious effects, one may be forced to consider questions about how and why individuals possessing or susceptible to such factors came to be, and how and why these individuals still exist given the fact that they have an unhealthy and life compromising predisposition to HCD. Questions of this sort relate to population genetics theory and should be recognized as adding yet another layer of complexity to the genetic dissection of HCD.
Complexity of Complex Traits: Populations
Standard evolutionary theory would suggest that the emergence and maintenance of novel phenotypes in the population at large (including diseases like hypertension and HCD) are entirely driven by mutation and selection. That is, standard evolutionary theory would argue that hypertension and/or HCD arose through a novel mutation or mutations that caused elevations in blood pressure, whereby these mutations may have, at least at some point in the past and on the basis of their influence on other phenotypes or traits, provided those possessing them a survival advantage (see Julius,5 Julius and Jamerson,6 and Weder and Schork,7 for a discussion). Sewall Wright, among others, challenged this very limited view by detailing the importance of stochastic (ie, random) factors, gene interactions, migration, population size and subdivision, and inbreeding in the emergence and maintenance of novel phenotypes and disease. Wright8,9⇓ argued that large populations typically carry a “stockpile” of alternative forms of genes that would never result in or contribute to something like a new disease or trait if not coupled with the right environment or gene combination. Wright then argued that if such a large population were to subdivide (due to, eg, limited resources, social strife, or natural disasters), then the random assortment and assignment of genes to the founders of the resulting subpopulations might result in a greater frequency of a certain gene or gene combination within one or a few of those subpopulations. Since these genes or gene combinations could occur with a greater frequency in a subpopulation, transmission of them to ensuing generations could result in their greater frequency or fixation. This would be even more the case if the subpopulation had a relatively small size initially, since the founding gene pool would be relatively small and mating members within that subpopulation would likely merely “reshuffle” existing genes and possibly push them toward an even greater frequency or fixation. If individuals with these stochastically determined prevalent genes or gene combinations were more fit or had a survival advantage in a different environment, then migration of those individuals to that different environment would result in a set of individuals whose greater fitness in that environment would create further propagation of the relevant genes or gene combinations.9
Consider Fig 2 and the “1” gene variant. It is rare (relative to all the genes) in the original population, but not as rare in the leftmost subpopulation. Such an aggregation of “1” genes in this subpopulation may have been purely a chance event or may have been guided by some factor (ability to withstand a different environment). In either case, there is now a higher probability that individuals with that gene will mate and propagate those genes than in the parent population. This subpopulation can undergo further division and thereby lead to an even greater abundance of such genes, again either by chance assortment or selection. An allele or genetic combination may even become fixed or lost (ie, everyone in a population has the gene or everyone in the population does not have the gene) as a result of random or selective forces (eg, the bottom, leftmost subpopulation of Fig 2). These genes could confer advantage in other environments. Thus, the random assortment of genes through population stratification may result in gene combinations that result in the greater fitness of the individuals possessing them in environments different from the one in which they arose. This is the substance and basis of the “Shifting Balance of Evolution” (SBE) theory.
Wright’s arguments can easily be invoked in discussions about the complexity of HCD: Since there are so many physiological and biochemical pathways that mediate blood pressure regulation and the human species is relatively old, with many population subdivisions and environmental changes, there are quite likely to be, on a worldwide scale, different mutations and gene combinations contributing to HCD. This insight is even more compelling in light of the very great environmental differences between parts of the world. Consider the fact that HCD can be understood as a “disease of civilization,” which has been exacerbated or made prevalent by modern lifestyles, dietary habits, and technological advance.10 Three basic facts contribute to this suggestion:
(1) Much HCD develops late in life, ie, after or during the reproductive years, and thus is able to manifest itself to a greater degree in modern society simply because life expectancy has increased dramatically.
(2) Much HCD is associated with urbanization and “westernized” lifestyles and diets (eg, high-salt diets, inactivity and obesity, pollution, and stress) that were not prevalent in the past and/or do not exist in some parts of the world to the same degree that they exist in others.
(3) Palliative and curative, though not necessarily preventive, medicines for HCD (and especially hypertension) exist, allowing individuals who otherwise might suffer or die from HCD to exist and transmit the responsible deleterious genes to ensuing generations. Although this phenomenon is not likely to contribute greatly to certain forms of essential hypertension since many persons inflicted by HCD and hypertension are older and past the key reproductive years, for other forms of HCD (eg, precocious myocardial infarction), such a phenomenon could play a role.
Such strong environmental determinants of HCD not only suggest the validity of concepts, like Wright’s that emphasize a role for the environment in directing the emergence and maintenance of trait variation but also suggest a role for gene×environment interaction studies in the dissection of the genetic basis of HCD.11,12⇓
Methods for the Genetic Dissection of Complex Traits
The previous two sections outlined aspects of the physiological and biochemical determinants of blood pressure regulation and evolutionary theory in an effort to put the difficulties surrounding the genetic dissection of HCD into a context. It is thus important to consider the question of just why current strategies for identifying genes are ill equipped to accommodate and overcome these difficulties without modification.
There are two basic strategies for characterizing genes that influence complex traits: candidate gene analysis and whole-genome searches.1–3⇓⇓ Candidate gene analysis is very straightforward: one merely tests the association between a particular genic variant (ie, allele) and a disease or trait with the hope of identifying a variant that is more frequent among individuals with the trait than those without the trait due to a causal relationship between that variant and trait. Candidate gene analyses are therefore dependent on knowledge about a gene or variant, and the appropriateness of the analysis of a particular gene is only as good as the knowledge that makes the gene or variant a “candidate” in the first place. Such knowledge can be obtained from biological insights (eg, the gene is known to be expressed in a certain tissue of relevance to the trait under scrutiny), homology to other genes, guesswork, or other factors. A problem with candidate gene analysis in light of the comments in the previous sections is that there are likely to be numerous (if not innumerable) candidate genes for HCD. Analysis of each and every one of these candidates, in isolation of the others, may amount to testing every gene on the human genome—an endeavor fraught with statistical problems relating to false-positive results.13,14⇓ In addition, since there is likely to be a great deal of heterogeneity, both with respect to the genes that predispose one to HCD and the environments that one may live in that induce susceptibility to HCD, finding appropriately homogenous case and control groups (ie, non-HCD individuals) might be problematic.15 Although there are strategies that alleviate the control group problem,15 these strategies do not necessarily allow one to test the simultaneous effect of multiple loci or environmental factors and thus are not necessarily appropriate for a comprehensive assessment of HCD genetics and risk factors.
Whole-genome searches involve gathering a large number of related individuals thought to be segregating for genes that influence a trait and then tracing the putative parent-to-offspring contransmission of variants (ie, alleles or genotypes) at landmark spots along the genome (known as marker loci) with possible trait-influencing variants or alleles. If one can conclude that alleles at a particular marker locus appear to segregate (or be transmitted along with) genes seemingly influencing the presence of the trait or disease in question, then one could infer that a gene actually influencing the trait or disease is near, or “linked” to, the marker locus in question. Statistical methods used to draw inferences about the putative contransmission of marker locus alleles and trait-influencing alleles have been termed “linkage analysis” methods and have received a great deal of recent attention.1–3⇓⇓ There are two general approaches to linkage analysis: parametric pedigree analysis, which involves tracing cosegregation and recombination phenomena between observed marker alleles and unobserved putative trait-influencing alleles among members of large pedigrees, and allele-sharing methods, which assess the number of marker alleles shared at a particular locus among pairs of relatives manifesting the same trait. Schork and Xu16 have considered the relative advantages and disadvantages of each approach. It should be emphasized that candidate genes could be assessed within a linkage analysis framework by simply treating the alleles at the candidate gene as though they were associated with a marker locus. One of the biggest problems with pedigree and allele-sharing analysis approaches is that most of their implementations focus on the detection of single loci or genetic variants (much like many candidate gene analyses), which make them somewhat unsuited for the analysis of multigenic traits like HCD. In addition, linkage strategies are notoriously nonpowerful for detecting genes with small to moderate effects.14,16⇓ Also, the collection of families necessary for conducting genetic linkage analyses and genome-wide searches may require finding a large number of families with individuals possessing the trait of interest. The use of a large number of families with different environmental exposures and genetic or ethnic backgrounds could introduce problems associated with heterogeneity, whereby the effect of one gene is washed out by the effect of others (ie, its effect is not constant, detectable, or even present in all the individuals in the sample).1,2,14,16⇓⇓⇓
There are other issues that plague candidate gene and linkage analysis. Many of these have been discussed in the literature, although they do not necessarily bear on the complexity of the traits to be studied but rather on statistical phenomena, such as marker informativity, marker spacing, and type I and type II error rates.1
Directions for Future Research
In an effort to accommodate the kind of complexity underlying HCD in candidate gene and whole-genome searches, one might have to consider a number of issues. Some of these issues are described in isolation below but have been touched on elsewhere as well.1,2,4,14,17–19⇓⇓⇓⇓⇓⇓
Finding More Homogenous Populations to Sample From
Obviously, one very good way to cut down on possible heterogeneity problems plaguing HCD genetics research would be to sample individuals known to be of common origin (ie, likely to possess the same set of mutant genes and genetic variants predisposing to HCD) and exposed to common environments. Such sampling has been given heavy emphasis in linkage disequilibrium mapping studies, where the relative isolation of a population, its age, its size, and its environmental homogeneity are all considered in the mapping effort.20 One drawback of such studies is that the genes identified may not be “ubiquitous” and may in fact cause HCD only in the population studied. Consider the study of an island population founded by a small religious sect that promoted a strict lifestyle. The genes underlying HCD among this group of people may be “private” alleles that are unique to that population and not contribute to (because they don't exist among people with) more “garden variety” forms of HCD seen in much larger populations. Such an argument is not compelling if one merely wants to determine a physiological mechanism that influences blood pressure by finding a gene. Finding special populations is not the only way one could preserve homogeneity. One could attempt to identify unique features within persons having a common condition (eg, obese, type II diabetics with HCD) in an effort to cull out a more clinically homogenous group. The motivation for this would be to find a group of individuals that have a condition caused by a common set of dysfunctional genes.1,21⇓
Assessing Population Structure
In the absence of island populations and the like, one could perform molecular assays within a large population in an effort to determine more genetically homogenous subgroups. For example, one could try to determine the amount of admixture within a population and attempt to exploit this admixture to map genes.22 In addition, one could attempt to determine the relative genetic distance between populations in an effort to assess their possible common origins and ultimate homogeneity or attempt to reconstruct the genealogical relationships among people within a population so this information could be exploited in gene mapping efforts.23,24⇓
Making Better Use of Animal Models
Mapping genes that influence traits in model organisms can help human geneticists find and characterize genes that influence analogous human traits in two ways. First, model organism studies have the capacity to expose systems and subsystems influencing a trait or condition that are likely to have human counterparts. Such knowledge can steer human geneticists to physiological and biochemical systems whose genetic bases might be known or easily identified. Second, genes are known to be conserved throughout evolution, so that finding a gene that influences blood pressure in rats, for example, may lead one to study the homologous human gene. Of course, the leap from rats to humans is a large one, so that the gene identified in rats may have lost (or changed) its function in humans.
Promoting Better Physiology
Obviously, the greater our understanding of the physiological and biochemical determinants of blood pressure regulation, the easier it will be to put the roles each gene might have into perspective. Thus, for example, it would be worthwhile to map genes that influence traits at lower levels of a physiological hierarchy. Such genes would likely be easier to identify since the phenotypes they influence are not as far removed from the genetic substrate that determines them (or at least not to the same degree as the more remote trait they impact). Thus, there are likely to be fewer genes and other factors that influence these “intermediate” traits. In addition, the knowledge gained from the identification of such genes would shed enormous light on how the determinants of, eg, blood pressure regulation, interact and operate when upset or dysfunctional. Note that such information can be gleaned from pharmacological probes and studies as well.4,25⇓
Better Statistical Methods and Designs
The development of linkage and candidate gene strategies that can accommodate multiple genetic and environmental factors should easily advance HCD genetics research. In addition, more efficient designs for mapping genes can only result in a greater number of research efforts, leading to a possible convergence and corroboration of results. Just how such designs would take their shape is of course in question, but recent work by Risch and others suggest some directions.14,26⇓ Such designs may also be dictated by technological breakthroughs. For instance, if sequencing genes becomes cheap, then study designs and analytical methods for directly relating sequence variation and trait variation will likely become focal points in statistical genetics research.
The emphasis among current medical researchers on the genetic dissection of complex traits such as HCD will not likely diminish any time soon. The difficulties surrounding the full disclosure of the array of genetic and environmental determinants of HCD, many of which have been touched on in this paper, will likely drive relevant research well into the future. The direction such research will take will likely be vastly different from current research paradigms. For example, some have argued that the “future” of complex genetic trait research will be of a largely statistical orientation.14 This is highly unlikely, since the development of better animal models, in vitro assays, pharmacological probes, gene expression analyses, and population genetic investigations will likely overshadow discussions about which statistic or modeling device will be the least error prone. This does not, however, undermine the significance quantitative methods will have in HCD research. Ultimately, what would seem to be the most compelling position to take in this light is the simple promotion of concerted efforts to integrate various strategies and the knowledge obtained with them.19,27⇓
Supported by NIH grants HL94-011 (NHLBI), HL54998-01 (NHLBI), and RR03655-11 (NCRR).
Lander ES, Schork NJ. Genetic dissection of complex traits. Science. 1994; 265 : 2037 –2048.
Schork N, Chakravarti A. A nonmathematical overview of modern gene mapping techniques applied to human diseases. In: Mockrin S, ed. Molecular Genetics and Gene Therapy of Cardiovascular Disease. New York. NY: Marcel Dekker Inc; 1996: 79 –109.
Schork NJ, Nath SP, Lindpaintner K, Jacob HJ. Extensions of quantitative trait locus mapping in experimental organisms. Hypertension. 1996. In press.
Weder AB, Schork NJ. Adaptation, allometry, and hypertension. Hypertension. 1994; 24 : 145 –156.
Wright S. Evolution in Mendelian populations. Genetics. 1931; 16 : 97 –159.
Risch N, Merikangas K. The future of genetic studies of complex human diseases. Science. 1996; 273 : 1516 –1517.
Schork NJ, Xu X. Sibpairs versus pedigrees: what are the advantages? Diabetes Rev. 1996. In press.
Ghosh S, Schork NJ. Genetic Analysis of NIDDM: the study of quantitative traits. Diabetes. 1996; 45 : 1 –14.
Xu X, Schork NJ. Linking genes and environmental exposure: why China presents special opportunities. Cancer Causes Control. 1996. In press.
Freimer NB, Reus VI, Escamilla MA, McInnes LA, Spesny M, Leon P, Service SK, Smith LB, Silva S, Rojas E, Gallegos A, Meza L, Fournier E, Baharloo S, Blankenship K, Tyler DJ, Batki S, Vinogradov S, Weissenbach J, Barondes SH, Sandkuijl LA. Genetic mapping using haplotype, association and linkage methods suggests a locus for servere bipolar disorder (BPI) at 18q22–q23. Nat Genet. 1996; 12 : 436 –441.
Risch N, Zhang H. Extreme discordant sib pairs for mapping quantitative trait loci in humans. Science. 1995; 268 : 1584 –1589.