# Potentially Spurious Correlations Between Arterial Size, Flow-Mediated Dilation, and Shear RateNovelty and Significance

## Jump to

## Abstract

The use of indices formed from the ratio of 2 variables often generates spurious correlations with other variables that are mathematically coupled. In this context, we examined the correlations between percent flow-mediated dilation, baseline diameter, and shear rate. In a sample of 315 participants, with and without substantial vascular risk factors, the observed correlation coefficients between the variables were of a similar magnitude to those reported in the literature. We then applied a Monte Carlo procedure based on random permutations to remove any physical or physiological explanation for these correlations. We found that the median residual correlation coefficients were comparable with those observed in our original sample. When the confounding influence of artery size was adjusted for, the mean difference in percent flow-mediated dilation between high-risk and low-risk samples was halved. These findings indicate that the widely reported correlations between flow-mediated dilation, basal artery diameter, and shear rate have a substantial spurious component. This is because percent flow-mediated dilation and shear rate are mathematically coupled to artery size.

- cardiovascular risk factors
- flow-mediated dilation
- mathematical artifacts
- Monte Carlo methods
- ratio variables
- shear stress

## Introduction

Ratio variables, indices generated by the ratio of 2 measurements, are widely used in biomedical research. Typical examples are body mass index (body weight divided by the square of body height), waist-to-hip ratio, high-density lipoprotein over total cholesterol, etc. Statisticians, however, have often warned about the problems arising from the use of ratio variables, particularly in correlation analysis. If 2 ratio variables share a common component (eg, the denominator), they inevitably show some degree of correlation even in the absence of any physical or biological relationship between the 2 indices or between any of the variables used to compute them. Karl Pearson was the first statistician to highlight this problem and created the expression spurious to define the correlations arising in the absence of a real relationship.^{1} As an illustration, he provided the example of a researcher who collects random triplets of bones (femur, tibia, humerus) from a huge stack of broken skeletons belonging to different animals: although there is no biological relationship between the lengths of the bones collected within the same triplet, the indices femur/humerus and tibia/humerus (sometimes used in paleontology) show a strong correlation, for the simple reason that they share the same denominator.^{1} The same problem can be generalized to the correlations between ratio variables sharing any common component, including the correlation between a ratio variable and its own denominator. More recently, other authors have described similar problems related to different indices used in the medical field: the waist-to-hip ratio,^{2} the ratio of forced expiratory volume divided by the square of the height,^{2} the cardiac index,^{2} the body mass index,^{3} and the percent gingival recession.^{4} In the analysis of all these ratio variables, spurious correlations have been recognized as a major problem,^{5} defined by some authors as mathematical coupling.^{2} In the present study, we examined 3 potentially spurious correlations arising between ratio variables involved in the measurement of the flow-mediated dilation (*FMD*).

The *FMD* test is a noninvasive technique for measuring endothelial function,^{6} that is, the capacity of brachial artery diameter (*BAD*) to respond to a reactive hyperemia induced by 5 minutes of ischemia.^{7} *FMD* is used as an early marker of arterial damage, has been associated to most vascular risk factors including hypertension,^{8,9} and responds rapidly to antihypertensive or lipid-lowering pharmacological and lifestyle interventions.^{10} Notably, the *FMD* (or better *FMD%*) is commonly assessed as:

where *BADat-rest* is the *BAD* measured at rest and *deltaBAD* is the absolute change of *BAD* induced by the ischemic stimulus (ie, *BADmax* minus *BADat-rest*). *FMD%* is, therefore, a typical ratio variable. Flow-mediated dilation is commonly explained by a mechanism involving the shear stress caused by the hyperemic blood flow against artery walls. Specifically, shear stress stimulates the endothelium, which triggers vasodilation through the release of vasoactive molecules. The shear stress is commonly measured by a surrogate index, the shear rate (*SR*), computed as 8×*Vmax*/*BADat-rest*^{11}; here again we have the ratio of 2 measures: *Vmax* (mean blood velocity at hyperemia) and *BADat-rest*.

In this context, significant correlations have been consistently reported between 3 relevant variables mentioned above. Specifically (1) an inverse correlation between *FMD%* and *BADat-rest*^{7,12–20}; (2) an inverse correlation between *SR* and *BADat-rest*^{14,16}; and (3) a direct correlation between *FMD* and *SR*.^{16,17,21}

All these 3 correlations are exactly what one expects, based on the hypothesis that the *FMD* is modulated by the shear stress,^{13,14,16–20,22} and that the blood flow in smaller vessels is associated with a higher shear stress.

The main problem is that these 3 correlations are computed between ratio variables and their own denominator (correlations 1 and 2) or between ratio variables sharing a common component (correlation 3). Mathematical coupling is expected to cause a negative correlation in the first 2 cases and a positive correlation in the third case. Thus, the observed correlations are likely to include a spurious component. Nevertheless, to our knowledge, this problem has been so far mostly neglected, with a few exceptions. Mitchell et al^{23} reported a negative association between *FMD%* and *BADat-rest*, but remarked that “because brachial artery diameter was included in the equations for *FMD%* and *DSS* (diastolic shear stress), the possibility existed that the relationship between *DSS* and *FMD%* was predominantly mathematical rather than physiological.” More recently, Atkinson et al^{24,25} questioned the validity of the formula used to compute *FMD%*, showing that the ratio *deltaBAD*/*BADat-rest* is not adequate to accurately quantify endothelial function, after correcting for differences in *BADat-rest*. They highlighted the central role of *BADat-rest* as a crucial confounder in the research of human endothelial function^{26} and proposed the adjustment for *BADat-rest* by covariance analysis as a standard analytic approach in the comparison of *FMD* between different groups.

The objective of the present study was to verify whether the correlations 1, 2, and 3 may be attributed, at least in part, to a mathematical artifact, and to quantify their potential spurious component. To this aim we used a simple Monte Carlo procedure based on random permutations: using data actually measured on real subjects, this procedure removes any potential relation, physical and physiological, previously existing between the variables. In this regard, this procedure may be intended as a computer simulation of Pearson thought experiment about the random collection of triplets of bones.

## Methods

### Subjects

In the present study, we reanalyzed the database of the Laboratory of Arterial Morphology and Function of the Centro Cardiologico Monzino in Milan. A total of 240 patients with cardiovascular risk factors but without history of cardiovascular events and 75 healthy subjects without cardiovascular risk factors, except for age and smoking, recruited among patients’ relatives and hospital staff, were included in the study. The study adheres to the Declaration of Helsinki and was in line with institutional guidelines. All participants signed an informed consent. Table S1 in the online-only Data Supplement reports the clinical and anthropometric characteristics of the subjects. The sample was intentionally heterogeneous, because greater between-subject variability is expected to provide a wider range of arterial measures, thus increasing the probability to deem as significant a true correlation. Nevertheless, to assess the stability of the results, all the analyses were repeated within each subgroup.

### Evaluation of Brachial Artery Function

*FMD* was examined according to the methods described elsewhere^{27} and presented in detail in the Methods section of online-only Data Supplement. Briefly, brachial artery images were recorded by B-mode ultrasound for 1 minute at rest (prestimulus), during 5 minutes of ischemia, and for 3 minutes during the reactive hyperemic postdeflation phase. *BAD* was measured with a dedicated software.^{28}

The brachial artery mean flow velocity was measured at rest, during the 60 seconds before cuff deflation and during 15 seconds after cuff deflation (*Vmax*), by using a pulsed Doppler signal.

### Data Analysis

The expected amount of spurious correlation between a ratio variable and its denominator, or between 2 ratio variables sharing the same denominator, critically depends on the kind of distribution of both numerator and denominator and on their respective coefficients of variation.^{29} Pearson^{1} provided an approximate formula to evaluate the expected spurious correlation between ratio variables whose components are normally distributed. Dunlap et al^{30} used a Monte Carlo simulation to empirically verify Pearson formula and to quantify the spurious correlations between ratio variables with a common component. Their approach consisted in generating, by a computer routine, totally independent, normally distributed, random numbers *X*, *Y*, and *C*, to simulate 2 ratio variables *Y/C* and *X/C* sharing the same denominator but lacking any biological relation. In our case, *BADat-rest*, *deltaBAD*, *V*, and *SR* exhibit markedly skewed distributions. Therefore, instead of generating normally distributed variables, we performed a random permutation test (also called randomization test or rerandomization test), as recommended by Jackson and Somers.^{31} This procedure is also called exact test because, by simulating the null hypothesis (ie, no association between the variables), it allows statistical tests to be performed while rigorously maintaining the original distributions of the variables.

### Random Permutation Procedure

First, values of *BADat-rest*, absolute *deltaBAD*, and mean peak blood velocity (*Vmax*) were stored in 3 columns of a data set. *Real FMD%* and *SR* were then computed according to the usual formulas, using the variables actually measured in each individual:

for every individual *i*.

Subsequently, for 2000 iterations, the columns containing *deltaBAD* and *Vmax* were randomly shuffled, whereas the column containing *BADat-rest* was kept unchanged. Thus, at each iteration, a simulated *FMD%* (*sim-FMD%*) and a simulated *SR* (*sim-SR*) were recomputed for every individual *i*:

where, typically, *i ≠ j ≠ k*

The results were 2000 different data sets with values of *BADat-rest*, *deltaBAD*, and *SR* randomly matched, thus lacking any physical or biological relation potentially present, although maintaining their original distributions.

Next, the empirical correlations between *sim-FMD%* and *BADat-rest*, between *sim-SR* and *BADat-rest*, and between *sim-FMD%* and *sim-SR* were computed in each data set. In the absence of artifacts, these correlations are expected to be null, with an empirical distribution of the coefficients centered around zero. Any observed systematic correlation between these 3 variables (ie, any shift of the distributions with respect to the value zero) should thus be regarded as the result of a mathematical artifact. Moreover, by comparing the *R* coefficients obtained in the original data set with those obtained after random permutations, it is possible to quantify the spurious component of the former ones.

The permutation test was performed using an in-house–developed SAS (SAS Institute Inc, Cary, NC) program.

### Statistics

Subjects’ characteristics were summarized as n (%) for categorical variables and as mean±SD or median and interquartile range for continuous variables with normal or non-normal distributions, respectively. Adjusted means of *FMD%* and absolute diameter change in the 2 groups were computed by covariance analysis, adjusting for baseline diameter. The results of the random permutation procedure were represented by the median values and the 2.5 and 97.5 percentiles. Because the variables of interest (*BADat-rest*, *FMD%*, and *SR*) were not normally distributed, all correlations were computed by the Spearman method. Yet, to compare our results with those most widely reported in the literature, the analysis was repeated using Pearson correlation. The Fisher transformation,^{32} *z*=(1/2)[ln(1+*R*)−ln(1−*R*)], was used to compute 95% confidence limits for the correlation coefficients. All analyses were performed using SAS statistical package version 9.2.

## Results

### Analysis of Real Data

Table 1 reports the correlation coefficients between *real-FMD%* and *BADat-rest*, between *real-SR* and *BADat-rest*, and between *real-FMD%* and *real-SR*, computed in the original database. In the entire sample, the correlations of *real-FMD%* and *real-SR* with *BADat-rest* were significant and negative, and the correlation between *real-FMD%* and *real-SR* was significant and positive (Table 1; Figure S1), with absolute values of Spearman *R* coefficients ranging from 0.20 to 0.51. The correlation coefficients showed a similar pattern in both cardiovascular risk factors and healthy subgroups. Pearson correlation coefficients were very similar to Spearman, and their values were in the range of those reported in the literature.^{14,16,17,21,33} Note that in the whole sample the correlation of *deltaBAD* with *BADat-rest* was nearly null (*R*=−0.01; *P*=0.86), and the correlation of *Vmax* with *BADat-rest* was rather weak (*R*=−0.09; *P*=0.10) and was inconsistent between the 2 groups (*R*=−0.16; *P*=0.01 and *R*=+0.14; *P*=0.01 in subjects with cardiovascular risk factors and healthy subjects, respectively; Table 1).

### Results With Random Permutations

The Figure shows the frequency distributions of Spearman correlation coefficients between *sim-FMD%* and *BADat-rest* (Figure, A), between *sim-SR* and *BADat-rest* (Figure, B), and between *sim-FMD%* and *sim-SR* (Figure, C). In contrast with what was expected in the absence of an artifact, all 3 distributions were markedly shifted: to the left (*sim-FMD%* versus *BADat-rest*, and *sim-SR* versus *BADat-rest*) or to the right (*sim-FMD%* versus *sim-SR*), with respect to the null value. None of the 2000 permutations yielded positive coefficients for the first 2 correlations, and only 8 of 2000 yielded negative coefficients for the third correlation. The arrows in the Figure indicate the value of the 3 Spearman correlation coefficients observed in the real data (see Table 1). Notably, the proportion of permutations yielding *R* coefficients more extreme than the values observed in the original sample were 56.8% for *sim-FMD%* versus *BADat-rest*, 20.2% for *sim-SR* versus *BADat-rest*, 19.4% for sim-FMD% versus *sim-SR*.

The medians of the correlation coefficient distributions, for the whole sample and for the subgroups, are reported in Table 2. These values are very close to the corresponding *R* coefficients computed with nonpermuted data, except for the correlation between *sim-FMD%* and *sim-SR*, where the median *R* coefficient was reduced by ≈25%. In all cases, the 95% confidence intervals of the *R* coefficients computed with nonpermuted data included the medians of the *R* coefficients computed with random permutations.

## Discussion

Our study has demonstrated that a large proportion of the correlations between *FMD%* and *BADat-rest*, between *SR* and *BADat-rest*, and between *FMD%* and *SR* is accounted for by a mathematical artifact. Indeed, the distributions of *R* coefficients obtained with random permutations, instead of being centered around the value zero, were markedly shifted. In the first 2 cases the medians were very close to the values of *R* coefficients computed from the original data (−0.33 versus −0.32 for the correlation between *FMD%* and *BADat-rest*, and −0.43 versus −0.46 for the correlation between *SR* and *BADat-rest*). In the third case the median of the random permutations was somewhat lower than the value obtained from the original data (0.15 versus 0.20). In terms of *R*^{2}, we can estimate that the proportions of the correlations accounted for by a mathematical artifact were 100% for *FMD%* versus *BADat-rest*, 87% for *SR* versus *BADat-rest*, and 56% for *FMD%* versus *SR*. Notably, the results were nearly the same for both subgroups included in the sample, which indicates that the conclusions are not dependent on the subjects’ characteristics.

Several authors reported that *BADat-rest* is an independent predictor of *FMD%* (eg, Yu et al).^{34} Holubkov et al^{35} reported “a significant inverse correlation between resting brachial diameter and hyperemia-induced maximum brachial artery diameter, suggesting that impaired FMD may be present in patients with large resting brachial artery diameters, independent of the integrity of the endothelium.” In the article of Silber et al,^{17} the authors asked: “Why is flow-mediated dilation dependent on arterial size?” In the introduction, the authors stated: “the reasons for this phenomenon are poorly understood. We have previously shown that FMD is greater in small brachial arteries because the shear stress stimulus is greater in small brachial arteries. However, it is unclear why the shear stimulus is greater in small arteries.” In their conclusion, the authors recognized that, by applying the Poiseuille formula, the calculated shear rate is proportional to 1/radius, thus “the greater FMD in small conduit arteries compared with large arteries does not reflect better inherent endothelial function.” Pyke and Tschakovsky^{14,36} proposed to normalize the response (*FMD%*) to the relevant stimulus (shear stress) by dividing *FMD%* by *SR*, integrated during 1 minute. They then showed that the dependency of *FMD%* from *BADat-rest* was greatly reduced after normalization. However, Thijssen et al^{19} found that the measured shear stress stimulus explained only 10% to 15% of the *FMD* response and suggested that “other shear-independent factors contribute to individual differences in the magnitude of FMD responses.”

Actually, our results indicate that most of these observations (as is indirectly recognized by some authors, but never clearly stated) can be explained by the fact that the formula of *FMD%* (*deltaBAD*/*BADat-rest*) is not appropriate to quantify *FMD*. Atkinson and Batterham^{24} showed that *FMD%* does not accurately scale across the range of *BADat-rest*, leading to an overestimate of endothelial function for low *BADat-rest* and vice versa. Moreover, in the absence of a proper scaling by *BADat-rest*, statistical inferences about the association of *FMD%* and cardiovascular disease may be problematic. In fact, the baseline diameter of brachial artery, as well as the diameter of the common carotid artery, has been shown to be predictive of future cardiovascular events.^{24,37} Therefore, any significant association found between *FMD%* and cardiovascular disease might be attributed to the denominator (*BADat-rest*) and not only to the whole ratio variable.

We think that it is inappropriate, statistically speaking, to define *BADat-rest* as an independent predictor of *%FMD*, even if this association is significant in every decade of age.^{38} Similarly, it is not surprising that the normalization of the response to the shear rate (ie, dividing *FMD%* by shear rate area under the curve) greatly reduces the correlation with *BADat-rest*. Indeed, the variable *BADat-rest* is present in the formulas of both *FMD%* and shear rate (even when the latter is integrated over a time interval), and it is mathematically eliminated when the ratio is computed, thus removing the spurious component of the correlation.

It is important to note that our results do not disprove that shear stress has a physiological role in the genesis of *FMD*. Instead, our results clearly show that, because the correlation between *FMD%* and shear rate is largely spurious, authors should not present this correlation among their results, even if they do not technically use it to demonstrate a causal relation between shear stress and *FMD*; such a relation should be substantiated by presenting other sources of evidence (eg, experimental ex vivo studies like the one published by Paniagua et al).^{39} The use of *FMD%* should be avoided in all analyses, and all associations should be tested between variables obtained by independent measurements. For instance, we suggest to test the dependence of *FMD* from its proposed stimulus by including the variables absolute diameter change, blood flow velocity, and *BADat-rest* separately in an appropriate statistical model.

A more general consideration about the use of ratio variables can be made. Their widespread utilization in biomedical research is mainly because of 2 reasons^{31}: first, a ratio incorporates 2 variables in a single measure that can be used in simple univariable analysis; second, there is a need for standardization in clinical and epidemiological studies. For instance, body weight strongly depends on body size: if one needs to compare the weight of different subjects, standardization is required to account for differences because of body size. However, it has been shown that ratio variables are inefficient from a statistical perspective,^{40} because they tend to have non-normal distributions even when both the numerator and the denominator are normally distributed, and because they are sensitive to changes of the denominator variance. Again, to account for the denominator, it is more appropriate to include it as a covariate in multivariable analysis rather than to incorporate the confounding variable in a ratio.

In conclusion, it is highly recommended to consider with caution all the correlations between ratio variables, such as *FMD%* and shear rate, regardless of the level of statistical significance, because they can be mainly explained by mathematical artifacts. Moreover, in line with what Atkinson et al^{41} proposed, we suggest the use of absolute diameter change instead of *FMD%* in clinical and epidemiological studies and of correction for baseline diameter and for shear rate by inclusion in an appropriate statistical model. Moreover, to properly cope with allometric scaling, it is also recommended to log-transform the data before analysis.

### Perspectives

Our results are in line with those from other authors reporting spurious correlations in the analysis of ratio variables. We have shown that some correlations reported in several studies investigating brachial artery endothelial function are because of a mathematical artifact, rather than a biological relationship. These findings have relevant implications for future epidemiological and clinical studies that examine brachial artery endothelial function and highlight the need to avoid the use of ratio variables, such as *FMD%* or *SR*, in correlation analysis.

## Sources of Funding

Financial support was received from Centro Cardiologico Monzino IRCCS and the Italian Ministry of Health (Ricerca Corrente 2007, Progetto Bio 40).

## Disclosures

None.

## Footnotes

The online-only Data Supplement is available with this article at http://hyper.ahajournals.org/lookup/suppl/doi:10.1161/HYPERTENSIONAHA.114.03608/-/DC1.

- Received April 23, 2014.
- Revision received May 15, 2014.
- Accepted September 4, 2014.

- © 2014 American Heart Association, Inc.

## References

- 1.↵
- Pearson K

- 2.↵
- 3.↵
- Kronmal RA

- 4.↵
- 5.↵
- Curran-Everett D

- 6.↵
- Charakida M,
- Masi S,
- Lüscher TF,
- Kastelein JJ,
- Deanfield JE

- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- Herrington DM,
- Fan L,
- Drum M,
- Riley WA,
- Pusser BE,
- Crouse JR,
- Burke GL,
- McBurnie MA,
- Morgan TM,
- Espeland MA

- 13.↵
- Koller A,
- Sun D,
- Kaley G

*in vitro*. Circ Res. 1993;72:1276–1284. - 14.↵
- 15.↵
- 16.↵
- 17.↵
- Silber HA,
- Ouyang P,
- Bluemke DA,
- Gupta SN,
- Foo TK,
- Lima JA

- 18.↵
- Thijssen DH,
- Bullens LM,
- van Bemmel MM,
- Dawson EA,
- Hopkins N,
- Tinken TM,
- Black MA,
- Hopman MT,
- Cable NT,
- Green DJ

- 19.↵
- Thijssen DH,
- Dawson EA,
- Black MA,
- Hopman MT,
- Cable NT,
- Green DJ

- 20.↵
- Thijssen DH,
- van Bemmel MM,
- Bullens LM,
- Dawson EA,
- Hopkins ND,
- Tinken TM,
- Black MA,
- Hopman MT,
- Cable NT,
- Green DJ

- 21.↵
- Betik AC,
- Luckham VB,
- Hughson RL

- 22.↵
- 23.↵
- Mitchell GF,
- Parise H,
- Vita JA,
- Larson MG,
- Warner E,
- Keaney JF Jr.,
- Keyes MJ,
- Levy D,
- Vasan RS,
- Benjamin EJ

- 24.↵
- 25.↵
- Atkinson G,
- Batterham AM

- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- Jackson DA,
- Somers KM

- 32.↵
- Snedecor GW,
- Cochran WG

- 33.↵
- Gibbs BB,
- Dobrosielski DA,
- Lima M,
- Bonekamp S,
- Stewart KJ,
- Clark JM

- 34.↵
- 35.↵
- 36.↵
- Pyke KE,
- Dwyer EM,
- Tschakovsky ME

- 37.↵
- Baldassarre D,
- Hamsten A,
- Veglia F,
- de Faire U,
- Humphries SE,
- Smit AJ,
- Giral P,
- Kurl S,
- Rauramaa R,
- Mannarino E,
- Grossi E,
- Paoletti R,
- Tremoli E

- 38.↵
- Maruhashi T,
- Soga J,
- Fujimura N,
- et al

- 39.↵
- Paniagua OA,
- Bryant MB,
- Panza JA

- 40.↵
- 41.↵

# Novelty and Significance

### What Is New?

This study addresses, for the first time, a widespread bias in the analysis of the correlations between variables commonly used in the assessment of endothelial dysfunction.

The extent of this bias is quantified and an analytic strategy to overcome the problem is proposed.

### What Is Relevant?

Brachial artery

*FMD%*and shear rate are ratio variables widely used in biomedical research to assess endothelial dysfunction, which is considered as an early sign of atherosclerosis and is associated with most cardiovascular risk factors, including hypertension.In clinical and epidemiological studies, it is crucial to avoid potential sources of bias when interpreting the relationship between ratio variables.

### Summary

It is highly recommended to avoid biological explanations when interpreting correlations between ratio variables, because these correlations can be mainly because of mathematical artifacts, regardless of the level of statistical significance observed.

## This Issue

## Jump to

## Article Tools

- Potentially Spurious Correlations Between Arterial Size, Flow-Mediated Dilation, and Shear RateNovelty and SignificanceFabrizio Veglia, Mauro Amato, Marta Giovannardi, Alessio Ravani, Calogero C. Tedesco, Beatrice Frigerio, Daniela Sansaro, Elena Tremoli and Damiano BaldassarreHypertension. 2014;64:1328-1333, originally published September 22, 2014https://doi.org/10.1161/HYPERTENSIONAHA.114.03608
## Citation Manager Formats

## Share this Article

- Potentially Spurious Correlations Between Arterial Size, Flow-Mediated Dilation, and Shear RateNovelty and SignificanceFabrizio Veglia, Mauro Amato, Marta Giovannardi, Alessio Ravani, Calogero C. Tedesco, Beatrice Frigerio, Daniela Sansaro, Elena Tremoli and Damiano BaldassarreHypertension. 2014;64:1328-1333, originally published September 22, 2014https://doi.org/10.1161/HYPERTENSIONAHA.114.03608