(Hypertension. 1999;34:236-241.)
© 1999 American Heart Association, Inc.
Scientific Contributions |
From the University of TexasHouston Health Science Center, School of Public Health (S.D., R.B.H., D..L.), and the Texas Children's Hospital (N.A.A., J.T.B.), Houston, Tex.
Correspondence to Shifan Dai, MD, PhD, School of Public Health, University of TexasHouston Health Science Center, 1200 Herman Pressler St, Houston, TX 77030. E-mail sdai{at}sph.uth.tmc.edu
| Abstract |
|---|
|
|
|---|
Key Words: echocardiography left ventricular mass observer variation reproducibility of results population study
| Introduction |
|---|
|
|
|---|
In Project HeartBeat!, a population-based, intensive longitudinal study to evaluate CVD risk factors as an interrelated set of growth processes in healthy children and adolescents, echocardiographic measurement of the cardiac geometry and function was an integral component. This provision allows assessment of the morphological and functional growth of the heart and determinants of different aspects of this growth process. A detailed quality assurance protocol was developed and implemented for echocardiographic measurements, including training, certification, and recertification of the echocardiographic technicians and continuous monitoring of data quality and measurement accuracy by a single pediatric echocardiographer or experienced technicians at Texas Children's Hospital. This report presents the results of a quality assessment study designed to evaluate observer variability in echocardiographic measurements. Three aspects of the echocardiographic measurements were assessed: intraobserver variability, interobserver variability within Project HeartBeat! staff, and interinstitutional measurement variability between project echocardiographic technicians and experienced technicians or a pediatric echocardiographer at Texas Children's Hospital. The latter also served as validation for echocardiographic measurements collected in Project HeartBeat!.
| Methods |
|---|
|
|
|---|
Echocardiograms were performed with the Interspec XL (Apogee) Annular Phased Array echocardiographic machine with either a 5- or 3.5-MHz transducer and recorded on VHS videocassettes. The participants were required to rest for 5 minutes before data collection. Echocardiographic examinations were done with the participants in supine position with a pillow under the right shoulder. The heart was imaged with 2D echocardiography in the parasternal long-axis view, parasternal short-axis view, apical view, subxiphoid views, and suprasternal notch image. M-mode echocardiography, 2D, and 2D directed pulsed-wave Doppler recordings were obtained by standard methods,6 7 and measurements were made online with the Interspec Apogee measurement software package. M-mode measurements followed the standards of the American Society of Echocardiography (ASE).8 Eight M-mode echocardiographic measurements and 8 Doppler measurements were specified as the core measurements in the study protocol. They are aortic root diameter, left atrial diameter, end-diastolic interventricular septal thickness, end-diastolic left ventricular (LV) diameter, end-diastolic LV posterior wall thickness, end-systolic interventricular septal thickness, end-systolic LV diameter, end-systolic LV posterior wall thickness, right ventricular (RV) preejection period, RV ejection time, isovolumetric relaxation time (IVRT), aortic peak velocity, aortic time-velocity integral, heart rate, LV preejection period, and LV ejection time. These 16 core original measurements and LV mass (LVM), calculated from the formula reported by Devereux et al,9 10 were included for quality assessment.
Quality assessment was based on samples reviewed from 3600 studies completed by October 1994 and recorded on videotapes. Altogether, 4 persons were trained and certified as project echocardiographic technicians who performed the studies, although only 2 were active in the project at any given time. Three samples of echocardiograms were chosen for quality assessment to evaluate (1) intraobserver variability (sample 1), (2) interobserver variability (sample 2), and (3) comparability between measurements of field echocardiographic technicians and reference readings by experienced technicians or the pediatric echocardiographer at Texas Children's Hospital (sample 3). No single echocardiographic recording was included in >1 sample. For sample 1, 20 echocardiograms from each of the 2 current echocardiographic technicians (40 total) were randomly selected to be reread by the same project technician. For sample 2, a total of 80 echocardiograms, 20 from the files for each of the 4 echo technicians, were selected and remeasured by 1 of the 2 current technicians, assigned to exclude their own originally measured echocardiograms. For sample 3, 5% of the echocardiograms from each of the 4 echo technicians, 182 in all, were randomly selected and reviewed at Texas Children's Hospital by an experienced technician or a pediatric echocardiographer. All remeasurements were made with the technician blinded to the original results.
Completeness and quality of all echocardiograms were determined at the end of each study. Among the total 302 echocardiograms selected for quality assessment, 3 at original measurement and 6 at repeated measurement were rated clinically as suboptimal because of poor acoustic windows in the participants. Pediatric cardiologists at Texas Children's Hospital concluded that even those studies considered to have imperfect image quality clinically permitted various measurements that provided data of acceptable quality. Thus, all 302 echocardiograms were included in the quality assessment.
Analyses were performed with the SPSS statistical package.11 Differences between the repeated measures (observation 1 minus observation 2) were first plotted against the mean of the repeated measures (observation 1 plus observation 2) divided by 2.12 Means and SDs of the differences were then calculated, and the corresponding paired t tests were performed. Means and SDs were also computed for the original and repeated measurements.
| Results |
|---|
|
|
|---|
Seventeen plots of the differences between repeated measures (observation 1 minus observation 2) versus the mean of the repeated measures (observation 1 plus observation 2) divided by 2 for each of the 3 samples were generated (plots not shown). These plots provided visual information on the magnitude of disagreement, both random error and systematic bias, and on the relationship of the differences and size of the measurements. Most plots revealed uniform distribution patterns with most points on or near the zero-difference reference line. No easily discernible dependence of the differences on measurement size was observed.
Table 1 displays the results of a comparison of original and repeated measurements by the same Project HeartBeat! echocardiographic technicians for the 16 core measurements. All the means of differences in Table 1 were very small compared with the magnitude of the measurements, and paired t tests suggested no statistically significant systematic differences between the original and repeated measurements. SDs of the differences were also small, indicating high reproducibility of within-observer measurements.
|
Results of interobserver comparisons are shown in Table 2. As expected, most means and SDs of the differences were larger than those of intraobserver differences. Differences between the first and second readings of end-diastolic septal thickness, end-systolic LV posterior wall thickness, and 5 Doppler measurements were statistically significant. However, all these differences were small with very limited clinical significance. Relatively large SDs of the differences were found for end-diastolic LV posterior wall thickness, end-systolic septal thickness, and end-systolic LV posterior wall thickness.
|
Repeated measurements done at Texas Children's Hospital were compared with the original measurements by the Project HeartBeat! echo technicians. Results are shown in Table 3. The means and SDs of the differences were very close to those of intraobserver differences and were smaller than those of the interobserver comparison. Differences of 0.19 mm in end-diastolic LV diameter, -0.25 mm in systolic septal thickness, and -0.003 second in RV ejection time were found to be statistically significant. These differences, however, were minimal with limited clinical significance. The small SDs of the differences also suggested high comparability of the echocardiographic measurements observed by project echo technicians and experienced technicians and by pediatric echocardiographers at Texas Children's Hospital.
|
The intraobserver, interobserver, and intersite comparisons of LVM are presented in Table 4. Means of differences were 1.82, 4.50, and 0.0013 g for the paired measurements by the same project observers, by different project observers, and by project echo technicians and Texas Children's Hospital echocardiographers, respectively. Mean differences were small, and none was statistically significant. The corresponding SDs were 18.79, 24.16, and 12.35 g, respectively, which were smaller than the corresponding SDs of original and repeated LVM measurements.
|
| Discussion |
|---|
|
|
|---|
The accuracy of a particular measurement process may be assessed only if the "true" value or "gold standard" value is known. Such true values were not known in the present study. An alternative method was then used to estimate the "relative" accuracy of the measurements. A series of paired determinations were obtained, and mean and SD of the differences were calculated. If the paired observations were the same except for random error, the mean of the differences would be expected to be 0, and the paired t test was used to test this hypothesis. Thus, the mean of differences offered a measure for average systematic differences (relative bias) between original and repeated measurements. The SD of the paired differences, which indicated variability of the difference between the first and second measurements and thus provided estimates of random errors, was the measure of reproducibility of the measurement process. Independence of the differences and size of the measurements is the prerequisite for the analyses described above.12
The correlation coefficient between original and repeated measurements, often used in reproducibility studies of echo measurements, was not adopted in the present study because it is a measure of association, which is dependent on both the variation between study subjects (ie, between the true values) and the variation within study subjects (measurement error).12 A high correlation coefficient does not necessarily indicate good agreement. An example of this is the unmodified ASE-cube LVM formula, which systematically overestimated LVM by 25%, with a correlation coefficient between calculated and necropsy LVM of 0.90.9
The present quality assessment addressed 3 questions: How consistent were the Project HeartBeat! field echo observers in measuring the cardiac structure and function (intraobserver variation); were there any differences in measurements between field observers (interobserver variation); and to what extent did the measurements by the field observers agree with those by experienced clinical echocardiographic technicians? For all 3 questions, both bias and random error were at issue. Our study demonstrated that the echocardiographic measurements were performed by each project echo technician in a highly consistent manner, the extent of interobserver variation was acceptable, and measurements by Project HeartBeat! technicians agreed with those by experienced technicians and the pediatric echocardiographer at Texas Children's Hospital.
Available results from other studies on intraobserver and interobserver variations of echocardiographic measurements are limited. Most studies conducted earlier varied greatly in study design and analysis method, making direct comparison difficult.8 14 15 16 17 18 Ladipo et al,14 evaluating measurements of 10 blind duplicated tracings by 3 observers, reported intraobserver mean absolute difference of 0.7 to 1.2, 0.2 to 0.8, 0.3 to 0.4, and 0.4 to 0.8 mm for diastolic and systolic LV diameters, diastolic LV posterior wall thickness, and diastolic interventricular septal thickness, respectively. Having 3 investigators measure 7 ventricular parameters of 20 randomly selected echocardiograms twice, Valdez et al16 showed that significant intraobserver difference was found in only 1 person in the measurement of end-diastolic LV posterior wall thickness. Small means and SDs of intraobserver differences in our study (Table 1) showed that each technician read the same echocardiograms consistently the second time, achieving a high degree of agreement. Schieken et al17 obtained intraobserver measurement errors of aortic root diameter, left atrial diameter, end-diastolic interventricular septal thickness, end-diastolic LV diameter, end-diastolic LV posterior wall thickness, end-systolic LV diameter, and LV ejection time in 20 healthy children 6 to 16 years of age. The errors were reported as 0.5, 0.6, 0.6, 1.3, 0.6, and 1.0 mm and 0.01 second, respectively.17 Although analysis methods differed, the SDs reported here should be comparable to about twice the errors reported by Schieken et al.17 Thus, the "errors" for intraobserver variability (SDs in Table 1 divided by 2) for the same measurements in the present study were either smaller or similar compared with their findings.
Previous studies on interobserver variation of echocardiographic measurement have shown different results.8 15 16 17 18 De Leonardis and Cinelli15 compared measurements of aortic root diameter, left atrial diameter, end-diastolic septal and posterior wall thicknesses, and end-diastolic and end-systolic LV diameters by 2 experienced interpreters on 50 routinely performed M-mode echocardiograms and concluded that no significant interobserver variability was found for all measured echocardiographic parameters. Valdez et al16 found statistically significant differences in measurements of end-diastolic septal thickness, end-diastolic and end-systolic LV posterior wall thicknesses, and end-diastolic and end-systolic LV diameters by 3 observers on 20 echocardiograms. The maximum mean difference was 2 mm. They concluded that the differences were not clinically significant. In our interobserver comparison, differences of 0.39 mm for end-diastolic septal thickness and 1.28 mm for end-systolic LV posterior wall thickness showed statistical significance. Comparison between Project HeartBeat! observers and Texas Children's Hospital observers revealed statistical significant differences of 0.19 mm for end-diastolic LV diameter and -0.25 mm for end-systolic septal thickness. Magnitudes of all these differences, however, were small compared with available results.
Sahn et al8 evaluated measurements on 5 echocardiograms by 76 observers for aortic root diameter, left atrial diameter, diastolic and systolic LV diameters, diastolic interventricular septal thickness, and diastolic LV posterior wall thickness and showed minimum mean percent uncertainties of 13.5%, 11.2%, 8.2%, 14%, 19.5%, and 23.4%, respectively, when the ASE convention was used for measurement. The percent uncertainty was calculated for each measurement on each recording as the 95th percentile confidence limit, determined as 1.97 SD, divided by the mean for the measurement times 100. Schieken et al17 reported interobserver measurement precision for aortic root diameter, left atrial diameter, end-diastolic interventricular septal thickness, end-diastolic LV diameter, end-diastolic LV posterior wall thickness, end-systolic LV diameter, and LV ejection time of 0.5, 0.6, 0.9, 2.3, 1.6, and 1.1 mm and 0.1 second, respectively. Again, the SDs reported here should be comparable to about twice the precision reported by Schieken et al.17 The estimates of "precision" for interobserver variability (SDs in Table 2 divided by 2) were larger in the present study for aortic root diameter and left atrial diameter but smaller for end-diastolic septal thickness and end-diastolic and end-systolic LV diameters, whereas they were similar for end-diastolic LV posterior wall thickness. By the same comparison, the estimates of precision for intersite measurement variability (SDs in Table 3 divided by 2) for the same echo parameters in the present study were either similar or smaller.
LVM has been repeatedly associated with CVD death in adults. Use of echocardiographic measurement of LVM as an outcome measure in epidemiological investigation of hypertension still poses a challenge regarding measurement precision and comparability across studies.19 Reproducibility of measurement of LVM has been studied by use of a variety of methods.4 15 20 A recent report from the Treatment of Mild Hypertension Study showed acceptable measurement accuracy and reproducibility in adults.18 The means and SDs of intraobserver difference in LVM were reported from that study to be 0.0 and 20.4 g for 1 cardiologist and 6.1 and 26.8 g for another. The means and SDs for interobserver difference were 7.9 and 34.7 g between the 2 cardiologists and 5.7 and 46.1 g between the cardiologist and echo technicians. Means and SDs of intraobserver and interobserver measurement differences from our study in healthy children and adolescents were either similar or smaller. Minimal mean differences and small variation of the paired measurements by project echo technicians and experienced technicians or pediatric echocardiographers at Texas Children's Hospital further suggest that echocardiographic measurement of LVM from population studies could be comparable to that from a clinical setting.
Doppler measurements of RV preejection period, RV ejection time, IVRT, aortic peak velocity, aortic time-velocity integral, heart rate, LV preejection period, and LV ejection time were included in the present analysis. Except for LV ejection time, no earlier results were available for between-study comparison. In our results, no significant difference was found for intraobserver comparison. Interobserver comparison showed only significant differences for RV preejection period, RV ejection time, IVRT, LV preejection period, and LV ejection time, and the interinstitutional comparison showed significant differences only for RV ejection time. The magnitudes of these differences, however, were trivial. Overall, the results showed good agreement between original and repeated measurements.
Variation of echocardiographic measurements arose from a variety of sources.1 4 Several factors can affect image quality and thus influence the definition of anatomic structures: participant's body habitus; respiratory status and cooperation; the technician's experience in recognizing the correct image signal and Doppler position and envelope, along with transducer orientation and placement; and the technician's familiarity with echocardiographic equipment. Although criteria regarding these factors had been defined in the study protocol, their effects on measurement variability were not evaluated in the present study. The proportion of adequate echocardiograms in population studies has been reported variably from a minimum of 28% during the first 5 months of a population study to a recent report of 93%.2 4 20 Although several individual measurements were not possible and thus not included in the present analysis, all echocardiograms were included in the quality assessment study. This fact, with the intention to include as many as possible measures for each cardiac parameter, may have sacrificed reproducibility of measurements from a few technically imperfect echocardiograms, resulting in increased differences of the paired measurements and SDs of the differences.
We conclude that the echocardiographic measurements taken from healthy children in a longitudinal study can be made accurately with acceptable reproducibility. Echocardiographic measurements from an epidemiological study can compare favorably with those taken in a clinical setting with experienced technical support. Thus, these measurements can be applied meaningfully to clinical observation.
| Acknowledgments |
|---|
Received December 8, 1998; first decision January 5, 1999; accepted March 22, 1999.
| References |
|---|
|
|
|---|
2. Savage DD, Garrison RJ, Kannel WB, Anderson SJ, Feinleib M, Castelli WP. Considerations in the use of echocardiography in epidemiology: the Framingham study. Hypertension. 1987;9(suppl II):II-40II-44.
3. Schieken RM. Measurement of left ventricular wall mass in pediatric populations. Hypertension. 1987;9(suppl II):II-47II-52.
4. Wallerson DC, Devereux RB. Reproducibility of echocardiographic left ventricular measurements. Hypertension. 1987;9(suppl II):II-6II-18. Review.
5.
Labarthe DR, Nichaman MZ, Harrist RB, Grunbaum JA, Dai
S. Development of cardiovascular risk factors from age
8 to 18 in Project HeartBeat!: study design and patterns of change
in plasma total cholesterol concentration.
Circulation. 1997;95:26362642.
6. Feigenbaum H. Echocardiography. 4th ed. Philadelphia, Pa: Lea and Febiger; 1986.
7. Snider AR, Serwer GA. Echocardiography in Pediatric Heart Disease. St Louis, Mo: Mosby Yearbook; 1990.
8.
Sahn DJ, DeMaria A, Kisslo J, Weyman A.
Recommendations regarding quantitation in M-mode
echocardiography: results of a survey of
echocardiographic measurements. Circulation. 1978;58:10721083.
9. Devereux RB, Alonso DR, Lutas EM, Gottlieb GJ, Campo E, Sachs I, Reichek N. Echocardiographic assessment of left ventricular hypertrophy: comparison to necropsy findings. Am J Cardiol. 1986;57:450458.[Medline] [Order article via Infotrieve]
10. Devereux RB. Detection of left ventricular hypertrophy by M-mode echocardiography: anatomic validation, standardization, and comparison to other methods. Hypertension. 1987;9(suppl II):II-9II-26. Review.
11. SPSS Base System: Syntax Reference Guide. Release 6.0. Chicago, Ill: SPSS Inc; 1993.
12. Altman DG, Bland JM. Measurement in medicine: the analysis of method comparison studies. Statistician. 1983;32:307317.
13. Last JM. A Dictionary of Epidemiology. New York, NY: Oxford University Press; 1993.
14.
Ladipo GIA, Dunn FG, Pringle TH, Bastian B, Lawrie TDV.
Serial measurements of left ventricular dimensions by
echocardiography: assessment of week-to-week,
inter- and intraobserver variability in normal subjects and patients
with valvular heart disease. Br Heart J. 1980;44:284289.
15. de Leonardis V, Cinelli P. Evidence of no interobserver variability in M-mode echocardiography. Clin Cardiol. 1986;9:324326.[Medline] [Order article via Infotrieve]
16.
Valdez RS, Motta JA, London E, Martin RP, Haskell WL,
Farquhar JW, Popp RL, Horlick L. Evaluation of the echocardiogram as an
epidemiologic tool in an asymptomatic population.
Circulation. 1979;60:921929.
17.
Schieken RM, Clarke WR, Mahoney LT, Lauer RM.
Measurement criteria for group echocardiographic
studies. Am J Epidemiol. 1979;110:504514.
18. Grandits GA, Liebson PR, Dianzumba S, Prineas RJ. Echocardiography in multicenter clinical trials: experience from the Treatment of Mild Hypertension Study. Control Clin Trials. 1994;15:395410.[Medline] [Order article via Infotrieve]
19. Gottdiener JS, Livengood SV, Meyer PS, Chase GA. Should echocardiography be performed to assess effects of antihypertensive therapy? Test-retest reliability of echocardiography for measurement of left ventricular mass and function. J Am Coll Cardiol. 1995;25:424430.[Abstract]
20. Mahoney LT, Clarke WR, Knoedel D, Lauer R. M. Echocardiographic reproducibility and precision among multiple sonographers: implications for population studies. Circulation. 1989;80(suppl II):II-543. Abstract.
This article has been cited by other articles:
![]() |
B. M. McQuillan, M. H. Picard, M. Leavitt, and A. E. Weyman Clinical Correlates and Reference Intervals for Pulmonary Artery Systolic Pressure Among Echocardiographically Normal Subjects Circulation, December 4, 2001; 104(23): 2797 - 2802. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hypertension Home | Subscriptions | Archives | Feedback | Authors | Help | AHA Journals Home | Search Copyright © 1999 American Heart Association, Inc. All rights reserved. Unauthorized use prohibited. |