EPID600 (Spring 2013) module XII. Error: Selection and information bias

The following questions refer to the article "Cardiovascular Risk Among Men Seeking Help for Erectile Dysfunction", by J. Frantzen, T.G.W. Speel, L.A. Kiemeney and E.J.H. Meuleman. Annals of Epidemiology 2006 (February);16(2):85-90.

1. “Erectile dysfunction (ED) is a multifactorial disease of the aging male affecting millions of men worldwide. In the Netherlands, on average 13% of men aged 40 years and older are affected (1).” (p85,c1) If we interpret the phrase “aged 40 years and older” as 40-79 years in 2000, about how many men in this age group in the Netherlands in 2000 would have been affected? Include the calculation. Hint: remember the Census Bureau's International Data Base.

2. “The prevalence increases with age: 6% of men aged 40-49 years, compared to 38% of men aged 70-79 years.” (p85,c1) Suppose that these age-specific prevalences and the overall prevalence of 13% were true for men aged 40-79 years in the Netherlands in the year 2000. If the prevalence among men age 50-59 years old were 8%, what would the prevalence of ED have been for men aged 60-69 years? Include the calculation.

3. Patient A (hypothetical) was born on January 1, 1930, entered the study on January 1, 1996 with 10 years of prior medical information, was first diagnosed with ED January 1, 2000 and with cardiovascular disease on January 1, 2001. He died on March 1, 2001. Give all answers in the number of months, rounded to the nearest month.

*3a. How much follow-up time did patient A contribute for the calculation of ED incidence during the period before introduction of sildenafil?

*3b. How much follow-up time did patient A contribute for the calculation of ED incidence during the period after introduction of sildenafil?

*3c. How much follow-up time did patient A contribute for the calculation of cardiovascular disease incidence after his diagnosis with ED?

4. The study design being employed to study the incidence of CVD is a . . . (Choose one best answer [and include a brief statement of support].) (8 hpts)

A. Cross-sectional study
B. Case-control study with incident cases
C. Cohort study
D. Ecologic study
E. Intervention trial

5. “Prevalence of CVD was higher among men with ED compared to controls. The odds ratio was 2.07 [95%-CI 1.67-2.56] for the period before the introduction of sildenafil.” (p86c2) Suppose that this odds ratio had been calculated from a 2 x 2 table in which the prevalence of CVD among the controls were 7%. Estimate the prevalence in the men with ED during the period before the introduction of sildenafil (show the calculation). Why do we know that it must be less than 14.5% even without calculating it?

6. Which one of the following numbers gives the best estimate of the incidence proportion (cumulative incidence) of cardiovascular disease from the data in Figure 1? Show the calculation.

A. 0.0212 B. 0.0221 C. 0.0244 D. 0.0250 E. 0.0818

7. “Depending on age category, the number of men was 1.5 to 2.1 as high in the period after compared to the period before.” Use data from the article to show the calculation of the 1.5 in the quoted sentence.

8. Table 2 presents estimates of CVD incidence for men with and without ED in the periods before and after the introduction of oral sildenafil. Without regard to the confidence intervals, which one of the following estimates is the least meaningful or interpretable? Briefly support your answer. (Choose one best answer and provide a brief supporting statement.)

A. 50.8 per 1000 person-years.
B. 29.4 per 1000 person-years.
C. 34.3 per 1000 person-years.
D. 24.9 per 1000 person-years.
E. 23.6 per 1000 person-years.
F. 23.9 per 1000 person-years.

9. “The estimation of the risk of incident CVD was more precise for the period after than the period before the introduction of sildenafil (Table 2).” (p86c2). Which of the following risk estimates in Table 2 was most precisely estimated. Support your answer by stating the most relevant number or statistic from the table. (60 words maximum)

A. ED subjects before introduction of sildenafil (period A)
B. Control subjects before introduction of sildenafil (period A)
C. ED subjects after introduction of sildenafil (period B)
D. Control subjects after introduction of sildenafil (period B)

10. “The relative risk [for the graph in Fig. 2B] was estimated at 1.7 [95%-CI 0.9-3.3] using the proportional hazards model.” (p86c2). Calculate the corresponding relative risk from the data in Table 2. Show the calculation to three significant figures.

11. “We know that about a quarter of men suffering from ED did consult a physician before the introduction of sildenafil (1). . . . Erectile dysfunction might, therefore, be relatively underdiagnosed in the period before the introduction.” (p87c2-p88c1). Suppose that this estimate of one-quarter is accurate. Suppose also that the sensitivity of physician diagnosis (and therefore of being counted as a case by Frantzen et al.) when a man complains about ED is 90%. Assuming 100% specificity of physician diagnosis, what true underlying incidence rate of ED for the 40,388 Netherlands men in the upper half of Table 1 is reflected in the observed rate of 5.3 per 1000 person-years?

12. The substantial underreporting of ED by men and underdiagnosis by general practitioners create significant potential for selection bias. Conceivably, selection bias was responsible for the observed association between incident ED and prevalent CVD during the period before sildenafil. Alternatively, such an association might actually exist, but it might not have been observed due to selection bias following the substantial increase in the number of men consulting a general practitioner for ED following the introduction of sildenafil. Conceivably selection bias could also have influenced the appearance of elevated CVD risk in men whose ED was noted before the introduction of sildenafil or the absence of such an elevation after the introduction of sildenafil. Describe a scenario in which selection bias would have occurred and had one of these influences. Do you think that any of these influences occurred?

13. Optional: Try this with some colleagues or your small group. Create a spreadsheet with two 2 x 2 tables relating incident ED to CVD prevalence, one table for period A (before sildenafil) and one for period B (after sildenafil). Read the following step by step, writing down the formulas. Then try to fill in the 2 x 2 table from the data in the paper.

“The overall prevalence of cardiovascular diseases was about 8% in both, the period before and after the introduction of sildenafil. Prevalence of CVD was higher among men with ED compared to controls. The odds ratio was 2.07 [95%-CI 1.67-2.56] for the period before the introduction of sildenafil. After the introduction, the odds ratio was significantly lower, namely, 1.38 [95%-CI 1.21-1.57].”

These confidence intervals are apparently incorrect. The 95% confidence interval for the odds ratio in a 2x2 table is obtained by first estimating the 95% confidence limits for the natural logarithm of the odds ratio and then exponentiating (taking anti-logarithms). The lower 95% confidence limit is ln(OR) - 1.96 x s.e.[ln(OR)] ; the upper 95% limit is ln(OR) + 1.96 x s.e.[ln(OR)], where s.e.[ln(OR)] is the standard error of the ln(OR) estimate. The standard error of ln(OR) is the square root of its variance. The variance of the ln(OR) is estimated as 1/a + 1/b + 1/c + 1/d, where a, b, c, d are the cells of the familiar 2x2 table.

Try using the information in a) the quoted paragraph, b) the first column of Table 2, c) the number of prevalent cases of CVD from Figure 1, and d) the (approximate) 8% CVD prevalence for both time periods to show that the authors most likely omitted the multiplier of 1.96 in their calculation of the confidence limits. Without intending to, they calculated 68% confidence intervals!