Relevant conceptual problems.
Chapter 18: 1, 2, 3, 6 (referring to study design, but not any computations, in 5), 7, 8
Chapter 19: 1, 3, 6, 9
Questions are 1 point each, except where indicated as 2 points.
Computational/Data problems (turn in).
1) The data given below and repeated in perry.csv are from a randomized social experiment
started in 1962.
123 three- and four-year old children were randomly assigned to either 2 years of preschool instruction or a control
group that received no preschool education. The response, recorded for each individual, was whether or not
that individual had been arrested for any crime before the age of 19 (Yes or No). The numbers of individuals
in each combination of treatment and response groups are:
Yes | No | |
Preschool | 19 | 42 |
Control | 32 | 30 |
2) The data given below and repeated in epi.csv are made up from a study of the association between a potential risk factor and breast cancer. The population of interest is women in small city. The investigators identified 100 women with breast cancer then identified 100 demographically similar women without breast cancer. There is no matching at the individual level, so we will ignore the 'demogaphically similar' part of the study design. They then asked each women about their exposure to that risk factor. Group A were exposed to the risk factor; group B was not. The numbers of women
in each combination of risk factor and cancer groups are:
Group: | No cancer | Cancer |
exposed | 40 | 50 |
not | 60 | 50 |
The investigators then repeat the study with new individuals and a larger sample size. They could only find 100 cases of cancer, but they identified 2500 non-cancer individuals. Groups A (exposed) and B (not) were defined the same way as in the previous study. The numbers of women
in each combination of risk factor and cancer groups in the new study are:
Group: | No cancer | Cancer |
exposed | 1000 | 50 |
not | 1500 | 50 |
d) 2 pts. Repeat question 2b for the new study. That is, estimate the proportion of cancer cases in the Group A women, the Group B women, and the difference in those proportions.
e) Repeat question 2c for the new study. That is, calculate the odds ratio that fills in this sentence: The odds of breast cancer for a woman in Group A is ______ times as large as that for a woman in group B.
f) 2 pts. The first study was intended to assess the consequences of exposure to the risk factor in populations with different background levels of breast cancer. Is it more appropriate to report the difference in proportions (i.e., your answer from question 2b) or the odds ratio (i.e., your answer from question 2c)? Briefly explain your choice.
g) Which study gives you the more precise estimate of the odds ratio? Briefly explain your choice.