Stat 587: Data sets that have been discussed in class or used in homework
are:
- Week 1:
- creativity.csv Creativity data, for lab, 23 Aug.
- hospital.csv Hospital stay data, optional for HW 1. Values repeated in HW problem text.
- Week 2:
- 4sets.csv Four data sets used to illustrate data description
in 29 Aug lecture. Columns are the set number and the value.
- creativity.txt Space delimited (.txt) version of the creativity data, used in lab, Aug 30.
- tomato.txt Tomato fertilizer data, used in lab,
Aug 30. Each row is a
data for a single plant. Variables are fertilizer (a: better, b:
usual), and yield (lbs).
- radon.txt Data for HW 2. Concentration
of radon in a simple random sample of owner-occupied homes in Ramsey Co., MN.
- mutagen.csv Data for HW 2.
Counts of mutants in control (no) and exposed (dose) groups.
- votes.txt Data for HW 2,
Votes in Congress. PctPro is the Percent Pro votes for that member.
That's the response variable for the HW problem.
There are lots of variables in the file. If you use SAS, you need
to read them all (at least up to PctPro). R and JMP will automatically read all the variables and give them their proper names.
You won't use most of the variables in the file.
- Week 3:
- sparrow.csv Bumpus's sparrow data for HW 3.
Variables are Humerus (length in inches) and Status (Survived or Perished).
- Week 4:
- case0202.txt Schizophrenic twin data set.
- hamburger.csv Hamburger example (csv format)
- hamburger.xlsx Hamburger example (xlsx format)
- darwin.csv Darwin's cross- and self-fertilized plant data, in wide format (one row per pot)
- darwin.txt Darwin's cross- and self-fertilized plant data, for SAS users
- darwinLong.csv Darwin's cross- and self-fertilized plant data, in long format (one row for each observation)
- PAwide.csv physical activity data, wide format for HW 4
- PAlong.csv physical activity data, long format for HW 4
- cancer.csv Breast cancer survival time for HW 4
- Week 5:
- fishprice.csv Fish prices in 1970 and 1980, wide format data, for lab 5 self assessment.
- fishlong.csv Fish prices in 1970 and 1980, long format data, for lab 5 self assessment.
- patty.txt Hamburger patty data set with multiple observations.
- burn.csv Prairie burn data for HW 5. First column
is the treatment (burn or not), the second column is the watershed identifier,
and the third is the percent of shrub cover in a 10m x 10m plot.
- dioxin.csv Blood dioxin levels for HW 5. First column is the dioxin concentration in the blood. Second column is the
potential exposure to Agent Orange (Vietman=exposed, Other=not).
Note: This file is an edited version of case study 3.2 to provide more reasonable numbers for
low dioxin measurements.
- Week 6
- alanine.csv Alanine data set for Lab 6 self assessment.
- Week 7
- Week 8
- case0601.csv Data set for Handicap study (Case study 6.1)
- teaching.csv Data set for lab 7 self assessment>
- lettuce.csv for HW 7. Lettuce yield
response to fertilizer. Data are the fertilizer applied to the plot
and the number of heads of lettuce harvested from the plot.
- dietstudy.csv for HW 7. Weight loss after 24 months
on three diets. Column 1 is the subject number, 2 is the treatment, 3 is the weight loss
in kg.
- Week 9
- cavity.txt Nest cavity size data set for lab self assessment
- Trex.csv Trex bone oxygen data for HW 8.
- faP.csv unadjusted p-values from t-tests comparing the two genotypes, for HW 8.
- fa2.csv Raw data for the fatty acids in two genotypes, for HW 8.
- Week 10
- meat.txt Meat pH data (case study 6.2),
Used as lecture example.
- peanut.txt Peanut and aflatoxin concentration data. The first column is the
percent clean peanuts. The second is the aflatoxin concentration
(ppb).
- music.txt Music / brain activity data for HW 9
First column is the number of years the subject has played a
string instrument. The second is the neuronal activity index, a
measure of brain activity
- planet.csv Planet data for HW 9.
Three columns: planet name,
order from sun, and distance from sun.
- diversity.txt
Number of butterfly species found in patches
of Brazilian rain forest for HW 9. Two columns: area of the patch, number of
butterfly species found on it.
- Week 11
- Week 12
- oring.csv Data for O-ring question on HW 10
- pace2.csv Data for pace of life question on HW 10
- brain.txt Data for brain example in lecture
- Week 13:
- light.txt Meadowfoam data (Case study 9.1) for light and flower
production lecture example and lab 11. Uses E or L to mark early or late groups.
- Week 14:
- Data sets for HW 12, problem 1
- wage.csv Data set for HW 12, problem 2
- sat.csv Data set for SAT lecture example and code