Mini-Assignment 9


  1. Familiarize yourself with the codebook for the Saratoga Houses dataset (codebook below)
  2. Import/load the Saratoga Houses dataset.
  3. Now construct a regression equation with price as your response variable and number of bathrooms as your explanatory variable. Answer Questions 1-3 below.
  4. Construct a multiple regression equation to test whether number of bathrooms are significantly associated with price after controlling for living area. Answer Question 4.
  5. Familiarize yourself with the read me file for the Student Health dataset.
  6. Import/load the Student dataset.
  7. Determine the mean weight of students based on major. Answer Question 5 below.
  8. Now construct a regression equation with weight as your response variable and major as your explanatory variable. Answer Question 6 below.
  9. Now, find the proportion of each major who are male and female (HINT: Look at the translation syntax for bivariate C->C). Answer Question 7 below.
  10. Construct the appropriate model to determine whether weight and major are significantly associated after controlling for gender. Answer Questions 8-9 below.


Question 1: What is the appropriate regression equation. How can you specifically explain the relationship between number of bathrooms and house price?

Question 2: Why are number of bathrooms significantly associated with house price?

Question 3: Why should living area be considered as a possible confounding variable?

Question 4: After controlling for living area, is there a significant association between number of bathrooms and house price?

Question 5: State the mean weight for nursing and engineering majors below.

Question 6: What does the regression equation tell us about how the mean weight of Nursing majors and Engineering majors compare?

Question 7: What proportion of Nursing majors are female? What proportion of Engineering majors are female?

Question 8: Why is gender a reasonable or not reasonable choice to consider as a confounding variable?

Question 9: Are major and weight significantly associated after controlling for gender?

Submit your answers in moodle under Mini-Assignment 9.

CODEBOOK: Saratoga Houses

The dataset contains information on 1,063 houses in Saratoga County, New York, USA in 2006.

The variables in the dataset include:

  • Price: Amount of house in US dollars
  • Living.Area: Square feet of house
  • Baths: Number of baths
  • Bedrooms: Number of bedrooms
  • Fireplace: “yes” or “no”
  • Acres: Number of acres on the property
  • Age: Age of house in years


CODEBOOK: Student Health

The following data comes from self-reported volunteer data of 100 students majoring in nursing and engineering at University of Miami in 2014. The following data was recorded:

  • major: Either “Nursing” or “Engineering” based on sample collected
  • gender: Listed as “Male” or “Female”
  • sleep: Estimated number of hours slept on a typical night
  • depression: diagnosed as depressed (1) or not diagnosed as depressed (0)
  • weight: Student weight in pounds.
  • smedia: Estimated amount of time (in hours) spent on social media each day.