Mini-Assignment 11

Questions

Familiarize yourself with the codebook for the Saratoga Houses dataset (codebook below) and then import/load the Saratoga Houses dataset.

Question 1: Construct a regression equation that allows you to predict price from number of bathrooms. How can you specifically explain the relationship between number of bathrooms and house price?

Question 2: Does the regression equation allow us to establish causation between number of bathrooms and price of house?

Question 3: Why might living area be considered as a possible confounding variable for the relationship between number of bathrooms and house price?

Question 4: Construct a multiple regression equation to test whether number of bathrooms are significantly associated with price after controlling for living area. After controlling for living area, is there a significant association between number of bathrooms and house price?

Question 5: Based on your findings – Does living area confound the relationship between number of bathrooms and house price?

Familiarize yourself with the codebook for the StudentHealth dataset (codebook below) and then import/load the data.

Question 6: Determine the mean weight of students based on major.

Question 7: Now construct a regression equation with weight as your response variable and major as your explanatory variable. What does the regression equation tell us about how the mean weight of Nursing majors and Engineering majors compare?

Question 8: Now, find the proportion of each major who are male and female (HINT: Look at the translation syntax for bivariate C->C). What proportion of Nursing majors are female? What proportion of Engineering majors are female?

Question 9: Why might gender be a possible confounding variable?

Question 10: Construct the appropriate model to determine whether weight and major are significantly associated after controlling for gender. Are major and weight significantly associated after controlling for gender?


CODEBOOK: Saratoga Houses

The dataset contains information on 1,063 houses in Saratoga County, New York, USA in 2006.

The variables in the dataset include:

  • Price: Amount of house in US dollars
  • Living.Area: Square feet of house (in SAS, the variable is living_area)
  • Baths: Number of baths
  • Bedrooms: Number of bedrooms
  • Fireplace: “yes” or “no”
  • Acres: Number of acres on the property
  • Age: Age of house in years

CODEBOOK: Student Health

The following data comes from self-reported volunteer data of 100 students majoring in nursing and engineering at University of Miami in 2014. The following data was recorded:

  • major: Either “Nursing” or “Engineering” based on sample collected
  • gender: Listed as “Male” or “Female”
  • sleep: Estimated number of hours slept on a typical night
  • depression: diagnosed as depressed (1) or not diagnosed as depressed (0)
  • weight: Student weight in pounds.
  • smedia: Estimated amount of time (in hours) spent on social media each day.