- Familiarize yourself with the codebook for the Saratoga Houses dataset (codebook below)
- Import/load the Saratoga Houses dataset.
- Now construct a regression equation with price as your response variable and number of bathrooms as your explanatory variable. Answer Questions 1-3 below.
- Construct a multiple regression equation to test whether number of bathrooms are significantly associated with price after controlling for living area. Answer Question 4.
- Familiarize yourself with the read me file for the Student Health dataset.
- Import/load the Student dataset.
- Determine the mean weight of students based on major. Answer Question 5 below.
- Now construct a regression equation with weight as your response variable and major as your explanatory variable. Answer Question 6 below.
- Now, find the proportion of each major who are male and female (HINT: Look at the translation syntax for bivariate C->C). Answer Question 7 below.
- Construct the appropriate model to determine whether weight and major are significantly associated after controlling for gender. Answer Questions 8-9 below.
Question 1: What is the appropriate regression equation. How can you specifically explain the relationship between number of bathrooms and house price?
Question 2: Why are number of bathrooms significantly associated with house price?
Question 3: Why should living area be considered as a possible confounding variable?
Question 4: After controlling for living area, is there a significant association between number of bathrooms and house price?
Question 5: State the mean weight for nursing and engineering majors below.
Question 6: What does the regression equation tell us about how the mean weight of Nursing majors and Engineering majors compare?
Question 7: What proportion of Nursing majors are female? What proportion of Engineering majors are female?
Question 8: Why is gender a reasonable or not reasonable choice to consider as a confounding variable?
Question 9: Are major and weight significantly associated after controlling for gender?
Submit your answers in moodle under Mini-Assignment 9.
CODEBOOK: Saratoga Houses
The dataset contains information on 1,063 houses in Saratoga County, New York, USA in 2006.
The variables in the dataset include:
- Price: Amount of house in US dollars
- Living.Area: Square feet of house (in SAS, the variable is living_area)
- Baths: Number of baths
- Bedrooms: Number of bedrooms
- Fireplace: “yes” or “no”
- Acres: Number of acres on the property
- Age: Age of house in years
CODEBOOK: Student Health
The following data comes from self-reported volunteer data of 100 students majoring in nursing and engineering at University of Miami in 2014. The following data was recorded:
- major: Either “Nursing” or “Engineering” based on sample collected
- gender: Listed as “Male” or “Female”
- sleep: Estimated number of hours slept on a typical night
- depression: diagnosed as depressed (1) or not diagnosed as depressed (0)
- weight: Student weight in pounds.
- smedia: Estimated amount of time (in hours) spent on social media each day.