**Directions**

- Familiarize yourself with the codebook for the Saratoga Houses dataset (codebook below)
- Import/load the Saratoga Houses dataset.
- Now construct a regression equation with price as your response variable and number of bathrooms as your explanatory variable. Answer Questions 1-3 below.
- Construct a multiple regression equation to test whether number of bathrooms are significantly associated with price after controlling for living area. Answer Question 4.
- Familiarize yourself with the read me file for the Student Health dataset.
- Import/load the Student dataset.
- Determine the mean weight of students based on major. Answer Question 5 below.
- Now construct a regression equation with weight as your response variable and major as your explanatory variable. Answer Question 6 below.
- Now, find the proportion of each major who are male and female (HINT: Look at the translation syntax for bivariate C->C). Answer Question 7 below.
- Construct the appropriate model to determine whether weight and major are significantly associated after controlling for gender. Answer Questions 8-9 below.

**Questions**

* Question 1*: What is the appropriate regression equation. How can you specifically explain the relationship between number of bathrooms and house price?

* Question 2:* Why are number of bathrooms significantly associated with house price?

* Question 3*: Why should living area be considered as a possible confounding variable?

* Question 4:* After controlling for living area, is there a significant association between number of bathrooms and house price?

* Question 5:* State the mean weight for nursing and engineering majors below.

* Question 6: *What does the regression equation tell us about how the mean weight of Nursing majors and Engineering majors compare?

* Question 7:* What proportion of Nursing majors are female? What proportion of Engineering majors are female?

* Question 8:* Why is gender a reasonable or not reasonable choice to consider as a confounding variable?

* Question 9:* Are major and weight significantly associated after controlling for gender?

Submit your answers in moodle under Mini-Assignment 9.

CODEBOOK: Saratoga Houses

The dataset contains information on 1,063 houses in Saratoga County, New York, USA in 2006.

The variables in the dataset include:

- Price: Amount of house in US dollars
- Living.Area: Square feet of house (in SAS, the variable is living_area)
- Baths: Number of baths
- Bedrooms: Number of bedrooms
- Fireplace: “yes” or “no”
- Acres: Number of acres on the property
- Age: Age of house in years

CODEBOOK: Student Health

The following data comes from self-reported volunteer data of 100 students majoring in nursing and engineering at University of Miami in 2014. The following data was recorded:

- major: Either “Nursing” or “Engineering” based on sample collected
- gender: Listed as “Male” or “Female”
- sleep: Estimated number of hours slept on a typical night
- depression: diagnosed as depressed (1) or not diagnosed as depressed (0)
- weight: Student weight in pounds.
- smedia: Estimated amount of time (in hours) spent on social media each day.