Exploring Third Variables

Pre-Class Reading and Video

There are several different types of third variables that we may need to explore when studying a research question.  There are three that we will be considering in this course  — Covariates, moderators, and confounders. The focus of this section will be on moderating variables, but let’s briefly discuss all three types.

  • covariate is a variable that is possibly predictive of the outcome under study. Think back to your literature review: are there any variables that the research shows are strong predictors of your response variable? Covariates will be discussed in more detail in the multivariate modeling material to come.
  • A confounder is a variable that influences both the explanatory variable and response variable. It is a variable that may make it appear that there is a relationship between between your explanatory and response when in fact there is not. Confounders will also be discussed in more detail in the multivariate modeling material to come.
  • A moderator is a third variable that appears to effect the direction or strength of an explanatory and response variable. The effect of a moderating variable is often characterized statistically as an interaction; that is, a third variable that affects the direction and/or strength of the relationship between your explanatory (X) and response (Y) variable.

 

Visualizing Moderating Variables

Parental Relationships and School Detention in Middle School Students

Suppose we are interested in testing whether the strength of parental relationships (as measured by a middle school students on a scale from 1(low) to 5(high)) is related to whether a student receives a detention within the first month of school. A bivariate graph reveals the following:

The graph above does suggest that there is an association between relationship strength and detention. Specifically, students with a stronger relationship have a lower tendency of detention. But is this relationship true for students for all grades? Let’s take a look when we break up the graph by grade level.

Notice that the relationship appears strong for 6th grade students in the graph above, very mild for students in 7th grade, and non-existent for students in 8th grade. Since the assocation between relationship strength and detention differs based on grade, it would suggest that grade is a potential moderating variable.

Interview Scores and Training Program

Suppose we are trying to study which training program (A) or (B) is more beneficial for helping job hunters during their interviews. Suppose we obtain the following graph:

In the graph above notice that Program A is more effective than Program B across-the-board. However, Program A results in a much higher average score for the older age groups than for the younger age groups. Since the strength of the relationship between Program and Interview Score seems to vary based on age group, age group is a potential moderator.

Now suppose instead we obtain the following graph:

Once again, the graph above demonstrates that Program A is across-the-board more effective than Program B. Since Program A is consistently 1 unit higher than Program B within each age group, the relationship does not vary based on age. Therefore, age does not appear to moderate the relationship.

Party Affiliation and Voting

Suppose that we are looking to discover whether there is a relationship between Party Affiliation and whether someone voted. A bivariate graph shows the proportion of people who voted within each party:

There does not appear to be a visual relationship between Political Affiliation and Voting based on the plot above since each Political Party has 45% of it’s members voting.

However, if we break it down by gender, notice what might happen:

 

For each subgroup (males and females) the relationship between Political Party and Voting is different based on gender. In particular, female republicans are expected to vote 30% more than female democrats. And male democrats are expected to vote 30% less often than male republicans. The direction of the relationship is different. This suggests that gender is potentially a moderating variable.

 

Examples of moderators within the context of your research question:

I have hypotheses about the association between smoking quantity and nicotine dependence for individuals with and without depression For example, for those with depression, any amount of smoking may indicate substantial risk for nicotine dependence (i.e. at both low and high levels of daily smoking), while among those without depression, smoking quantity might be expected to be more clearly associated with likelihood of experiencing nicotine dependence (i.e. the more one smokes, the more likely they are to be nicotine dependent). In other words, I am hypothesizing a non-significant association between smoking and nicotine dependence for individuals with depression and a significant, positive association between smoking and nicotine dependence for individuals without depression.

To test this, I can run two ANOVA tests, one examining the association between nicotine dependence (categorical) and level of smoking (quantitative) for those with depression and one examining the association between nicotine dependence (categorical) and level of smoking (quantitative) for those without depression.

The results show a significant association between smoking and nicotine dependence such that the greater the smoking, the higher the rate of nicotine dependence among those individuals with and without depression. In this example, we would say that depression does not moderate the relationship between smoking and nicotine dependence. In other words, the relationship between smoking and nicotine dependence is consistent for those with and without depression.

I have a similar question regarding alcohol dependence. Specifically, I believe that the association between smoking quantity and nicotine dependence is different for individuals with and without alcohol dependence (the potential moderator). For those individuals with alcohol dependence, I believe that smoking and nicotine dependence will not be associated (i.e there will be high rates of nicotine dependence at low, moderate and high levels of smoking), while among those without alcohol dependence, smoking quantity will be significantly associated with the likelihood of experiencing nicotine dependence (i.e. the more one smokes, the more likely he/she is to be nicotine dependent). In other words, I am hypothesizing a non-significant association between smoking and nicotine dependence for individuals with alcohol dependence and a significant, positive association between smoking and nicotine dependence for individuals without alcohol dependence.

To test this, I run two ANOVA tests, one examining the association between smoking and nicotine dependence for those with alcohol dependence and one examining the association between smoking and nicotine dependence for those without alcohol dependence.

The results show that there is a significant association between smoking and nicotine dependence but, as I hypothesized, only for those without alcohol dependence. That is, for those without alcohol dependence, nicotine dependence is positively associated with level of smoking. In contrast, for those with alcohol dependence, the association between smoking and nicotine dependence is non-significant (statistically similar rates of nicotine dependence at every level of smoking).

Because the relationship between the explanatory variable (smoking) and the response variable (nicotine dependence) is different based on the presence or absence of our third variable (alcohol dependence), we would say that alcohol dependence moderates the relationship between nicotine dependence and smoking.

Please watch the video below:

Find R video here.
Find Stata video here.

Pre-Class Quiz

After reviewing the material above, take Quiz 10 in moodle. Please note that you have 2 attempts for this quiz and the higher grade prevails.

During Class Tasks

Mini-Assignment 7
Project Component I: click here