Graphing in Python with seaborn
The data set used to illustrate the seaborn commands is the HELP study (data name is HELPrct), which was a clinical trial for adult inpatients recruited from a detoxification unit. The variables that we use throughout this tutorial include depression (cesd), homelessness status (homeless), primary abuse substance (substance), patient’s age (age), and patient’s
gender (sex).
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Sample data structure similar to HELPrct
# Replace this with your actual dataset load
# Example: df = pd.read_csv('HELPrct.csv')
df = pd.read_csv('HELPrct.csv')
Univariate Graphing
Bar plot for a single categorical variable (substance
)
sns.countplot(data=df, x='substance')
plt.title("Primary abuse substance of subjects")
plt.show()
Histogram of a quantitative variable (cesd
)
sns.histplot(data=df, x='cesd', bins=30)
plt.title("Depression Scores of Subjects")
plt.show()
3. Density plot
sns.kdeplot(data=df, x='cesd')
plt.title("Depression Scores of Subjects")
plt.show()
Bivariate Graphing
Bar plot of means (grouped by substance
)
mean_df = df.groupby('substance')['cesd'].mean().reset_index()
sns.barplot(data=mean_df, x='substance', y='cesd')
plt.ylabel("Depression")
plt.title("Mean Depression Scores at each Primary Abuse Substance")
plt.show()
Boxplots
sns.boxplot(data=df, x='substance', y='cesd', hue='substance')
plt.ylabel("Depression")
plt.title("Mean Depression Scores at each Primary Abuse Substance")
plt.show()
Density plots by group
sns.kdeplot(data=df, x='cesd', hue='substance')
plt.xlabel("Depression")
plt.title("Mean Depression Scores at each Primary Abuse Substance")
plt.show()
Mean with Error Bars
from scipy.stats import sem
summary_df = df.groupby('substance')['cesd'].agg(['mean', sem]).reset_index()
plt.errorbar(x=summary_df['substance'], y=summary_df['mean'],
yerr=summary_df['sem'], fmt='o', capsize=5, linestyle='None')
plt.xlabel("Substance")
plt.ylabel("Depression")
plt.title("Mean Depression Scores with Error Bars by Substance")
plt.show()