Working with Data
Pre-Class Readings and Videos
Now that you have a research question, it is time to look at the data. Raw data consist of long lists of numbers and/or labels that are not very informative. Exploratory Data Analysis (EDA) is how we make sense of the data by converting them from their raw form to a more informative one. In particular, EDA consists of:
- Organizing and summarizing the raw data,
- Discovering important features and patterns in the data and any striking deviations from those patterns, and then
- Interpreting our findings in the context of the problem
We begin EDA by looking at one variable at a time (also known as univariate analysis). In order to convert raw data into useful information we need to summarize and then examine the distribution of any variables of interest. By distribution of a variable, we mean:
- What values the variable takes, and
- How often the variable takes those values
When working with data with more than just a few observations and/or variables requires specialized software. The use of syntax (or formal code) in the context of statistical software is a central skill that we will be teaching you in this course. We believe that it will greatly expand your capacity not only for statistical application but also for engaging in deeper levels of quantitative reasoning about data.
Writing Your First Program
Empirical research is all about making decisions (the best ones possible with the information at hand). Please watch the video below.
After reviewing the material above, take Quiz 3 in moodle. Please note that you have 2 attempts for this quiz and the higher grade prevails.
During Class Tasks
Project Component D