Winter 2023
PLSC 30600 - Causal Inference
Questions of cause and effect are central to the study of political science and to the social sciences more broadly. But making inferences about causation from empirical data is a significant challenge. Critically, there is no simple, assumption-free process for learning about a causal relationship from the data alone. Causal inference requires researchers to make assumptions about the underlying data generating process in order to identify and estimate causal effects. The goal of this course is to provide students with a structured statistical framework for articulating the assumptions behind causal research designs and estimating effects using quantitative data.
The course begins by introducing the counterfactual framework of causal inference as a way of defining causal quantities of interest such as the “average treatment effect.” It then proceeds to illustrate a variety of different designs for identifying and estimating these quantities. We will start with the most basic experimental designs and progress to more complex experimental and observational methods. For each approach, we will discuss the necessary assumptions that a researcher needs to make about the process 1 that generated the data, how to assess whether these assumptions are reasonable, how to interpret the quantity being estimated and ultimately how to conduct the analysis.
This course will involve a combination of lectures, sections and problem sets. Lectures will focus on introducing the core theoretical concepts being taught in this course. Sections will emphasize application and demonstrate how to implement various causal inference techniques with real data sets. Problem sets will contain a mixture of both theoretical and applied questions and serve to reinforce key concepts and allow students to assess their progress and understanding throughout the course.
Assignments will involve analysis of data using the R programming language. This is a free and open source language for statistical computing that is used extensively for data analysis in many fields. Prior experience with the fundamentals of R programming is required.
PLSC 40502 - Data Analysis with Statistical Models
Statistical models provide a structure for the analysis of data. Often many scientific questions revolve around drawing statistical inferences about some parameter, such as a regression coefficient. Models also allow researchers to generate predictions on new or out-of-sample data. Understanding the fundamentals of how to define, estimate and validate a statistical model is essential to the process of quantitative empirical research.
This course is part of the second year of the Quantitative Methodology sequence in the Department of Political Science and builds on the first year sequence (PLSC 30500, 30600, 30700). It will introduce students to likelihood and Bayesian inference with a focus on multilevel/hierarchical regression models. The overarching framework of this class is model-based inference for description and prediction – a complement to the design-based framework of PLSC 30600 Causal Inference. Students will learn both the theory behind Bayesian modeling as well as how to implement common estimators (e.g. Expectation-Maximization, Markov Chain Monte Carlo (MCMC)) in the R statistical programming language. Applied examples will be drawn from across the political science literature, with a particular emphasis on the analysis of large survey data (e.g. the American 1 National Election Survey (ANES), the Cooperative Election Survey (CES), the European Social Survey (ESS)).
This course will involve a combination of lectures and problem sets as well as final research project. Lectures will focus on introducing the core theoretical concepts being taught in this course as well as providing illustrations through worked applied examples. Problem sets will contain a mixture of both theoretical and applied questions and serve to reinforce key concepts and allow students to assess their progress and understanding throughout the course. The final project consists of an 8-12 page research note applying the methods taught in the course to an actual data analysis task.
Assignments will involve analysis of data using the R programming language. This is a free and open source language for statistical computing that is used extensively for data analysis in many fields. Prior experience with the fundamentals of R programming is required.