Sep 22: Annie Dong: Variations of the Additive Factors Models of educational data

Join us Tuesday, Sep 22 at 11:00am in Baker Hall 232M:

Variations of the Additive Factors Models of educational data

Annie Dong, Human Computer Interaction Institute, Carnegie Mellon, and University of Santa Barbara, California

The Additive Factors Model is a logistic regression model that attempts to find the best-fit curve for predicting students’ problem step performance. We use different DataShop datasets spanning different domains in conjunction with pre- and post-test data to explore two variations of the Additive Factors Model. One variation used a log opportunity count in place of the opportunity count parameter in the Additive Factors Model. This effectively allows for bigger performance estimate increases at earlier opportunity counts compared to later opportunity counts. We demonstrate that taking the log opportunity improves the model fit based on AIC/BIC but does not alleviate the correlation between skill intercept and skill slope estimates. The second variation used linear, rather than logistic, regression to predict graded success measures rather than 0-1 first-attempt correctness for each problem step. We developed four distinct ways of quantifying graded success and explore how it improves the reliability of AFM’s parameter estimates as compared to pre/post data.

Aug 25: Matt Baird, Rand Corporation

Please join us Tues Aug 25 at 2:00pm in room 232M Baker Hall.


Dealing with Variation in Test Duration When Estimating Program Effects

Matthew Baird and John Pane


Educational treatment effects for student achievement may be a function not only of increased student ability, but of differing testing conditions. We use a data set that includes pretest and posttest outcomes as well as test durations (which is correlated with testing conditions) to investigate the impact of an educational intervention. We demonstrate that part of the large positive treatment effects are the result of larger-than-normal change in test duration from the pretest to the posttest for the treated group. Part of the change in duration may be the result of increased student ability, but there is evidence that part of the anomalous duration changes are the result of a change in treatment testing conditions from the pretest to the posttest, which introduces bias in the treatment effect estimate. We explore strategies for controlling for this bias and estimating the treatment effect, and discuss the implications for testing condition elements that are no observable to the researcher.

CMART Closing Colloquia June 22-23

The CMART postdoctoral training program ( is ending its first five-year stint, with the departure of Adam Sales to UT-Austin later this summer.

On Monday June 22 and Tuesday June 23 we are having a set of final workshops/colloquia, bringing back all of the CMART and CMART-affiliated postdocs. This includes research colloqiua from each of the current and returning postdocs (these are less formal 30-40 minute presentations with ample time for discussion and speculation):

Monday afternoon in Baker Hall 232M:

  • 1:30-2:30. Julia Kaufman (RAND Corporation): Anchoring Survey Measures of Teaching Practice
  • 2:30-3:30. Sarah Ryan (Education Development Center, Boston MA): Results from a Field Test of an Instrument Designed to Measure Youth Social Capital during Postsecondary Transitions
  • 3:30-4:30. Tracy Sweet (University of Maryland): Modeling networks as mediators

Tuesday afternoon in Baker Hall 232M:

  • 1:00-2:00. Jodi Casabianca (UT Austin): Extensions and Applications of the Hierarchical Rater Model
  • 2:00-3:00. Adam Sales (Carnegie Mellon University): Counting successes as a way to aggregate results across experiments

All are welcome to attend these colloquia.

The SERG colloquium series will continue. A call for presenters, and later a schedule for the summer and beyond, will be sent out in the coming days.

Contact if you have any questions or need additional information.


Ben Hansen: CMART Speaker Series June 9

Please join us for the final installment of the 2015 CMART Speaker Series

Ben Hansen will be speaking at 10:00 on Tuesday, June 9
Location: DeGroot Library, 229A 232M Baker Hall (The usual SERG space)

Ben is an Associate Professor of statistics at the University of Michigan. (and was my PhD advisor). He specializes in causal inference, and in particular randomization-based inference, attributable effects, and propensity-score matching. He (with collaborators) developed the optmatch R package and the Gates Foundation Evaluation Engine.

Title: Propensity score calipers and the overlap condition

Abstract: Propensity scores (Rosenbaum and Rubin, 1983) are used widely to address measured confounding in quasiexperiments. They also arise in connection with the antecedent question of whether non­equivalent treatment and control groups are suitable for comparison at all, with or without covariate adjustments.

“Common support”, the assumption that propensity scores are bounded away from 1, is so named because for large samples it entails that the propensity support of the treatment group be contained within that of the control group. This entailment may appear to be simple to check, but it is not: it refers to true propensity scores, not estimates of them; and even if true propensity score supports coincide supports on the estimated propensity often will not. The few methodologists who have addressed the issue have tended to do so by changing the subject, specifying sample trimming rules suited to technical objectives other than the straightforward one of ensuring that like be compared to like (Crump et al, 2009; Rosenbaum 2012).

I suggest an alternate approach based on caliper matching, while using a novel procedure to determine the value of the caliper. I’ll discuss two examples, one from education and the other from sociology and public health.

Upcoming Talk: Jared Murray (April 28)

Please join us Tuesday, April 28 at 10:00 a.m. in 232M Baker Hall for this talk given by Jared Murray of CMU Statistics.

Title: Flexible Regression Models for Partially Identified Causal Effects with Binary Instrumental Variables

Abstract: I outline a model-based approach to causal inference using instrumental variables, focusing on the case of a binary instrument, treatment and response. After reviewing model-based inference in instrumental variable designs I will focus on relaxing two classes of assumptions: parametric assumptions about the form of the regression functions, and structural assumptions that are invoked to point identify causal effects. Weaker structural assumptions are often more tenable, but no longer point identify causal effects of interest. Strictly speaking, this is not a problem when performing Bayesian inference for causal effects. However, it does mean that inferences are sensitive to modeling assumptions and prior distributions – even asymptotically, and even if the model for observables is correct. As a result, specifying appropriate prior distributions and conducting sensitivity analysis is paramount.

With this in mind I describe a class of parameterizations of prior distributions for partially identified regression models with several desirable properties: They allow for flexible nonparametric priors for point identified regression functions, selectively informative conditional priors for partially identified parameters, and computationally efficient sensitivity analysis. The methods are illustrated on a well-known dataset collected during a randomized encouragement study.

Michael Hudgens: CMART Speaker Series April 27

Please join us for the second of the 2015 CMART Speaker Series

Michael Hudgens will be speaking at 4:00pm this Monday April 27.
125 Scaife Hall

Michael is an Associate Professor of biostatistics at the University of North Carolina Chapel Hill, and the director of the Biostatistics Core at the UNC Center for AIDS Research.
His methodological work is in causal inference, and has written about principal stratification, randomization inference, and interference between units, along with a wide variety of biostatistical applications.

Title: Causal Inference in the Presence of Interference

Abstract: A fundamental assumption usually made in causal inference is that of no interference between individuals (or units), i.e., the potential outcomes of one individual are assumed to be unaffected by the treatment assignment of other individuals. However, in many settings, this assumption obviously does not hold. For example, in infectious diseases, whether one person becomes infected depends on who else in the population is vaccinated. In this talk we will discuss recent approaches to assessing treatment effects in the presence of interference. Inference about different direct and indirect (or spillover) effects will be considered in a population where individuals form groups such that interference is possible between individuals within the same group but not between individuals in different groups. An analysis of an individually-randomized, placebo controlled trial of cholera vaccination in 122,000 individuals in Matlab, Bangladesh will be presented which indicates a significant indirect effect of vaccination.

Upcoming Talk: Ran Liu

Please join us Tuesday, April 14 at 10:00 a.m. in 232M Baker Hall for this talk given by Ran Liu of CMU Psychology and Human-Computer Interaction.

Title: Variations in learning rate within cognitive tutor use

Abstract A growing body of research suggests that accounting for student-specific variability in educational data can enhance modeling accuracy and may have important implications for individualizing instruction. The traditional Additive Factors Model (AFM), a logistic regression-based model commonly used to fit educational data and discover/refine skill models of learning, contains a parameter that individualizes for overall student ability but not for student learning rate. We find that adding a per-student learning rate parameter to AFM overall does not improve predictive accuracy, nor does it relate to pretest-posttest gains. However, two alternative methods of differentiating learning rates at the student level yielded more interesting and externally valid results. In the first method, we created three classes of students based on each student’s residual patterns (across practice opportunities) when fitted with the standard AFM model. Adding a per-class learning rate to the traditional AFM model substantially improved its predictive accuracy, and class membership was systematically related to pretest-postest gains. In a second method, we eliminate skill-specific learning rate parameters from the model and individualize learning rates only at the student level. Preliminary evidence suggests that, although these parameter estimates are significant predictors of post-test outcomes, usage (number of practice opportunities) is a relatively better predictor than are differences in these learning rate estimates.

Upcoming Talk: Dan McCaffrey, March 31

Please join us Tuesday, March 31 at 10:00 a.m. in 232M Baker Hall for this talk given by Dan McCaffrey of ETS.

Title: “The Impact of Measurement Error on the Accuracy of Individual and Aggregate SGP”

Abstract: Student growth percentiles (SGPs) express students’ current observed scores as percentile ranks in the distribution of scores among students with the same prior-year scores. A common concern about SGPs at the student level, and mean or median SGPs (MGPs) at the aggregate level, is potential bias due to test measurement error (ME). Shang, VanIwaarden, and Betebenner (SVB; this issue) develop a simulation-extrapolation (SIMEX) approach to adjust SGPs for test ME. In this paper, we use a tractable example in which different SGP estimators, including SVB’s SIMEX estimator, can be computed analytically to explain why ME is detrimental to both student-level and aggregate-level SGP estimation. A comparison of the alternative SGP estimators to the standard approach demonstrates the common bias-variance tradeoff problem: estimators that decrease the bias relative to the standard SGP estimator increase variance, and vice versa. Even the most accurate estimator for individual student SGP has large errors of roughly 19 percentile points on average for realistic settings. Those estimators that reduce bias may suffice at the aggregate level but no single estimator is optimal for meeting the dual goals of student- and aggregate level inferences.

Upcoming Talk: Adam Sales (March 24th)

Please join us Tuesday, March 24 at 10:00 a.m. in 232M Baker Hall for this talk given by Adam Sales, PhD, from CMU Stats

Title: Exploring Causal Mechanisms in a Randomized Effectiveness Trial of the Cognitive Tutor

Cognitive Tutor Algebra I (CTAI), published by Carnegie Learning, Inc., is an Algebra I curriculum, including both textbook components and an automated, computer appli- cation that is designed to deliver individualized instruction to students. A recent randomized controlled effectiveness trial, found that CTAI increased students’ test scores by about 0.2 standard deviations. However, the study raised a number of questions, in the form of evidence for treatment- effect-heterogeneity. More basically, it is unknown which as- pects of the CTAI program drove the observed effect. The experiment generated student log-data from the computer application. This study attempts to use that data to shed light on CTAI’s causal mechanisms, via principal stratifi- cation. Principal strata are categories of both treatment and control students according their CTAI usage; they al- low researchers to estimate differences in treatment effect between usage subgroups. Importantly, randomization sat- isfies the principal stratification identification assumptions. We present the results of our first analyses here, following prior observational results. We find that students who en- counter more than the median number of sections experience 0.45 (0.2–0.6) standard deviations higher effect than their peers who encounter fewer, and students who need more assistance experience 0.36 (0.25–0.48) standard deviations lower effect than their peers who require less.