The recent Canning et al. publication on growth mindset and racial achievement gaps is such a paper.
The authors examine the relationship between instructor beliefs about the malleability of intelligence and the achievement gap between underrepresented (URM) and non-URM students. After controlling for many factors, they find this gap is smaller if the instructor believes intelligence to be changeable, i.e. has a growth mindset. This is an intuitively reasonable result. Unfortunately, the paper gives us no reasons beyond intuitive reasonableness to believe it.
Here are the main problems with the paper.
- Is there a there there? According to Canning et al., instructors who scored one standard deviation above the mean on a measure of growth mindset had a URM-nonURM achievement gap of 0.1 grade points. Those who scored one SD below the mean had an achievement gap of 0.19 grade points. So increasing instructor growth mindset by two standard deviations decreases the achievement gap by all of 0.09 grade points. The paper summarizes this result by saying, "the racial achievement gap was nearly twice as large in courses taught
by college professors who endorsed fixed (versus growth) mindset beliefs
about students’ ability." That sounds more impressive than a change of 0.09 grade points.
For anyone teaching introductory statistics, this is an excellent illustration of the difference between statistical and practical significance, as well as relative and absolute change.
- No data is shown. At first glance, Fig. 1 in Canning et al. looks like a typical dynamite plunger plot showing the mean and some sort of error bar. This is a poor way to present data because it completely hides the distribution and can make totally different datasets look the same. A histogram or at least a box plot would be much more informative.
But it gets worse. A closer look at the caption of Fig. 1 reveals that it doesn't show any data at all. Rather, it displays predicted values from a complex statistical model incorporating numerous student-, instructor- and course-level variables. In fact, no figure in the paper displays any actual data. Only the outputs of statistical models are shown. Even the results discussed previously are modeled, not actual results.
Now, statistically adjusting for potential confounders is often an appropriate and useful thing to do. I am not against the practice. However, a publication should start with the data itself and then discuss any necessary adjustments. Otherwise, readers are essentially asked to take authors' analyses on faith.
- More predictors are not better. Canning et al. predicted student grades using a statistical model that included the following variables:
- faculty mindset (the actual variable being studied)
- student gender
- student race/ethnicity
- student first-generation status
- student SAT scores
- course enrollment
- course level
- faculty gender
- faculty race/ethnicity
- faculty age
- faculty years of teaching experience
- faculty tenure status
All this matters because the use of correlated variables in a regression analysis, termed multicollinearity, can result in parameter estimates that are very sensitive to changes in the data or the choice of predictors. Essentially, the regression coefficients become uninterpretable. And since mindset is merely one of the 13 predictor variables in the model, its regression coefficient is just as affected as the others. Whether the predicted grade difference resulting from growth mindset is affected or not is not entirely clear. If predicted grades were obtained by simply plugging values into the model, they are not affected by multicollinearity. However, all results derived from model coefficients are affected.
None of the problems outlined here are unique to this study. There is a desperate need for better statistical analysis and data presentation in the education literature. We need to focus not on ever more sophisticated statistical techniques but on a solid understanding and use of the basics. Show the data. Use absolute change. Don't use techniques without understanding the assumptions behind them. These measures alone could prevent many promising but unreliable publications.