I recently completed the
The Bell Curve by Herrnstein and Murray, and I also read relevant chapters in the 1996 expanded edition of Stephen J. Gould's
The Mismeasure of Man, which claims on its cover to be "The definitive refutation of the argument of
The Bell Curve." It contains an added chapter directly rebutting
The Bell Curve, which is the same as an essay published in the New Yorker ("
Curveball") and in at least one other anthology of responses to
The Bell Curve.
Gould writes on pages 375-276 the following passage (this passage requires some mathematical knowledge to understand):
My charge of disingenuousness receives its strongest affirmation in a sentence tucked away on the first page of Appendix 4, page 593, where the authors state: "In the text, we do not refer to the usual measure of goodness of fit for multiple regressions, R^2, but they are presented here for the cross-sectional analysis." Now why would they exclude from the text, and relegate to an appendix that very few people will read or even consult, a number that, by their own admission, is "the usual measure of goodness of fit." I can only conclude that they did not choose to admit in the main text the extreme weakness of their vaunted relationships.
Herrnstein and Murray's correlation coefficients are generally low enough by themselves to inspire lack of confidence. (Correlation coefficients measure the strength of linear relationships between variables; positive values run from 0.0 for no relationship to 1.0 for perfect linear relationship.) Although low figures are not atypical in the social sciences for large surveys involving many variables, most of Herrnstein and Murray's correlations are very weak--often in the 0.2 to 0.4 range. Now, 0.4 may sound respectably strong, but--and now we come to the key point--R^2 is the square of the correlation coefficient, and the square of a number between 0 and 1 is less than the number itself, so a 0.4 correlation yields an r-squared of only 0.16. In Appendix 4, then, we discover that the vast majority of measures for R^2 , excluded from the main body of the text, have values less than 0.1. These very low values of R^2 expose the true weakness, in any meaningful vernacular sense, of nearly all the relationships that form the heart of The Bell Curve.
Even with the required mathematical knowledge, I expect that this passage is still confusing. If perchance it is NOT confusing to you, then please read it again and critically analyze this argument:
"Now, 0.4 may sound respectably strong, but--and now we come to the key point--R^2 is the square of the correlation coefficient, and the square of a number between 0 and 1 is less than the number itself, so a 0.4 correlation yields an r-squared of only 0.16 [...] In Appendix 4, then, we discover that the vast majority of measures for R^2, excluded from the main body of the text, have values less than 0.1. These very low values of R^2 expose the true weakness, in any meaningful vernacular sense, of nearly all the relationships that form the heart of The Bell Curve."
The premises are correct, but the critical problem with Gould's argument, as I see it, is that he writes as though he has his math exactly backward, in an elementary blunder (much like skipping a negative sign). In order to practically interpret values of R^2, you need to take the
square root of them (not the square of them), which makes them larger (not smaller). Appendix 4 of
The Bell Curve indeed lists low values of r-squared (R^2), but low values of R^2 are not directly reflective of an extremely weak correlation. The main text of
The Bell Curve properly communicates the significance of correlations in terms of the “correlation coefficient,” or R. If you have a correlation coefficient (R) between obesity and IQ equal to 0.31, for example, then you can put that in English as “obesity is correlated with IQ by a 31% goodness of fit,” which is a relatively strong correlation given that there are many forces that influence obesity, not just intelligence, as in almost all sociological relationships (a point that Gould grants). If you were to express that value instead as R^2, however, you would have only R^2 = (0.31)^2 = less than 0.1, which Gould would seem to dismiss as much too weak to draw any correlation, as though R^2 (and not R) is the relevant value for drawing such a conclusion.
It would make no sense for Herrnstein and Murray to list the R^2 values instead of the R values as a rhetorical advantage, because the R^2 values are necessarily smaller than R, but they did it seemingly to supply more relevant technical information for those who want to repeat the statistical modeling. They also listed the ChiSquared values, which no lay reader would seriously care about.
Gould was a renowned evolutionary biologist and was eminently authoritative in regression analysis of biological relationships. Therefore, it is far more likely that I made a silly mathematical/rhetorical mistake than did Gould. So, I would love it if anyone can spot my mistake. The only considerable alternative is that Gould consciously misled his readers, which is the most uncomfortable explanation.