Comments on Psych Your Mind: Have Your Cake and Eat It Too! Practical Reform in Social Psychology

Great post! I am trying to follow some of your adv...

2013-03-21T14:13:01.249-07:00

Great post! I am trying to follow some of your advices by riding by bike to work every day and eating healthy! Thanks!

Thanks for the comment David! I guess I never thou...

2013-02-28T07:41:27.896-08:00

Thanks for the comment David! I guess I never thought of the r-squared conversion as being misleading, but I can see your point in that a 4% r-squared makes people think the effect is small when it's actually a medium effect. r = .21 does a much better job at making the point of the effect size, in that we deal in correlations much more often.

This is an excellent and important comment. I cann...

2013-02-27T19:23:50.666-08:00

This is an excellent and important comment.
I cannot resist one quibble. Squaring an effect size r of .21 to conclude that it explains 4% of the variance is technically correct, but misleading if the intent is to help to interpret its size. The calculation merely changes the terms of reference into squared units, exactly like squaring a standard deviation to get the variance. (Alternatively, we might take the square root of the variance to get the sd, in order to return to the original units of measurement.) An r of .21 means that 21% of the (unsquared) variation in the DV is accounted for by the IV. As long as you know this, it makes no difference whether you use r or r2 because the conversion neither adds nor subtracts information – both numbers mean exactly the same thing. But it’s misleading when the conversion to r2 leads one to interpret effect sizes as “small” just because 4% doesn’t sound like much.
For more on r vs. r2 see:
http://mres.gmu.edu/readings/Julius/Ozer_correlation_and_coefficient_of_determination.pdf
http://dionysus.psych.wisc.edu/lit/ToFile/4curtin/dandrade_EffectSize_stats_jqa.pdf

David Funder

RCF, thanks for reading!

2013-02-24T15:30:04.476-08:00

RCF, thanks for reading!

Excellent post, Mike! I think you did a nice job a...

2013-02-23T16:29:17.732-08:00

Excellent post, Mike! I think you did a nice job at summarizing a lot of the salient issues. And I think your suggestions about what to do (and what not to do) are both practical and reasonable.

Thanks for the comment! I've seen this analysi...

2013-02-22T13:11:45.447-08:00

Thanks for the comment! I've seen this analysis and it largely corroborates my own experience with these journals (reading them, not publishing in them: that takes a near miracle in social psychology).

The good news for social psychology is that we rarely publish in these journals, and so, for our journals where we are the gatekeepers, we can put a stop to this sort of lack of consideration of sample/effect size.

A little optimism from B-Rob!!! It's hard to...

2013-02-22T13:07:44.018-08:00

A little optimism from B-Rob!!!

It's hard to know how much of an overestimate the r = .21 is. I've started thinking about it as a useful decision point for designing my own studies these days. Can I detect an r = .21 effect with this design? Do I expect the effect to be smaller, and if so, how much? The Simmons talk from SPSP (http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2205186) has some nice information about how to make sample size decisions as well.

Thanks for the comment!

Good post. Unfortunately, even the top scientific ...

2013-02-22T11:32:34.169-08:00

Good post. Unfortunately, even the top scientific journals rarely make authors publish effect sizes, or power calculations: See here

Dave writes "The flip side of this, however, ...

2013-02-22T11:24:38.712-08:00

Dave writes "The flip side of this, however, is that people who are not p-hacking are failing to identify real effects that they're looking for because they don't have enough power." Before I was introduced the p-hacking phenomena and the raft of replication issues, I always thought the tragedy of experimental psychology was the fact that it was failing to detect so many effects (Type II errors). If you want a positive spin on using larger samples, look no further than the fact that you can detect all of those small effects with more regularity.

That said, the r = .21 that Richard et al (2003) report has to be an overestimate. Our research as always been underpowered and therefore has only been able to detect medium effect sizes under the old NHST regime.

Thanks for the comment Dave and I agree on all cou...

2013-02-22T07:42:37.849-08:00

Thanks for the comment Dave and I agree on all counts! The insidious thing about p-hacking is exactly as you describe--many people don't realize they're doing it, and it's easy to rationalize analysis choices (especially for smart people, as I like to think researchers tend to be) that actually bias hypothesis testing.

I think you've hit on something under-apprecia...

2013-02-22T07:28:17.028-08:00

I think you've hit on something under-appreciated with sample sizes. As a field, it seems we don't have enough fluency with effect sizes and as a result we don't recognize how large our samples need to be in order to have enough power to give us a fighting chance to find a true result.

One consequence is that it leads to p-hacking, which I think happens with a lot less pre-meditation and deviousness than one might think. There are certainly instances of people playing with data until it yields significance, but I'm certain that it often happens as a function of simple confirmation bias and self-serving motivations leading people to find what they "know" is true through means they don't recognize are undermining the validity of their statistics.

The flip side of this, however, is that people who are not p-hacking are failing to identify real effects that they're looking for because they don't have enough power. So it's not just leading people to do more bad things, it's hurting the good guys too. Also, it's worth noting that p-hacking can get you significant results whether your effect is real or not, but a larger sample size (done right) will do better at distinguishing real effects from spurious ones.