In last week's journal club we read about a recent paper in Psychological Science with a very clear message: It should be the norm for researchers to post their data upon publication. In the article, the author (Uri Simonsohn) lays out the major reason why he thinks posting data is a good idea: It helps our field catch scientific fraud in action (e.g., fabricated data). Simonsohn details some methods he has used in the past to catch fraud in the paper and on his new blog over at datacolada.org (I'll have mine blended!).
I agree that posting data will make it harder for people to fabricate data. However, my favorite reason to increase norms for posting data has nothing to do with data fabrication.
I admit I was naive when I started out my research career. It was 2002 and I was a lost undergraduate at UC Berkeley. I wandered into Professor Serena Chen's (she was Professor Chen to me at the time) office and told her that "I was maybe thinking about trying my hand at an honors thesis." She looked at me, the way that a faculty member typically looks at a lost undergraduate, and gave me a stack of papers to read. A week later, we were working together on a project that ended up being Study 2 in this paper.
At the time I wanted to be involved in research because it seemed so cool--to actually add a (small) piece of public knowledge to what we already know about social psychology. That professors actually do this on a daily basis (and for money!) made it seem like "researcher" was a super cool job. If I think back, this is really what fired me up about conducting research!
Flash forward to 2010 when I was applying for my first tenure-track position. Now I number the publications in my vita (count my awesomeness!), I get "royally cheesed off"* by another researcher's failure to cite my work, and I tout my ability to publish in "top-tier journals" in my application cover letter. Long gone are thoughts of contributing to public knowledge--how very sad.
The point of conducting research is not to die with the most publications but to create public knowledge that will better society and our understanding of the human experience. That to me is the real reason we should post data (without restrictions or authorship obligations)--because data is a source of knowledge, and that knowledge is for the public good. Too often we get caught up in ownership of the ideas we write about in our papers or the data we collected using our own efforts. Many of the graduate students at STAG expressed ownership objections, and I appreciate these concerns (particularly coming from graduate students looking to land a job!). We forget though, that these ideas and our efforts are not only ours, but society's as well.
Researchers should post their data in most cases (not all cases for various reasons brought up in the paper). In this spirit, my laboratory is starting a data-posting initiative. We hope to have data available for half of our lab publications by the end of Spring 2014. I hope you will join me in contributing to public knowledge!
* Yes, I did just use that in a sentence!
Simonsohn, U. (2013). Just Post It: The Lesson From Two Cases of Fabricated Data Detected by Statistics Alone Psychological Science DOI: 10.2139/ssrn.2114571
I think that social psychologists ought to blind themselves on their data. The data massaging and inclination toward storytelling to fit a pet hypotheses is otherwise too tempting. That is one big reason I find it hard to give any weight to social psychology experiments. That is besides my grave misgivings due to the fact that they're measuring quantities with entirely made-up scales that do not have any clear meaning, like .4 on a study of implicit self esteem. errorstatistics.com
ReplyDeleteAre you lost? Your comment has absolutely 0% relevance to this post. Maybe the internet is a new thing for you? Best of luck!
DeleteIf you are referring to my comment, then the relevance should be pretty obvious. The post remarks that "posting data will make it harder for people to fabricate data", and there is a reference to Simonsohn (with the topic of his interesting new blog). I realize my remark was blunt (late night posting) but it is pretty clearly relevant to this issue. After several years of studying statistical research, and a particular group of studies just this week, I've good reason to think that, while posting should be endorsed and encouraged, it does not address some of the deeper problems. It would have maximum effect if posted prior to the data analysis. Without blinding and a great deal more self-criticism, some of the "new psychology reforms" appear to be papering over the deeper issues.
DeleteI realize you had another "favorite reason" for being pro-posting; that scarcely makes my remark irrelevant. Sorry if your dismissal wasn't directed at me. errorstatistics.com
This blog post is about posting data online. Your comment contains an observation that data fraud exists, an explanation for why you hate social psychology, and finally, a shameless plug of your own blog (which I see you've done AGAIN in your second comment).
DeleteSo yes, it is barely and tangentially related to my post...
I claimed neither that data fraud exists (although it obviously does) nor that I hate social psychology. Yours is a rather unbalanced and strangely immature response to a sensible recommendation for accomplishing the goal of posting. I once found a very interesting post on this group blog, that is why I accidentally found myself back here. But that one was an excellent and honest attempt (by a woman) to come to grips with the problems of verification bias, data-dependent selections, cherry-picking, p-value hacking, and so on in psychology. I haven't gone back to check who the author was; it's too bad, though, when uneven scholarship hurts a group blog.
DeleteI imagine that my style of responding to comments actually drives readership up and not down. Thanks for your concern though!
DeleteIt is true that posting of data must be imperative for researchers not just to evaluate fabricated data or for transparency but also because it could benefit the majority. Posting of research data or information can help a lot of researchers and students gain great research and thesis ideas that will still go back with the progress of society.
ReplyDeleteNice post Michael. I like the idea of finding ways to disseminate social psychology data to the public--after all, we are purportedly studying the social behavior of the public, and that very same public is in large part funding our research!
ReplyDeleteI wonder, though, if posting data online is the best way to disseminate knowledge. If data is posted online, a member of the public would have to possess the skills to analyze and interpret the data in order to gain any insight into human behavior. Seems like, if our goal is to convey knowledge to the public, we would need to post summaries of studies in easily accessible online forums, kind of like a researcher-written version of an NPR or NYT science blog.
p.s.: of course, this would take a lot of work, and I'm being idealistic!
Thanks for your comment Aaron! Let me direct you to the people's science, a website for just that sort of outreach: http://thepeoplesscience.org/
DeleteBut in terms of data posting, I think it serves the public by allowing other researchers the access to our data--so that they can contribute to knowledge in new ways. That and it is possible that some undergrad statistics courses could use posted data for the reproducing of analyses.
Good points, and thanks for the website link!
ReplyDeletePosting data online can help a lot of young minds around the world to learn from the experienced lot. It also helps in clearing out a lot of questions that anyone could have corresponding to the same topic as listed by researchers. Psychology is a vast field with each having a different approach to it, an amalgamation of these different ideas can maybe bring about a new and interesting idea.
ReplyDelete