What is replication?

[Updated Jan. 2020 to cover unified definition, and clarify]

Concept

Replication is when you repeat a previous study to see if you get the same results. Ideally this happens a lot in science. In practice, not as much. But what counts? 

According to Brian Nosek (“What is replication?”, 2019), people commonly say replication is repeating a study’s procedure and observing whether the previous finding recurs, but this definition fails because the changes to procedure define the replication.

Consider: when replicating an Israeli study in the US, he didn’t use the original materials – they were in Hebrew! Replications change participants, campus, country, language, etc. But if any of these negated something like the Stroop effect, that would be a surprising and important discovery. 

So, Nosek argues replication is really a conceptual notion.

Replication is attempting to reproduce a previously observed finding with no reason to expect a different outcome.

Brian Nosek (2019)

(He says “no a priori reason” — reasons made up afterwards don’t count We also assume competent, good-faith replication.)

Claiming something is a replication “is a theoretical commitment.”

Practical Definition for DARPA SCORE

Originally, we distinguished between direct replications and data replications, and privileged direct replications. Direct replications test the original claim by gathering new data, such as a new psychology experiment.

Data-analytic replications test the original claim using new found data, data appropriate for replication but not originally collected by the replication research team. For example, using the same economic indicator as the original-study researchers, but from a different time period.

But direct replications exclude most economics, sociology, and political science. So, as of February 2020 (Round 6), we will adopt DARPA’s new unified definition:

A replication is testing the same claim using data that was not used in the original study.

DARPA SCORE (2020)

Unifying increases the number of evaluated claims in SCORE from 100 to 250. (However, R1-R5 markets will only pay prizes for direct replications, because that’s what we said then.)

Sample Size & Statistical Power

A decent replication has to have a sample large enough that failure to detect a result is almost certainly due to the claim being wrong, rather than not looking hard enough. If we redo the study but with only 3 participants, the result is almost certainly noise. 

We cannot just use the original sample size: most published studies are actually too small. Therefore we need to ensure sufficient power.

You might want to read:

 Share your contact info with us to stay up-to-date with an every-so-often newsletter and an announcement of the launch. 

One thought on “What is replication?”

Contribute to the discussion...

This site uses Akismet to reduce spam. Learn how your comment data is processed.

May we send you invitations to future research projects? 

Add your email address to our low-volume distribution list.

Note: We cannot re-use participant lists for new recruiting. Please opt-in (above) for occasional announcements about related future studies.

We’re sorry to see you go! Please visit our social media sites.

This site uses cookies to provide you with a better browsing experience.

Visit our Privacy Policy for more information.