P-Values and Initial Prices

The use of p-values and hypothesis testing is ubiquitous across science. Everyone remembers the rules they learnt in their first statistics courses; 0.05 is the magic cut-off for statistical significance. Statistical evidence with p-values below 0.05 are accepted – models with p-values higher than this threshold are discarded.  

However, this statistical practice has come under increasing criticism. In 2016 the American Statistical Association (Wasserstein & Lazar, 2016) released a statement with six principles on the interpretation of the p-value, with the view to to encourage data analysis to go beyond the p-value. Benjamin et al (2018) proposed changing the default cut-off of statistical significance to 0.005. 

This raises the question – what is the relationship between p-values and replication success? Using the data from past systematic replication studies we can explore this relationship.

We looked at the replication outcomes and associated original p-values of 104 studies. These 104 studies were replicated as part of the Replication Project Psychology (RPP), Many Labs 2 Project (ML2), Experimental Economics Replication Project (EERP), or the Social Science Replication Project (SSRP). 

Of these 104 studies’ hypothesis tests, 5 (4.8%) studies had p-values above 0.05 (significant at the 10% level), 40 (38.5%) studies had p-values between 0.01 and 0.05 (significant at the 5% level), 25 (24%) studies had p-values between 0.01 and 0.001 and 34 (32.7%) studies had p-values of 0.001 and below.

Analysing the replication rates in each of these categories we can we see that p-value of the original study alone is good predictor of replication success.

  • For p-value <=0.001 replication success rate is 82%
  • For p-value <= 0.01 replication success rate is 44%
  • For p-value > 0.01 replication success rate is 27%

This information provides us a great starting point for forecasting the replication rates. When setting up prediction markets, an initial price is set. In previous replication markets (RPP, EERP, SSRP and ML2) initial prices were set at 0.5. This initial pricing was not too far off overall replication rate (around 49%), and was a simple, straight-forward starting price. However, we can see from above that we can make a much more informed initial estimate of a study’s replication success probability through the p-value of the original study.

For Replication Markets, we will be using the information gained from previous studies to set our ‘best initial guess’ for the initial prices of the market – based on our p-value analysis. This means that when the prediction markets first open, the price will be based on its p-value.

To set the initial prices, we will split each study into one of three categories based on its p-value.

  • For studies with p-values of 0.001 or below the initial price will be 0.8
  • For studies with p-values between 0.001 and 0.01 the initial price will be 0.4
  • For studies with p-values higher than 0.01 the initial price will be 0.3

These initial prices are simply the rounded figures of the percentage successful replications in the past data.


  • Benjamin, D. J., Berger, J. O., Johannesson, M., Nosek, B. A., Wagenmakers, E.-J., Berk, R., … Johnson, V. E. (2018). Redefine statistical significance. Nature Human Behaviour, 2(1), 6. https://doi.org/10.1038/s41562-017-0189-z

Learn more about this post’s author at his homepage:


Contribute to the discussion...

This site uses Akismet to reduce spam. Learn how your comment data is processed.

May we send you invitations to future research projects? 

Add your email address to our low-volume distribution list.

Note: We cannot re-use participant lists for new recruiting. Please opt-in (above) for occasional announcements about related future studies.

We’re sorry to see you go! Please visit our social media sites.

This site uses cookies to provide you with a better browsing experience.

Visit our Privacy Policy for more information.