El pequeño impacto del haqueo de resultados marginalmente significativos sobre la estimación meta-analítica del tamaño del efecto


  • Juan Botella Ausina Universidad Autónoma de Madrid
  • Manuel Suero Autonomous Unoversity of madrid
  • Juan I. Durán
  • Desirée Blazquez
DOI: https://doi.org/10.6018/analesps.433051
Palabras clave: p-hacking, Tamaño del efecto, Meta-análisis


La etiqueta p-hacking (pH) se refiere a un conjunto de prácticas oportunistas destinadas a hacer que sean significativos algunos valores p que deberían ser no significativos. Algunos han argumentado que debemos prevenir y luchar contra el pH por varias razones, especialmente debido a sus posibles efectos nocivos en la evaluación de los resultados de la investigación primaria y su síntesis meta-analítica. Nos focalizamos aquí en el efecto de un tipo específico de pH, centrado en estudios marginalmente significativos, en la estimación combinada del tamaño del efecto en el meta-análisis. Queremos saber cuánto deberíamos preocuparnos por su efecto de sesgo al evaluar los resultados de un meta-análisis. Hemos calculado el sesgo en una variedad de situaciones que parecen realistas en términos de prevalencia y de la definición operativa del pH. Los resultados muestran que en la mayoría de las situaciones analizadas el sesgo es inferior a una centésima (± 0.01), en términos de d o r. Para alcanzar un nivel de sesgo de cinco centésimas (± 0.05), tendría que haber una presencia masiva de este tipo de pH, lo que parece poco realista. Hay muchas buenas razones para luchar contra el pH, pero nuestra conclusión principal es que entre esas razones no se incluye que tenga un gran impacto en la estimación meta-analítica del tamaño del efecto.


Los datos de descargas todavía no están disponibles.

Biografía del autor/a

Juan Botella Ausina, Universidad Autónoma de Madrid

Facultad de Psicologia

Universidad Autonoma de Madrid


Anvari, F., & Lakens, D. (2019). The replicability crisis and public trust in psychological science. Comprehensive Results in Social Psychology, 1-21.

Baker, M. (2016). Is there a reproducibility crisis? A Nature survey lifts the lid on how researchers view the ‘crisis’ rocking science and what they think will help. Nature, 533(7604), 452-455.

Bakker, M., van Dijk, A,. & Wicherts, J. M. (2012). The Rules of the Game Called Psychological Science. Perspectives on Psychological Science, 7, 543-554.

Banks, G. C., Rogelberg, S. G., Woznyj, H. M., Landis, R. S., & Rupp, D. E. (2016). Evidence on questionable research practices: The good, the bad, and the ugly. Journal of Business and Psychology, 31:323–338.

Belas, N., Bengart, P., & Vogt, B. (2017). P-hacking in Clinical Trials. Working Paper Series.

Bishop, D. V., & Thompson, P. A. (2016). Problems in using p-curve analysis and text-mining to detect rate of p-hacking and evidential value. PeerJ, 4, e1715.

Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2010). A basic introduction to fixed-effects and random-effects models for meta-analysis. Research Synthesis Methods, 1, 97-111.

Bosco, F. A., Aguinis, H., Singh, K., Field, J. G., & Pierce, C. A. (2015). Correlational effect size benchmarks. Journal of Applied Psychology, 100, 431–449.

Botella, J., & Duran, J. I. (2019). A meta-analytical answer to the crisis of confidence of Psychology. Anales De Psicología/Annals of Psychology, 35(2), 350-356.

Botella, J., Ximénez, M. C., Revuelta, J., & Suero, M. (2006). Optimization of sample size in controlled experiments: the CLAST rule. Behavior Research Methods, Instruments & Computers, 38(1), 65-76.

Brodeur, A., Lé, M., Sangnier, M., & Zylberberg, Y. (2016). Star wars: The empirics strike back. American Economic Journal: Applied Economics, 8(1), 1-32.

Carter, E. C., Schönbrodt, F. D., Gervais, W. M., & Hilgard, J. (2019). Correcting for bias in psychology: A comparison of meta-analytic methods. Advances in Methods and Practices in Psychological Science, 2(2), 115-144.

Cohen, J. (1988). Statistical power analysis for the behavioural sciences, 2ª ed. New York: Academic Press.

De Boeck, P., & Jeon, M. (2018). Perceived crisis and reforms: Issues, explanations, and remedies. Psychological Bulletin, 144(7), 757.

De Winter, J. C., & Dodou, D. (2015). A surge of p-values between 0.041 and 0.049 in recent decades (but negative results are increasing rapidly too). PeerJ, 3, e733.

DeCoster, J., Sparks, E. A., Sparks, J. C., Sparks, G. G., & Sparks, C. W. (2015). Opportunistic biases: Their origins, effects, and an integrated solution. American Psychologist, 70(6), 499.

Earp, B. D., & Trafimow, D. (2015). Replication, falsification, and the crisis of confidence in social psychology. Frontiers in Psychology, 6, 621.

Fanelli, D. (2009). How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data. PloS one, 4(5), e5738.

Fiedler, K., & Schwarz, N. (2016). Questionable research practices revisited. Social Psychological and Personality Science, 7(1), 45-52.

Francis, G. (2012). Publication bias and the failure of replication in experimental psychology. Psychonomic Bulletin & Review, 19(6), 975-991.

Franco, A., Malhotra, N., & Simonovits, G. (2014). Publication bias in the social sciences: Unlocking the file drawer. Science, 345(6203), 1502-1505.

Friese, M., & Frankenbach, J. (2019). p-Hacking and publication bias interact to distort meta-analytic effect size estimates. Psychological Methods. Advance online publication. https://doi.org/10.1037/met0000246

Hall, J., & Martin, B. R. (2019). Towards a taxonomy of research misconduct: The case of business school research. Research Policy, 48(2), 414-427.

Hartgerink, C. H. (2017). Reanalyzing Head et al.(2015): Investigating the robustness of widespread p-hacking. PeerJ, 5, e3068.

Head, M. L., Holman, L., Lanfear, R., Kahn, A. T., & Jennions, M. D. (2015). The extent and consequences of p-hacking in science. PLoS Biology, 13(3), e1002106.

Hedges, L. V., & Vevea, J. L. (2005). Selection method approaches. In H. R. Rothstein, A. J. Sutton, & M. Borenstein (Eds.), Publication bias in meta-analysis: Prevention, assessment and adjustments (pp. 145–174). Chichester, England: John Wiley & Sons.

Holtfreter, K., Reisig, M. D., Pratt, T. C., & Mays, R. D. (2019). The perceived causes of research misconduct among faculty members in the natural, social, and applied sciences. Studies in Higher Education, 1-13.

Ioannidis, J. P., & Trikalinos, T. A. (2007). An exploratory test for an excess of significant findings. Clinical Trials, 4(3), 245-253.

John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524-532.

Johnson, N. L., Kotz, S., & Balakrishnan, N. (1994). Continuous Univariate Distributions (2nd edition). New York, John Wiely Sons. Inc Vol. 2.

Kraemer, H. C., Gardner, C., Brooks, J., & Yesavage, J. A. (1998). Advantages of excluding underpowered studies in meta-analysis: Inclusionist versus exclusionist viewpoints. Psychological Methods, 3(1), 23-31.

Krawczyk, M. (2015). The search for significance: a few peculiarities in the distribution of P values in experimental psychology literature. PloS One, 10(6), e0127872.

Krishna, A., & Peter, S. M. (2018). Questionable research practices in student final theses–Prevalence, attitudes, and the role of the supervisor’s perceived attitudes. PloS One, 13(8), e0203470.

Lane, D. M., & Dunlap, W. P. (1978). Estimating effect size: Bias resulting from the significance criterion in editorial decisions. British Journal of Mathematical & Statistical Psychology, 31, 107-112.

Leggett, N. C., Thomas, N. A., Loetscher, T., & Nicholls, M. E. (2013). The life of p: “Just significant” results are on the rise. The Quarterly Journal of Experimental Psychology, 66(12), 2303-2309.

Marszalek, J. M., Barber, C., Kohlhart, J., & Cooper, B. H. (2011). Sample size in psychological research over the past 30 years. Perceptual and Motor Skills, 112(2), 331-348.

Martinson, B. C., Anderson, M. S., & De Vries, R. (2005). Scientists behaving badly. Nature, 435(7043), 737.

Marusic, A., Wager, E., Utrobicic, A., Rothstein, H. R., & Sambunjak, D. (2016). Interventions to prevent misconduct and promote integrity in research and publication. Cochrane Database of Systematic Reviews, (4).

Mueller, G. P. (2018). When the search for truth fails: A computer simulation of the impact of the publication bias on the meta-analysis of scientific literature. Scientometrics, 117(3), 2061-2076.

Pashler, H., & Harris, C. R. (2012). Is the replicability crisis overblown? Three arguments examined. Perspectives on Psychological Science, 7(6), 531-536.

Pashler, H., & Wagenmakers, E.J. (2012). Editors’ Introduction to the Special Section on Replicability in Psychological Science: A Crisis of Confidence? Perspectives on Psychological Science, 7, 528-530.

Richard, F. D., Bond, C. F., Jr., & Stokes-Zoota, J. J. (2003). One hundred years of social psychology quantitatively described. Review of General Psychology, 7, 331–363.

Ross, L. (2018). From the fundamental attribution error to the truly fundamental attribution error and beyond: My research journey. Perspectives on Psychological Science, 13(6), 750-769.

Rothstein, H. R., Sutton, A. J., & Borenstein, M. (Eds.) (2005). Publication bias in meta-analysis: Prevention, assessment, and adjustments. Nueva York: Wiley.

Rubio-Aparicio, M., Marín-Martínez, F., Sánchez-Meca, J., & López-López, J. A. (2018). A methodological review of meta-analyses of the effectiveness of clinical psychology treatments. Behavior Research Methods, 50(5), 2057-2073.

Schneck, A. (2018). Examining publication bias—a simulation-based evaluation of statistical tests on publication bias. PeerJ, 5, e4115.

Sijtsma, K. (2016). Playing with data—or how to discourage questionable research practices and stimulate researchers to do things right. Psychometrika, 81(1), 1-15.

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359-1366.

Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014a). P-curve: a key to the file-drawer. Journal of Experimental Psychology: General, 143(2), 534.

Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014b). p-curve and effect size: Correcting for publication bias using only significant results. Perspectives on Psychological Science, 9, 666–681.

Stricker, J., & Günther, A. (2019). Scientific misconduct in psychology: A systematic review of prevalence estimates and new empirical data. Zeitschrift für Psychologie, 227(1), 53.

Ulrich, R., & Miller, J. (2015). p-hacking by post hoc selection with multiple opportunities: Detectability by skewness test?: Comment on Simonsohn, Nelson, and Simmons (2014). Journal of Experimental Psychology: General, 144(6), 1137-1145.

Ulrich, R., & Miller, J. (2018). Some properties of p-curves, with an application to gradual publication bias. Psychological Methods, 23(3), 546.

van Aert, R. C., Wicherts, J. M., & van Assen, M. A. (2019). Publication bias examined in meta-analyses from psychology and medicine: A meta-meta-analysis. PloS One, 14(4), e0215052.

van Assen, M. A. L. M., van Aert, R. C. M., & Wicherts, J. M. (2015). Meta-analysis using effect size distributions of only statistically significant studies. Psychological Methods, 20, 293–309.

Wigboldus, D. H., & Dotsch, R. (2016). Encourage playing with data and discourage questionable reporting practices. Psychometrika, 81(1), 27-32.

Yong, E. (2012). In the wake of high-profile controversies, psychologists are facing up to problems with replication. Nature, 485, 298-300.

Cómo citar
Botella, J., Suero, M., Durán , J. I., & Blazquez, D. (2021). El pequeño impacto del haqueo de resultados marginalmente significativos sobre la estimación meta-analítica del tamaño del efecto. Anales De Psicología / Annals of Psychology, 37(1), 178-187. https://doi.org/10.6018/analesps.433051
Metodología de las ciencias del comportamiento