How does noise generated by researcher decisions undermine the credibility of science? We test this by observing all decisions made among 73 research teams as they independently conduct studies on the same hypothesis with identical starting data. We find excessive variation of outcomes. When combined, the 107 observed research decisions taken across teams explained at most 2.6% of the total variance in effect sizes and 10% of the deviance in subjective conclusions. Expertise, prior beliefs and attitudes of the researchers explain even less. Each model deployed to test the hypothesis was unique, which highlights a vast universe of research design variability that is normally hidden from view and suggests humility when presenting and interpreting scientific findings.