Version 1
: Received: 8 August 2024 / Approved: 9 August 2024 / Online: 12 August 2024 (16:04:32 CEST)
How to cite:
Gut, C.; Goldman, A. Revisiting Aristotle vs. Ringelmann: The influence of biases on measuring productivity in Open Source software development. Preprints2024, 2024080713. https://doi.org/10.20944/preprints202408.0713.v1
Gut, C.; Goldman, A. Revisiting Aristotle vs. Ringelmann: The influence of biases on measuring productivity in Open Source software development. Preprints 2024, 2024080713. https://doi.org/10.20944/preprints202408.0713.v1
Gut, C.; Goldman, A. Revisiting Aristotle vs. Ringelmann: The influence of biases on measuring productivity in Open Source software development. Preprints2024, 2024080713. https://doi.org/10.20944/preprints202408.0713.v1
APA Style
Gut, C., & Goldman, A. (2024). Revisiting Aristotle vs. Ringelmann: The influence of biases on measuring productivity in Open Source software development. Preprints. https://doi.org/10.20944/preprints202408.0713.v1
Chicago/Turabian Style
Gut, C. and Alfredo Goldman. 2024 "Revisiting Aristotle vs. Ringelmann: The influence of biases on measuring productivity in Open Source software development" Preprints. https://doi.org/10.20944/preprints202408.0713.v1
Abstract
Aristotle vs. Ringelmann was a discussion between two distinct research teams from the ETH Zürich who argued whether the productivity of Open Source software projects scales sublinear or superlinear with regard to its team size. This discussion evolved around two publications, which apparently used similar techniques by sampling projects on GitHub and running regression analyses to answer the question about superlinearity. Despite the similarity in their research methods, one team around Ingo Scholtes reached the conclusion that projects scale sublinear, while the other team around Didier Sornette ascertained a superlinear relationship between team size and productivity. In subsequent publications, the two authors argue that the opposite conclusions may be attributed to differences in project populations, since 81.7% of Sornette’s projects have less than 50 contributors. Scholtes, on the other hand, sampled specifically projects with more than 50 contributors. This publication compares the research from both authors by replicating their findings, thus allowing for an evaluation of how much project sampling actually accounted for the differences between Scholtes’ and Sornette’s results. Thereby, the discovery was made that sampling bias only partially explains the discrepancies between the two authors. Further analysis led to the detection of instrumentation biases that drove the regression coefficients in opposite directions. These findings were then consolidated into a quantitative analysis, indicating that instrumentation biases contributed more to the differences between Scholtes’ and Sornette’s work than the selection bias suggested by both authors.
Keywords
Mining Software Repositories; Open Source; Empirical Software Engineering; software development productivity; GitHub; git; economies of scale; diseconomies of scale; replication study; sampling bias; instrumentation bias
Subject
Computer Science and Mathematics, Software
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.