Preprint Article Version 1 This version is not peer-reviewed

Revisiting Aristotle vs. Ringelmann: The influence of biases on measuring productivity in Open Source software development

Version 1 : Received: 8 August 2024 / Approved: 9 August 2024 / Online: 12 August 2024 (16:04:32 CEST)

How to cite: Gut, C.; Goldman, A. Revisiting Aristotle vs. Ringelmann: The influence of biases on measuring productivity in Open Source software development. Preprints 2024, 2024080713. https://doi.org/10.20944/preprints202408.0713.v1 Gut, C.; Goldman, A. Revisiting Aristotle vs. Ringelmann: The influence of biases on measuring productivity in Open Source software development. Preprints 2024, 2024080713. https://doi.org/10.20944/preprints202408.0713.v1

Abstract

Aristotle vs. Ringelmann was a discussion between two distinct research teams from the ETH Zürich who argued whether the productivity of Open Source software projects scales sublinear or superlinear with regard to its team size. This discussion evolved around two publications, which apparently used similar techniques by sampling projects on GitHub and running regression analyses to answer the question about superlinearity. Despite the similarity in their research methods, one team around Ingo Scholtes reached the conclusion that projects scale sublinear, while the other team around Didier Sornette ascertained a superlinear relationship between team size and productivity. In subsequent publications, the two authors argue that the opposite conclusions may be attributed to differences in project populations, since 81.7% of Sornette’s projects have less than 50 contributors. Scholtes, on the other hand, sampled specifically projects with more than 50 contributors. This publication compares the research from both authors by replicating their findings, thus allowing for an evaluation of how much project sampling actually accounted for the differences between Scholtes’ and Sornette’s results. Thereby, the discovery was made that sampling bias only partially explains the discrepancies between the two authors. Further analysis led to the detection of instrumentation biases that drove the regression coefficients in opposite directions. These findings were then consolidated into a quantitative analysis, indicating that instrumentation biases contributed more to the differences between Scholtes’ and Sornette’s work than the selection bias suggested by both authors.

Keywords

Mining Software Repositories; Open Source; Empirical Software Engineering; software development productivity; GitHub; git; economies of scale; diseconomies of scale; replication study; sampling bias; instrumentation bias

Subject

Computer Science and Mathematics, Software

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.