Abstract: The increasing complexity of software ecosystems requires reliable, continuously maintained Software Composition Analysis (SCA) data. Although automated scanners provide essential scalability, they lack the
contextual judgment required to reliably resolve ambiguous licensing and vulnerability scenarios. This raises the question of whether crowdsourcing can support data curation within SCA environments, particularly in the context of the SCA Tool. To address this question, this thesis conducts a Systematic Literature Review (SLR) following the
evidence-based software engineering guidelines proposed by Kitchenham et al. (2004) and Kitchenham, Charters et al. (2007). The review synthesizes existing research on crowdsourcing approaches, application domains, incentive structures, quality assurance mechanisms, and licensing frameworks. The analysis shows that crowdsourcing
encompasses structurally distinct approaches that differ in task complexity, aggregation mechanisms, and
contribution diversity. Based on this synthesis, the study evaluates the suitability of these approaches for supporting data curation tasks within the SCA Tool. The findings suggest that selectively aligning crowdsourcing approaches with task characteristics and associated risk levels can improve scalability while preserving data integrity. However, its effectiveness depends on strong governance, appropriate incentives, and proportional quality controls. Overall, the study provides an evidence based foundation for assessing whether crowdsourcing can complement automated workflows in SCA-based data curation.
Keywords: Crowdsourcing, Systematic Literature Review (SLR), Curation, SCA Tool
PDF: Master Thesis
Reference: Mohammad Naim. A Systematic Review of Crowdsourcing Approaches and Their Applicability to Curation. Master Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2026.
Discover more from Professorship for Open-Source Software
Subscribe to get the latest posts sent to your email.