Final Thesis: In-Depth Analysis of Software Composition Analysis Tools: Current Curation Practices

Abstract: Software Composition Analysis (SCA) tools play a vital role in identifying vulnerabilities and ensuring
license compliance in Open Source Software (OSS). However, their effectiveness is strongly influenced by the quality of metadata used during analysis. Incomplete, inconsistent, or outdated metadata can lead to false positives,
undetected vulnerabilities, or misclassified licenses. This thesis investigates how SCA tools address these challenges through metadata curation practices focused on validation, correction, and updating. Drawing on a Systematic Literature Review (SLR) of 25 academic papers and selected grey literature, this study analyzes 31 SCA tools,
including both fullfeatured solutions and supporting components (e.g., license classifiers such as Ninka). These tools are classified by their primary focus (vulnerability or licensing),curation methodology (manual, automatic, or
hybrid), data sources, and update frequency. The findings reveal that while a few tools employ structured,
community-driven curation workflows, most provide limited transparency regarding how metadata is curated or maintained. The thesis proposes a typology of curation strategies, identifies critical gaps in transparency and
evaluation, and draws comparative lessons from domains where metadata quality is a primary concern. These insights lead to practical recommendations for improving the reliability, auditability, and interpretability of SCA tools through more modular and transparent curation workflows.

Keywords: Software Composition Analysis (SCA), Curation, Systematic Literature Review (SLR)

PDF: Master Thesis

Reference: Aleem Ud Din. In-Depth Analysis of Software Composition Analysis Tools: Current Curation Practices. Master Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2025.


Discover more from Professorship for Open-Source Software

Subscribe to get the latest posts sent to your email.