Abstract: Software productivity metrics are indicators for software quality, development efficiency and developer satisfaction, remaining crucial to project success. The MECOIS project provides a robust solution for analyzing
development data. Central to this is a data foundation, leveraging automated pipelines to extract software
engineering metrics from development platforms such as GitHub. However, even though there is already an existing extraction layer, it relies exclusively on the REST API. Following a design science approach, this thesis introduces a complementary extraction approach based on the GraphQL API, on existing literature and established architectures. We analyze the accessible data, develop a solution that seamlessly integrates with the existing pipeline and
demonstrate its applicability using realworld data from the AMOS program stored on Github. Furthermore, the
subsequent evaluation provides a comparative analysis of the REST and GraphQL approaches, specifically focusing on data completeness, request efficiency and rate-limit consumption. We found that the GraphQL-based approach can reduce the number of Rate-Limit Cost drastically, in our demon stration by up to 97.8% for fork extractions and enables comprehensive access to metadata, such as custom project fields and timeline events, which are
unavailable via the REST API.
Keywords: Data Extraction, GraphQL, Data Pipelines
PDF: Bachelor Thesis
Reference: Benjamin Georg Koller. Integrating and Evaluating a GraphQL Extraction Layer. Bachelor Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2026.
Discover more from Professorship for Open-Source Software
Subscribe to get the latest posts sent to your email.