Final Thesis: Design and Implementation of a Version Control System for Open Data Modelling Projects

Abstract: Many modern software applications and research projects depend on the ability to access high-qualitative data sources. Even though there is already a large number of openly available data sets, such data sets are often hard to (re)use due to various barriers such as incomplete documentation, wrong or missing values, and more. To address these barriers, the JValue Project has been established by the Professorship of Open Source Software at Friedrich-Alexander-Universität Erlangen-Nürnberg. The goal of the JValue Project is to “make open data easy, safe, and reliable”. In the context of the JValue Project, numerous software applications are developed which, among others, allow to explicitly define the structure and further meta information of openly available data sets. However, it is currently neither possible to collaborate with other individuals on such data source configurations, nor is it possible to retrace the historic development that led to the current state of a particular configuration. To build a basis to address these issues, a Version Control System shall be developed, which makes it possible to store, retrieve, and compare revisions of files containing data source configurations and related information. This thesis presents a concept of such a system, and evaluates this concept by implementing a prototype showing its feasibility. As a result of this thesis, it is now possible for other applications developed in the context of the JValue Project to access, create, and compare revisions in order to provide advanced collaboration and versioning features to end users.

Keywords: Version control systems, open data, collaboration.

PDF: Master Thesis

Reference: Martin Buchalik. Design and Implementation of a Version Control System for Open Data Modelling Projects. Master Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2022.