Final Thesis: Black-box Investigation of the Ohloh Data Source for OSS Research

Abstract: Thanks to the collected data by providers like Ohloh, research on open source software has become both easier and popular. Ohloh is offering access to its database containing most of the statistically relevant information collected by Ohloh from publicly available version control systems. There have been many studies in open source software using the data provided by Ohloh. In this study we wanted to investigate the reliability and validity of the data from Ohloh by doing a black box investigation. We tested the reliability of the Ohloh data and found out that the collected data was not reflecting the number of all open source projects due to method changes by Ohloh. Therefore we decided to change the research question to “How to reduce redundancy in R scripting by creating an interactive toolkit?”. For that purpose we develop a generic, independent and extendable toolkit, that allows the user not to waste time on repetitive tasks but rather concentrate himself on his research.

Keywords: R, split-apply-combine, plyr, data processing, graphics toolkit

PDFs: Master Thesis, Work Description

Reference: Ahmet Sitti. Black-box Investigation of the Ohloh Data Source for OSS Research. Master Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg: 2014.