Reproducibility in Machine Learning: Reproducibility Challenge

Reproducibility, vaguely defined, refers to the ease with which the experiments and results of a paper can be replicated. Currently, there is a "crisis of reproducibility in machine learning". many innovative works are being published every day, however, when other scientists attempt to reproduce them, sometimes due to complexity and lack of documentation, they are simply not able to do so, and when they do, results end up sometimes being different than described in the original works.\cite{repro_crisis}. In this paper we attempted to reproduce two papers: "From Group to Individual Labels using Deep Features", and Standford's "GloVe: Global Vectors for Word Representation". In the end, due to complexity and lack of documentation of the original work, as well as lack of accessibility to external data and resources, we were unable to reproduce or even fully understand the underlying code of the first paper. In contrast, we were able to reproduce partial results of the GloVe paper at some extent, as well as to experiment with some minor changes and observe their outcomes. In particular, we were able to reproduce and explore different results on the word analogy tasks reported on the original paper. We nonetheless had to modify some evaluation scripts in order to make them run correctly, as well as to implement some minor scripts to generate graphics and transform data necessary for evaluation. Through this paper, we also offer some criticism of the lack of "good practices" when creating work that will be published to the research community, such as the integration of documentation and demo examples.

Our paper:

Our repo:


Popular Posts