Landfill: An Open Dataset of Code Smells with Public Evaluation

Fabio Palomba, Dario Di Nucci, Michele Tufano, Gabriele Bavota, Rocco Oliveto, Denys Poshyvanyk, Andrea De Lucia

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

Code smells are symptoms of poor design and implementation choices that may hinder code comprehension and possibly increase change- and fault-proneness of source code. Several techniques have been proposed in the literature for detecting code smells. These techniques are generally evaluated by comparing their accuracy on a set of detected candidate code smells against a manually-produced oracle. Unfortunately, such comprehensive sets of annotated code smells are not available in the literature with only few exceptions. In this paper we contribute (i) a dataset of 243 instances of five types of code smells identified from 20 open source software projects, (ii) a systematic procedure for validating code smell datasets, (iii) LANDFILL, a Web-based platform for sharing code smell datasets, and (iv) a set of APIs for programmatically accessing LANDFILL's contents. Anyone can contribute to Landfill by (i) improving existing datasets (e.g., Adding missing instances of code smells, flagging possibly incorrectly classified instances), and (ii) sharing and posting new datasets. Landfill is available at www.sesa.unisa.it/landfill/, while the video demonstrating its features in action is available at http://www.sesa.unisa.it/tools/landfill.jsp.
Original languageEnglish
Title of host publication2015 IEEE/ACM 12th Working Conference on Mining Software Repositories
DOIs
Publication statusPublished - May 2015

Fingerprint Dive into the research topics of 'Landfill: An Open Dataset of Code Smells with Public Evaluation'. Together they form a unique fingerprint.

Cite this