Parameterizing and assembling IR-based solutions for SE tasks using genetic algorithms

Annibale Panichella, Bogdan Dit, Rocco Oliveto, Massimiliano Di Penta, Denys Poshyvanyk, Andrea de Lucia

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

34 Scopus citations

Abstract

Information Retrieval (IR) approaches are nowadays used to support various software engineering tasks, such as feature location, traceability link recovery, clone detection, or refactoring. However, previous studies showed that inadequate instantiation of an IR technique and underlying process could significantly affect the performance of such approaches in terms of precision and recall. This paper proposes the use of Genetic Algorithms (GAs) to automatically configure and assemble an IR process for software engineering tasks. The approach (named GA-IR) determines the (near) optimal solution to be used for each stage of the IR process, i.e., term extraction, stop word removal, stemming, indexing and an IR algebraic method calibration. We applied GA-IR on two different software engineering tasks, namely traceability link recovery and identification of duplicate bug reports. The results of the study indicate that GA-IR outperforms approaches previously published in the literature, and that it does not significantly differ from an ideal upper bound that could be achieved by a supervised and combinatorial approach.

Original languageEnglish
Title of host publication2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering, SANER 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages314-325
Number of pages12
ISBN (Electronic)9781509018550
DOIs
StatePublished - 20 May 2016
Event23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering, SANER 2016 - Suita, Osaka, Japan
Duration: 14 Mar 201618 Mar 2016

Publication series

Name2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering, SANER 2016
Volume1

Conference

Conference23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering, SANER 2016
Country/TerritoryJapan
CitySuita, Osaka
Period14/03/1618/03/16

Keywords

  • Information retrieval
  • Parametrization
  • Search-based software engineering
  • Text-based software engineering

Fingerprint

Dive into the research topics of 'Parameterizing and assembling IR-based solutions for SE tasks using genetic algorithms'. Together they form a unique fingerprint.

Cite this