TY - GEN
T1 - Configuring and assembling information retrieval based solutions for software engineering tasks
AU - Dit, Bogdan
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2017/1/12
Y1 - 2017/1/12
N2 - Information Retrieval (IR) approaches are used to leverage textual or unstructured data generated during the software development process to support various software engineering (SE) tasks (e.g., concept location, traceability link recovery, change impact analysis, etc.). Two of the most important steps for applying IR techniques to support SE tasks are preprocessing the corpus and configuring the IR technique, and these steps can significantly influence the outcome and the amount of effort developers have to spend for these maintenance tasks. We present the use of Genetic Algorithms (GAs) to automatically configure and assemble an IR process to support SE tasks. The approach named IR-GA determines the (near) optimal solution to be used for each step of the IR process without requiring any training. We applied IR-GA on three different SE tasks and the results of the study indicate that IR-GA outperforms approaches previously used in the literature, and that it does not significantly differ from an ideal upper bound that could be achieved by a supervised approach and a combinatorial approach.
AB - Information Retrieval (IR) approaches are used to leverage textual or unstructured data generated during the software development process to support various software engineering (SE) tasks (e.g., concept location, traceability link recovery, change impact analysis, etc.). Two of the most important steps for applying IR techniques to support SE tasks are preprocessing the corpus and configuring the IR technique, and these steps can significantly influence the outcome and the amount of effort developers have to spend for these maintenance tasks. We present the use of Genetic Algorithms (GAs) to automatically configure and assemble an IR process to support SE tasks. The approach named IR-GA determines the (near) optimal solution to be used for each step of the IR process without requiring any training. We applied IR-GA on three different SE tasks and the results of the study indicate that IR-GA outperforms approaches previously used in the literature, and that it does not significantly differ from an ideal upper bound that could be achieved by a supervised approach and a combinatorial approach.
KW - Information retrieval
KW - Parametrization
KW - Reproducibility of experiments
KW - Search-based software engineering
KW - Text-based software engineering
UR - http://www.scopus.com/inward/record.url?scp=85013078278&partnerID=8YFLogxK
U2 - 10.1109/ICSME.2016.85
DO - 10.1109/ICSME.2016.85
M3 - Conference contribution
AN - SCOPUS:85013078278
T3 - Proceedings - 2016 IEEE International Conference on Software Maintenance and Evolution, ICSME 2016
SP - 641
EP - 646
BT - Proceedings - 2016 IEEE International Conference on Software Maintenance and Evolution, ICSME 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 32nd IEEE International Conference on Software Maintenance and Evolution, ICSME 2016
Y2 - 2 October 2016 through 10 October 2016
ER -