TY - JOUR
T1 - Testing research software
T2 - an in-depth survey of practices, methods, and tools
AU - Eisty, Nasir U.
AU - Kanewala, Upulee
AU - Carver, Jeffrey C.
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
PY - 2025/6
Y1 - 2025/6
N2 - Context: Research software is essential for developing advanced tools and models to solve complex research problems and drive innovation across domains. Therefore, it is essential to ensure its correctness. Software testing plays a vital role in this task. However, testing research software is challenging due to the software’s complexity and to the unique culture of the research software community. Objective: Building on previous research, this study provides an in-depth investigation of testing practices in research software, focusing on test case design, challenges with expected outputs, use of quality metrics, execution methods, tools, and desired tool features. Additionally, we explore whether demographic factors influence testing processes. Method: We survey research software developers to understand how they design test cases, handle output challenges, use metrics, execute tests, and select tools. Results: Research software testing varies widely. The primary challenges are test case design, evaluating test quality, and evaluating the correctness of test outputs. Overall, research software developers are not familiar with existing testing tools and have a need for new tools to support their specific needs. Conclusion: Allocating human resources to testing and providing developers with knowledge about effective testing techniques are important steps toward improving the testing process of research software. While many industrial testing tools exist, they are inadequate for testing research software due to its complexity, specialized algorithms, continuous updates, and need for flexible, custom testing approaches. Access to a standard set of testing tools that address these special characteristics will increase level of testing in research software development and reduce the overhead of distributing knowledge about software testing.
AB - Context: Research software is essential for developing advanced tools and models to solve complex research problems and drive innovation across domains. Therefore, it is essential to ensure its correctness. Software testing plays a vital role in this task. However, testing research software is challenging due to the software’s complexity and to the unique culture of the research software community. Objective: Building on previous research, this study provides an in-depth investigation of testing practices in research software, focusing on test case design, challenges with expected outputs, use of quality metrics, execution methods, tools, and desired tool features. Additionally, we explore whether demographic factors influence testing processes. Method: We survey research software developers to understand how they design test cases, handle output challenges, use metrics, execute tests, and select tools. Results: Research software testing varies widely. The primary challenges are test case design, evaluating test quality, and evaluating the correctness of test outputs. Overall, research software developers are not familiar with existing testing tools and have a need for new tools to support their specific needs. Conclusion: Allocating human resources to testing and providing developers with knowledge about effective testing techniques are important steps toward improving the testing process of research software. While many industrial testing tools exist, they are inadequate for testing research software due to its complexity, specialized algorithms, continuous updates, and need for flexible, custom testing approaches. Access to a standard set of testing tools that address these special characteristics will increase level of testing in research software development and reduce the overhead of distributing knowledge about software testing.
KW - Interview
KW - Research software engineering
KW - Software engineering
KW - Software quality
KW - Software testing
KW - Survey
UR - http://www.scopus.com/inward/record.url?scp=86000736204&partnerID=8YFLogxK
U2 - 10.1007/s10664-025-10620-6
DO - 10.1007/s10664-025-10620-6
M3 - Article
AN - SCOPUS:86000736204
SN - 1382-3256
VL - 30
JO - Empirical Software Engineering
JF - Empirical Software Engineering
IS - 3
M1 - 81
ER -