Are Mutation Scores Correlated with Real Fault Detection? A Large Scale Empirical study on the Relationship Between Mutants and Real Faults

Papadakis, Mike; Shin, Donghwan; Yoo, Shin; Bae, Doo-Hwan

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

Papadakis, Mike; Shin, Donghwan; Yoo, Shin et al.

2018 • In 40th International Conference on Software Engineering, May 27 - 3 June 2018, Gothenburg, Sweden

Peer reviewed

Permalink
https://hdl.handle.net/10993/34950

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

ICSE-main18b (1).pdf

Publisher postprint (1.83 MB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Disciplines :

Computer science

Author, co-author :

Papadakis, Mike ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)

Shin, Donghwan

Yoo, Shin

Bae, Doo-Hwan

External co-authors :

yes

Language :

English

Title :

Are Mutation Scores Correlated with Real Fault Detection? A Large Scale Empirical study on the Relationship Between Mutants and Real Faults

Publication date :

2018

Event name :

40th International Conference on Software Engineering (ICSE'18)

Event date :

from 27-5-2018 t0 3-6-2018

Audience :

International

Main work title :

40th International Conference on Software Engineering, May 27 - 3 June 2018, Gothenburg, Sweden

Peer reviewed :

Peer reviewed

Focus Area :

Security, Reliability and Trust

Available on ORBilu :

since 19 February 2018

Statistics

Number of views

191 (9 by Unilu)

Number of downloads

1285 (18 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

Iftekhar Ahmed, Rahul Gopinath, Caius Brindescu, Alex Groce, and Carlos Jensen. 2016. Can testedness be effectively measured?. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, Seattle, WA, USA, November 13-18, 2016. 547–558. https://doi.org/10.1145/2950290.2950324
Paul Ammann and Jeff Offutt. 2008. Introduction to software testing. Cambridge University Press.
James H. Andrews, Lionel C. Briand, and Yvan Labiche. 2005. Is mutation an appropriate tool for testing experiments?. In 27th International Conference on Software Engineering (ICSE 2005), 15-21 May 2005, St. Louis, Missouri, USA. 402–411. https://doi.org/10.1145/1062455.1062530
James H. Andrews, Lionel C. Briand, Yvan Labiche, and Akbar Siami Namin. 2006. Using Mutation Analysis for Assessing and Comparing Testing Coverage Criteria. IEEE Trans. Software Eng. 32, 8 (2006), 608–624. https://doi.org/10.1109/TSE.2006.83
Marcel Böhme and Abhik Roychoudhury. 2014. CoREBench: studying complexity of regression errors. In International Symposium on Software Testing and Analysis, ISSTA’14, San Jose, CA, USA - July 21 - 26, 2014. 105–115. https://doi.org/10.1145/2610384.2628058
David Bowes, Tracy Hall, Mark Harman, Yue Jia, Federica Sarro, and Fan Wu. 2016. Mutation-aware fault prediction. In Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA 2016, Saarbrücken, Germany, July 18-20, 2016. 330–341. https://doi.org/10.1145/2931037.2931039
Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. 2008. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs. In 8th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2008, December 8-10, 2008, San Diego, California, USA, Proceedings. 209–224. http://www.usenix.org/events/osdi08/tech/full_papers/cadar/cadar.pdf
Thierry Titcheu Chekam, Mike Papadakis, Yves Le Traon, and Mark Harman. 2017. An empirical study on mutation, statement and branch coverage fault revelation that avoids the unreliable clean program assumption. In Proceedings of the 39th International Conference on Software Engineering, ICSE 2017, Buenos Aires, Argentina, May 20-28, 2017. 597–608. https://doi.org/10.1109/ICSE.2017.61
Henry Coles, Thomas Laurent, Christopher Henard, Mike Papadakis, and Anthony Ventresque. 2016. PIT: a practical mutation testing tool for Java (demo). In Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA 2016, Saarbrücken, Germany, July 18-20, 2016. 449–452. https://doi.org/10.1145/2931037.2948707
Muriel Daran and Pascale Thévenod-Fosse. 1996. Software Error Analysis: A Real Case Study Involving Real Faults and Mutations. In Proceedings of the 1996 International Symposium on Software Testing and Analysis, ISSTA 1996, San Diego, CA, USA, January 8-10, 1996. 158–171. https://doi.org/10.1145/229000.226313
Hyunsook Do, Sebastian G. Elbaum, and Gregg Rothermel. 2005. Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact. Empirical Software Engineering 10, 4 (2005), 405–435. https://doi.org/10.1007/s10664-005-3861-2
Phyllis G. Frankl and Oleg Iakounenko. 1998. Further Empirical Studies of Test Effectiveness. In SIGSOFT’98, Proceedings of the ACM SIGSOFT International Symposium on Foundations of Software Engineering, Lake Buena Vista, Florida, USA, November 3-5, 1998. 153–162. https://doi.org/10.1145/288195.288298
Phyllis G. Frankl and Stewart N. Weiss. 1993. An Experimental Comparison of the Effectiveness of Branch Testing and Data Flow Testing. IEEE Trans. Software Eng. 19, 8 (1993), 774–787. https://doi.org/10.1109/32.238581
Phyllis G. Frankl, Stewart N. Weiss, and Cang Hu. 1997. All-uses vs mutation testing: An experimental comparison of effectiveness. Journal of Systems and Software 38, 3 (1997), 235–253. https://doi.org/10.1016/S0164-1212(96)00154-9
Gordon Fraser and Andrea Arcuri. 2011. EvoSuite: automatic test suite generation for object-oriented software. In SIGSOFT/FSE’11 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE-19) and ESEC’11: 13th European Software Engineering Conference (ESEC-13), Szeged, Hungary, September 5-9, 2011. 416–419. https://doi.org/10.1145/2025113.2025179
Gordon Fraser and Andreas Zeller. 2012. Mutation-Driven Generation of Unit Tests and Oracles. IEEE Trans. Software Eng. 38, 2 (2012), 278–292. https://doi.org/10.1109/TSE.2011.93
Gregory Gay. 2017. The Fitness Function for the Job: Search-Based Generation of Test Suites That Detect Real Faults. In 2017 IEEE International Conference on Software Testing, Verification and Validation, ICST 2017, Tokyo, Japan, March 13-17, 2017. 345–355. https://doi.org/10.1109/ICST.2017.38
Milos Gligoric, Alex Groce, Chaoqiang Zhang, Rohan Sharma, Mohammad Amin Alipour, and Darko Marinov. 2015. Guidelines for Coverage-Based Comparisons of Non-Adequate Test Suites. ACM Trans. Softw. Eng. Methodol. 24, 4 (2015), 22:1–22:33. https://doi.org/10.1145/2660767
Rahul Gopinath, Carlos Jensen, and Alex Groce. 2014. Code coverage for suite evaluation by developers. In 36th International Conference on Software Engineering, ICSE’14, Hyderabad, India - May 31 - June 07, 2014. 72–82. https://doi.org/10.1145/2568225.2568278
Monica Hutchins, Herbert Foster, Tarak Goradia, and Thomas J. Ostrand. 1994. Experiments of the Effectiveness of Dataflow- and Controlflow-Based Test Adequacy Criteria. In Proceedings of the 16th International Conference on Software Engineering. 191–200. http://portal.acm.org/citation.cfm?id=257734.257766
Laura Inozemtseva and Reid Holmes. 2014. Coverage is not strongly correlated with test suite effectiveness. In 36th International Conference on Software Engineering, ICSE’14, Hyderabad, India - May 31 - June 07, 2014. 435–445. https://doi.org/10.1145/2568225.2568271
René Just. 2014. The major mutation framework: efficient and scalable mutation analysis for Java. In International Symposium on Software Testing and Analysis, ISSTA’14, San Jose, CA, USA - July 21 - 26, 2014. 433–436. https://doi.org/10.1145/2610384.2628053
René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: a database of existing faults to enable controlled testing studies for Java programs. In International Symposium on Software Testing and Analysis, ISSTA’14, San Jose, CA, USA - July 21 - 26, 2014. 437–440. https://doi.org/10.1145/2610384.2628055
René Just, Darioush Jalali, Laura Inozemtseva, Michael D. Ernst, Reid Holmes, and Gordon Fraser. 2014. Are mutants a valid substitute for real faults in software testing?. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE-22), Hong Kong, China, November 16 - 22, 2014. 654–665. https://doi.org/10.1145/2635868.2635929
Marinos Kintis, Mike Papadakis, Andreas Papadopoulos, Evangelos Valvis, Nicos Malevris, and Yves Le Traon. 2017. How effective are mutation testing tools? An empirical analysis of Java mutation testing tools with manual analysis and real faults. Empirical Software Engineering (21 Dec 2017). https://doi.org/10.1007/ s10664-017-9582-5
Nan Li, Upsorn Praphamontripong, and Jeff Offutt. 2009. An Experimental Comparison of Four Unit Test Criteria: Mutation, Edge-Pair, All-Uses and Prime Path Coverage. In Mutation 2009, Denver, Colorado, USA. 220–229. https://doi.org/10.1109/ICSTW.2009.30
Akbar Siami Namin and James H. Andrews. 2009. The influence of size and coverage on test suite effectiveness. In Proceedings of the Eighteenth International Symposium on Software Testing and Analysis, ISSTA 2009, Chicago, IL, USA, July 19-23, 2009. 57–68. https://doi.org/10.1145/1572272.1572280
Akbar Siami Namin and Sahitya Kakarla. 2011. The use of mutation in testing experiments and its sensitivity to external threats. In Proceedings of the 20th International Symposium on Software Testing and Analysis, ISSTA 2011, Toronto, ON, Canada, July 17-21, 2011. 342–352. https://doi.org/10.1145/2001420.2001461
A. Jefferson Offutt, Ammei Lee, Gregg Rothermel, Roland H. Untch, and Christian Zapf. 1996. An Experimental Determination of Sufficient Mutant Operators. ACM Trans. Softw. Eng. Methodol. 5, 2 (1996), 99–118. https://doi.org/10.1145/227607. 227610
A. Jefferson Offutt, Jie Pan, Kanupriya Tewary, and Tong Zhang. 1996. An Experimental Evaluation of Data Flow and Mutation Testing. Softw., Pract. Exper. 26, 2 (1996), 165–176. https://doi.org/10.1002/(SICI)1097-024X(199602)26:2<165::AID-SPE5>3.0.CO;2-K
Jeff Offutt. 2011. A mutation carol: Past, present and future. Information & Software Technology 53, 10 (2011), 1098–1107. https://doi.org/10.1016/j.infsof.2011.03.007
Carlos Pacheco and Michael D. Ernst. 2007. Randoop: feedback-directed random testing for Java. In Companion to the 22nd Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2007, October 21-25, 2007, Montreal, Quebec, Canada. 815–816. https://doi.org/10.1145/1297846.1297902
Hristina Palikareva, Tomasz Kuchta, and Cristian Cadar. 2016. Shadow of a doubt: testing for divergences between software versions. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016. 1181–1192. https://doi.org/10.1145/2884781.2884845
Mike Papadakis, Christopher Henard, Mark Harman, Yue Jia, and Yves Le Traon. 2016. Threats to the validity of mutation-based test assessment. In Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA 2016, Saarbrücken, Germany, July 18-20, 2016. 354–365. https://doi.org/10.1145/2931037. 2931040
Mike Papadakis, Yue Jia, Mark Harman, and Yves Le Traon. 2015. Trivial Compiler Equivalence: A Large Scale Empirical Study of a Simple, Fast and Effective Equivalent Mutant Detection Technique. In 37th IEEE/ACM International Conference on Software Engineering, ICSE 2015, Florence, Italy, May 16-24, 2015, Volume 1. 936–946. https://doi.org/10.1109/ICSE.2015.103
Mike Papadakis, Marinos Kintis, Jie Zhang, Yue Jia, Yves Le Traon, and Mark Harman. 2018. Mutation Testing Advances: An Analysis and Survey. Advances in Computers (2018).
Mike Papadakis and Nicos Malevris. 2010. An Empirical Evaluation of the First and Second Order Mutation Testing Strategies. In Third International Conference on Software Testing, Verification and Validation, ICST 2010, Paris, France, April 7-9, 2010, Workshops Proceedings. 90–99. https://doi.org/10.1109/ICSTW.2010.50
Rudolf Ramler, Thomas Wetzlmaier, and Claus Klammer. 2017. An empirical study on the application of mutation testing for a safety-critical industrial software system. In Proceedings of the Symposium on Applied Computing, SAC 2017, Marrakech, Morocco, April 3-7, 2017. 1401–1408. https://doi.org/10.1145/3019612.3019830
David Schuler and Andreas Zeller. 2013. Covering and Uncovering Equivalent Mutants. Softw. Test., Verif. Reliab. 23, 5 (2013), 353–374. https://doi.org/10.1002/stvr.1473
Donghwan Shin, Shin Yoo, and Doo-Hwan Bae. 2017. A Theoretical and Empirical Study of Diversity-aware Mutation Adequacy Criterion. IEEE Trans. Software Eng. (2017). https://doi.org/10.1109/TSE.2017.2732347
Dávid Tengeri, László Vidács, Árpád Beszédes, Judit Jász, Gergo Balogh, Bela Vancsics, and Tibor Gyimóthy. 2016. Relating Code Coverage, Mutation Score and Test Suite Reducibility to Defect Density. In Ninth IEEE International Conference on Software Testing, Verification and Validation Workshops, ICST Workshops 2016, Chicago, IL, USA, April 11-15, 2016. 174–179. https://doi.org/10.1109/ICSTW.2016. 25
W. Eric Wong, Ruizhi Gao, Yihao Li, Rui Abreu, and Franz Wotawa. 2016. A Survey on Software Fault Localization. IEEE Trans. Software Eng. 42, 8 (2016), 707–740. https://doi.org/10.1109/TSE.2016.2521368
W. Eric Wong and Aditya P. Mathur. 1995. Fault detection effectiveness of mutation and data flow testing. Software Quality Journal 4, 1 (1995), 69–83. https://doi.org/10.1007/BF00404650
Yucheng Zhang and Ali Mesbah. 2015. Assertions are strongly correlated with test suite effectiveness. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2015, Bergamo, Italy, August 30 - September 4, 2015. 214–224. https://doi.org/10.1145/2786805.2786858