Abstract :
[en] Rather than tediously writing unit tests manually,
tools can be used to generate them automatically — sometimes
even resulting in higher code coverage than manual testing. But
how good are these tests at actually finding faults? To answer
this question, we applied three state-of-the art unit test generation
tools for Java (Randoop, EvoSuite, and Agitar) to the 357 faults
in the Defects4J dataset and investigated how well the generated
test suites perform at detecting faults. Although 55.7% of the
faults were found by automatically generated tests overall, only
19.9% of the test suites generated in our experiments actually
detected a fault. By studying the performance and the problems of
the individual tools and their tests, we derive insights to support
the development of automated unit test generators, in order to
increase the fault detection rate in the future. These include
1) improving coverage obtained so that defective statements
are actually executed in the first instance, 2) techniques for
propagating faults to the output, coupled with the generation
of more sensitive assertions for detecting them, and 3) better
simulation of the execution environment to detecting faults that
are dependent on external factors, for example the date and time.
Scopus citations®
without self-citations
144