It looks like you're new here. If you want to get involved, click one of these buttons!
This is an article about a technical issue, but basically it's about a heroic attempt to deal with the impossible.
At the risk of appearing presumptuous, I would like to share these experiences with you: It's about establishing automated user interface tests.
I owe you an explanation why I called this "impossible": The general theory of a test is to stimulate the system unter test with some known input.
Provided, the system behaves in reproducible way, we can then compare the system's output with a nominal output and apply some suitable
pass/fail criterion to mark the test as being passed successfully or now. That, at least, is the theory.
For a classical software test, this principle holds well. But the theory tends to neglect an important issue: the choice of the
test cases, the way how and against what nominal output the system's response is compared against is a manifestation of an engineering process.
What is tested and how response is checked against what should be guided by a specification or a wider system knowledge. Thus,
the test is reflecting the "important" aspects of the system while putting less emphasis on the less important aspects.
Why is this important? The answer is: because that choice reflects the freedom of implementation. And that is basically what makes up
engineering. A system should meet the requirements laid down in a specification. What is not part of the specification
should be left to the implementor and ideally this freedom will be used to find a solution that is simple and effective.
A test should focus on the specification and not the implementation details.
That is not a new concept: it is the basis of the test driven software engineering approach. Programmers, being particular lazy, tend
to adopt such methods quickly and use the implementation freedom to minimize their coding efforts. In addition, to focus the testing on the
important aspects enables refactoring - a technique, where code is frequently reorganized or even rewritten. Being the opposite of the
"never change a running system" principle, refactoring can be a very effective way of keeping code maintainable, provided there
is a good test coverage and a high tolerance of tests for irrelevant implementation side effects.
Coming back to the UI tests, most frameworks available for UI test automation follow the theory: somehow, the user interface is stimulated,
i.e. by recoding and replaying mouse movements, button events and keyboard interactions. That is the input side. On the response side
however, there are numerous channels, mostly somehow graphical. How should any test system deal with output then? That is the crucial
point here and taking into account my previous statements about the importance of a careful and selective response checking, it should not be hard
to see that we have an issue here. These are the facts: I have not seen many people performing automated user interface tests. And if so, I have heard a lot of complaints about
high maintenance efforts, unreliable execution and much else. There seem to be many people who regard automated user interface tests "impossible".
With KLayout I tried to deal with that issue. Here are my considerations:
And this is the implementation:
So far, this approach holds well. In fact, some simple tests have very successfully been automated using that concept.
However, it gets nasty we it comes to the details. For example, the simple comparison of test logs is not quite useful. It happens
frequently, that the drawing of a layout changes slightly, i.e. because different rounding or similar. Then, just a few pixels
of the drawn image change. To evaluate the differences between two test logs with embedded images, a more elaborate solution is
required than just a simple "diff". That for, KLayout comes with a simple utility ("gtfui") which is basically a graphical "diff" tool with the
capability of comparing images and showing image differences. This utility has been proven extremely useful, but of course is was
some effort to build it.
Some issues have not been solved yet:
However, the tests have been proven extremely useful. With user interface tests, it is possible to cover a broad variety of functionality
and tests are easily created (recorded). The UI test suite has shown me a couple of bugs after refactoring sessions which have not been
found by the unit tests. Another conclusion is, that checking selected widgets for correct content is a very powerful way to keep
tests maintainable. However, such check points must not be set sparsely. Otherwise it is hard to track down the root cause of a test failure.
Coming back to the title, I would conclude that I did not manage to completely tame the beast. Right now, I have about 70 user interface tests and
most of them run stable and provide a high degree of coverage for database and user interface functionality, but some frequently require
updating. I would like
to see Troll Tech to integrate something like my solution into Qt to save me the effort of implementing a test framework and enhance
the stability of the tests.
On the other hand, I feel
that a real solution would involve a user interface architecture which is consistently optimized for testability, i.e. by providing a view model
layer below the user interface that could be used as a test interface.
That however, is beyond my scope currently and what I really would like to see right now is a cold beer sitting on my table ...