Skip to main content

UI test smells: if() and for() and files

I read with interest Matt Archer's blog post entitled "How test automation with Selenium or Watir can fail"

He shows a couple of examples that, while perfectly valid, are poor sorts of tests to execute at the UI level.  Conveniently, Mr. Archer's tests are so well documented that it is possible to point to exactly to where the smells come in. 

The test in the first example describes a UI element that has only two possible states: either "two-man lift" or "one-man lift", depending on the weight of an item.  In a well-designed test for a well-designed UI, it should be possible for the test to validate that only those two states are available to the user of the UI, and then stop.  

But Mr. Archer's test actually reaches out to the system under test in order to open a file whose contents may be variable or even arbitrary, iterates over the contents of the file, and attempts to check that the proper value is displayed in the UI based on the contents of the file.  Mr. Archer himself notes a number of problems with this approach, but he fails to note the biggest problem:  this test is expensive.  The contents of the file could be large, the contents of the file could be corrupt, and since each entry generates only one of two states in the UI, all but two checks made by this test will be unnecessary. 

Mr. Archer goes on to note a number of ways that this test is fragile.  I suggest that the cases involving bad data in the source file are excellent charters for exploratory testing, but a poor idea for an automated test.  An automated UI test that simply checks that the two states are available without reference to any outside test data is perfectly adequate for this situation. 

In his second example, Mr. Archer once again iterates over a large number of source records in order to achieve very little.  Again, exercising the same UI elements 100 times using different source data is wasteful, since all the UI test should be doing is checking that the UI functions correctly.  However, there is an interesting twist in Mr. Archer's second example that he fails to notice.   If Mr. Archer were to randomly pull a single source record from his list of 100 records for each run of the test, he would get the coverage that he seems to desire for his 100 records over the course of many hundreds of individual runs of the test.  I took a similar approach in a test harness I once built for an API test framework, and I described that work in the June 2006 issue of Better Software magazine, in a piece called "Old School Meets New Wave". 

Both of Mr. Archer's examples violate two well-known UI test design principles.  The first principle is called the "Testing Pyramid for Web Apps".   As far as I can tell, this pyramid was invented simultaneously and independently by both Jason Huggins (inventor of Selenium) and by Mike Cohn.  (Jason's pyramid image is around the web, but I nabbed it from here.)




Any test that reaches out to the file system or to a database is going to belong to the middle tier of business-logic functional tests.  And even then, most practitioners would probably use a mock rather than an actual file, depending on context.   While it is not always possible for UI tests to avoid the business-logic tier completely, it should be the case that UI tests are in fact focused on testing *only* the UI.  Loops and conditionals in UI tests are an indication that something is being tested that is not just part of the UI.  Business logic tests should to the greatest extent possible be executed "behind the GUI".  From years of experience I can say with authority that UI tests that exercise business logic become both expensive to run and expensive to maintain, if they are maintainable at all.

The other principle violated by these examples is that highest-level tests should never have loops or conditionals.  The well-known test harness Fitnesse does not allow loops or conditionals in its own UI.  Whatever loops or conditionals may be required by the tests represented in the Fitnesse UI must be coded as part of the fixture for each test.  For a detailed discussion of this design pattern, see this post by Gojko Adzic: "How to implement loops in Fitnesse test fixtures

Comments

Gojko Adzic said…
I have a different take on the whole thing of UI test design now that, for me, makes such errors obvious. I call it the principle of symmetric change
Jared said…
I don't think we can say the pyramid was 'invented', perhaps observed and described. Craig Larman spoke to us regarding a similar test automation strategy around 2003, and it's been a fairly standard idea (if not practice) in the teams I've worked in since 2004.

I'll try and dig up some other references (if only because someone sent me a link to http://blogs.agilefaqs.com/2011/02/01/inverting-the-testing-pyramid/# and I was annoyed by their lack of attribution).
TestWithUs said…
This comment has been removed by a blog administrator.
The Geeks said…
This comment has been removed by a blog administrator.

Popular posts from this blog

Reviewing "Context Driven Approach to Automation in Testing"

I recently had occasion to read the "Context Driven Approach to Automation in Testing". As a professional software tester with extensive experience in test automation at the user interface (both UI and API) for the last decade or more for organizations such as Thoughtworks, Wikipedia, Salesforce, and others, I found it a nostalgic mixture of FUD (Fear, Uncertainty, Doubt), propaganda, ignorance and obfuscation. 

It was weirdly nostalgic for me: take away the obfuscatory modern propaganda terminology and it could be an artifact directly out of the test automation landscape circa 1998 when vendors, in the absence of any competition, foisted broken tools like WinRunner and SilkTest on gullible customers, when Open Source was exotic, when the World Wide Web was novel. Times have changed since 1998, but the CDT approach to test automation has not changed with it. I'd like to point out the deficiencies in this document as a warning to people who might be tempted to take it se…

Watir is What You Use Instead When Local Conditions Make Automated Browser Testing Otherwise Difficult.

I spent last weekend in Toronto talking to Titus Fortner, Jeff "Cheezy" Morgan, Bret Pettichord, and a number of other experts involved with the Watir project. There are a few things you should know:

The primary audience and target user group for Watir is people who use programming languages other than Ruby, and also people who do little or no programming at all. Let's say that again:

The most important audience for Watir is not Ruby programmers 
Let's talk about "local conditions":

it may be that the language in which you work does not support Selenium
I have been involved with Watir since the very beginning, but I started using modern Watir with the Wikimedia Foundation to test Wikipedia software. The main language of Wikipedia is PHP, in which Selenium is not fully supported, and in which automated testing in general is difficult. Watir/Ruby was a great choice to do browser testing.  At the time we started the project, there were no selenium bindings for …

Open letter to the Association for Software Testing

To the Association for Software Testing:

Considering the discussion in the software testing community with regard to my blog post "Test is a Ghetto", I ask the Board of the AST  to release a statement regarding the relationship of the AST with Keith Klain and Per Scholas, particularly in regard to the lawsuit for fraud filed by Doran Jones (PDF download link) .

The AST has a Code of Ethics  and I also ask the AST Board to release a public statement on whether the AST would consider creating an Ethics Committee similar to, or as a part of the recently created Committee on Standards and Professional Practices.

The yearly election for the Board of the AST happens in just a few weeks, and I hope that the candidates for the Board and the voting members of the Association for Software Testing will consider these requests with the gravity they deserve.