Good Practices For Automating Functional Tests

Why

I spend a lot of time talking about the benefits of automating tests – automated checks as explained by Michael Bolton. I call them automated tests out of habit (and clarity to the uninitiated). Part of the responsibility of teaching this subject is teaching people to follow good practices. Almost all of these practices I lifted from somebody else.

Why do I like automated tests? Because I want to reduce the time between when a problem is introduced and when it is reported to the person that introduced the problem. Some time in history, man measured the cost of fixing defects and it was more expensive the longer it went unnoticed. This is a generalization but I take it seriously because my job is to help people make good products. My value is in the reduction of costs and increase in benefit I create for the pay I receive. When I do that more effectively, my value increases. Since I work in the computer software business, I ought to be using technology to reach those ends.

Balance in what you test (Unit, Service, GUI)

I have heard about the automated testing pyramid from several people and read on many blogs such as Michael Cohn’s – I don’t know who originated the idea, but I heard about it from Janet Gregory when she trained my team at HP. If all of the tests for a product are through the graphical user interface, then there’s a lot that isn’t getting tested, the tests will tend to be at a higher cost to maintain because interfaces tend to be change more than the classes and services, and they will be found later in the product cycle as the GUI tests tend to require all of the product layers are built (regardless of the order they are created).

Not Everything Should be Automated

James Bach recently wrote a blog post on the skill required to create scripted tests. The lesson I received was that scripting tests to the point they become checks is high even if people are running the. The investment goes even higher when the interpreter is a computer program because the instruction needs to be so much more precise. The return is lower because the interpreter only sees what it is told to see.

Then where is the value? Automating the activities that cost less to automate than to perform manually. Consider partial automation as a great alternative to the automate/manual question.

  • Inject data to set up the test scenario – possible sql queries or web service calls
  • Verifying the unseen changes – sql queries (new/changed records), parsing logs, or web service calls
  • Navigation to the location of the test – this could be opening web pages, logging in, and going to a certain web page.
  • Capturing screen shots for human eyes to review
  • Notification of environment changes
I recently wrote a blog post on using Interactive Ruby to support manual testing with automation which may help you see the automation and manual thought process marry.
One more thing to consider: do not automate tests that will not be run repeatedly. Do not even script them. Keep notes on what was done just in case you need to do some forensic analysis. Just don’t automate them!

Pass/Fail Criteria

The first mistake I ever made in automated tests was to think I automated a test by creating the automated navigation. Without some criteria to know if the test passed, the “result” is useless.

I like to know which tests failed separately from which did not complete. If the problem is not what you are testing for, it’s an exception. When the test fails, somebody has to figure out if there is a failure in the product, if the product changed without a like change in the tests, or if there is a failure in the tests (presumably, a good practice was not followed). Assuming there is a failure in the product, a defect that will be fixed or not fixed – it is a known problem. In the case of the incomplete test, the problem is unknown because we haven’t actually seen what happens when we get to the end – you probably do not know if the test would pass or not.

Layered Framework

The first thing I got right when I started automating tests was to create layers. In fact, I was so sure that it the only possibility for being successful that I was shocked to hear Dorothy Graham talk about the idea in Lightning Strikes the Keynote at StarWest 2010 (as if it were a new idea). Maybe some automation tools make this separation so difficult that some people don’t do it.

The best way to describe this is to separate the what from the how. The what should be your test cases (like general instructions for a manual tester). The how should be your test framework (classes and methods). The what should be the business logic that needs to be tested (or the workflow, or whatever). The how should be the specific instructions for dealing with the interface (click this, fill in that). The what is your customers actions. The how is the implementation of your interface that supports what the customer wants to do.

The purpose is to simplify maintenance when the product creators (dev team in my case) changes the how. We often do not see that separation in step-by-step manual test scripts (also called checks by the experts). If we change the submit form action from clicking a button to a gesture (such as a Be-Witched nose wiggle), there is one place to change the code so that all tests incorporating that form submission will work.

I have seen the separation in many different ways and levels. In some cases, the framework supported domain specific actions (log in, set field, submit form). In other cases, the framework supported transactions (create account, update profile, purchase subscription).

Run 1 Check per Test

I like the idea of separating each test from the others. Not because I care how many test cases exist but because I want separation of failing and passing results. I do not find value in doing and checking 10 things if they all pass or they all fail because one check failed. This also means to go straight to the tested functionality. Did you want to test the navigation? Do that in another test.

What about an end-to-end test? For example, suppose you want to test creating an order, fulfilling the order, charging a credit card, and notification of completion? If it’s a straight through “complete” use case, then you are testing one thing.

Timing Dependencies

There are two timing-related considerations here. First, sometimes tests must wait. They wait for a web page to load, they wait for JavaScript to be completed. You should not wait by guessing how long is necessary. Many tools come with wait_until type functions such as browser.div(name=>’javascript complete’).wait_until_exists which will delay the script the right amount of time. If you don’t get that in your tool, create loops that check for existence with a timeout.

The second timing consideration is dependence on something happening that we don’t know when it’s going to happen. Suppose I create a situation that will trigger a notification when the notification scheduler runs but I don’t know exactly when it will run. I can either help it along will triggering the notification scheduler manually or … consider not doing that test. Nobody wants the automated tests hung up for 35 minutes.

Do Not Assume the Data Exists

In a previous job that I had, the product under test came with a sample database. I found that it was often used to support manual tests. The problem with the assumption that the sample data will be there to support the test is that other tests could change or remove the data. Manual testers will make the adjustment by creating the data at that point. An automated test will… stop.

When we converting the manual tests into automated tests, one of the first features we added to the framework was to allow us to create the data needed to support the test. In that case, we used web services calls.

The Most Reliable Way to Use Data

A long time ago I worked on a system that had almost no data to support my tests. I would spend an hour creating data through the web interface. I hope that nobody does what I did, not even with a web automation tool. I solved that problem by learning how to import xml files with the data needed to support my tests. Since then, I have used api’s (including web services api’s) and sql scripts to inject data. Use the most reliable way possible.

Clean Up After Your Tests

“A job isn’t finished until you have cleaned up after yourself” said my father. I say the same thing about testing. For the sake of other tests that will run after yours, you may consider cleaning up.

Summary

I spend years learning about these practices. Sometimes I learned the hard way, other times I was fortunate enough to learn them from a seasoned professional. I did not want to call this best practices because that would assume I know the context. There are so many situations that I could not know the best practice for each of them. Consider each of them with the help of your team before making a decision. If the others that depend on the test results understand these practices, they can often make implementing the practices much easier.

Additional Acknowledgments

In addition to the ones already in the post, I would like to recognize my co-workers Jean-Philippe Boucharlat and David C. Cooper, whom I worked with at Hewlett-Packard, for sharing their insights to automation best practices.

Update Notes – February 10 2012

I have updated the original post to improve the article from the feedback of the kind readers.

9 thoughts on “Good Practices For Automating Functional Tests

  1. Kobi Halperin (@halperinko)

    Very good article, and so true.
    One point to stress out is enabling fast results investigation – normally done using a Drill-Down log, which enables quick focusing on failed items.
    In some cases, this log has to by in synch with other logs, either by time stamps or other dedicated messages, enabling to dive even further into the right spot in the product logs.

    Kobi @halperinko

  2. dmcnulla Post author

    Good point. Maybe a good follow on is the whole plan. Some managers tend to think we create automation then it just goes. Somebody has to be responsible to review results and act on them. Somebody has to have a build system, a deployment system, and a notification system.

  3. Adam Knight (@adampknight)

    Hi,

    Nice post. Totally agree on the data layers – wherever possible the test data and the test harness should be separate so that the automation harness logic can change and adapt without the need for extensive refactoring of the tests themselves.

    “Not Everything Should be Automated” – absolutely – I wrote a post on just this subject a while ago, not every test needs automating, particularly ones with low likelihood of regression and high setup/maintenance costs.

    Regarding the “Pass/Fail Criteria” I would question the absolute nature of your phrase “When the test fails, there is a defect that will be fixed or not fixed ” – to me automation is a ste of checkpoints around the system. When an unexpected result throws up a “Failure” this could mean a number of things including:-
    – the test automation has a problem
    – there is a valid change in product behaviour
    – there is new undesirable behaviuor in the product
    I see a test “failure” as an alert that re-examination is required of initially the automation around a feature and then possibly retesting of the feature itself based on new behaviour, which may or may not be classified as a defect. The automation can tell us that something has changed but I think it takes a human to investigate that change and the impact of it.

    Thanks again for an interesting post.

    Adam.
    —————————————-
    http://a-sisyphean-task.com/
    twitter: adampknight
    —————————————-

  4. dmcnulla Post author

    That is a great point about the failures. My friend and former colleague David Cooper used to call those the good bucket (tests that found regressions and required a defect) and the bad bucket (tests that failed because of changes in the software and required maintenance).

    I appreciate the comments!

  5. Jean-Philippe Boucharlat

    Nice article, Dave.

    “Run 1 Check per Test”: Definitely! You may also mention “go straight to the tested functionality”, which is a way to avoid testing several parts of the system at the same time. It’s yet another reason why data injection is a key factor of success.

    Regarding assertions, I like the bad bucket image. Assertions not only certify the “result” but are also very convenient for the guy who will analyze the test execution results. Well-written test scripts ideally report a failing assertion before being unable to proceed, which is most of the time pretty easy to interpret. For the ones who use libraries like FIT, try to make sure you will get some “red” before the “yellow”

    Dave, this article would be worth being developped further!

  6. Kobi Halperin (@halperinko)

    Another point:
    Don’t forget to supply simple interface for Manual testers to make use of Automation functions,
    Automation functions are very useful to save time during manual testing both in scripted as well as in exploratory testing.
    When using KDT like abstraction level, it is quite easy to implement the same via GUI.

  7. dmcnulla Post author

    I use two tools to support that concept. First, I create test-supporting classes that manual testers can instantiate and utilize in their testing. That allows them to test the features in more ways through variations of parameters of methods and order of methods. Second, I create scripts that can be called from the command line with parameters that support some of the same variations. Let them explore, not just follow a script.

    Good point!

Leave a comment