Recently I have had numerous conversations with people at various tiers of companies all over the place who are toying with the idea of building their own test automation and continuous integration infrastructure. Since I have spent a considerable amount of time dealing with such undertakings I decided that it might be worth the time to brain dump some of the issues you may want to consider before you dive in.
Boxes, VM’s or Cloud?
A common first reaction is to take a couple of those old boxes sitting around to run the first “couple” tests you have. In some cases, this is the perfect solution. That is if you have a small application that rarely changes and only needs a daily test run (on one operating system and it’s available browsers). In my experience you can reasonably run one Windows VM without lagging the host machine unusable, which gives you two concurrent browser test jobs without worrying about process conflicts. Do remember that to do this correctly, you really want a machine dedicated to your CI system (which I will talk more about below).
“The Cloud”, is the 2009-2010 buzz word that makes all technology sound better, and in a lot of ways COULD be the ideal solution for test automation. You get the benefits of paying for only the cycles you use, having someone else manage the infrastructure and avoid those pesky licensing costs. However in my experience the setup is painful, the solutions (EC2, etc.) are slow and lacking some of the features to really do test automation well. For example, if you want to run your own CI instance and run your tests on demand in the cloud you will run into some pretty painful engineering problems. It’s not easy to instruct the “cloud” service to fire up a Windows VM and then have that VM connect to your CI instance and become an available slave. It’s doable with Linux, but last I checked – the features to do something similar with Windows simply weren’t there. Also do you really want to wait sometimes up to 15 minutes (also my experience with EC2) for the machines to come up before you can even start running your tests?
What CI System?
The point of this article is not to recommend solutions, it is to encourage questions. However, outlining all of the possible CI systems would take forever so I will simply say that I wound up using Hudson. The reasoning includes a very open and functional Open Source community, with smart contributors willing to take a few minutes to respond and help me out. I also found it possible (not simplistic) to build plugins to customize the things that I needed changed. Many people out there swear by Cruise Control, or Build Bot and I would highly encourage you to do some research and pick the solution that you feel will allow you to be the most productive. For example if you plan to use Windmill and EC2, you may want to do some reading about the Amazon EC2 and Windmill plugins available for Hudson and see what comparable tools are available.
Which test framework?
When it comes to browser based web testing tools I really think you need to pick the one that fits your needs the best. A great example of this was in my needs to automate functionality contained in iframes being served over HTTPS from a different domain, the only solution (after weeks of trial and error) turned out to be WatiN. Of course, writing and building tests in C#.net wasn’t going to be an easy sell so IronWatin was invented as a means to write WatiN tests in Python.
Watir has captured a lot of the Ruby community and has recently been moving towards consolidating the separated browser projects into one, which will significantly improve the ease of use.
Selenium has a thriving community, lots of available consulting support, integrates well into a Java environment and offers the Grid project. You can also avoid all of the work involved in running your own system by writing your tests in Selenium and then offloading them to a company like Sauce Labs if you are willing to pay for it.
What do we do about Flash/AS3 automation?
How much work goes into maintenance and software upgrades?
Depending on the machine setup you are going with, this will vary. Obviously if you have a box and a VM on it you can manually go through the process of upgrading the browsers on each, installing patches and security fixes for the OS etc. But if you went with the VM solution, you need to come up with a way to deploy updates to all the machines in your pool. An Open Source solution that came up near the top of my search is WPKG, but like the rest of the tools on this page – there are many solutions and you will want to do your research. Some of the maintenance you will be dealing with can be done as a system job, or run by your CI system. A good example of this is to remove data, test files, source repositories etc. that accumulate on your test running machines.After a while, these files in combination with temporary internet files from the browsers and system tmp files start to slow things down.
This piece is very important to take into consideration from the beginning, because once you have 15 VM’s running tests — doing anything manually becomes a major chore. You also need to be cognizant of that fact that if you chose the cloud testing solution you will be managing your own test running images used to boot the VM’s. Every time you want to make updates or changes to that image, you get to go through the whole process of baking it and uploading it to the cloud hosting service. In my experience, this process is NOT enjoyable or quick.. so be prepared to invest some serious time.
What is the strategy for scaling and expanding?
Clearly this will be dictated by the rest of decisions you made, but I think when you get to this point, the idea of buying and manually setting up more and more physical boxes starts to break down. Buying more and more machines to sit there running tests simply seems like a bad use of resources (and desk space). VM’s allow you to quickly replicate images and expand your arsenal as long as the host machine has the hardware resources to power it without negatively effecting test run times on all the others. There are also solutions out there that allow you to boot and shutdown VM’s on the fly, which provides some interesting possibilities in juggling system resources.
I have found that since Windows is the common platform that can run all the browsers I care about, having a large pool of identical VM’s running all the time is an easy way to queue up 100′s of tests and get results in a reasonable amount of time. I do think that this is the aspect that cloud services start to become more appealing. The idea of spinning up more and more virtual machines in the cloud (with essentially endless capacity) makes the idea of scaling those tests considerably less terrifying. If you can get over the generally slow spin up times, and have come up with a strategy of dynamically harnessing and adding those machines to the pool – you may have it made!
What format/language is best for our tests?
What operating systems and browsers do we need?
The answer to this question should come from the metrics of your user base, and will also help you narrow down a testing framework and your test machines. Some of the available frameworks simply won’t run on Linux, or support IE6. Are all your users using Google Chrome? Then you should probably make sure the test framework you use has a Chrome launcher. Historically I have concentrated about 90% of the available VM resources on the most popular platform (usually Windows), with the most popular browsers (Firefox latest release, IE 7 or 8 depending on the users, Safari Windows latest release, and Chrome). This gives me a pool of work horse machines that can crank out tests representing the majority. The other 10% would be divided into the higher percentage minorities which probably includes a MacOSX slave running FF and Safari and a Linux machine running FF and Konqueror.
Can we support both functional and unit tests?
I have found that having your own test infrastructure really makes this one easier. Since your resources will probably be on the same network as your codebase, you can easily access and run unit tests.. however as soon as you start sending your tests off to the cloud you are dealing with some security/privacy issues and engineering challenges. If inorder to unit test your code you need the entirety of your code base available, sending a copy of it off to your image on the cloud for every test (job, run or even change set) over and over could become a major strain on your system and simply sounds like a bad idea. If you are counting on unit tests you will at least want a machine on your network available as a slave from the CI system for those jobs.
How do we report results, and stay tuned into failures?
Most CI solutions have many ways they can be configured to alert you of a failure, from email, irc, jabber to a phone call you can usually find some solution that will get your attention. At this point in time the norm appears to be jUnit compatible results in XML. I’m not a huge fan, but the available tools for parsing and aggregating jUnit into something useful is very appealing. If you are okay with a true/false, that will always work out of the box, but you will need to be prepared to put in a little extra effort in both your test development and test job setup to generate the jUnit report files.
What role does automation need to play in our process?
For maximal results you need to make the leap where QA and Development both understand that a failing job in the CI system is a show stopper and must be investigated immediately. This way people honor the continual nature of continuous integration. It’s job is to catch problems shortly after they are broken, point you directly at them, and continue failing until you have solved the issue or fixed the test. If this isn’t the process you want, having continuous integration may not be the solution you are looking for.
Another effective strategy I have seen is to base release viability on the status of your CI system. Don’t let the product go out the door until the ‘thoroughly defined’ suite of functional and unit tests running in continuous integration are all running and passing. It is easy to push off the process of updating failing tests until “later”, but everyone is busy, and later is usually never.
What part of our application should we automate?
This really should be computed by time and resources, as much as I would like to say “everything” I am well aware of it’s low probability. Pick your application flows that make you money, or really mean a lot to your users likely experience. Ensure that your application loads, they can take the happy path, give you some money and then leave. At least this way you can sleep at night knowing that, they may not be able to change their profile information but they can still pay your salary.
I hope that I sufficiently communicated my point and passed on some useful and informative tidbits. Setting up automation infrastructure that has any chance of actually doing it’s job is a major investment of resources and should be well planned based on individualized needs. I hope this saves someone some time and energy.
Best of luck, and happy testing.