Not long ago, we set out on a mission to perform a full scalability test on one of our products (Trend Micro Deep Security). After some quick, back-of-the-napkin calculations we discovered that we needed somewhere in the order of 35 Dell 710′s with virtualization to complete our test. Finding that many available servers is a tall order for any company, and buying that many servers for a month long test was completely out of the question (try asking your managers for 35 servers and see how pale they go!).
Naturally we turned to the cloud to help us out. Amazon Web Services (AWS) was a good fit to provide the amount of smaller scale resources we needed. (In our case micro and small instances were perfect for simulating a large manager/agent architecture, with each instance simulating dozens of agents).
One thing to be aware of, you can’t simply open an account and request 1000 micro instances. The AWS capacity team works with you, via good old e-mail, to plan the right mixture of instance types, platforms, availability zones and regions that work for both your project and AWS. Once the configuration was settled we designed the tools we needed to rapidly scale up and down our test environment. This included custom AMI (templates) and tools that leveraged the APIs for discovery and resource monitoring.
We ran into our share of quirks on the AWS platform including time skew issues when using high CPU, invalid CPU information for micro instances on CloudWatch and of course, inevitable price wars over spot instances! Due to the nature of the tests, not everything went to schedule. On occasion our plan to scale up was met with error messages from the AWS API of “Insufficient Capacity”. It helps to have backup plans when a particular instance type or region is in high demand.
Once we had addressed our various issues, AWS proved to be an incredible resource for finding scalability issues and quickly testing improvements. The ability to rapidly provision hundreds of VMs from a single AMI allowed us to scale up and down depending on the requirements of the test.
In the end we met our scalability goals and spent a fraction of what it would have cost in-house. We met our goals, and our manager kept his rosy complexion.
[Ed. note: Trend Micro would like to know what you think about this. We enthusiastically invite your comments and we will read every one of them. For very detailed information: