Commercial Off-the-Shelf (COTS) software is becoming an ever-increasing part of organizations' total IT strategy for building and delivering systems. A common perception held by many people is that since a vendor developed the software, much of the testing responsibility is carried by the software vendor. However, people are learning that as they buy and deploy COTS-based systems, the test activities are not necessarily reduced, but shifted to other types of testing not seen on some in-house developed systems.

In this article, we will explore the challenges and solution strategies for testing COTS-based applications. We will also see a process for testing COTS-based applications.

Case Study

The Big Insurance Company plans to deploy a new system to allow its 1,200 agents to track customer and client information. Instead of writing its own application, the company has chosen to buy a site license of a popular contact management application. The solution appears to be cost-effective, as the total cost of the software to be deployed to all agents will be about $100,000 as compared to an in-house development estimate of $750,000. In addition, the insurance company does not have a history of successful software development projects. There are, however, some considerations that the company realized after they made the purchase decision:

  • New versions of the application will be released each year.
  • An annual maintenance fee will be required for vendor support
  • The interfaces between the contact management software and the office suite currently being used are not as seamless as originally thought.
  • There are some computers being used by agents that are too old or too new for the software
  • Each agent has an existing client contact database of about 1,000 people, but the data is stored in a variety of products and database formats.

In planning the purchase and deployment of the application, the project manager allocated ample time to perform a pilot deployment with 10 field agents using a variety of computers. Initial feedback from the 10 agents indicated that the new application worked correctly, but some tasks were hard to understand and perform. Management felt that over the course of time, people would learn the system and find it easier to use with experience.

The deployment plan was to have all agents download and install the new application over a weekend. Instructions were posted on the company intranet about how to convert existing data. A help line was established to provide support to the agents. On deployment weekend, 98% of the agents downloaded the new software and installed it on their notebook computers. About 20% of the agents had problems installing the software due to incompatibilities with hardware and operating systems. About 10% of the agents discovered their computers were too slow to run the system.

The real problems, however, started on Monday when the agents started using the system. Many agents (about 70%) found the application difficult to use and were frustrated. In addition, all of the agents found that the new application could not perform some of the functions the old contact databases would. Fortunately, many of the agents kept their old contacts database.

After four weeks, the company decided to implement another product, but this time more field testing was performed, other customers of the product were referenced, and more extensive testing was performed for interoperability, compatibility, correctness, and usability. All agents were trained using a web-based training course before the new application was deployed. The second deployment was a huge success.

In the first project, the results were:

  • A loss of time and productivity for the agents
  • A loss of credibility for the project team and the IT department
  • A loss of sales as agents could not use the system to follow-up with prospects quickly

In the second deployment,

  • The initial product was more usable and had more useful features
  • Agents were trained to avoid confusion in how to use the product
  • Testing was more complete, which gave a higher level of confidence in deploying the application

In comparing the deployments, the company learned that:

  • Application features are just one aspect of the product's quality.
  • End-users must understand how to use the product.
  • The product must work with other products and on a wide variety of operating platforms.
  • Although the vendor tested the product, the customer has responsibility to test items the vendor can't test.
  • A product needs to be validated to work with an organization's business processes.

This case study is not a true story, but it is based in representative projects I have seen in acquiring and deploying COTS products. From this example, we can see the need for testing, but what are the issues in COTS testing and how do we solve them?

Unique Challenges of Testing COTS-based Applications

Challenge #1 - COTS is a Black Box

The customer has no access to source code in COTS products. This forces testers to adopt an external, black-box, test approach. Although black-box testing is certainly not foreign to testers, it limits the view and expands the scope of testing. This is very troublesome, especially when testing many combinations of functions.

Functional testing is redundant by its very nature. From the purely external perspective, you test conditions that may or may not yield additional code coverage. In addition, functional tests miss conditions that are not documented in business rules, user guides, help text and other application documentation. The bottom line is that in functional testing, you can test against a defined set of criteria, but there will likely be features and behavior that the criteria will not include. That's why structural testing is also important. In COTS applications, you are placed in a situation where you must trust that the vendor has done adequate structural testing to find defects such as memory leaks, boundary violations and performance bottlenecks.

Solution Strategies: Avoid complex combinations of tests and the idea of "testing everything." Instead, base tests on functional or business processes used in the real world environment. The initial tendency of people in testing COTS applications is to start defining tests based on user interfaces and all of the combinations of features. This is a slippery slope which can lead to many test scenarios, some meaningful and others with little value.

Challenge #2 - Lack of Functional and Technical Requirements

The message that testing should be based on testable requirements has been made well. Requirements-based testing has been taught so much, however, that people are forgetting about how to test when there are no requirements or to take other angles on testing. Testing from the real-world perspective is validation, and validation is the kind of testing that is primary in a customer or user's test of a COTS product.

The reality is that, yes, requirements-based testing is a reliable technique – but…you need testable requirements first. In COTS you may have defined user needs, but you do not have the benefit of documents that specify user need to the developer for building the software. In fact, the developer of the software may not have the benefit of documented requirements for tests either. For the customer, this means you have to look elsewhere for test cases, such as:

  • Exploring the application
  • Business processes
  • User guides

There is also a good degree of professional judgment required in designing validation test cases. Finding test cases is one thing. Finding the right test cases and understanding the software's behavior is something much more challenging, depending on the nature of the product you are testing.

Solution Strategy:

  • Design tests that are important to how you will use the product. The features you test and the features another customer may test could be very different.
  • Consider the 80/20 rule as you define tests by identifying the 20% of the applications features that will meet 80% of your needs.

Challenge #3 - The Level of Quality is Unknown

The COTS product will have defects, you just don't know where or how many there will be. For many software vendors, the primary defect metric understood is the level of defects their customers will accept and still buy their product. I know that sounds rather cynical, but once again, let's face facts. Software vendors are in business to make a profit. Although perfection is a noble goal and (largely) bug-free software is a joy to use, a vendor will not go to needless extremes to find and fix some defects. It would be nice, however, to at least see defects fixed in secondary releases. Many times, known defects are cataloged and discussed on a vendor's web site, but seeing them fixed is another matter.

This aspect of COTS is where management may have the most unrealistic expectations. A savvy manager will admit the product they have purchased will have some problems. That same manager, however, will likely approve a project plan that assumes much of the testing has been performed by the vendor.

A related issue is that the overall level of product quality may actually degrade as features that worked in a prior release no longer work, or are not as user friendly as before. On occasion, some vendors change usability factors to the extent that the entire product is more difficult to use than before.

Solution Strategy:

  • Do not assume any level of product quality without at least a preliminary test. A common strategy is not to be an early customer of a new release. It's often wise to wait and see what other users are saying about the product. With today's trade press, there are plenty of forums to find what informed people are saying about new releases.
  • Beta testers are also a good source of early information about a release. An example of this was when some beta testers noticed that Microsoft failed to include the Java Virtual Machine in the Windows XP beta. Prior to the revelation, Microsoft had not indicated their intention. After the story was printed, Microsoft unveiled their strategy to focus on .Net.

Challenge #4 - Unknown Development Processes and Methods

Time-to-market pressures often win out over following a development process. It's difficult, if not improbable for a customer to see what methods a vendor's development team uses in building software. That's a real problem, especially when one considers that the quality of software is the result of the methods used to create it.

Here are some things you might like to know, but probably will not be able to find out:

  • Were peer reviews used throughout the project?
  • How experienced are the developers?
  • Which phases of testing were performed?
  • Which types of testing were performed?
  • Are test tools used?
  • Are defects tracked?
  • How do developers collaborate on projects?
  • How are product features conceived and conveyed to developers?
  • What type of development methodology used?
  • Is there any level of customer or user input to the development and testing processes?

Solution Strategies:

  • This is a tough issue to deal with because the vendors and their staffs do not want to reveal trade secrets. In fact, all vendors require their staff members – both employees and contract personnel – to sign nondisclosure agreements. Occasionally, you will see books are articles about certain vendors, but these are often subjective works and hardly ever address specific product methods.
  • Independent assessments may help, but like any kind of audit or review, people know what to show and what to hide. Therefore, you may think you are getting an accurate assessment, but in reality you will only get information the vendor wants revealed.

Challenge #5 - Compatibility Issues

Software vendors, especially those in the PC-based arena, have a huge challenge in trying to create software that will work correctly and reliably in a variety of hardware and operating system environments. When you also consider peripherals, drivers, and many other variables, the task of achieving compatibility is impossible. Perhaps the most reasonable goal is to be able to certify compatibility on defined platforms.

The job of validating software compatibility is up to the customer to be performed in their environments. With the widely diverse environments in use today, it's a safe bet to assume that each environment is unique at some point.

Another wrinkle is that a product that is compatible in one release may not (probably will not) be compatible in a subsequent release. Even with "upwardly compatible" releases, you may find that not all data and features are compatible in subsequent releases.

Finally, be careful to consider compatibility between users in your organization that are using varying release levels of the same product. When you upgrade a product version, you need a plan that addresses:

  • When users will have their products upgraded
  • Which users will have their products upgraded
  • Hardware and other upgrades that may be needed
  • Data conversions that may be needed
  • Contingency plans in case the upgrade is not successful

Solution Strategies:

  • Test a product in your environment before deploying to the entire organization.
  • Have an upgrade plan in place to avoid incompatibility between users of the same product.

Challenge #6 - Uncertain Upgrade Schedules and Quality

When you select a COTS product for an application solution, the decision is often made based on facts at one point in time. Although the current facts about a product are the only ones that are known and relevant during the acquisition process, the product's future direction will have a major impact in the overall return on investment for the customer. The problem is that upgrade schedules fluctuate greatly, are impacted by other events such as new versions of operating systems and hardware platforms, and are largely unknown quantities in terms of quality.

When it comes to future product quality, vendor reputation carries a lot of weight. Also, past performance of the product is often an indicator of future performance. This should be a motivator for vendors to maintain high levels of product quality, but we find ourselves back at the point of understanding that as long as people keep buying the vendor's product at a certain level of quality, the vendor really has no reason to improve product quality except for competing with vendors of similar products.

Solution Strategies:

Keep open lines of communication with the vendor. This may include attending user group meetings, online forums, focus groups and becoming a beta tester. Find out as much as you can about planned releases and:

  • don't assume the vendor will meet the stated release date, and
  • don't assume a level of quality until you see the product in action in your environment(s).

Challenge #7 - Varying Levels of Vendor Support

Vendor support is often high on the list of acquisition criteria. However, how can you know for sure your assessment is correct? The perception of vendor support can be a subjective one. Most people judge the quality of support based on one or a few incidents.

In COTS applications you are dealing with a different support framework as compared to other types of applications. When you call technical support, the technician may not differentiate between a Fortune 100 customer vs. an individual user at home.

Furthermore, when you find defects and report them to the vendor, there is no guarantee they will be fixed, even in future releases of the product.

Solution Strategies:

  • Talk to other users about their support experiences, keeping in mind that people will have a wide variety of experiences, both good and bad.
  • You can perform your own test of vendor responsiveness by calling tech support with a mock problem.

Challenge #8 - Difficulty in Regression Testing and Test Automation

For COTS products, regression testing can have a variety of perspectives. One perspective is to view a new release as a new version of the same basic product. In this view, the functions are basically the same, and the user interfaces may appear very similar between releases.

Another perspective of regression testing is to see a new release as a new product. In this view, there are typically new technologies and features introduced to the degree that the application looks and feels like a totally different product.

The goal of regression testing is to validate that functions work correctly as they did before an application was changed. For COTS, this means that the product still meets your needs in your environment as it did in the previous version used. Although the functions may appear different at points, the main concerns are that:

  • Features you use often have not been dropped
  • Performance has not degraded
  • Usability factors have not degraded
  • New features do not distract from core application processes
  • New technology does not require major infrastructure changes

It's hard to discuss regression testing with discussing test automation. Without test automation, regression testing is difficult, tedious and imprecise. However, test automation of COTS products is challenging due to:

  • Changing technologies between releases and versions
  • Low return on investment
  • The large scope of testing
  • Test tool incompatibility with the product

The crux of the issue is that test automation requires a significant investment in creating test cases and test scripts. The only ways to recoup the investment are:

  • Finding defects that outweigh the cost of creating the tests
  • Repeating the tests enough times to outweigh the manual testing effort

While it is possible that a defect may be found in the regression testing of a COTS product that may carry a high potential loss value, the more likely types of defects will be found in other forms of testing and will relate more to integration, interoperability, performance, compatibility, security and usability factors rather than correctness.

This leaves us with a ROI based on repeatability of the automated tests. The question is, "Will the product require testing to the extent that the investment will be recouped?"

If you are planning to test only one or two times per release, probably not. However, if you plan to use automated tools to test product performance on a variety of platforms, or to just test the correctness of installation, then you may well get a good return on your automation investment.

For the scope concern, much of the problem arises from the inability to identify effective test cases. Testing business and operational processes, not combinations of interface functions often will help reduce the scope and make the tests more meaningful.

Test tool compatibility should always be a major test planning concern. Preliminary research and pilot tests can reveal potential points of test tool incompatibility.

Solution Strategies:

  • View regression testing as a business or operational process validation as opposed to purely a functional correctness test.
  • Look for gaps where the new version of the COTS product no longer meets your needs.
  • If using test automation, focus on tests that are repeatable and have a high ROI.
  • Perform pilot tests to determine test tool compatibility.

Challenge #9 - Interoperability and Integration Issues

When dealing the spider web of application interfaces and the subsequent processing on all sides of the interfaces, the complexity level of testing interoperability becomes quite high.

Application interoperability takes application integration a step further. While integration addresses the ability to pass data and control between applications and components, interoperability addresses the ability for the sending and receiving applications to use the passed data and control to create correct processing results. It's one thing to pass the data, it's another thing for the receiving application to use it correctly.

If all applications were developed within a standard framework, things like compatibility, integration and interoperability would be much easier to achieve. However, there is a tradeoff between standards and innovation. As long as rapid innovation and time-to-market are primary business motivators, standards are not going to be a major influence on application development.

Some entities, such as the Department of Defense, have developed environments to certify applications as interoperable with an approved baseline before they can be integrated into the production baseline. This approach achieves a level of integration, but limits the availability of solutions in the baseline. Other organizations have made large investments in interoperability and compatibility test labs to measure levels of interoperability and compatibility. However, the effort and expense to build and maintain test labs can be large. In addition, you can only go so far in simulating environments where combinations of components are concerned.

Solution Strategies:

  • Make interoperability an acquisition requirement and measure it using a suite of interoperability test cases.
  • Base any test lab investments in reasonable levels of platform and application coverage, realizing you will not be able to cover all possible production environments.
  • Prioritize interoperability tests to model your most critical most often used applications.
  • Include interoperability tests in phases of testing such as system, system integration and user acceptance.

Summary

Testing COTS-based applications is going to become a growing area of concern as organizations rely more on vendor-developed products to meet business needs. Just because a vendor develops the product does not relieve the customer from the responsibility of testing to ensure the product will meet user and business needs.

In addition, the product may work in some environments but not others. Testing COTS products relies heavily on validation, which seeks to determine the correctness and fitness of use based on real-world cases and environments as opposed to documented specifications. Although aspects of the COTS product may be described in business needs and acquisition criteria, many tests of the product will likely be based in a customer's daily work processes.

The bottom line is that successfully testing COTS products is possible, but requires a different view of risk, processes, people and tools.