India Equity Research

Monday, July 23, 2007

To Access…. Run Command.

Accessibility Controls
access.cpl

Add Hardware Wizard
hdwwiz.cpl

Add/Remove Programs
appwiz.cpl

Administrative Tools
control.exe admintools

Automatic Updates
wuaucpl.cpl

Bluetooth Transfer Wizard
fsquirt

Calculator
calc

Certificate Manager
certmgr.msc

Character Map
charmap

Check Disk Utility
chkdsk

Clipboard Viewer
clipbrd

Command Prompt
cmd

Component Services
dcomcnfg

Computer Management
compmgmt.msc

Date and Time Properties
timedate.cpl

DDE Shares
ddeshare

Device Manager
devmgmt.msc

Direct X Control Panel (if installed)*
directx.cpl

Direct X Troubleshooter
dxdiag

Disk Cleanup Utility
cleanmgr

Disk Defragment
dfrg.msc

Disk Management
diskmgmt.msc

Disk Partition Manager
diskpart

Display Properties
control.exe desktop

Display Properties
desk.cpl

Display Properties (w/Appearance Tab Preselected)
control.exe color

Dr. Watson System Troubleshooting Utility
drwtsn32

Driver Verifier Utility
verifier

Event Viewer
eventvwr.msc

File Signature Verification Tool
sigverif

Findfast
findfast.cpl

Folders Properties
control.exe folders

Fonts
control.exe fonts

Fonts Folder
fonts

Free Cell Card Game
freecell

Game Controllers
joy.cpl

Group Policy Editor (XP Prof)
gpedit.msc

Hearts Card Game
mshearts

Iexpress Wizard
iexpress

Indexing Service
ciadv.msc

Internet Properties
inetcpl.cpl

Java Control Panel (if installed)
jpicpl32.cpl

Java Control Panel (if installed)
javaws

Keyboard Properties
control.exe keyboard

Local Security Settings
secpol.msc

Local Users and Groups
lusrmgr.msc

Logs You Out Of W*NDOW$
logoff

Mcft Chat
winchat

Minesweeper Game
winmine

Mouse Properties
control.exe mouse

Mouse Properties
main.cpl

Network Connections
control.exe netconnections

Network Connections
ncpa.cpl

Network Setup Wizard
netsetup.cpl

Nview Desktop Manager (if installed)
nvtuicpl.cpl

Object Packager
packager

ODBC Data Source Administrator
odbccp32.cpl

On Screen Keyboard
osk

Opens AC3 Filter (if installed)
ac3filter.cpl

Password Properties
password.cpl

Performance Monitor
perfmon.msc

Performance Monitor
perfmon

Phone and Modem Options
telephon.cpl

Power Configuration
powercfg.cpl

Printers and Faxes
control.exe printers

Printers Folder
printers

Private Character Editor
eudcedit

Quicktime (If Installed)
QuickTime.cpl

Regional Settings
intl.cpl

Registry Editor
regedit

Registry Editor
regedit32

Removable Storage
ntmsmgr.msc

Removable Storage Operator Requests
ntmsoprq.msc

Resultant Set of Policy
rsop.msc

Resultant Set of Policy (XP Prof)
rsop.msc

Scanners and Cameras
sticpl.cpl

Scheduled Tasks
control.exe schedtasks

Security Center
wscui.cpl

Services
services.msc

Shared Folders
fsmgmt.msc

Shuts Down W*NDOW$
shutdown

Sounds and Audio
mmsys.cpl

Spider Solitare Card Game
spider

SQL Client Configuration
cliconfg

System Configuration Editor
sysedit

System Configuration Utility
msconfig

System File Checker Utility
sfc

System Properties
sysdm.cpl

Task Manager
taskmgr

Telnet Client
telnet

User Account Management
nusrmgr.cpl

Utility Manager
utilman

WINDOWS Firewall
firewall.cpl

WINDOWS Magnifier
magnify

WINDOWS Management Infrastructure
wmimgmt.msc

WINDOWS System Security Tool
syskey

WINDOWS Update Launches
wupdmgr

WINDOWS XP Tour Wizard
tourstart

Wordpad
write

Friday, July 20, 2007

"Testing Relief" 1.0.0 - Test Your Program Code Selectively

Iron Ant today announces the availability of "Testing Relief" 1.0.0, a unique pretesting tool for applications created on Microsoft .NET Framework. Designed for developers following RUP discipline, "Testing Relief" applies a brand-new approach to software testing, allowing you to increase the quality of software testing while reducing the time needed for it.

One of the challenges in software development is testing an application whose programming code has been modified. It often happens that some parts of the code indirectly affected by the changes remain hidden for the developer, which results in errors. To find a bug takes quite some time, more so if the application has a complex architecture or a large amount of code. In such cases, testing and troubleshooting becomes tiresome and very time-consuming.

"Testing Relief" has been created to address the challenge mentioned above. When set to work, it'll analyze all the changes made to the application, evaluate how the changes influence all parts of the software and single out those parts of the code that depend on the changes but haven't been modified themselves. This will help the developer focus on the "suspicious" parts of the code and find the bug much quicker because, instead of full test of the code, the developer can test the code selectively, only those parts that have been influenced by changes.

This selective approach to testing helps the developer to reduce the time needed for testing the application and increases the quality of testing. It tells you about modifications in the code and influences these modifications have on the rest of the code.

Article sent by Balaji V

Thursday, July 12, 2007

Open Source Test Management Tools

In this post, i will share my views & findings with Open Source Test Management Tools (http://www.opensourcetesting.org/testmgt.php).
Usage of tools in the Test Management is becoming the basic need & It will be tough to manage the activities without tools.

The commercial tools in this segment are costly and may not fit into upcoming organizations budget. So i have been looking at Open Source Test Management tools. In the last week, I have evaluated some open source test management tools.

The following are my requirements for Test Management Tool
Capture the Requirements
Design Test Cases
Map Test Cases with Requirements
Link Bug reports with Test Case ID after the test execution
Test Execution Reports
Version Management for the Test Cases
Search Feature

The following tools are evaluated after the initial screening.
TestLink (http://testlink.sourceforge.net/docs/testLink.php)
RTH (http://www.rth-is-quality.com/home.php)

The above tools work with Apache, MySQL & PHP. Both the tools looks promising and the Test Link has some additional benefits in-terms of better reports and linking with popular issue trackers.

Gulf banks mega-merger

Emirates Bank International and National Bank of Dubai are merged to create the Gulf's largest bank by assets and market value.
It would have a combined capitalisation of $11.25bn (£5.6bn), topping National Bank of Abu Dhabi - the largest bank by market value in United Arab Emirates.
For more Info:http://news.bbc.co.uk/2/hi/business/6260326.stm

Wednesday, July 11, 2007

'I'm Testing ...'

Below are notes on what some of the descriptive words that I use to describe systems do for me. They may do a very different thing for you. And I guess that is to be expected, words are ambiguous.

When I say that I'm Testing [Insert Name of Software Here], that may not mean very much. It may not mean much to other people if they don't know the name, and it certainly doesn't expose the level of knowledge that I have about the system. What it can do for me is get me thinking in terms of the superficialities, of the first things that come to mind:
the adverts,
the basic feature set,
what general kind of things need to be tested
basic requirements

If I extensionalise the name and give it some extra meta-tag information (in this case the version number): I'm testing [Name of Software] version A.B. Then I start to think about:
changes in this version,
what needs to be tested specifically in this version?
what the requirements were for this version?
have all the necessary changes been made?
Delivery dates and timescales.

When I call the Application a 'System' I start to think about
an integration of parts,
interfaces,
data syntax,
flow through the system

When I call the Application 'Software' I start to think of it as a thing:
which runs on an operating system (what versions),
interfaces with hardware (disk),
data semantics
common elements to software (instruction manual, box, help file, install routine).

When I call the Application a 'Solution', I think:
for whom?
for what problem?

And when I say 'Application' I'm thinking:
apply it to what?
Use it for what?
How can I apply it?
What functionality does it have?

Tuesday, July 10, 2007

IBM agrees to buy software testing firm Telelogic

IBM has agreed a deal to buy Telelogic in a $745m deal that gives it access to application lifecycle management tools that span the testing, building and governing of software development projects.

Danny Sabbah, IBM Rational general manager, said the move would boost its presence in developing embedded systems for automotive, aerospace and other sectors as those markets demand more complex product designs. Telelogic will operate as part of the Rational developer family, he added.

In a statement, Telelogic CEO Anders Lidbeck said the deal would “provide clients with enhanced capabilities to develop and deploy complex systems on a global basis [through the ability to] leverage a broader set of capabilities".

IBM’s capture of Telelogic will have a huge effect on the software testing world, experts said.

“For both Telelogic and IBM customers [the combination] should ultimately be viewed as offering great potential,” wrote Ovum analyst Bola Rotibi in a research note.

“IBM Rational's testing facilities … have for sometime taken a back seat to the advance facilities offered by Compuware, Mercury and even Borland. Telelogic further strengthens the embedded capabilities [and] provides the boost that IBM Rational guys need to rework their wider testing portfolio. Watch out HP (Mercury), Compuware and Borland.”

Rotibi added that IBM would gain in requirements management where Telelogic’s Doors product is strong, contrasting with IBM’s Rational RequistePro, which has “languished untouched and with no clear strategy or vision of direction”.

However, she cautioned that IBM took a long time to create a roadmap after its acquisition of Rational.

IBM’s purchase of Telelogic also provides it with a greater say in IT governance, the booming sector that aims to help align business and IT processes, for example by visualising integrations and dependencies. Telelogic gained a foothold in IT governance with the acquisition of Popkin Software and its System Architect line in 2005. IBM’s Sabbah said the tools would be “complementary” to its plans.

James Rogers, chief marketing officer of another IT governance firm, Troux Technologies, said, “Popkin was a very strong leader but we have seen Telelgoic moving it towards application lifecycle management. But Doors is the gorilla in requirements management and Telelogic gives IBM access to some vertical markets.”

More broadly, Sabbah noted that IBM “is increasingly relying on software as its engine of growth”, adding that it expects software to generate half of its total profits in three years. The giant has spent about $16bn on over 50 acquisitions over the last five years and Sabbah said that aggressive strategy will continue, including deals that extend “beyond the traditional IT space”.

source www.vnunet.com

Different Types of Testing

Functionality Testing : The functionality of the application ( i.e GUI components ). against the specifications ( eg, if we click " submit" button, the application should display ..... " confirmation dialog box")

Acceptance testing: Formal testing conducted to determine whether a system satisfies its acceptance criteria and thus whether the customer should accept the system.

Regression Testing: Testing the application for checking whether the previous features are working properly or not, after adding new features to the application.

Stress Testing: The idea of stress testing is to find the breaking point in order to find bugs that will make that break potentially harmful.

Load Testing: This is merely testing at the highest transaction arrival rate in performance testing to see the resource contention, database locks etc...

Black-box Testing: Focuses on the program's functionality against the specification.

White-box Testing: Focuses on the paths of logic.

Unit Testing: The most 'micro' scale of testing; to test particular functions or code modules. Typically done by the programmer and not by testers, as it requires detailed knowledge of the internal program design and code. Not always easily done unless the application has a well-designed architecture with tight code; may require developing test driver modules or test harnesses.

Integration Testing - testing of combined parts of an application to determine if they function together correctly. The 'parts' can be code modules, individual applications, client and server applications on a network, etc. This type of testing is especially relevant to client/server and distributed systems.

Incremental Integration Testing: Continuous testing of an application as new functionality is added; requires that various aspects of an application's functionality be independent enough to work separately before all parts of the program are completed, or that test drivers be developed as needed; done by programmers or by testers.

Functional Testing: Black-box type testing geared to functional requirements of an application; this type of testing should be done by testers.

System Testing: Black-box type testing that is based on overall requirements specifications; covers all combined parts of a system.

Sanity Testing: Typically an initial testing effort to determine if a new software version is performing well enough to accept it for a major testing effort. For example, if the new software is crashing systems every 5 minutes, bogging down systems to a crawl, or destroying databases, the software may not be in a 'sane' enough condition to warrant further testing in its current state.

Performance Testing: This term is often used interchangeably with 'stress' and 'load' testing. Ideally 'performance' testing (and any other 'type' of testing) is defined in requirements documentation or QA or Test Plans.

Usability Testing: Testing for 'user-friendliness'. Clearly this is subjective, and will depend on the targeted end-user or customer. User interviews, surveys, video recording of user sessions, and other techniques can be used. Programmers and testers are usually not appropriate as usability testers.

Installation/Uninstallation Testing: Testing of full, partial, or upgrade install/uninstall processes.

Security Testing: Testing how well the system protects against unauthorized internal or external access, willful damage, etc; may require sophisticated testing techniques.

Compatability Testing: Testing how well software performs in a particular hardware/software/operating system/network/etc. environment.

Ad-hoc Testing: Testing the application in a random manner.

Alpha Testing: Testing of an application when development is nearing completion; minor design changes may still be made as a result of such testing. Typically done by end-users or others, not by programmers or testers.

Beta Testing: Testing when development and testing are essentially completed and final bugs and problems need to be found before final release. Typically done by end-users or others, not by programmers or testers.

Definition of Software Error - What is a software error?

One common definition of a software error is a mismatch between the program and its specification.

Definition #1:
“A mismatch between the program and its specification is an error in the program if and only if the specification exists and is correct.”

Definition #2:
“A software error is present for when the program does not do what its end user reasonability expects to do.” (Myers, 1976)

Categories of Software Errors

User interface errors, such as output errors, incorrect user messages.
Function errors
Defect hardware
Incorrect program version
Testing errors
Requirements errors
Design errors
Documentation errors
Architecture errors
Module interface errors
Performance errors
Error handling
Boundary-related errors
Logic errors such as
calculation errors
State-based behavior errors
Communication errors
Program structure errors, such as control-flow errors

Most programmers are rather cavalier about controlling the quality of the software they write. They bang out some code, run it through some fairly obvious ad hoc tests, and if it seems okay, they’re done. While this approach may work all right for small, personal programs, it doesn’t cut the mustard for professional software development. Modern software engineering practices include considerable effort directed toward software quality assurance and testing. The idea, of course, is to produce completed software systems that have a high probability of satisfying the customer’s needs.

There are two ways to deliver software free of errors. The first is to prevent the introduction of errors in the first place. And the second is to identify the bugs lurking in your code, seek them out, and destroy them. Obviously, the first method is superior. A big part of software quality comes from doing a good job of defining the requirements for the system you’re building and designing a software solution that will satisfy those requirements. Testing concentrates on detecting those errors that creep in despite your best efforts to keep them out.

Defect Severity and Defet Priority

The defect Severity scale for determining defect criticality and the associated defect Priority levels to be assigned to errors found in software. It is a scale which can be easily adapted to other automated test management tools.

ANSI/IEEE Std 729-1983 Glossary of Software Engineering Terminology defines Criticality as,
"A classification of a software error or fault based on an evaluation of the degree of impact that error or fault on the development or operation of a system (often used to determine whether or when a fault will be corrected)."

The severity framework for assigning defect criticality that has proven most useful in actual testing practice is a five level scale. The criticality associated with each level is based on the answers to several questions.

First, it must be determined if the defect resulted in a system failure. ANSI/IEEE Std 729-1983 defines a failure as,
"The termination of the ability of a functional unit to perform its required function."

Second, the probability of failure recovery must be determined. ANSI/IEEE 729-1983 defines failure recovery as,
"The return of a system to a reliable operating state after failure."

Third, it must be determined if the system can do this on its own or if remedial measures must be implemented in order to return the system to reliable operation.

Fourth, it must be determined if the system can operate reliably with the defect present if it is not manifested as a failure.

Fifth, it must be determined if the defect should or should not be repaired.
The following five level scale of defect criticality addresses the these questions.

The five Levels are:
1. Critical
2. Major
3. Average
4. Minor
5. Exception

1. Critical - The defect results in the failure of the complete software system, of a subsystem, or of a software unit (program or module) within the system.

2. Major - The defect results in the failure of the complete software system, of a subsystem, or of a software unit (program or module) within the system. There is no way to make the failed component(s), however, there are acceptable processing alternatives which will yield the desired result.

3. Average - The defect does not result in a failure, but causes the system to produce incorrect, incomplete, or inconsistent results, or the defect impairs the systems usability.

4. Minor - The defect does not cause a failure, does not impair usability, and the desired processing results are easily obtained by working around the defect.

5. Exception - The defect is the result of non-conformance to a standard, is related to the aesthetics of the system, or is a request for an enhancement. Defects at this level may be deferred or even ignored.
In addition to the defect severity level defined above, defect priority level can be used with severity categories to determine the immediacy of repair.

A five repair priority scale has also be used in common testing practice. The levels are:
1. Resolve Immediately
2. Give High Attention
3. Normal Queue
4. Low Priority
5. Defer

1. Resolve Immediately - Further development and/or testing cannot occur until the defect has been repaired. The system cannot be used until the repair has been effected.

2. Give High Attention - The defect must be resolved as soon as possible because it is impairing development/and or testing activities. System use will be severely affected until the defect is fixed.

3. Normal Queue - The defect should be resolved in the normal course of development activities. It can wait until a new build or version is created.

4. Low Priority - The defect is an irritant which should be repaired but which can be repaired after more serious defect have been fixed.

5. Defer - The defect repair can be put of indefinitely. It can be resolved in a future major system revision or not resolved at all.

Checklist for Test Preparation

Listed below are questions/suggestions for systematically planning and preparing software testing.

Have you planned for an overall testing schedule and the personnel required, and associated training requirements?

Have the test team members been given assignments?

Have you established test plans and test procedures for
module testing, integration testing, system testing, and
acceptance testing?

Have you designed at least one black-box test case for each system function?

Have you designed test cases for verifying quality objectives/factors (e.g. reliability, maintainability, etc.)?

Have you designed test cases for verifying resource objectives?

Have you defined test cases for performance tests, boundary tests, and usability tests?

Have you designed test cases for stress tests (intentional attempts to break system)?

Have you designed test cases with special input values (e.g. empty files)?

Have you designed test cases with default input values?

Have you described how traceability of testing to requirements is to be demonstrated (e.g. references to the specified functions and requirements)?

Do all test cases agree with the specification of the function or requirement to be tested?

Have you sufficiently considered error cases? Have you designed test cases for invalid and unexpected input conditions as well as valid conditions?

Have you defined test cases for white-box-testing (structural tests)?

Have you stated the level of coverage to be achieved by structural tests?

Have you unambiguously provided test input data and expected test results or expected messages for each test case?

Have you documented the purpose of and the capability demonstrated by each test case?

Is it possible to meet and to measure all test objectives defined (e.g. test coverage)?

Have you defined the test environment and tools needed for executing the software test?

Have you described the hardware configuration an resources needed to implement the designed test cases?

Have you described the software configuration needed to implement the designed test cases?

Have you described the way in which tests are to be recorded?

Have you defined criteria for evaluating the test results?

Have you determined the criteria on which the completion of the test will be judged?

Have you considered requirements for regression testing?

Developing a Test Specification

Definitions
I’ve seen the terms “Test Plan” and “Test Specification” mean slightly different things over the years. In a formal sense (at this given point in time for me), we can define the terms as follows:

Test Specification – a detailed summary of what scenarios will be tested, how they will be tested, how often they will be tested, and so on and so forth, for a given feature. Examples of a given feature include, “Intellisense, Code Snippets, Tool Window Docking, IDE Navigator.” Trying to include all Editor Features or all Window Management Features into one Test Specification would make it too large to effectively read.
Test Plan – a collection of all test specifications for a given area. The Test Plan contains a high-level overview of what is tested (and what is tested by others) for the given feature area. For example, I might want to see how Tool Window Docking is being tested. I can glance at the Window Management Test Plan for an overview of how Tool Window Docking is tested, and if I want more info, I can view that particular test specification.
If you ask a tester on another team what’s the difference between the two, you might receive different answers. In addition, I use the terms interchangeably all the time at work, so if you see me using the term “Test Plan”, think “Test Specification.”

Parts of a Test Specification
A Test Specification should consist of the following parts:
History / Revision - Who created the test spec? Who were the developers and Program Managers (Usability Engineers, Documentation Writers, etc) at the time when the test spec was created? When was it created? When was the last time it was updated? What were the major changes at the time of the last update?

Feature Description – a brief description of what area is being tested.
What is tested? – a quick overview of what scenarios are tested, so people looking through this specification know that they are at the correct place.

What is not tested? - are there any areas being covered by different people or different test specs? If so, include a pointer to these test specs.

Nightly Test Cases – a list of the test cases and high-level description of what is tested each night (or whenever a new build becomes available). This bullet merits its own blog entry. I’ll link to it here once it is written.

Breakout of Major Test Areas - This section is the most interesting part of the test spec where testers arrange test cases according to what they are testing. Note: in no way do I claim this to be a complete list of all possible Major Test Areas. These areas are examples to get you going.

Specific Functionality Tests – Tests to verify the feature is working according to the design specification. This area also includes verifying error conditions.

Security tests – any tests that are related to security. An excellent source for populating this area comes from the Writing Secure Code book.

Accessibility Tests – This section shouldn’t be a surprised to any of my blog readers. See The Fundamentals of Accessibility for more info.

Stress Tests – This section talks about what tests you would apply to stress the feature.

Performance Tests - this section includes verifying any perf requirements for your feature.

Edge cases – This is something I do specifically for my feature areas. I like walking through books like How to break software, looking for ideas to better test my features. I jot those ideas down under this section

Localization / Globalization - tests to ensure you’re meeting your product’s International requirements.

Setting Test Case Priority

A Test Specification may have a couple of hundred test cases, depending on how the test cases were defined, how large the feature area is, and so forth. It is important to be able to query for the most important test cases (nightly), the next most important test cases (weekly), the next most important test cases (full test pass), and so forth. A sample prioritization for test cases may look like:

Highest priority (Nightly) – Must run whenever a new build is available

Second highest priority (Weekly) – Other major functionality tests run once every three or four builds

Lower priority – Run once every major coding milestone

What is User Acceptance Testing?

Concept
UAT goes under many names. As well as user acceptance testing, it is also known as Beta Testing (usually in the PC world) QA Testing, Application Testing, End User Testing.

Developing software is a expensive business. It is expensive in ;
time, as the software must be analysed, specified, designed and written
people, as very few development projects are one man jobs
money, the people responsible for the analysis, specification and development of software do not come cheap (look at the current rates for contractors!)
If having expended all these peoples time and the company's money, if the resulting software is not completely suitable to the purpose required, then that time and money has not been fully utilised.

If the software is suitable to the purpose, but; does not dovetail precisely with the business processes makes processes more difficult to do than before causes business processes to take longer than previously
makes additional processes necessary, with making other processes obsolete then you may not will not see return on your investment in the software until much later, or may not even see a return on your investment.

Question : how do we ensure that we do not end up in this situation?
Answer : we test the software against objective criteria to ensure that we don't
Previously, most testing was left in the hands of the development teams, with the end users trusting those teams to deliver applications that were not only fully functional and stable, but also applications that would dovetail into business processes, and support those processes (maybe even make things a little bit easier)

However, the testing executed by developers is to ensure that the code they have created is stable and functional. They will test that;

they cover all the lines and logic paths through their code
all the screens flow backwards and forwards in the correct order
the software meets the functions specified (eg calculations are correct, reports have correct columns, screens have correct validation etc)
This testing might not be done through the application itself (often because it has not been completely built while they are testing), so they will only add a few records, maybe by editing the file/table and adding the records, rather than using the 'Record Entry Screen'.

THIS IS NOT A PROBLEM.
I AM NOT DEGRADING THE SYSTEM AND UNIT TESTING DONE BY THE DEVELOPERS
As we will see later, this does not pose us a problem, because the UAT department will cover this testing. This system testing and unit testing by the developers is still very valid and useful. I would rather take delivery of an application where the development team say "We have done this, done that, hacked this file and added a few records, ran a few test cases through here - and everything seems OK", then take an application which has not gone through any system testing.

The application that has been tested by the developers will have had most of the obvious flaws identified and ironed out, and only the types of issues the testing was designed for should be identified. The second application will be completely unknown, and some of the time allocated for UAT will be spent identifying and fixing problems problems that could have been easily identified and rectified by the developers.

Also, because the developers are testing their own work, there is a tendency for them to skip areas because they 'know that there won't be a problem there'.

I have spoke to developers who have came to our company from places that do not do UAT, and they are both impressed with how we do things, but also like the idea of an independent third party testing their software.

These people are professional software developers, and they do not want to supply something that isn't exactly what's wanted, and they feel that the UA testing gives them a large comfort zone that any problems with their work will be identified and escalated back to them for correction.

As I said, these issues do not prove a problem to the user acceptance tester.

The four issues of the software delivered not matching the business process, making things more difficult etc are circumvented by the user acceptance tester.

While the developer tests against the system specification and technical documentation, the user acceptance tester test against the business requirements. The former tests the code, the latter the application. We will come to the test planning in a bit.

The issue of the developer testing their own work ceases to be an issue, as the UAT team will design a testing strategy that covers all areas of the business requirements, whether or not the developer feels there may be problems in a specific area.

The issue of additional processes being necessary should also not be a problem. As I said before, the UAT team tests the application against the business requirements, so all testing is done through the use of the proper system transactions.

The UAT team do not hack tables / file to create data, if a client record is needed for a test, then the UAT will create this client, by use of the formal client maintenance transaction, not by adding a record to the 'Client_Details' file.

This use of the formal application transaction transactions serves two purposes

it tests all the transactions that the business users shall run, giving complete 'business' coverage (as opposed to code coverage, or logic path coverage)
it will highlight any potential areas of adverse impacts on the business processes. If the contents of a form (eg an application form for a new life assurance policy) is used as the basis for creating a new life assurance policy record, then the use of the formal 'New Life Assurance Policy' transaction will determine whether the transaction works, and also whether the form holds the requisite information to create the policy records.
The 'New Life Assurance Policy' system may require the client to declare whether they smoke or not, however, if this question is not on the application form, then the business users will have to re-contact every new client, to determine whether or not they smoke!

We can see then, that it is the role of user acceptance testing to not only prove whether or not an application works, but also to prove how it will fit with business processes.

User Acceptance Testing Processes
OK, now we have determined what UAT is, now we need to look at HOW we achieve these objectives.

The user acceptance test life cycle follows the path shown below (obviously at a very high level);

analysis of business requirements We can't do anything concerning testing until we understand what the developments are supposed to achieve. This is quite an intangible step in the process, and consists mostly of thought processes, meeting etc. The end result, is a clear vision,in the testers mind, of what they are going to be expected to prove, and why it is necessary.
analysis of testing requirements. This is more tangible than the first stage, and consists of documenting the areas of the development that require testing, the methodologies you will need to use to test them, and the results to expect to be returned when you test them.
Execution of testing. Doing the business. This is what it all boils down to. Every development project will be different, and you will have had enough experience in this part of the cycle to not need any pointers from me!
Getting the testing signed off. There is no use going through all of these processes, raising problems to developments teams, having more work done by the development teams in fixing those problems, re-testing the changes and re-doing all your regression scripts, unless at the end of the day, you can the users to sign off the changes.

What is a Test Case?

Definition of Test Case

- In software engineering, a test case is a set of conditions or variables under which a tester will determine if a requirement upon an application is partially or fully satisfied. It may take many test cases to determine that a requirement is fully satisfied. In order to fully test that all the requirements of an application are met, there must be at least one test case for each requirement unless a requirement has sub requirements. In that situation, each sub requirement must have at least one test case.

More Definitions of a Test Case.

- A test case is also defined as a sequence of steps to test the correct behavior of a functionality/feature of an application.

- A set of inputs, execution preconditions, and expected outcomes developed for a particular objective, such as to exercise a particular program path or to verify compliance with a specific requirement.

- A test case is a list of the conditions or issues of what the tester want to test in a software. Test case helps to come up with test data. A test case has an input description, Test sequence and an expected behavior.

The characteristics of a test case is that there is a known input and an expected output, which is worked out before the test. The known input should test a pre-condition and the expected output should test a post-condition.

Under special circumstances, there could be a need to run the test, produce results - and a team of experts evaluate if the results can be considered as passed. The first test is taken as the base line for subsequent test / product release cycles.

Test cases include a description of the functionality to be tested taken from either the requirements or use cases, and the preparation required to ensure that the test can be conducted.

Saturday, July 7, 2007

Roles in software testing

Software testing can be done by software testers. Until the 1950s the term software tester was used generally, but later it was also seen as a separate profession. Regarding the periods and the different goals in software testing (see D. Gelperin and W.C. Hetzel) there have been established different roles: test lead/manager, tester, test designer, test automater/automation developer, test administrator. participants of testing team
1.Tester
2.Developer
3.Customer
4.Information Service Management
5.Senior Organization Management
6.Quality team

Certification

Several certification programs exist to support the professional aspirations of software testers and quality assurance specialists. No certification currently offered actually requires the applicant to demonstrate the ability to test software. No certification is based on a widely accepted body of knowledge. No certification board decertifies individuals.[verification needed][citation needed] This has led some to declare that the testing field is not ready for certification . Certification itself cannot measure an individual's productivity, their skill, or practical knowledge, and cannot guarantee their competence, or professionalism as a tester .

Open Certification of Software Testers

Certifications can be grouped into: exam-based and education-based. Exam-based certifications: For these there is the need to pass an exam, which can also be learned by self-study: e.g. for ISTQB or QAI. Education-based certifications are instructor-led sessions, where each course has to be passed, e.g. IIST (International Institute for Software Testing).

Testing certifications

Quality assurance certifications

Manual vs. Automated

Some writers believe that test automation is so expensive relative to its value that it should be used sparingly. Others recommend automating 100% of all tests. A challenge with automation is that automated testing requires automated test oracles (an oracle is a mechanism or principle by which a problem in the software can be recognized). Such tools have value in load testing software (by signing on to an application with hundreds or thousands of instances simultaneously), or in checking for intermittent errors in software. The success of automated software testing depends on complete and comprehensive test planning. Software development strategies such as test-driven development are highly compatible with the idea of devoting a large part of an organization's testing resources to automated testing. Many large software organizations perform automated testing. Some have developed their own automated testing environments specifically for internal development, and not for resale.

Exploratory vs. Scripted

Exploratory testing means simultaneous, test design, and test execution, with an emphasis on learning. Scripted testing means that learning and test design happens prior to test execution, and quite often the learning has to be done again during test execution. Exploratory testing is very common, but in most writing and training about testing it is barely mentioned and generally misunderstood. Some writers consider it a primary and essential practice. Structured exploratory testing is a compromise when the testers are familiar with the software. A vague test plan, known as a test charter, is written up, describing what functionalities need to be tested but not how, allowing the individual testers to choose the method and steps of testing.

There are two main disadvantages associated with a primarily exploratory testing approach. The first is that there is no opportunity to prevent defects, which can happen when the designing of tests in advance serves as a form of structured static testing that often reveals problems in system requirements and design. The second is that, even with test charters, demonstrating test coverage and achieving repeatability of tests using a purely exploratory testing approach is difficult. For this reason, a blended approach of scripted and exploratory testing is often used to reap the benefits of while mitigating each approach's disadvantages.

Agile vs. Traditional

Starting around 1990, a new style of writing about testing began to challenge what had come before. The seminal work in this regard is widely considered to be Testing Computer Software, by Cem Kaner (1988, ISBN 0-8306-9563-X; as of 1999 in a 3rd edition, ISBN 1-85032-908-7). Instead of assuming that testers have full access to source code and complete specifications, these writers, including Kaner and James Bach, argued that testers must learn to work under conditions of uncertainty and constant change. Meanwhile, an opposing trend toward process "maturity" also gained ground, in the form of the Capability Maturity Model. The agile testing movement (which includes but is not limited to forms of testing practiced on agile development

projects) has popularity mainly in commercial circles, whereas the CMM was embraced by government and military software providers.

However, saying that "maturity models" like CMM gained ground against or opposing Agile testing may not be right. Agile movement is a 'way of working', while CMM are a process improvement idea.

There is however another point of view that must be considered. That is the operational culture of an organization. While it may be true that testers must have an ability to work in a world of uncertainty, it is also true that their flexibility must have direction. In many cases test cultures are self-directed and as a result fruitless, unproductive results can ensue. Furthermore, providing of positive defect evidence may only reflect either the tip of a much larger problem or that you have exhausted all possibilities. A framework is a test of Testing. It provides a boundary that can measure (validate) the capacity of our work. Both sides have, and will continue to argue the virtues of their work. The proof however is in each and every assessment of delivery quality. It does little good to test systematically if you are too narrowly focused. On the other hand, finding a bunch of errors is not an indicator that Agile methods was the driving force; you may simply have stumbled upon an obviously poor piece of work.

Controversy

There is considerable controversy among testing writers and consultants about what constitutes responsible software testing. Members of the "context-driven" school of testing believe that there are no "best practices" of testing, but rather that testing is a set of skills that allow the tester to select or invent testing practices to suit each unique situation. This belief directly contradicts standards such as the IEEE 829 test documentation standard, and organizations such as the Food and Drug Administration who promote them.

Code Coverage

Code coverage is inherently a white box testing activity. The target software is built with special options or libraries and/or run under a special environment such that every function that is exercised (executed) in the program(s) are mapped back to the function points in the source code. This process allows developers and quality assurance personnel to look for parts of a system that are rarely or never accessed under normal conditions (error handling and the like) and helps reassure test engineers that the most important conditions (function points) have been tested.

Test engineers can look at code coverage test results to help them devise test cases and input or configuration sets that will increase the code coverage over vital functions. Two common forms of code coverage used by testers are statement (or line) coverage, and path (or edge) coverage. Line coverage reports on the execution footprint of testing in terms of which lines of code were executed to complete the test. Edge coverage reports which branches, or code decision points were executed to complete the test. They both report a coverage metric, measured as a percentage.

Generally code coverage tools and libraries exact a performance and/or memory or other resource cost which is unacceptable to normal operations of the software. Thus they are only used in the lab. As one might expect there are classes of software that cannot be feasibly subjected to these coverage tests, though a degree of coverage mapping can be approximated through analysis rather than direct testing.

There are also some sorts of defects which are affected by such tools. In particular some race conditions or similar real time sensitive operations can be masked when run under code coverage environments; and conversely some of these defects may become easier to find as a result of the additional overhead of the testing code.

A sample testing cycle

Although testing varies between organizations, there is a cycle to testing:
  1. Requirements Analysis: Testing should begin in the requirements phase of the software development life cycle.
    During the design phase, testers work with developers in determining what aspects of a design are testable and under what parameter those tests work.
  2. Test Planning: Test Strategy, Test Plan(s), Test Bed creation.
  3. Test Development: Test Procedures, Test Scenarios, Test Cases, Test Scripts to use in testing software.
  4. Test Execution: Testers execute the software based on the plans and tests and report any errors found to the development team.
  5. Test Reporting: Once testing is completed, testers generate metrics and make final reports on their test effort and whether or not the software tested is ready for release.
  6. Retesting the Defects

Not all errors or defects reported must be fixed by a software development team. Some may be caused by errors in configuring the test software to match the development or production environment. Some defects can be handled by a workaround in the production environment. Others might be deferred to future releases of the software, or the deficiency might be accepted by the business user. There are yet other defects that may be rejected by the development team (of course, with due reason) if they deem it inappropriate to be called a defect.

Test cases, suites, scripts and scenarios

A test case is a software testing document,which consists of event, action, input, output, expected result and actual result. Clinically defined (IEEE 829-1998) a test case is an input and an expected result. This can be as pragmatic as 'for condition x your derived result is y', whereas other test cases described in more detail the input scenario and what results might be expected. It can occasionally be a series of steps (but often steps are contained in a separate test procedure that can be exercised against multiple test cases, as a matter of economy) but with one expected result or expected outcome. The optional fields are a test case ID, test step or order of execution number, related requirement(s), depth, test category, author, and check boxes for whether the test is automatable and has been automated. Larger test cases may also contain prerequisite states or steps, and descriptions. A test case should also contain a place for the actual result. These steps can be stored in a word processor document, spreadsheet, database or other common repository. In a database system, you may also be able to see past test results and who generated the results and the system configuration used to generate those results. These past results would usually be stored in a separate table.

The term test script is the combination of a test case, test procedure and test data. Initially the term was derived from the byproduct of work created by automated regression test tools. Today, test scripts can be manual, automated or a combination of both.

The most common term for a collection of test cases is a test suite. The test suite often also contains more detailed instructions or goals for each collection of test cases. It definitely contains a section where the tester identifies the system configuration used during testing. A group of test cases may also contain prerequisite states or steps, and descriptions of the following tests.

Collections of test cases are sometimes incorrectly termed a test plan. They might correctly be called a test specification. If sequence is specified, it can be called a test script, scenario or procedure.

Test levels

  • Unit testing tests the minimal software component, or module.
  • Integration testing exposes defects in the interfaces and interaction between integrated components (modules).
  • Functional testing tests the product according to programmable work.
  • System testing tests an integrated system to verify/validate that it meets its requirements.
  • Acceptance testing can be conducted by the client. It allows the end-user, customer or client to decide whether or not to accept the product. Acceptance testing may be performed after the testing and before the implementation phase.
    • Alpha testing is simulated or actual operational testing by potential users/customers or an independent test team at the developers' site. Alpha testing is often employed for off-the-shelf software as a form of internal acceptance testing, before the software goes to beta testing.
    • Beta testing comes after alpha testing. Versions of the software, known as beta versions, are released to a limited audience outside of the company. The software is released to groups of people so that further testing can ensure the product has few faults or bugs. Sometimes, beta versions are made available to the open public to increase the feedback field to a maximal number of future users.

It should be noted that although both Alpha and Beta are referred to as testing it is in fact use immersion. The rigors that are applied are often unsystematic and many of the basic tenets of testing process are not used. The Alpha and Beta period provides insight into environmental and utilization conditions that can impact the software.

After modifying software, either for a change in functionality or to fix defects, a regression test re-runs previously passing tests on the modified software to ensure that the modifications haven't unintentionally caused a regression of previous functionality. Regression testing can be performed at any or all of the above test levels. These regression tests are often automated.

White-box and black-box testing

White box and black box testing are terms used to describe the point of view a test engineer takes when designing test cases. Black box being an external view of the test object and white box being an internal view. Software testing is partly intuitive, but largely systematic. Good testing involves much more than just running the program a few times to see whether it works. Thorough analysis of the program under test, backed by a broad knowledge of testing techniques and tools are prerequisites to systematic testing. Software Testing is the process of executing software in a controlled manner; in order to answer the question “Does this software behave as specified?” Software testing is used in association with verification and validation. Verification is the checking of or testing of items, including software, for conformance and consistency with an associated specification. Software testing is just one kind of verification, which also uses techniques such as reviews, inspections, and walk-throughs. Validation is the process of checking what has been specified is what the user actually wanted.
  • Validation: Are we doing the right job?
  • Verification: Are we doing the job right?

In order to achieve consistency in testing, it is imperative to have and follow a set of testing principles. This enhances the efficiency of testing within SQA team members and thus contributes to increased productivity.

There are generally 3 main levels of software testing carried out:

  • Unit Testing: in which each unit (basic component) of the software is tested to verify that the detailed design for the unit has been correctly implemented
  • Integration testing: in which progressively larger groups of tested software components corresponding to elements of the architectural design are integrated and tested until the software works as a whole.
  • System testing: in which the software is integrated to the overall product and tested to show that all requirements are met

A further level of testing is also done, in accordance with requirements:

  • Acceptance testing: upon which the acceptance of the complete software is based. The clients often do this.

Changes made in order to fix bugs or develop further fucntionality may cause previously successful tests to fail, so regression testing is used to repeat the earlier successful tests to ensure that changes made in the software have not introduced new bugs/side effects.

In recent years the term grey box testing has come into common usage. The typical grey box tester is permitted to set up or manipulate the testing environment, like seeding a database, and can view the state of the product after his actions, like performing a SQL query on the database to be certain of the values of columns. It is used almost exclusively of client-server testers or others who use a database as a repository of information, but can also apply to a tester who has to manipulate XML files (DTD or an actual XML file) or configuration files directly, or perform testing like SQL injection. It can also be used by testers who know the internal workings or algorithm of the software under test and can write tests specifically for the anticipated results. For example, testing a data warehouse implementation involves loading the target database with information, and verifying the correctness of data population and loading of data into the correct tables.

History of TESTING

The separation of debugging from testing was initially introduced by Glenford J. Myers in his 1978 book the "Art of Software Testing". Although his attention was on breakage testing it illustrated the desire of the software engineering community to separate fundamental development activities, such as debugging, from that of verification. Drs. Dave Gelperin and William C. Hetzel classified in 1988 the phases and goals in software testing as follows: until 1956 it was the debugging oriented period, where testing was often associated to debugging: there was no clear difference between testing and debugging. From 1957-1978 there was the demonstration oriented period where debugging and testing was distinguished now - in this period it was shown, that software satisfies the requirements. The time between 1979-1982 is announced as the destruction oriented period, where the goal was to find errors. 1983-1987 is classified as the evaluation oriented period: intention here is that during the software lifecycle a product evaluation is provided and measuring quality. From 1988 on it was seen as prevention oriented period where tests were to demonstrate that software satisfies its specification, to detect faults and to prevent faults. Dr. Gelperin chaired the IEEE 829-1989 (Test Documentation Standard) with Dr. Hetzel writing the book "The Complete Guide of Software Testing". Both works were pivotal in to today's testing culture and remain a consistent source of reference. Dr. Gelperin and Jerry E. Durant also went on to develop High Impact Inspection Technology that builds upon traditional Inspections but utilizes a test driven additive.

How can World Wide Web sites be tested?

Web sites are essentially client/server applications - with web servers and 'browser' clients. Consideration should be given to the interactions between html pages, TCP/IP communications, Internet connections, firewalls, applications that run in web pages (such as applets, javascript, plug-in applications), and applications that run on the server side (such as cgi scripts, database interfaces, logging applications, dynamic page generators, asp, etc.). Additionally, there are a wide variety of servers and browsers, various versions of each, small but sometimes significant differences between them, variations in connection speeds, rapidly changing technologies, and multiple standards and protocols. The end result is that testing for web sites can become a major ongoing effort. Other considerations might include:
  • What are the expected loads on the server (e.g., number of hits per unit time?), and what kind of performance is required under such loads (such as web server response time, database query response times). What kinds of tools will be needed for performance testing (such as web load testing tools, other tools already in house that can be adapted, web robot downloading tools, etc.)?
  • Who is the target audience? What kind of browsers will they be using? What kind of connection speeds will they by using? Are they intra- organization (thus with likely high connection speeds and similar browsers) or Internet-wide (thus with a wide variety of connection speeds and browser types)?
  • What kind of performance is expected on the client side (e.g., how fast should pages appear, how fast should animations, applets, etc. load and run)?
  • Will down time for server and content maintenance/upgrades be allowed? how much?
  • What kinds of security (firewalls, encryptions, passwords, etc.) will be required and what is it expected to do? How can it be tested?
  • How reliable are the site's Internet connections required to be? And how does that affect backup system or redundant connection requirements and testing?
  • What processes will be required to manage updates to the web site's content, and what are the requirements for maintaining, tracking, and controlling page content, graphics, links, etc.?
  • Which HTML specification will be adhered to? How strictly? What variations will be allowed for targeted browsers?
  • Will there be any standards or requirements for page appearance and/or graphics throughout a site or parts of a site??
  • How will internal and external links be validated and updated? how often?
  • Can testing be done on the production system, or will a separate test system be required? How are browser caching, variations in browser option settings, dial-up connection variabilities, and real-world internet 'traffic congestion' problems to be accounted for in testing?
  • How extensive or customized are the server logging and reporting requirements; are they considered an integral part of the system and do they require testing?
  • How are cgi programs, applets, javascripts, ActiveX components, etc. to be maintained, tracked, controlled, and tested?
Some sources of site security information include the Usenet newsgroup 'comp.security.announce' and links concerning web site security

Some usability guidelines to consider - these are subjective and may or may not apply to a given situation

  • Pages should be 3-5 screens max unless content is tightly focused on a single topic. If larger, provide internal links within the page.
  • The page layouts and design elements should be consistent throughout a site, so that it's clear to the user that they're still within a site.
  • Pages should be as browser-independent as possible, or pages should be provided or generated based on the browser-type.
  • All pages should have links external to the page; there should be no dead-end pages.
  • The page owner, revision date, and a link to a contact person or organization should be included on each page.

What if there isn't enough time for thorough testing?

Use risk analysis to determine where testing should be focused.
Since it's rarely possible to test every possible aspect of an application, every possible combination of events, every dependency, or everything that could go wrong, risk analysis is appropriate to most software development projects. This requires judgement skills, common sense, and experience. (If warranted, formal methods are also available.) Considerations can include:
  • Which functionality is most important to the project's intended purpose?
  • Which functionality is most visible to the user?
  • Which functionality has the largest safety impact?
  • Which functionality has the largest financial impact on users?
  • Which aspects of the application are most important to the customer?
  • Which aspects of the application can be tested early in the development cycle?
  • Which parts of the code are most complex, and thus most subject to errors?
  • Which parts of the application were developed in rush or panic mode?
  • Which aspects of similar/related previous projects caused problems?
  • Which aspects of similar/related previous projects had large maintenance expenses?
  • Which parts of the requirements and design are unclear or poorly thought out?
  • What do the developers think are the highest-risk aspects of the application?
  • What kinds of problems would cause the worst publicity?
  • What kinds of problems would cause the most customer service complaints?
  • What kinds of tests could easily cover multiple functionalities?
  • Which tests will have the best high-risk-coverage to time-required ratio?

How can it be known when to stop testing?

This can be difficult to determine. Many modern software applications are so complex, and run in such an interdependent environment, that complete testing can never be done. Common factors in deciding when to stop are:
  • Deadlines (release deadlines, testing deadlines, etc.)
  • Test cases completed with certain percentage passed
  • Test budget depleted
  • Coverage of code/functionality/requirements reaches a specified point
  • Bug rate falls below a certain level
  • Beta or alpha testing period ends

What can be done if requirements are changing continuously?

This is a common problem for organizations where there are expectations that requirements can be pre-determined and remain stable. If these expectations are reasonable, here are some approaches:
  • Work with the project's stakeholders early on to understand how requirements might change so that alternate test plans and strategies can be worked out in advance, if possible.
  • It's helpful if the application's initial design allows for some adaptability so that later changes do not require redoing the application from scratch.
  • If the code is well-commented and well-documented this makes changes easier for the developers.
  • Use some type of rapid prototyping whenever possible to help customers feel sure of their requirements and minimize changes.
  • The project's initial schedule should allow for some extra time commensurate with the possibility of changes.
  • Try to move new requirements to a 'Phase 2' version of an application, while using the original requirements for the 'Phase 1' version.
  • Negotiate to allow only easily-implemented new requirements into the project, while moving more difficult new requirements into future versions of the application.
  • Be sure that customers and management understand the scheduling impacts, inherent risks, and costs of significant requirements changes. Then let management or the customers (not the developers or testers) decide if the changes are warranted - after all, that's their job.
  • Balance the effort put into setting up automated testing with the expected effort required to refactor them to deal with changes.
  • Try to design some flexibility into automated test scripts.
  • Focus initial automated testing on application aspects that are most likely to remain unchanged.
  • Devote appropriate effort to risk analysis of changes to minimize regression testing needs.
  • Design some flexibility into test cases (this is not easily done; the best bet might be to minimize the detail in the test cases, or set up only higher-level generic-type test plans)
  • Focus less on detailed test plans and test cases and more on ad hoc testing (with an understanding of the added risk that this entails).

If this is a continuing problem, and the expectation that requirements can be pre-determined and remain stable is NOT reasonable, it may be a good idea to figure out why the expectations are not aligned with reality, and to refactor an organization's or project's software development process to take this into account. It may be appropriate to consider agile development approaches.

What if the application has functionality that wasn't in the requirements?

It may take serious effort to determine if an application has significant unexpected or hidden functionality, and it could indicate deeper problems in the software development process. If the functionality isn't necessary to the purpose of the application, it should be removed, as it may have unknown impacts or dependencies that were not taken into account by the designer or the customer. (If the functionality is minor and low risk then no action may be necessary.) If not removed, information will be needed to determine risks and to determine any added testing needs or regression testing needs. Management should be made aware of any significant added risks as a result of the unexpected functionality.

This problem is a standard aspect of projects that include COTS (Commercial Off-The-Shelf) software or modified COTS software. The COTS part of the project will typically have a large amount of functionality that is not included in project requirements, or may be simply undetermined. Depending on the situation, it may be appropriate to perform in-depth analysis of the COTS software and work closely with the end user to determine which pre-existing COTS functionality is important and which functionality may interact with or be affected by the non-COTS aspects of the project. A significant regression testing effort may be needed (again, depending on the situation), and automated regression testing may be useful.

What's the best approach to software test estimation?

There is no simple answer for this. The 'best approach' is highly dependent on the particular organization and project and the experience of the personnel involved.

For example, given two software projects of similar complexity and size, the appropriate test effort for one project might be very large if it was for life-critical medical equipment software, but might be much smaller for the other project if it was for a low-cost computer game. A test estimation approach that only considered size and complexity might be appropriate for one project but not for the other.

Following are some approaches to consider.

Implicit Risk Context Approach:
A typical approach to test estimation is for a project manager or QA manager to implicitly use risk context, in combination with past personal experiences in the organization, to choose a level of resources to allocate to testing. In many organizations, the 'risk context' is assumed to be similar from one project to the next, so there is no explicit consideration of risk context. (Risk context might include factors such as the organization's typical software quality levels, the software's intended use, the experience level of developers and testers, etc.) This is essentially an intuitive guess based on experience.

Metrics-Based Approach:
A useful approach is to track past experience of an organization's various projects and the associated test effort that worked well for projects. Once there is a set of data covering characteristics for a reasonable number of projects, then this 'past experience' information can be used for future test project planning. (Determining and collecting useful project metrics over time can be an extremely difficult task.) For each particular new project, the 'expected' required test time can be adjusted based on whatever metrics or other information is available, such as function point count, number of external system interfaces, unit testing done by developers, risk levels of the project, etc. In the end, this is essentially 'judgement based on documented experience', and is not easy to do successfully.

Test Work Breakdown Approach:
Another common approach is to decompose the expected testing tasks into a collection of small tasks for which estimates can, at least in theory, be made with reasonable accuracy. This of course assumes that an accurate and predictable breakdown of testing tasks and their estimated effort is feasible. In many large projects, this is not the case. For example, if a large number of bugs are being found in a project, this will add to the time required for testing, retesting, bug analysis and reporting. It will also add to the time required for development, and if development schedules and efforts do not go as planned, this will further impact testing.

Iterative Approach:
In this approach for large test efforts, an initial rough testing estimate is made. Once testing begins, a more refined estimate is made after a small percentage (eg, 1%) of the first estimate's work is done. At this point testers have obtained additional test project knowledge and a better understanding of issues, general software quality, and risk. Test plans and schedules can be refactored if necessary and a new estimate provided. Then a yet-more-refined estimate is made after a somewhat larger percentage (eg, 2%) of the new work estimate is done. Repeat the cycle as necessary/appropriate.

Percentage-of-Development Approach:
Some organizations utilize a quick estimation method for testing based on the estimated programming effort. For example, if a project is estimated to require 1000 hours of programming effort, and the organization normally finds that a 40% ratio for testing is appropriate, then an estimate of 400 hours for testing would be used. This approach may or may not be useful depending on the project-to-project variations in risk, personnel, types of applications, levels of complexity, etc.

Successful test estimation is a challenge for most organizations, since few can accurately estimate software project development efforts, much less the testing effort of a project. It is also difficult to attempt testing estimates without first having detailed information about a project, including detailed requirements, the organization's experience with similar projects in the past, and an understanding of what should be included in a 'testing' estimation for a project (functional testing? unit testing? reviews? inspections? load testing? security testing?)

With agile software development approaches, test effort estimations may be unnecessary if pure test-driven development is utilized. However, it is not uncommon to have a mix of some automated positive-type unit tests, along with some type of separate manual or automated functional testing. In general, agile-based projects by their nature will not be heavily dependent on large testing efforts, since they emphasize the construction of releasable software in very short iteration cycles. These smaller test effort estimates may not be as difficult and the impact of inaccurate estimates will be less severe.

For an interesting view of the problem of test estimation, see the comments on Martin Fowler's web site indicating that, for many large systems, "testing and debugging is impossible to schedule".

How can it be determined if a test environment is appropriate?
This is a difficult question in that it typically involves tradeoffs between 'better' test environments and cost. The ultimate situation would be a collection of test environments that mimic exactly all possible hardware, software, network, data, and usage characteristics of the expected live environments in which the software will be used. For many software applications, this would involve a nearly infinite number of variations, and would clearly be impossible. And for new software applications, it may also be impossible to predict all the variations in environments in which the application will run. For very large, complex systems, duplication of a 'live' type of environment may be prohibitively expensive.

In reality judgements must be made as to which characteristics of a software application environment are important, and test environments can be selected on that basis after taking into account time, budget, and logistical constraints. Such judgements are preferably made by those who have the most appropriate technical knowledge and experience, along with an understanding of risks and constraints.

For smaller or low risk projects, an informal approach is common, but for larger or higher risk projects (in terms of money, property, or lives) a more formalized process involving multiple personnel and significant effort and expense may be appropriate.

In some situations it may be possible to mitigate the need for maintenance of large numbers of varied test environments. One approach might be to coordinate internal testing with beta testing efforts. Another possible mitigation approach is to provide built-in automated tests that run automatically upon installation of the application by end-users. These tests might then automatically report back information, via the internet, about the application environment and problems encountered.

Will automated testing tools make testing easier?


  • Possibly. For small projects, the time needed to learn and implement them may not be worth it unless personnel are already familiar with the tools. For larger projects, or on-going long-term projects they can be valuable.
  • A common type of automated tool is the 'record/playback' type. For example, a tester could click through all combinations of menu choices, dialog box choices, buttons, etc. in an application GUI and have them 'recorded' and the results logged by a tool. The 'recording' is typically in the form of text based on a scripting language that is interpretable by the testing tool. Often the recorded script is manually modified and enhanced. If new buttons are added, or some underlying code in the application is changed, etc. the application might then be retested by just 'playing back' the 'recorded' actions, and comparing the logging results to check effects of the changes. The problem with such tools is that if there are continual changes to the system being tested, the 'recordings' may have to be changed so much that it becomes very time-consuming to continuously update the scripts. Additionally, interpretation and analysis of results (screens, data, logs, etc.) can be a difficult task. Note that there are record/playback tools for text-based interfaces also, and for all types of platforms.
  • Another common type of approach for automation of functional testing is 'data-driven' or 'keyword-driven' automated testing, in which the test drivers are separated from the data and/or actions utilized in testing (an 'action' would be something like 'enter a value in a text box'). Test drivers can be in the form of automated test tools or custom-written testing software. The data and actions can be more easily maintained - such as via a spreadsheet - since they are separate from the test drivers. The test drivers 'read' the data/action information to perform specified tests. This approach can enable more efficient control, development, documentation, and maintenance of automated tests/test cases.
  • Other automated tools can include:
         code analyzers - monitor code complexity, adherence to
    standards, etc.

    coverage analyzers - these tools check which parts of the
    code have been exercised by a test, and may
    be oriented to code statement coverage,
    condition coverage, path coverage, etc.

    memory analyzers - such as bounds-checkers and leak detectors.

    load/performance test tools - for testing client/server
    and web applications under various load
    levels.

    web test tools - to check that links are valid, HTML code
    usage is correct, client-side and
    server-side programs work, a web site's
    interactions are secure.

    other tools - for test case management, documentation
    management, bug reporting, and configuration
    management, file and database comparisons, screen
    captures, security testing, macro recorders, etc.

Test automation is, of course, possible without COTS tools. Many successful automation efforts utilize custom automation software that is targeted for specific projects, specific software applications, or a specific organization's software development environment. In test-driven agile software development environments, automated tests are often built into the software during (or preceding) coding of the application.

What if an organization is growing so fast that fixed QA processes are impossible?
This is a common problem in the software industry, especially in new technology areas. There is generally no easy solution in this situation. One approach is:
  • Hire good people
  • Management should 'ruthlessly prioritize' quality issues and maintain focus on the customer
  • Everyone in the organization should be clear on what 'quality' means to the customer

Depending on the growth rate, it is possible that incremental self-managed team approaches may be applicable, such as 'Kaizen' methods of continuous process improvement, or the Deming-Shewhart Plan-Do-Check-Act cycle, and others.

How can Software QA processes be implemented without reducing productivity?
By implementing QA processes slowly over time, using consensus to reach agreement on processes, focusing on processes that align tightly with organizational goals, and adjusting/experimenting/refactoring as an organization matures, productivity can be improved instead of stifled. Problem prevention will lessen the need for problem detection, panics and burn-out will decrease, and there will be improved focus and less wasted effort. At the same time, attempts should be made to keep processes simple and efficient, avoid a 'Process Police' mentality, minimize paperwork, promote computer-based processes and automated tracking and reporting, minimize time required in meetings, and promote training as part of the QA process. However, no one - especially talented technical types - likes rules or bureaucracy, and in the short run things may slow down a bit. A typical scenario would be that more days of planning, reviews, and inspections will be needed, but less time will be required for late-night bug-fixing and handling of irate customers.

Other possibilities include incremental self-managed team approaches such as 'Kaizen' methods of continuous process improvement, the Deming-Shewhart Plan-Do-Check-Act cycle, and others.

Does every software project need testers?

While all projects will benefit from testing, some projects may not require independent test staff to succeed.

Which projects may not need independent test staff? The answer depends on the size and context of the project, the risks, the development methodology, the skill and experience of the developers, and other factors. For instance, if the project is a short-term, small, low risk project, with highly experienced programmers utilizing thorough unit testing or test-first development, then test engineers may not be required for the project to succeed.

In some cases an IT organization may be too small or new to have a testing staff even if the situation calls for it. In these circumstances it may be appropriate to instead use contractors or outsourcing, or adjust the project management and development approach (by switching to more senior developers and agile test-first development, for example). Inexperienced managers sometimes gamble on the success of a project by skipping thorough testing or having programmers do post-development functional testing of their own work, a decidedly high risk gamble.

For non-trivial-size projects or projects with non-trivial risks, a testing staff is usually necessary. As in any business, the use of personnel with specialized skills enhances an organization's ability to be successful in large, complex, or difficult tasks. It allows for both a) deeper and stronger skills and b) the contribution of differing perspectives. For example, programmers typically have the perspective of 'what are the technical issues in making this functionality work?'. A test engineer typically has the perspective of 'what might go wrong with this functionality, and how can we ensure it meets expectations?'. Technical people who can be highly effective in approaching tasks from both of those perspectives are rare, which is why, sooner or later, organizations bring in test specialists.

What are some recent major computer system failures caused by software bugs?
  • A September 2006 news report indicated problems with software utilized in a state government's primary election, resulting in periodic unexpected rebooting of voter checkin machines, which were separate from the electronic voting machines, and resulted in confusion and delays at voting sites. The problem was reportedly due to insufficient testing.
  • In August of 2006 a U.S. government student loan service erroneously made public the personal data of as many as 21,000 borrowers on it's web site, due to a software error. The bug was fixed and the government department subsequently offered to arrange for free credit monitoring services for those affected.
  • A software error reportedly resulted in overbilling of up to several thousand dollars to each of 11,000 customers of a major telecommunications company in June of 2006. It was reported that the software bug was fixed within days, but that correcting the billing errors would take much longer.
  • News reports in May of 2006 described a multi-million dollar lawsuit settlement paid by a healthcare software vendor to one of its customers. It was reported that the customer claimed there were problems with the software they had contracted for, including poor integration of software modules, and problems that resulted in missing or incorrect data used by medical personnel.
  • In early 2006 problems in a government's financial monitoring software resulted in incorrect election candidate financial reports being made available to the public. The government's election finance reporting web site had to be shut down until the software was repaired.
  • Trading on a major Asian stock exchange was brought to a halt in November of 2005, reportedly due to an error in a system software upgrade. The problem was rectified and trading resumed later the same day.
  • A May 2005 newspaper article reported that a major hybrid car manufacturer had to install a software fix on 20,000 vehicles due to problems with invalid engine warning lights and occasional stalling. In the article, an automotive software specialist indicated that the automobile industry spends $2 billion to $3 billion per year fixing software problems.
  • Media reports in January of 2005 detailed severe problems with a $170 million high-profile U.S. government IT systems project. Software testing was one of the five major problem areas according to a report of the commission reviewing the project. In March of 2005 it was decided to scrap the entire project.
  • In July 2004 newspapers reported that a new government welfare management system in Canada costing several hundred million dollars was unable to handle a simple benefits rate increase after being put into live operation. Reportedly the original contract allowed for only 6 weeks of acceptance testing and the system was never tested for its ability to handle a rate increase.
  • Millions of bank accounts were impacted by errors due to installation of inadequately tested software code in the transaction processing system of a major North American bank, according to mid-2004 news reports. Articles about the incident stated that it took two weeks to fix all the resulting errors, that additional problems resulted when the incident drew a large number of e-mail phishing attacks against the bank's customers, and that the total cost of the incident could exceed $100 million.
  • A bug in site management software utilized by companies with a significant percentage of worldwide web traffic was reported in May of 2004. The bug resulted in performance problems for many of the sites simultaneously and required disabling of the software until the bug was fixed.
  • According to news reports in April of 2004, a software bug was determined to be a major contributor to the 2003 Northeast blackout, the worst power system failure in North American history. The failure involved loss of electrical power to 50 million customers, forced shutdown of 100 power plants, and economic losses estimated at $6 billion. The bug was reportedly in one utility company's vendor-supplied power monitoring and management system, which was unable to correctly handle and report on an unusual confluence of initially localized events. The error was found and corrected after examining millions of lines of code.
  • In early 2004, news reports revealed the intentional use of a software bug as a counter-espionage tool. According to the report, in the early 1980's one nation surreptitiously allowed a hostile nation's espionage service to steal a version of sophisticated industrial software that had intentionally-added flaws. This eventually resulted in major industrial disruption in the country that used the stolen flawed software.
  • A major U.S. retailer was reportedly hit with a large government fine in October of 2003 due to web site errors that enabled customers to view one anothers' online orders.
  • News stories in the fall of 2003 stated that a manufacturing company recalled all their transportation products in order to fix a software problem causing instability in certain circumstances. The company found and reported the bug itself and initiated the recall procedure in which a software upgrade fixed the problems.
  • In August of 2003 a U.S. court ruled that a lawsuit against a large online brokerage company could proceed; the lawsuit reportedly involved claims that the company was not fixing system problems that sometimes resulted in failed stock trades, based on the experiences of 4 plaintiffs during an 8-month period. A previous lower court's ruling that "...six miscues out of more than 400 trades does not indicate negligence." was invalidated.
  • In April of 2003 it was announced that a large student loan company in the U.S. made a software error in calculating the monthly payments on 800,000 loans. Although borrowers were to be notified of an increase in their required payments, the company will still reportedly lose $8 million in interest. The error was uncovered when borrowers began reporting inconsistencies in their bills.
  • News reports in February of 2003 revealed that the U.S. Treasury Department mailed 50,000 Social Security checks without any beneficiary names. A spokesperson indicated that the missing names were due to an error in a software change. Replacement checks were subsequently mailed out with the problem corrected, and recipients were then able to cash their Social Security checks.
  • In March of 2002 it was reported that software bugs in Britain's national tax system resulted in more than 100,000 erroneous tax overcharges. The problem was partly attributed to the difficulty of testing the integration of multiple systems.
  • A newspaper columnist reported in July 2001 that a serious flaw was found in off-the-shelf software that had long been used in systems for tracking certain U.S. nuclear materials. The same software had been recently donated to another country to be used in tracking their own nuclear materials, and it was not until scientists in that country discovered the problem, and shared the information, that U.S. officials became aware of the problems.
  • According to newspaper stories in mid-2001, a major systems development contractor was fired and sued over problems with a large retirement plan management system. According to the reports, the client claimed that system deliveries were late, the software had excessive defects, and it caused other systems to crash.
  • In January of 2001 newspapers reported that a major European railroad was hit by the aftereffects of the Y2K bug. The company found that many of their newer trains would not run due to their inability to recognize the date '31/12/2000'; the trains were started by altering the control system's date settings.
  • News reports in September of 2000 told of a software vendor settling a lawsuit with a large mortgage lender; the vendor had reportedly delivered an online mortgage processing system that did not meet specifications, was delivered late, and didn't work.
  • In early 2000, major problems were reported with a new computer system in a large suburban U.S. public school district with 100,000+ students; problems included 10,000 erroneous report cards and students left stranded by failed class registration systems; the district's CIO was fired. The school district decided to reinstate it's original 25-year old system for at least a year until the bugs were worked out of the new system by the software vendors.
  • A review board concluded that the NASA Mars Polar Lander failed in December 1999 due to software problems that caused improper functioning of retro rockets utilized by the Lander as it entered the Martian atmosphere.
  • In October of 1999 the $125 million NASA Mars Climate Orbiter spacecraft was believed to be lost in space due to a simple data conversion error. It was determined that spacecraft software used certain data in English units that should have been in metric units. Among other tasks, the orbiter was to serve as a communications relay for the Mars Polar Lander mission, which failed for unknown reasons in December 1999. Several investigating panels were convened to determine the process failures that allowed the error to go undetected.
  • Bugs in software supporting a large commercial high-speed data network affected 70,000 business customers over a period of 8 days in August of 1999. Among those affected was the electronic trading system of the largest U.S. futures exchange, which was shut down for most of a week as a result of the outages.
  • In April of 1999 a software bug caused the failure of a $1.2 billion U.S. military satellite launch, the costliest unmanned accident in the history of Cape Canaveral launches. The failure was the latest in a string of launch failures, triggering a complete military and industry review of U.S. space launch programs, including software integration and testing processes. Congressional oversight hearings were requested.
  • A small town in Illinois in the U.S. received an unusually large monthly electric bill of $7 million in March of 1999. This was about 700 times larger than its normal bill. It turned out to be due to bugs in new software that had been purchased by the local power company to deal with Y2K software issues.
  • In early 1999 a major computer game company recalled all copies of a popular new product due to software problems. The company made a public apology for releasing a product before it was ready.
  • The computer system of a major online U.S. stock trading service failed during trading hours several times over a period of days in February of 1999 according to nationwide news reports. The problem was reportedly due to bugs in a software upgrade intended to speed online trade confirmations.
  • In April of 1998 a major U.S. data communications network failed for 24 hours, crippling a large part of some U.S. credit card transaction authorization systems as well as other large U.S. bank, retail, and government data systems. The cause was eventually traced to a software bug.
  • January 1998 news reports told of software problems at a major U.S. telecommunications company that resulted in no charges for long distance calls for a month for 400,000 customers. The problem went undetected until customers called up with questions about their bills.
  • In November of 1997 the stock of a major health industry company dropped 60% due to reports of failures in computer billing systems, problems with a large database conversion, and inadequate software testing. It was reported that more than $100,000,000 in receivables had to be written off and that multi-million dollar fines were levied on the company by government agencies.
  • A retail store chain filed suit in August of 1997 against a transaction processing system vendor (not a credit card company) due to the software's inability to handle credit cards with year 2000 expiration dates.
  • In August of 1997 one of the leading consumer credit reporting companies reportedly shut down their new public web site after less than two days of operation due to software problems. The new site allowed web site visitors instant access, for a small fee, to their personal credit reports. However, a number of initial users ended up viewing each others' reports instead of their own, resulting in irate customers and nationwide publicity. The problem was attributed to "...unexpectedly high demand from consumers and faulty software that routed the files to the wrong computers."
  • In November of 1996, newspapers reported that software bugs caused the 411 telephone information system of one of the U.S. RBOC's to fail for most of a day. Most of the 2000 operators had to search through phone books instead of using their 13,000,000-listing database. The bugs were introduced by new software modifications and the problem software had been installed on both the production and backup systems. A spokesman for the software vendor reportedly stated that 'It had nothing to do with the integrity of the software. It was human error.'
  • On June 4 1996 the first flight of the European Space Agency's new Ariane 5 rocket failed shortly after launching, resulting in an estimated uninsured loss of a half billion dollars. It was reportedly due to the lack of exception handling of a floating-point error in a conversion from a 64-bit integer to a 16-bit signed integer.
  • Software bugs caused the bank accounts of 823 customers of a major U.S. bank to be credited with $924,844,208.32 each in May of 1996, according to newspaper reports. The American Bankers Association claimed it was the largest such error in banking history. A bank spokesman said the programming errors were corrected and all funds were recovered.
  • Software bugs in a Soviet early-warning monitoring system nearly brought on nuclear war in 1983, according to news reports in early 1999. The software was supposed to filter out false missile detections caused by Soviet satellites picking up sunlight reflections off cloud-tops, but failed to do so. Disaster was averted when a Soviet commander, based on what he said was a '...funny feeling in my gut', decided the apparent missile attack was a false alarm. The filtering software code was rewritten.