Continuous Quality: 2011

Tuesday, July 5, 2011

The Army's $2.7 Billion Computer Designed To Help Troops In Iraq And Afghanistan Doesn't Work

http://www.businessinsider.com/armys-billion-dollar-cloud-computer-in-iraq-afghanistan-hurts-troops-2011-7

Cannot perform simple analytical tasks
has trouble finding reports
the mapping software is incompatible with the search software.

This post provided to me by Rocket, one of our rising star graduate testers.

Friday, June 17, 2011

Questions Regarding the Ross Report

I thought I would post a recent conversation I had regarding the analysis of metrics. The Ross Report collates software testing benchmarks for Australian organisations. The discussion with Michael De Robertis, Test Manager at ISoft, highlighted a mutual interest in testing metrics, and the deeper questions raised as you start to peel back the layers.

Hi Kelvin.

Thanks a lot for your detailed response.

Yes please feel free to post on your blog, and you’re welcome to include my name/title/company.

From my personal perspective, it is a breath of fresh air to establish good, constructive dialog with any like minded people who have a deep understanding of Testing and who wish to challenge the boundaries of what it all means.

One of the things I love about testing is that despite having 15 uninterrupted years in the profession I still have so much to learn, and good philosophical discussions about Testing still inspire me.

Whilst I can’t speak for regions outside of Canberra (and Sydney to a lesser degree where I used to live), I am deeply concerned with the general quality of Testing professionals out there, and perhaps this is my biggest driver for having a good, mature Benchmarking process which I refine and improve religiously.

I am seeing all types of Testers out there – those who are only politically driven and insist on a requirement having a dot on the “ i “; those who don’t understand business priorities and how to manage risk or to plan; those who only want to break systems and drive wedges with development; those who lose motivation quickly; and those who don’t understand why solid process needs to accompany what we do.

I am seeing these qualities in “certified” testers (both new and experienced), and in graduate IT professionals, many of whom do not even study one single subject on testing. In my final year of my Electronic Engineering degree, testing was a big deal. I even majored in subjects on testing (through what was then called the Australian Centre for Test and Evaluation, ACTE). However unfortunately I don’t see a passion for testing driving the Test industry as it did when I began my career. Instead it seems to be only market driven (ie we need more testers, even if they are no good at it).

Apologies for the drawn out background. In my view, this is where I see publications such as the Ross Report adding real value because the Testing industry needs proper guidance through skills validation programmes complimented with independent evaluations (benchmarks) about how well the industry performs.

Refer below, where I have responded to your follow up questions.

Thanks again.

Regards,

Hi Michael,

Thanks for your comments and feedback on the Ross Report. I really appreciate your email, as it gives us valuable feedback on the way the industry benchmark is being used and what improvements and additional information users are seeking.

Apologies first of all for not responding sooner, it has been on my long list of to do items for a little while, and I have been waiting for a little time to put some real attention to it.

I will add some responses inline with your feedback below.

Also would it be possible that I could post your comments and my responses to my blog page? I found this discussion quite interesting, and I would like to share it with others.

Cheers

Kelvin

Hi there.

Can you please forward this email to someone who is familiar with the content of the Ross Report?

I found the report very comprehensive and particularly useful to compare to our benchmark…..love the metrics! J

I must say I am a numbers person myself. I love looking at what data shows, and using that to gain a deeper understanding of the testing process. Also I have the view that "without evidence, you are just another person with an opinion".

What I did find though, was the more that you look at the results, the more questions that open up to you. You tend to question: how did the number emerge, what does it mean, what influences it, how to I improve it... And I think that is the point you make below. Certainly after scratching the surface, I want to know more.

Getting the balance right in the survey is hard. To better understand I want to ask more questions, but the tradeoff then is participation will fall off if respondents find the survey too onerous. What we probably need to do is to periodically run other more detailed surveys that drill down, so long as participants dont then suffer "survey fatigue"! Anyhow a challenge to build the just right "goldilocks" survey.

Compiling the Ross Report turned into a much larger job than I anticipated. We spent well over 500 hours, with multiple people involved. It was a big job, but we want do go deeper and wider (more respondents) next time.

We are just about to kick off the 2012 Ross Report, so your feedback is very timely.

In my personal view, there were three areas that I was hoping someone could comment further about (purely for the benefit of comparing philosophies of benchmarking).

1) More proof about Automation Tools effectiveness – The report rightly separates automation and performance testing from manual testing, however I couldn’t find measurements that specifically reported on whether automation testing (for example) improved regression fail rates or resulted in bugs being found. All that I could see was that organisations were happy with their choice (but not the backup metrics as to why). I also couldn’t find metrics on downtime needed for training, setup, configuration, execution, and maintenance of those tools.

Very good point. I think the industry is still uncertain on the real benefits they hope to obtain from automation. Few organisation measure their effort, tests automated/executed, defects found, etc. Automation ROI expectations seem to be poorly understood in terms of their economics, and the metrics to show that.

There are questions around metrics for automation, for instance, that we should measure manual test reduction. Many times we count execution reduction from automated tests, that would perhaps never have been chosed to be executed if it was manual. We probably still expend a lot of effort running tests that shouldn't be continued, as they are not providing much insight.

There were some metrics on coverage, regression re-execution specific to automation, and these indicated some surpising insights - many organisations are poorat re-executing automated tests. Obviously this requires more in-depth follow up! There were also some metrics in relation to

Your comment about defect discovery rates from automation is also pertinent. I think this is a very important metric. Also perhaps to be combined with effort, to deduce defects per hour (from automation). On page 49 you should see we report that only 6% of regression tests fail per test cycle. Regression tests have lower defect yield than new feature or defect correction tests. I would expect for many this would also apply to automation, as in most organisations automation is focussed on regression testing.

I think your comments to extenddefect detection, testing effort metrics from general testing, to specifically look at automation, is a good suggestion, and one we will look at.

2) Skills validation as opposed to Skills Certification – There is still a view amongst many testing professionals that existing certification programs are not value-adding other than to say that you have the certification. Although I have certification, I lean towards this view because I personally feel that certification could do more to prove that testing professionals have worked the hours, gained the experience, delivered the services, and have been endorsed by former and current peers. Hence I think the report would add value if it considered skills qualification and validation rather than through certification alone.

I agree, certification falls short in a number of areas in terms of skills validation. See http://kelvinross.blogspot.com/2010/10/certification-is-failing-do-we-need.html and http://dorothygraham.blogspot.com/2011/02/part-3-certification-schemes-do-not.html.

I think to go beyond the certifications that organisation's staff are attaining, we would need to get some measurement on the competencies of the staff. I am just not sure how we can measure this from an organisational perspective??? It ispretty hard to assess competency of individuals, doing it in a meaningful way for organisations, hmmm, I need to think about that a bit more.

We tried to assess where staff are sourced, in terms of career entry, e.g. from thebusiness, development, etc. and impact of staff attributes, e.g. tools, techniques, theory, etc.

In other work we have undertaken skills assessments, including with our own consulting staff, scored skills for different skill categories, and tried to role that up to a organisational profile score to aim for improvements, etc. I say tried, because getting a precise skill profile was a challenge. While this profile is important to the individual organisation, I am not sure how this as an average trend would benefit, nor how we would measure individual organisations in a consistent way.

I think also if we take a organisation norm, it would perhaps look similar if we surveyed individuals regarding their skills and competencies across a large survey set. It is also possibly easier to do it at an individual level. What it might not show us is what skill areas are in most demand, and where are the most significant shortfalls.

That being said, what you say about skills validation is definately correct. Could you perhaps give me some more suggestions on what you would like to see here from a benchmarking point of view?

You are right, I’m not sure how you would present this in a benchmark / survey form and I’m not sure how respondents would react. However, if I look at this from a real day to day issue, my problem is that I cannot be secure in knowing that a Tester is fit for the job on the basis of having certification. I recall the early 90’s when Cisco Certification was a big deal. Now it isn’t, and everyone has it. In my view Testing is heading down the same path, and therefore I rely on my own benchmarking to substitute “certification” for “validated skills”.

I think your report almost has the structure in place to do this when you look at the Perceptions section. For example instead of a Test Manager answering a question about what their perception is about something, why not have a section whereby a number of disciplines are required to answer (ie Developers, Dev Managers, Project Managers, BA, etc)? At least once per year I obtain this feedback from my peers. Perhaps respondents might not do so if their team culture is not solid, I don’t know.

In terms of actual certification, I think something like CBAP (for business analysis / systems engineering) would be a good model.

3) Respondent Profile & the influence of the report due to their Methodology – I couldn’t relate to the average team size being up to 25. In my 15 year testing career spanning small and large organisations, I can recall only two occasions where test teams were this big (let alone the whole project team), however they were also subdivided and very independent of each other. Therefore I was slightly concerned that the report leaned towards the perspective of more strictly controlled / less-progressive teams running old methodologies. In fact, your Methodology section confirms that given that Waterfall appeared to be one of the most prominent methodologies used.

Team sizes vary dramatically, you can see some organisation at 1-5 members, the norm around 11-25, and also several in the 100+. The team size though refers to the overall testing team size for the organisation, and hasn't been reduced to project testing team size. Some of the 100+ organisation I know have over 500 testers, some at 700 - 750 testers.

When we looked at the Project Landscape you can see it is common to have 10 - 20 projects per annum. Also common was a 6 month project duration. This paints a rough picture that on average you are probably going to have a project with 2 or 3 testers. Larger projects with longer durations may had 5 - 10 perhaps. We didn't specifically ask though the average number of testers on a project. As you point out that could change a lot based on methodogy, archecture, team composition, etc.

I think the survey does have a bias towards "testing aware" companies, as we less likely to have contact with "testing ignorant" companies. Also as you point out, this is likely biased to larger organisations. I think in benchmarking yourself, you need to think in terms of benchmarking with the testing industry, which wouldn't include "testing ignorant/weak" organisations.

Waterfall is still most common, however Agile has been increasing dramatically over recent times. Still waterfall is prevalent in large teams/organisations, however we are finding the understanding is improving, and the barriers are coming down to adoption of Agile. What I still see in organisation is misconception of what constitutes "Agile", and hence why we tried to drill down into some of the Agile practices. We need to do this further again in the next survey.

Some other areas that didn’t appear to be covered which might add value in my opinion are:

1) Exploring more deeply into the skill sets of testing professionals. Examples:

a. Are they more closely aligned with the business side (in scope & analysis) or with the development / technical side of things (design, tools, infrastructure, etc)

b. What classifies an Automation or a Performance Test Specialist? Is it a Tester who can capture/playback and make basic adjustments to automation scripts, or are they at the other end of the spectrum and are fully competent in reading/writing in scripting and programming languages?

I think this relates to your point 2 above.

2) More detail about the value adding of test professionals (in addition to perceptions of testing) . Examples:

a. How quickly can they adapt to the job at hand?

b. What is their contribution to process improvements, finding important bugs, and identifying new features?

I think we should be able to look at some of these in future perceptions. We just need to be able to phrase it into a perception questions, and one that has qualified results, and one where the benchmark is likely to provide some comparable/actionable results that organisations can compare their own position against.

As discussed earlier, perhaps we need to split out a survey for individuals, and get their bottom up input into the testing approach, rather than the top down perception from the organisation view as well.

3) More information about how Test Managers estimate. Example:

a. If a Test Manager has a wealth of metrics at their disposal, what are the best ones to use when estimating for future projects, and what are their reasons?

Michael, is this something that you wanted surveyed, or something that needed some further discussion?

In my experience, people tend to use rules of thumb at the early stage, supported by % of project budget, and developer/tester ratios (2 metrics we did survey). That is when they are varying their budget, for many it is what you had last time. As more detail comes in they tend to move towards work breakdown, and that's when overheads must be factored in such as test management, test environments, etc. (again distribution of test effort metrics provide some guidance here). I guess we could survey whether people use other things such as function points (or test points, etc.), and other estimation approaches.

If you think we should survey this further, some example of what you would like to see would help us.

Good question. I probably have to think about this one in more detail, but basically, I am constantly challenging myself to get the perfect estimate. That is despite the scope creep, changing priorities, higher than expected bug rates, etc etc. My view is that these ‘unplanned’ events should always be expected and therefore planned. However this should be done with a proven technique rather than to simply double the time required (for example). I have detailed methods which I use and which consider these events, but I’m not sure if others do likewise particularly in organizations where they have to stick to the fixed number of deliverables or timeframes (even if quality is not up to scratch). In my group, we react according to the need and therefore the estimation process needs to have lots of options. Stressful? Yes. Good end result? Also yes!

Hence I’m after guidance which tells me that my estimation methods are good or not good (etc). From a mathematical perspective, I would think this is achievable.

4) Post Deployment and measurement of success - How accurate was testing implemented to a plan, what criterion was used (and by whom) to determine if the project was a success, and does this criterion ever change depending on the priorities? Example:

a. If the test plan could not be implemented accurately (due to other project issues out of our control) AND the project ran late (due to a larger number of bugs needing to be fixed) AND the client was very happy with the end result, is this considered a failure or a success? In my industry (safety critical, I would be happy with this result).

Yes, you point he is quite right. Completing a project on schedule is not the only measurement of success. Finding defects late in a project is always a huge frustration, and I have experienced it first hand where despite the testing evidence the project has steamrolled into production, in some cases costing 100's of millions of dollars. So success could be a cancelled project due to test evidence.

The Standish study doesn't reflect that point. I think IT portfolio management should be like share portfolio management. Some investments will not acheive the return, and better to abandon them then to stick with it.

I guess we (as an industry) need to think about measuring success more effectively. Quality ultimately needs to include the end user / customer perception. While testing we tend to focus on the pre-deployment metrics, it is the post deployment metrics which really count! It is like we are measuring inputs to quality, and we need to focus more on outputs of quality.

Here we could look at specific issues, like you suggest above. It might be worth understanding:

- projects delays caused by testing, did post-deployment quality improve. Ie. are testing results beneficial in influence project schedules

- do project plans effectively use testing results to measure plan milestone completion

- and so on

Something also for our benchmark team to focus on for 2012, and one where your suggestions are very welcome.

Tuesday, April 19, 2011

Is it time consumers get tough on software warranties?

A recent Australian Federal Court ruling has imposed a $200k penalty on a hardware and software vendor for false and misleading warranty claims made to its customers.

Most software vendors have limited understanding of their obligations under the Trade Practices Act 1974 (section 52 and 53g). Section 53g states: "[A corporation shall not...] make a false or misleading representation concerning the existence, exclusion or effect of any condition, warranty, guarantee, right or remedy."

Indeed, there are many clauses under the Trade Practices Act that software vendors should be aware of in relation to software quality, such as:

Section 68 - Application of provisions not to be excluded or modified
"... purports to exclude, restrict or modify or has the effect of excluding, restricting or modifying: ... (c) any liability of the corporation for breach of a condition or warranty implied by such a provision; or ... is void."
Section 70 - Supply by description
"... there is an implied condition that the goods will correspond with the description..."
Section 72 - Implied undertakings as to quality or fitness
"... there is an implied condition that the goods supplied under the contract for the supply of the goods are of merchantable quality..."
Section 74D - Actions in respect of goods of unmerchantable quality
"... the consumer suffers loss or damage by reason that the goods are not of merchantable quality; the corporation is liable to compensate the consumer or that other person for the loss or damage and the consumer or that other person may recover the amount of the compensation by action against the corporation in a court of competent jurisdiction."

"Goods of any kind are of merchantable quality within the meaning of this section if they are as fit for the purpose or purposes for which goods of that kind are commonly bought as it is reasonable to expect having regard to:
(a) any description applied to the goods by the corporation;
(b) the price received by the corporation for the goods (if relevant); and
(c) all the other relevant circumstances."

So what constitutes "merchantable quality" and "reasonable to expect" in the software industry? What testing prior to deployment is reasonable to expect? How do we ensure that the "goods will correspond with the description"? What will be the outcome when some of these clauses are tested in court?

I am not a laywer, but I believe that many vendors are likely short of their responsibilities in these areas. However, for many consumers and acquirers, they are not prepared to legally challenge for their entitlement to quality.

Bridging the Software Testing Skills Gap

As the Australian economy improves, and pressure to obtain skilled resources grows, the pressure to find skilled software testing resources will be even greater. This is the most critical challenge currently facing Test Managers in large organisations throughout Australia. Ultimately the ripple effects will be felt by IT and Project Managers, and up to the CIO’s software delivery pipelines, as inadequate access to testing resources will place successful releases at risk (see Testing is IT's Elephant in the Room).

In the Ross Report the evidence collected led me to conclude that Australia will need to find 1500 – 2000 new testers to fill the gap. The reasoning for this is that not only will IT projects continue to ramp up, but the IT project budget apportioned to testing is continuing to grow, and this translates into a significant increase in testing expenditure by 2012. Evidence for this can also be seen in the ramp up in job vacancies, as shown in our Software Testing and QA Job Index.

Sourcing of testing resources is changing, with a shift towards increased outsourcing. Furthermore, testing is becoming increasingly analytical, where testers are now expected to provide technology advice at a similar skill level to that of others in the development team, such as programmers, business analysts and solution architects.

The implications of the Skills Gap are two fold:

Testing resources will become a suppliers market. Salaries will increase, there will be greater competition for available good resources, bidding wars will take place on experienced resources, and organisations will face greater churn as resources leave when tempted by other opportunities.

Shortage of resources will limit what project demands can be met. Projects may be delayed, or released with less than desired governance and quality control. Organisations will face greater risk.

On April 14^th and 15^th, Australia’s leading Test Managers will meet for the 9^th Australian Test Managers Forum on the Gold Coast. At last year’s Forum, attendee feedback confirmed resourcing as their most critical challenge, further reinforcing the evidence above. Hence the key focus for this year’s forum on the Skills Gap, and what organisations are doing to address the challenge.

This article explores the considerations, and seeks feedback as to what organisations are doing to bridge the Skills Gap. A number of strategy questions that organisations may be considering are:

Sourcing Strategy

Where will your resources come from?
Whether to use permanent, contract or outsourced resources?
Whether to undertake testing onshore, offshore or a blend of both?
Will you rely on a constant permanent team, or do you need to be flexible with your resource demands?
Are resources full-time or part-time?
What supplier relationships need to be established to secure access to resources when needed?

Attracting Resources

Where can we find new testing resources?
Do we look at immigration to access a greater resource pool?
Is testing seen as an attractive career path to potential recruits?
Are Australian universities producing graduates to address the testing skills gap?
Are resources from other parts of the business being reskilled successfully to address the skills gap?
Are resources from other parts of development being reskilled successfully to address the skills gap?
Are graduate, intern or traineeship programmes being used as a mechanism to develop new testers?

Retaining Resources

Are organisations able to offer the right environment to retain testers?
Are salaries competitive?
Are career paths in place, and do testers see advancement and personal development opportunities?
Do testers have opportunities for promotion without having to leave the discipline?
Is testing a good pathway for developing leadership in other areas of the business?
Are appropriate skills development programmes in place to continue to advance skills and provide greater value?
What is the best way to develop skills: classroom or online training, mentoring, project placement, etc.?
Are organisations providing a pleasant workplace environment?
Are testers recognised and respected by their peers?
Are team structures supportive of the testing role?

Reducing the Workload Demand

What efficiency improvement can be undertaken to reduce the requirement for skilled testing resources, yet still produce a quality outcome?
Are team structures well balanced?
What level of testing needs to be undertaken by independent testing?
Are processes efficient, in terms of most cost-effectively finding and fixing defects, using the right skills and resources, at the right time?
Can less testing be done and still achieve the successful outcome, e.g. through prioritisation and risk-based testing?
Can Test Tools be better utilised to reduce the workload?
Can scheduling the development lifecycle manage the workload, in terms of spreading testing more evenly?
Can we achieve sufficient reuse of test assets to improve efficiency?

A lot of tough questions here! Please post comments regarding your thoughts on what organisations are currently or are planning to do to address the skills gap. I will take your comments to the Forum for discussion, and also post a follow up article here after the Forum on what alternatives were suggested.

Tuesday, March 1, 2011

Bank error reveals Student's Ebay Fraud

As reported by Computerworld, there are times when IT failures do some good. In Brisbane a schoolboy's Ebay scams were uncovered only after a glitch incorrectly deposited $2 million into one of the fake scam bank accounts.

Tuesday, February 15, 2011

Virgin Glitches Again, But No Queues

Virgin has again had glitches according to Computerworld, but this time no long queues as the system was down only for a few minutes.

Last time failures resulted in massive delays across the country.

Continuous Quality

Tuesday, July 5, 2011

Friday, June 17, 2011

Questions Regarding the Ross Report

Tuesday, April 19, 2011

Is it time consumers get tough on software warranties?

Sunday, April 3, 2011

Bridging the Software Testing Skills Gap

Tuesday, March 1, 2011

Bank error reveals Student's Ebay Fraud

Tuesday, February 15, 2011

Virgin Glitches Again, But No Queues

Share It

Blog Archive

About Me

Followers