Monday, November 17, 2014
Testing Contracts In Integration
Tuesday, September 23, 2014
Independent Testers Are Like Parents Of Drug Addicts
Ever heard of "codependence"?
Codependence is a type of dysfunctional helping relationship where one person supports or enables another person’s addiction, poor mental health, immaturity, irresponsibility, or under-achievement [Wikipedia].It is like being a parent of a drug addict. You are trying to help them beat the addiction, but in helping you can actually feed the addiction. It can be destructive for both parties.
Having independent software testing often leads into the same trap of codependence.
Back in the "good old days" software engineers tested their own code, there weren't software testers, sometimes as a software engineer you got assigned to testing. In my first job in avionics systems you would spend most of your time testing, but you were still a software engineer.
Then in the late 90s, influenced by Y2K, independent testing really got traction. Independent testers look at things differently to the developer, it provides a quality gate. All these things are true and beneficial.
But then software engineers stopped testing!
That is the independent testers job. They own quality. They will find the bugs because they are good at it. The crappy details nit-picking checking is better done by anybody other than me!
The result is development no longer own quality. Testers find lots of defects that should have been found more efficiently earlier. Buckets of time and schedule is lost as testing is passed over the fence and down the line. Developers are not getting informed of how to improve quality and reduce putting the bugs in the first place. And we are just beating them over the head with lots of public failures that frustrates everyone and puts developers and testers in conflict.
There are seeds of change though. Many agile and continuous quality approaches are putting ownership of testing back in with development and delivery teams, rather than sitting only with testing. Developers are taking on board greater testing responsibilities. Testing is becoming more of a coach and a safety net, providing input to development processes where they can be improved internally to deliver software faster and more reliably.
Last week I reviewed two great pieces of information which made me reflect more on correcting the codependence, I highly recommend you check them out:
- How Google Tests Software, Whittaker et al - at least read the intro, talks about the way testing roles are setup within teams
- How to Deliver Quality Assurance At Speed, Wyatt - register to get access to webinar recording. See how the Jira team at Atlassian eventually did away with reliance on independent testing, and evolved to developers doing the testing with support of "Quality Assistants".
And finally a plug for our upcoming conference iqnite Australia, where we are leading the thinking to reshape how organisations are approaching their testing. In 2014 we have a big focus around DevOps, Agile and Continuous Quality, including ideas discussed here.
Saturday, May 31, 2014
Feature Toggles
While most QA and test managers want to delay release until certain that there are no major defects in the system, this can significantly stymie delivery and inhibit innovation. Feature Toggles allow new features to be deployed, but only exposed to limited users. In some cases this may be named users, internal user groups, or a limited slice or percent of the customer base. In this way, defects in new features have limited exposure and impact on the user base. This enables the new features to be tested on production environments, and also allows feedback from alpha and beta users on the effectiveness of new features.
Simplistically, a Feature Toggle is a if statement surrounding a new feature that determines whether the feature is exposed to the user or not. However, more powerful Feature Toggle libraries can use various aspects to determine whether the feature is exposed, such as:
- user credentials and groups
- employees,
- GEO location,
- Browser user-agent,
- percentage of users
- server load
- etc
Enabling features earlier in a controlled approach speeds up feedback from customers. For startups, and applications requiring innovation, often it is a greater risk that we are building the wrong product. Often much more significant that we are building the product right. In these situations we want to prioritise testing with real customers, getting feedback on the effectiveness of the features. This may be higher priority on whether there are defects in the feature construction itself. Getting user feedback early allows the user design of the feature to pivot, and we can return to other system tests once the feature user value has been fully realised.
Feature Toggles can be key to the A/B Testing process. Toggles can partition features depending on the A/B or multivariate segments. Performance measurements can then be compared between features exposed or hidden.
Adopting Feature Toggles has its gotchas that must be carefully managed. At some stage, features should be mainlined and the toggle taken out of the code, or where the feature is unsuccessful and toggles are off, the feature is removed from the code base.
Testing feature toggles also requires care. Integration tests will need to flick toggles between on and off to expose or hide the feature within tests.
Further reading:
Tuesday, August 13, 2013
Evidence of Testing
The replacement of the QH payroll system must take a place in the front rank of failures in public administration in this country. It may be the worst. 2.15 pg 12
It is estimated that it will cost about $1.2B over the next eight years. 2.14 pg 12
Standards which had been preset to ensure that the system when delivered would function adequately were lessened or avoided so as to permit implementation. 1.24 pg 215
The risk of doing so was clear and was made explicit by KJ Ross and Associates Pty Ltd, engaged to conduct User Acceptance Testing. 1.25 pg 215
Mr Cowan conducted the exercise he was retained to carry out competently and professionally. Moreover, his approach to that task stands as a rare example in this Project of a cogent warning provided to the parties of problems which clearly existed, but which both parties had to some extent been reluctant to accept and to resolve. Had the warnings he gave in his Final UAT Report been heeded, then the system ought not to have gone live when it did and many of the problems experienced by the system would have been avoided. 5.61 pg 123
Tuesday, July 5, 2011
The Army's $2.7 Billion Computer Designed To Help Troops In Iraq And Afghanistan Doesn't Work
- Cannot perform simple analytical tasks
- has trouble finding reports
- the mapping software is incompatible with the search software.
This post provided to me by Rocket, one of our rising star graduate testers.
Friday, June 17, 2011
Questions Regarding the Ross Report
I thought I would post a recent conversation I had regarding the analysis of metrics. The Ross Report collates software testing benchmarks for Australian organisations. The discussion with Michael De Robertis, Test Manager at ISoft, highlighted a mutual interest in testing metrics, and the deeper questions raised as you start to peel back the layers.
Hi Kelvin.
Thanks a lot for your detailed response.
Yes please feel free to post on your blog, and you’re welcome to include my name/title/company.
From my personal perspective, it is a breath of fresh air to establish good, constructive dialog with any like minded people who have a deep understanding of Testing and who wish to challenge the boundaries of what it all means.
One of the things I love about testing is that despite having 15 uninterrupted years in the profession I still have so much to learn, and good philosophical discussions about Testing still inspire me.
Whilst I can’t speak for regions outside of Canberra (and Sydney to a lesser degree where I used to live), I am deeply concerned with the general quality of Testing professionals out there, and perhaps this is my biggest driver for having a good, mature Benchmarking process which I refine and improve religiously.
I am seeing all types of Testers out there – those who are only politically driven and insist on a requirement having a dot on the “ i “; those who don’t understand business priorities and how to manage risk or to plan; those who only want to break systems and drive wedges with development; those who lose motivation quickly; and those who don’t understand why solid process needs to accompany what we do.
I am seeing these qualities in “certified” testers (both new and experienced), and in graduate IT professionals, many of whom do not even study one single subject on testing. In my final year of my Electronic Engineering degree, testing was a big deal. I even majored in subjects on testing (through what was then called the Australian Centre for Test and Evaluation, ACTE). However unfortunately I don’t see a passion for testing driving the Test industry as it did when I began my career. Instead it seems to be only market driven (ie we need more testers, even if they are no good at it).
Apologies for the drawn out background. In my view, this is where I see publications such as the Ross Report adding real value because the Testing industry needs proper guidance through skills validation programmes complimented with independent evaluations (benchmarks) about how well the industry performs.
Refer below, where I have responded to your follow up questions.
Thanks again.
Regards,
Hi Michael,
Thanks for your comments and feedback on the Ross Report. I really appreciate your email, as it gives us valuable feedback on the way the industry benchmark is being used and what improvements and additional information users are seeking.
Apologies first of all for not responding sooner, it has been on my long list of to do items for a little while, and I have been waiting for a little time to put some real attention to it.
I will add some responses inline with your feedback below.
Also would it be possible that I could post your comments and my responses to my blog page? I found this discussion quite interesting, and I would like to share it with others.
Cheers
Kelvin
Hi there.
Can you please forward this email to someone who is familiar with the content of the Ross Report?
I found the report very comprehensive and particularly useful to compare to our benchmark…..love the metrics! J
I must say I am a numbers person myself. I love looking at what data shows, and using that to gain a deeper understanding of the testing process. Also I have the view that "without evidence, you are just another person with an opinion".
What I did find though, was the more that you look at the results, the more questions that open up to you. You tend to question: how did the number emerge, what does it mean, what influences it, how to I improve it... And I think that is the point you make below. Certainly after scratching the surface, I want to know more.
Getting the balance right in the survey is hard. To better understand I want to ask more questions, but the tradeoff then is participation will fall off if respondents find the survey too onerous. What we probably need to do is to periodically run other more detailed surveys that drill down, so long as participants dont then suffer "survey fatigue"! Anyhow a challenge to build the just right "goldilocks" survey.
Compiling the Ross Report turned into a much larger job than I anticipated. We spent well over 500 hours, with multiple people involved. It was a big job, but we want do go deeper and wider (more respondents) next time.
We are just about to kick off the 2012 Ross Report, so your feedback is very timely.
In my personal view, there were three areas that I was hoping someone could comment further about (purely for the benefit of comparing philosophies of benchmarking).
1) More proof about Automation Tools effectiveness – The report rightly separates automation and performance testing from manual testing, however I couldn’t find measurements that specifically reported on whether automation testing (for example) improved regression fail rates or resulted in bugs being found. All that I could see was that organisations were happy with their choice (but not the backup metrics as to why). I also couldn’t find metrics on downtime needed for training, setup, configuration, execution, and maintenance of those tools.
Very good point. I think the industry is still uncertain on the real benefits they hope to obtain from automation. Few organisation measure their effort, tests automated/executed, defects found, etc. Automation ROI expectations seem to be poorly understood in terms of their economics, and the metrics to show that.
There are questions around metrics for automation, for instance, that we should measure manual test reduction. Many times we count execution reduction from automated tests, that would perhaps never have been chosed to be executed if it was manual. We probably still expend a lot of effort running tests that shouldn't be continued, as they are not providing much insight.
There were some metrics on coverage, regression re-execution specific to automation, and these indicated some surpising insights - many organisations are poorat re-executing automated tests. Obviously this requires more in-depth follow up! There were also some metrics in relation to
Your comment about defect discovery rates from automation is also pertinent. I think this is a very important metric. Also perhaps to be combined with effort, to deduce defects per hour (from automation). On page 49 you should see we report that only 6% of regression tests fail per test cycle. Regression tests have lower defect yield than new feature or defect correction tests. I would expect for many this would also apply to automation, as in most organisations automation is focussed on regression testing.
I think your comments to extenddefect detection, testing effort metrics from general testing, to specifically look at automation, is a good suggestion, and one we will look at.
2) Skills validation as opposed to Skills Certification – There is still a view amongst many testing professionals that existing certification programs are not value-adding other than to say that you have the certification. Although I have certification, I lean towards this view because I personally feel that certification could do more to prove that testing professionals have worked the hours, gained the experience, delivered the services, and have been endorsed by former and current peers. Hence I think the report would add value if it considered skills qualification and validation rather than through certification alone.
I agree, certification falls short in a number of areas in terms of skills validation. See http://kelvinross.blogspot.com/2010/10/certification-is-failing-do-we-need.html and http://dorothygraham.blogspot.com/2011/02/part-3-certification-schemes-do-not.html.
I think to go beyond the certifications that organisation's staff are attaining, we would need to get some measurement on the competencies of the staff. I am just not sure how we can measure this from an organisational perspective??? It ispretty hard to assess competency of individuals, doing it in a meaningful way for organisations, hmmm, I need to think about that a bit more.
We tried to assess where staff are sourced, in terms of career entry, e.g. from thebusiness, development, etc. and impact of staff attributes, e.g. tools, techniques, theory, etc.
In other work we have undertaken skills assessments, including with our own consulting staff, scored skills for different skill categories, and tried to role that up to a organisational profile score to aim for improvements, etc. I say tried, because getting a precise skill profile was a challenge. While this profile is important to the individual organisation, I am not sure how this as an average trend would benefit, nor how we would measure individual organisations in a consistent way.
I think also if we take a organisation norm, it would perhaps look similar if we surveyed individuals regarding their skills and competencies across a large survey set. It is also possibly easier to do it at an individual level. What it might not show us is what skill areas are in most demand, and where are the most significant shortfalls.
That being said, what you say about skills validation is definately correct. Could you perhaps give me some more suggestions on what you would like to see here from a benchmarking point of view?
You are right, I’m not sure how you would present this in a benchmark / survey form and I’m not sure how respondents would react. However, if I look at this from a real day to day issue, my problem is that I cannot be secure in knowing that a Tester is fit for the job on the basis of having certification. I recall the early 90’s when Cisco Certification was a big deal. Now it isn’t, and everyone has it. In my view Testing is heading down the same path, and therefore I rely on my own benchmarking to substitute “certification” for “validated skills”.
I think your report almost has the structure in place to do this when you look at the Perceptions section. For example instead of a Test Manager answering a question about what their perception is about something, why not have a section whereby a number of disciplines are required to answer (ie Developers, Dev Managers, Project Managers, BA, etc)? At least once per year I obtain this feedback from my peers. Perhaps respondents might not do so if their team culture is not solid, I don’t know.
In terms of actual certification, I think something like CBAP (for business analysis / systems engineering) would be a good model.
3) Respondent Profile & the influence of the report due to their Methodology – I couldn’t relate to the average team size being up to 25. In my 15 year testing career spanning small and large organisations, I can recall only two occasions where test teams were this big (let alone the whole project team), however they were also subdivided and very independent of each other. Therefore I was slightly concerned that the report leaned towards the perspective of more strictly controlled / less-progressive teams running old methodologies. In fact, your Methodology section confirms that given that Waterfall appeared to be one of the most prominent methodologies used.
Team sizes vary dramatically, you can see some organisation at 1-5 members, the norm around 11-25, and also several in the 100+. The team size though refers to the overall testing team size for the organisation, and hasn't been reduced to project testing team size. Some of the 100+ organisation I know have over 500 testers, some at 700 - 750 testers.
When we looked at the Project Landscape you can see it is common to have 10 - 20 projects per annum. Also common was a 6 month project duration. This paints a rough picture that on average you are probably going to have a project with 2 or 3 testers. Larger projects with longer durations may had 5 - 10 perhaps. We didn't specifically ask though the average number of testers on a project. As you point out that could change a lot based on methodogy, archecture, team composition, etc.
I think the survey does have a bias towards "testing aware" companies, as we less likely to have contact with "testing ignorant" companies. Also as you point out, this is likely biased to larger organisations. I think in benchmarking yourself, you need to think in terms of benchmarking with the testing industry, which wouldn't include "testing ignorant/weak" organisations.
Waterfall is still most common, however Agile has been increasing dramatically over recent times. Still waterfall is prevalent in large teams/organisations, however we are finding the understanding is improving, and the barriers are coming down to adoption of Agile. What I still see in organisation is misconception of what constitutes "Agile", and hence why we tried to drill down into some of the Agile practices. We need to do this further again in the next survey.
Some other areas that didn’t appear to be covered which might add value in my opinion are:
1) Exploring more deeply into the skill sets of testing professionals. Examples:
a. Are they more closely aligned with the business side (in scope & analysis) or with the development / technical side of things (design, tools, infrastructure, etc)
b. What classifies an Automation or a Performance Test Specialist? Is it a Tester who can capture/playback and make basic adjustments to automation scripts, or are they at the other end of the spectrum and are fully competent in reading/writing in scripting and programming languages?
I think this relates to your point 2 above.
2) More detail about the value adding of test professionals (in addition to perceptions of testing) . Examples:
a. How quickly can they adapt to the job at hand?
b. What is their contribution to process improvements, finding important bugs, and identifying new features?
I think we should be able to look at some of these in future perceptions. We just need to be able to phrase it into a perception questions, and one that has qualified results, and one where the benchmark is likely to provide some comparable/actionable results that organisations can compare their own position against.
As discussed earlier, perhaps we need to split out a survey for individuals, and get their bottom up input into the testing approach, rather than the top down perception from the organisation view as well.
3) More information about how Test Managers estimate. Example:
a. If a Test Manager has a wealth of metrics at their disposal, what are the best ones to use when estimating for future projects, and what are their reasons?
Michael, is this something that you wanted surveyed, or something that needed some further discussion?
In my experience, people tend to use rules of thumb at the early stage, supported by % of project budget, and developer/tester ratios (2 metrics we did survey). That is when they are varying their budget, for many it is what you had last time. As more detail comes in they tend to move towards work breakdown, and that's when overheads must be factored in such as test management, test environments, etc. (again distribution of test effort metrics provide some guidance here). I guess we could survey whether people use other things such as function points (or test points, etc.), and other estimation approaches.
If you think we should survey this further, some example of what you would like to see would help us.
Good question. I probably have to think about this one in more detail, but basically, I am constantly challenging myself to get the perfect estimate. That is despite the scope creep, changing priorities, higher than expected bug rates, etc etc. My view is that these ‘unplanned’ events should always be expected and therefore planned. However this should be done with a proven technique rather than to simply double the time required (for example). I have detailed methods which I use and which consider these events, but I’m not sure if others do likewise particularly in organizations where they have to stick to the fixed number of deliverables or timeframes (even if quality is not up to scratch). In my group, we react according to the need and therefore the estimation process needs to have lots of options. Stressful? Yes. Good end result? Also yes!
Hence I’m after guidance which tells me that my estimation methods are good or not good (etc). From a mathematical perspective, I would think this is achievable.
4) Post Deployment and measurement of success - How accurate was testing implemented to a plan, what criterion was used (and by whom) to determine if the project was a success, and does this criterion ever change depending on the priorities? Example:
a. If the test plan could not be implemented accurately (due to other project issues out of our control) AND the project ran late (due to a larger number of bugs needing to be fixed) AND the client was very happy with the end result, is this considered a failure or a success? In my industry (safety critical, I would be happy with this result).
Yes, you point he is quite right. Completing a project on schedule is not the only measurement of success. Finding defects late in a project is always a huge frustration, and I have experienced it first hand where despite the testing evidence the project has steamrolled into production, in some cases costing 100's of millions of dollars. So success could be a cancelled project due to test evidence.
The Standish study doesn't reflect that point. I think IT portfolio management should be like share portfolio management. Some investments will not acheive the return, and better to abandon them then to stick with it.
I guess we (as an industry) need to think about measuring success more effectively. Quality ultimately needs to include the end user / customer perception. While testing we tend to focus on the pre-deployment metrics, it is the post deployment metrics which really count! It is like we are measuring inputs to quality, and we need to focus more on outputs of quality.
Here we could look at specific issues, like you suggest above. It might be worth understanding:
- projects delays caused by testing, did post-deployment quality improve. Ie. are testing results beneficial in influence project schedules
- do project plans effectively use testing results to measure plan milestone completion
- and so on
Something also for our benchmark team to focus on for 2012, and one where your suggestions are very welcome.
Tuesday, April 19, 2011
Is it time consumers get tough on software warranties?
- Section 68 - Application of provisions not to be excluded or modified
"... purports to exclude, restrict or modify or has the effect of excluding, restricting or modifying: ... (c) any liability of the corporation for breach of a condition or warranty implied by such a provision; or ... is void." - Section 70 - Supply by description
"... there is an implied condition that the goods will correspond with the description..." - Section 72 - Implied undertakings as to quality or fitness
"... there is an implied condition that the goods supplied under the contract for the supply of the goods are of merchantable quality..." - Section 74D - Actions in respect of goods of unmerchantable quality
"... the consumer suffers loss or damage by reason that the goods are not of merchantable quality; the corporation is liable to compensate the consumer or that other person for the loss or damage and the consumer or that other person may recover the amount of the compensation by action against the corporation in a court of competent jurisdiction."
"Goods of any kind are of merchantable quality within the meaning of this section if they are as fit for the purpose or purposes for which goods of that kind are commonly bought as it is reasonable to expect having regard to:
(a) any description applied to the goods by the corporation;
(b) the price received by the corporation for the goods (if relevant); and
(c) all the other relevant circumstances."