[SATLUG] when will people learn?

Alan Lesmerises alesmerises at satx.rr.com
Sat Feb 26 12:52:23 CST 2011

On 2/26/2011 9:25 AM, Daniel Givens wrote:
> On Feb 24, 2011, at 10:27 PM, Geoff Edmonson wrote:
>> On 02/24/2011 09:36 PM, Alan Lesmerises wrote:
>>> On 2/24/2011 6:36 AM, Geoff Edmonson wrote:
>>>> http://www.bbc.co.uk/news/technology-12554499
>>> "It is going to portray Microsoft in a bad light ..." -- uhhh, YEAH!!!
>>> Talk about stating the obvious!
>> <snip>  as long as Microsoft has been releasing software, there's been an immediate rush, as soon as it was released, to update it because of -some- issue, be it compatibility, or security.</snip>
> <snip>  Quality control in the software industry isn't what it should be. Period. It's not just a Microsoft thing.</snip>

My initial comment wasn't specifically MS bashing -- it was more of a 
commentary on the author of the article.  That being said, there does 
seem to be a systemic problem with the processes in-place at MS for 
software development and reliability testing.

Being a Reliability Engineer (albeit not specializing in software), I do 
occasionally have to deal software reliability issues, including 
reliability principles, standards, and testing.  A colleague that does 
specialize in software reliability explained to me that software can 
actually exhibit a soft-of wear-out phenomenon.  I know, that sounds a 
little crazy, but it has more to do with how the software morphs over 
time as more and more patches are applied to it.  Each patch introduces 
more and more opportunities for fatal flaws to emerge until you reach 
the point that it all collapses & you have to chunk the whole thing & 
start over.

I don't have direct knowledge of what code comes from where, how long 
it's been in use, how many patches have been applied, etc., but I 
suspect that MS just keeps trying to tweak the existing code base and 
not go back and do a clean sheet rebuild of the older software, drivers, 
etc. (unless they absolutely have no choice in the matter).  Most 
organizations probably do as well -- it costs money to have people write 
and test code, and if there's something that already exists that 
performs the function needed, they reason "why re-write it".

 From the reliability testing perspective, that same colleague told me 
that it really is impossible to test for all possible conditions that 
software may be subjected to in the real world, so it's quite routine to 
not find many problems until it has been released to the world.  In 
fact, on of the primary approaches to software reliability testing is to 
actually take a user, place them at the keyboard and tell them to "have 
at it" -- try to do something to break the software.  Unfortunately, 
with modern software being as complex as they are with so many features 
& capabilities and the multitudes of ways that a given task may be 
performed, there is no way to possibly test every function, process, 
control, etc., in every possible combination or sequence to ensure 
everything works properly every time.

What this means is that it's extremely important to implement good 
coding practices from the start, at the very lowest levels, such as 
capturing and handling unexpected inputs so they can't produce 
undesirable results (buffer overruns, anyone?).  The prevalence and 
types of bugs that seemed to have been so common in MS products in the 
past suggests that good coding practices were not sufficiently 
emphasized there.  And I definitely agree with Daniel in that it's not 
exclusive to MS either.

Al Lesmerises

More information about the SATLUG mailing list