Pirates and Angels

Jon maddog Hall maddog at dtype.org
Mon Feb 3 13:30:23 EST 2003


Several years ago I was attending a trade show in Dallas when I got an
emergency telephone call from one of our product managers for Digital
Unix.  She was in Houston, and wanted me to leave the trade show and
fly to Houston to visit a customer site.  She was in a panic, and told
me that my absence from the show had been approved all the way up our
management chain.  I drove to the airport and grabbed the next airplane.

The next day we visited the Johnson Space Flight Center, and met
with a rather intense man who told us about the computer systems at the center.
He told us how the old systems were built from scratch, with operating
system software designed from scratch, and how all of these systems were
now 20-30 years old.  He told us that the costs of maintaining these
systems were now well over 200 million US dollars per year, and that NASA
was afraid that if they did not trim money where they could, that the whole
space mission might be scrapped.  More importantly, the people who built these
systems were not only retiring, but they were dying.  It is one thing to get
a retiree to come back to make a small change to the system.....

In any case this man and a small group of people under him, who he code
named "The Pirates", had put together a plan to use modern-day computers,
modern-day standard operating systems (called Unix), and re-write all the
application code developed over 30 years.  They argued that after this
was done once at a cost of 200 million USD, the annual cost of maintenance
would be about 23 million USD, about 1/10th the current yearly maintenance
cost.

This project had already been done, and over 200,000 hours of mission
telemetry data had been fed into both systems to see where they differed.
If one system got a different answer from the other, the simulations were
stopped, and the differences were understood.  The project was now finished.

But before the project was put on line, there was still one last detail.  The
manager of the project wanted to meet with representatives of the company
supplying the computer systems.  He wanted to tell us how much he was
depending on our systems to work flawlessly...absolutely flawlessly.
He wanted to tell us that if he even THOUGHT for one brief moment of time
that our systems were not up to the job, that he would scrub the mission,
even if it meant losing hundreds of thousands of dollars in pre-launch
staging.  He wanted to know at this meeting if we knew of ANY REASON why
our operating system and hardware architectures were not up to the job.

The DEC product manager's hand had been lying on my arm.  As the NASA project
manager kept telling us how much he depended on us, how much he was willing
to throw accolades our way if things went well, and how much he was willing
to drag our company down if things did not go well, her hand tightened around
my arm.  Finally I had to hiss, "Pat, I can't feel my fingers any more."  Her
hand loosened a bit.

In the end, the NASA project manager turned to us and smiled.  He said,
"I know that you feel I have been very intense during this meeting, but I want
to let you know why.  I was the flight manager the day the Challenger blew
up, and even though that was not my fault, I lost a billion dollar payload
and seven good friends that day."

I looked him in the eye and said, "Even though I know that no complex system
is free of bugs, given the level of testing and the level of redundancy
that you have here, I believe that the system will perform to your needs."
He thanked me, and the meeting was over.

I never heard directly from him again, and for the next few shuttle launches
I held my breath to see if there were any delays that could be attributed to
our gear, but apparently there were not.  Eventually I saw a Rolling Stone
article about "The Pirates" and their work at saving NASA money and creating
greater flexibility by using "standard" hardware and software, and it
included an interview with that same NASA project manager.  In it he briefly
praised our operating system and hardware for its dependability.  He was
true to his word.

Now as I see articles criticizing NASA for its lack of efforts in saving
money, or its efforts of trying too hard to save money, therefore jeopardizing
safety, I think back to that project manager and remember that the only
break in his steely reserve that day was when the tear rolled down his face
at the mention of the Challenger crew.  Somehow I doubt that he, or anyone
at NASA, would have given the "O.K." to launch if they thought it did not
have an acceptable chance of coming back again in one piece.

Sincerely,

Jon "maddog" Hall

-- 
=============================================================================
Jon "maddog" Hall
Executive Director           Linux International(SM)
email: maddog at li.org         80 Amherst St. 
Voice: +1.603.672.4557       Amherst, N.H. 03031-3032 U.S.A.
WWW: http://www.li.org

Board Member: Uniforum Association, USENIX Association

(R)Linux is a registered trademark of Linus Torvalds in several countries.
(SM)Linux International is a service mark of Linux International, Inc.




More information about the gnhlug-discuss mailing list