IEEE Y2K Chairman takes questions

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

IEEE Y2K Committee Chairman takes questions

I am Dale W. Way, Chairman of the Year 2000 Technical Information Focus Group of the Technical Activities Board of the Institute of Electrical and Electronics Engineers (IEEE). My committee wrote the pivotal letter to Congress that broke the logjam on Y2K liability legislation this past summer and recently I wrote a serious critique of Ed Yourdons Y2K End Game essay. I have been asked by David B. Collum, Professor of Chemistry and Chemical Biology, Cornell University, to introduce myself and respond to questions from the community that communes on this forum. I am doing that. I cannot promise I will respond to all questions, as I am very busy, but I will try. (I will be out of the country from 12NOV to 19NOV.)

I have already dialoged with others on both pieces/subjects and have learned from those experiences. The following paragraph I wrote is derived from one of those dialogs and is useful in this forum.

Email exchange is not the best medium for conveying complex ideas with depth and subtlety; there are many associated concepts that follow along with certain words and expressions that are not always held in common among the participants. Another major disrupter to communication is something a close colleague pointed out to me recently: "When knowledgeable, well-meaning people disagree, it is usually because they are talking about the same thing from different levels of abstraction." I tend to talk at the categorical, meta and meta-meta level of the situation, making generalities about categories derived from a coherent taxonomic framework that attempts to encompass the crisis from as close to its true width and depth as I can get. Others, rightfully, are more grounded in the concrete instances of their personal or extended experience. Both perspectives are helpful, neither sufficient to deal with any concrete situation/solution that also has to perform on a larger stage. The more abstract perspective is absolutely necessary if a holistic understanding is sought. (Not that everybody seeks that; most don't.) The level of concrete instances is far too extensive and multi-dimensionally complex for anyone to hold in their head at one time. Dimensions must be collapsed, abstractions must be made, if the whole is to be seen.

That said, I will summarize the controversial positions I have taken, which chop away at the base of many of the orthodoxies of the Y2K community (or Y2Klatura as I call it when in a fighting mood).

1. Rollover, the real-time transition of the century boundary, is way overblown in its significance. It is neither the end or the beginning of anything distinctive; forward looking applications have had to deal with Y2K for years already and backward-looking applications will have to deal with it for years from now. While there are some distinctive parts of the infrastructure that are especially sensitive to rollover, the overall Y2K crisis is not.

2. With a few rare, minor, short-term exceptions, every logical element in a two-digit year computer system (a device or program module or anything built of combinations of them) will cure itself, without remediation, given enough time. This is true because every logical element has its own range of year representations it must COMPUTER UPON (not just record/store) in a consistent manner. This it must do from the point of view of doing its job within the larger context of the rest of the system. This range is not determined by the creators or maintainers of that module, but by the real-world data stream it will actually encounter. (If the creators/maintainers had been really smart, they would have put a filter at the front end and explicitly rejected any year/date outside of a predetermined range that the element could actually handle -- but thats another story.) In practical terms, this data stream will have a maximum forward-looking aspect, concerning future year/date representations, and a maximum backward-looking aspect. When that entire range is on ONE SIDE of the century boundary, AND IT DOES NOT MATTER WHICH ONE, there is no ambiguity with the missing century digits; the computational math works. However long (in intrinsic calendar time) year/date representations in that range that straddle the century boundary can or are allowed to flow into the element is how long the element is intrinsically vulnerable to Y2K, no less, no more.

3. Not all Y2K errors lead to Y2K system failures. Nor do all Y2K system failures lead to business function failures. Nor do all business function failure lead to business or supply chain failures. Nor do all supply chain failures lead to disaster or catastrophe. The duration of a failure, the mean time to repair (MTTR) it, is a more significant factor to its breaking out of these containment layers and infecting the wider world than the number of failures; many failures can be tolerated as long as they are short. Beyond the sin of confusing what is POSSIBLE with what is PROBABLE, many in the Y2K community equate all Y2K errors with equal fear and apprehension as if all will lead to disaster. If any commentator talks about a Y2K error, a Y2K problem or a Y2K failure that COULD happen and does not indicate a method for evaluating the PROBABILITY of its happening in any one place or across a wider area, AND does not illuminate a chain of events that would bring such an event out to the wider world in some reasonable, coherent way, what that commentator says should be commensurately discounted or ignored. The time for waving ones arms around and sounding the alarm about everything that could happen in order to build Y2K awareness is long passed. Now we must concentrate on trying to understanding what will happen.

4. Many elements have very narrow ranges of vulnerability and will not be a factor in any Y2K disruption. Clocks have an infinitely small range right at rollover, therefore have only a one-time risk and are usually easily resetable. Most physical/process control systems, if they have any year sensitivity at all (because they concatenate year/date with time representations and compute on the whole as a unit) have very narrow windows (seconds, minutes, maybe hours) that will pass quickly. While theoretically exposed during that period, there are other mitigating factors that keep that vulnerability from turning into failures that can break out into the wider world. Consequently the vast majority of such physical/process control systems that underlie the production facilities of our utilities and much of our factory infrastructure are at no or little risk of direct Y2K disruption that will be visible to the wider world. THE LIGHTS WILL NOT GO OUT AT MIDNIGHT. Those elements that have wider ranges of date representation they must handle live mostly in business/accounting/administrative computing. On-line transaction processing (OLTP), like ATM or internal check processing systems, usually do not look forward or backward beyond the day in question and are so at limited vulnerability. But to the extent they interact with backroom systems and databases that do have wider ranges, they are at more risk. We do not have to worry about having electric power or phones around rollover, but we do have to worry longer-term about having economically viable power and telephone companies.

5. Traditional remediation done on short-range elements may have been wasted effort, especially when considering the certainty of any invasive fixes to otherwise working code will induce additional errors at a fairly predictable rate (~20%). A wiser approach may be to suspend operations while the window passes, or if there is a little longer window, generate a stop-gap workaround during that period to keep the business function that uses that system working to a necessary extent. Alternatively, it may be better to just suspend that business function if it does not unduly hurt the organization/entity. But remediating everywhere, and just because, two-digit years were found in a system will likely be wasteful or counter-productive

6. Compliance, or more precisely component-level compliance, is a fundamentally flawed concept upon which to base efforts at technically preventing Y2K problems. It does not recognize in a realistic way any components place in a dependent web of other components. The effect of incoming external data on a given component and the effect of its outgoing data on other components is not really part of the equation. A system (however its borders are defined) can be completely made of compliant parts and fail in the face of Y2K while another can be completely made of non-compliant parts and perform reliably. Multiple components made compliant can be compliant in different ways and be incompatible when interconnected and sharing date data. The presumed independence inherent in the concept of component compliance is only an illusion. Entities that bought all new computer hardware because they were not compliant (i.e., their clocks were not compliant) may be surprised to find out the viability of their applications will not necessarily be helped by that purchase. (Although the hardware vendor that sold to them under those pretenses will have been helped, at least in the short term.)

7. Traditional software remediation aimed at making modules, programs or even applications compliant will often fail when introduced back into their web of other systems, applications and databases. Interdependence problems will emerge without a clear indication where the source of the problems really is. (It may not be in any one place, but in the RELATIONSHIP between elements, like some marriage problems.) This will tend to lengthen the duration of the problem and increase the MTTR, raising the risk the problem will break out of its containment layers, cascading out into other systems and functions. Without complete knowledge of the interdependencies between all the elements that share data (something virtually never available) all traditional invasive remediation does is push integration dependency errors and problems into the beleaguered testing phase. The testing/validation phase has neither the infrastructure or time is available to run the necessary number iterative loops of fix/test/refix/retest within the context of full regression testing necessary to assure limited errors when the pieces and/or the whole are put back into production. This means traditional invasive remediation, no matter how apparently successful at the component/element/subsystem level, will leave many latent problems to be found in use.

END

-- Dale W. Way (d.way@ieee.org), November 09, 1999

Answers

Dale,

Many thanks for your extraordinary efforts to bring forth the potential problems posed by y2k.

Ray

-- Ray (ray@tottacc.com), November 09, 1999.


Translation please? Does this mean that the IEEE is optimistic or pessimistic?

-- Lars (lars@indy.net), November 09, 1999.

Dear Sir, I'm not sure what your role has been in the Y2K process, but I congratulate you in advance if you have taken any efforts to publicize this event, and make others aware of the potential negative ramifications.

However, you state that as long as a computer gets a consistent date field to rely upon, it will sort itself out. Are we dealing with artificial intelligence? Can computers sort this mess out themselves? I say this in lieu of networking, ie one computer can make sense out of consistent set of 'wrong' dates, but what happens when it networks with another computer?

Second, are you being optimistic about the 'lights staying on' at rollover? Or do you have some kind of inside information which contradicts the many posts here? In other words, do embedded chips have a smaller role to play in power generation than we might hope or suspect? Thanks again,

Buzzzzzz

-- buzz (thanks@I.think), November 09, 1999.


Sounds good, but I still don't think Medicare providers will get paid, and the IRS is still toast.

-- Dog Gone (layinglow@rollover.now), November 09, 1999.

I think you've done a wonderful job at disinformation and some outright lying. Who am I to stop you?

-- Paula (chowbabe@pacbell.net), November 09, 1999.


Mr.Way

If I am reading you correctly, you are saying that in many, if not most cases, FOF may be the best approach (as long as you can do without the system for awhile), in that you then can identify where the problem is and not spend alot of time (money) fixing something that may not really be a problem.

Also, the act of trying to become compliant, infects the system with a 20% additional bug problem (average), thus obscuring where the problems originated from, if there were an original problem.

If that's what you imply, then hasn't the world wasted one trillion dollars or more?

Isn't then the last-in, the first out (of the mess)? have we opened pandoras box?

-- bob brock (bb@myhouse.com), November 09, 1999.


Mr. Way

Thank you for your efforts to educate the masses (us).

From your previous post it seems like you continue to worry about the interconnections more than the code. It is MHO that the powers that be should have an extended "vacation" from Xmass extending into the New Year to slowly manuver through the critical date periods and put the puzzel back together again. Any comments?

It just seems that we are a train heading for a rock wall. One would think that if the train is going slower the damage would be less.

-- Brian (imager@home.com), November 09, 1999.


Dale, you stated "the vast majority of such physical/process control systems that underlie the production facilities of our utilities and much of our factory infrastructure are at no or little risk of direct Y2K disruption that will be visible to the wider world." Is there any way power transmission and distribution could be indirectly affected by an internal/external Y2K problem/failure which could in turn be visible to the wider world in terms of loss of power? Also, you explicitly state THE LIGHTS WILL NOT GO OUT AT MIDNIGHT. I agree, but would like to know your opinion as to the reliability of the power industry after the rollover. Thanks in advance!

-- Val Jones (vjones@cableestevan.com), November 09, 1999.

Thanks for some new ideas. Yes he is saying that some problems will sort themselves out... example: Your system processes loans. For the sake of simplification all loans are one year or less. Your system will have a problem from ~Feburary 1999 to about Novemeber ~2000 at which point the system will start operating correctly again...

Yes he is saying that a LOT of remediation may have been misdirected and this is quite true. If you have 10 programmers but enough work for a hundred, you had better identify the MOST important problems first and not spend your precious resources fixing the kid's nintendo. A LOT of folks have spent their time and money fixing the kid's nintendo.

He is also saying, you had better know where ALL your problems are and how each will affect your system(s) before you start fixing them (very true) if you want to fix them at the most cost/effort effective level.

Yes he is saying the lights are not going to go out (and for the most part they aren't). He speaks of more subtile issues than something like the lights going out. He says some systems will fail BECAUSE they were remediated, which is true. He says that a 'compliant' system can fail if it is in a larger system which is also compliant but has not been remediated in such a way as to ensure that the changes are all functional WHEN PLAYED TOGETHER.

Systems complexity is very difficult to comprehend from the outside looking inside. Systems in general are a difficult concept. Basically this fellow did not 'critque' Yourdon. His statements seem consistent both the original and this later statement.

Bottom line, No Virginian, the lights are not going to go out, but watch out for the price of oil!

-- (...@.......), November 09, 1999.


Dale,

In early 1999, there was a report published by, I believe, Jim Lord, which cited as acknowledged fact that 3 of Venezuela's 5 oil refineries were noncompliant and non-fixable and would be shut down prior to rollover. This was confirmed by sources at the House Subcommittee on Management, Information and Technology.

1. Are these non-fixable Venezuelan refineries representative of all refineries?

2. If Venezuela's refineries are a special case of difficult remediation work, how and why are they special?

3. What methodology AND DATA have been published which would allow expert and authoritative electrical engineers to conclude that the petroleum industry has successfully completed a top-to-bottom remediation?

I look forward to something other than the tried-and-true political response, i.e. "I haven't seen those reports, so I can't comment on them." 3.

-- The Creosote Exxplosion! (A@Arealstickymess.com), November 09, 1999.



Dale, Another specific example.

On March 21 the Chicago Tribune reported that the computer at Brazil's Xingo hydroelectric facility was forwarded to 1/1/2000 at which time "twelve thousand warning lights flashed all across the control board, with all kinds of alarm information."

Apparently, technicians switched the clock back and the situation was remedied, but it was reported that the plant would have had to shut down to search for the source of failures if the situation had been real.

I do not know the date of that "experiment". The layman's conclusion is that that facility is a candidate for a good toasting at rollover. What data contradict that conclusion? What reasonable assumptions contradict that conclusion?

-- The Creosote Exxplosion! (A@Arealstickymess.com), November 09, 1999.


  1. Due to their short time horizons as mentioned, I would expect nearly all embedded systems failures to occur around rollover. Since these are the most immediate threats to life, I can't see how you would minimize the significance of 1/1/00.
  2. Here's an exception. The UI of a program takes dates in the format mm/dd/yy, and adds 1900 to yy to get the four-digit year used internally. So 1/1/00 becomes 1/1/1900. Unfortunately, Unix and Windows systems use the seconds-since-1970 format for time. In this format, there is no valid representation for 1900! Negative timestamps are not allowed. A 1/1/00 input date will crash the program, be rejected, or be converted to some other date (negative number taken as positive, for example.) So the actual useful range of dates supported by the program is not 1900 to 1999 (or 2000 to 2099 as you imply), but 1970 to 1999, or 2070 to 2099. So the application will become unusable for quite a long period of time.
  3. True enough, but not helpful. To say that not all computers will crash, not all organisations will fail, does not help us decide which ones will. And it is certainly also true that there is a critical percentage of software failures that will kill a company, and some percentage of company failures that will kill the economy. We don't know that percentage, or on which side of it we are likely to be. An 80% success rate sounds OK to many policy makers, but 20% failure may well be enough to cause extremely severe consequences. A 20% shortfall of oil or electricity certainly would.
  4. We have heard sporadic reports of a class of exceptions to this viewm, namely maintenance dates held in memory. In these devices, when the maintenance interval is exceeded (when it has been 90 days since the last maintenance, for example), the software shuts off the device. If the date comparison is not Y2K compliant, the device may well shut down at rollover, thinking that maintenance is overdue. A restart will not help, since the old date is retained in memory. The operator would have to set the clock back or purge the maintenance date from memory. Or perhaps just perform maintenance, to get the correct time placed in memory. All of these things could take longer than a simple press of the reset button. No idea how widespread this problem is, but I have heard it mentioned in items like fire trucks, medical equipment and airport equipment.

    As for lights out, this could happen due to embedded systems errors in the power system, unexpected shutdowns by large customers, or due to regulatory requirements (if phones go out, I believe nuclear power plants are required to shut down.) Even minor failures of control systems have been known to lead to operator errors.

  5. Many PC's have been unnecessarily replaced, and pointlessly replaced as well, since it was the applications that needed to be fixed. On the other hand, it is reported to be very expensive to shut down some items like refineries and chemical plants. Once shut down, some of these plants take quite awhile to restart. So any general shutdown by industry will result in financial losses and shortages.
  6. End-to-end testing is required. Normally, new applications go through an extended shakedown period before (and usually after) they are put in service. Y2K is requiring fixes to running systems already in place. And since 2000 is not here yet, these modified applications are not actually running in the environment they must handle. This is a situation ripe for missed errors and incorrect fixes. Of course, some large systems, like banking or telecommunications, simply can't be end-to-end tested before hand. They are just too large to replicate all the components and subject them to typical load. Normally, they are modified incrementally, and with support standing by to catch any errors in the new code. Y2K will put large numbers of modified systems into use all at once, leading to many unexpected failures.
  7. Agreed. But this is not an argument against remediation. There's no reason to believe that fixes will go faster after 2000 than they would have before. If a large organization will take 2 years to get their systems fixed, we had better hope that they are nearly 2 years into their project. Otherwise, they will simply have to do without that critical system. Large companies and governments are held together by their software. There is no way I can see them doing without it for any extended period of time.

The problem I have with government and business Y2K efforts so far is the secrecy. If any eletrical utility would just publish a list of all systems that they've had to fix so far, and what would have happened if they had not fixed them, we would a much better idea of what shape we're in. We would then be able to guess the effects on plants (and countries) that had not done this work. We would also actually be in better shape, since utility workers worldwide could check their own efforts against this list. The same thing is true across industries -- power, water, transport, chemical, etc. Our inability to share information (presumably due to our expensive and unpredictable legal system) has made Y2K a potential disaster.

The large Y2K budgets at various organizations tell me that the problem was indeed large and that many systems had to be fixed. A large effort required virtually guarantees that there are organizations that have failed in their efforts. Competance is simply not that widespread. So despite the secrecy and uncertainty, I feel relatively certain that we are in for a huge mess, at minimum. I remain hopeful (but not at all confident) that the embedded systems problems are not as bad as feared, and that utilities will survive rollover relatively unscathed.

If the various governments involved had been interested in getting any bad news, they could have solved this problem. The regulators in the U.S. could have forced the utilities to disclose the details of their remediation. They could have done a few sample remediations and published the details themselves. They could have handled the legal difficulties of disclosing Y2K defects in supplier equipment.

But we didn't. In two months, we'll know how badly this turns out.

-- Michael Goodfellow (mgoodfel@best.com), November 09, 1999.


Mr. Way,

I'll pose essentially the same question to you that I did to this forum: Are you saying that two or more remediated systems can be happily interfacing and swapping date data before they encounter 2000 dates, and still run into trouble when those dates arrive? If so, why?

-- Thinman (thinman38@hotmail.com), November 09, 1999.


Michael; Wrong on Unix and Dos... Unix seconds since 1970 Dos is since 1980 (as I recall) Dos rolls over, Unix has some time to go before it rolls over, as I recall.

-- (...@.......), November 09, 1999.

Mr. Way--

In point no. 4 you stated:

"...While theoretically exposed during that period, there are other mitigating factors that keep that vulnerability from turning into failures that can break out into the wider world. Consequently the vast majority of such physical/process control systems that underlie the production facilities of our utilities and much of our factory infrastructure are at no or little risk of direct Y2K disruption that will be visible to the wider world. THE LIGHTS WILL NOT GO OUT AT MIDNIGHT..."

Could you explain in detail what those "mitigating factors" are that will prevent those process control systems from being disrupted? This is not clear to me at all.

I can understand how y2k disruption problems in a factory system might not be "visible" to the outside world, however if those questionable "mitigating factors" fail to mitigate I'm almost certain someone will notice that THE LIGHTS HAVE GONE OUT. Factory systems afterall do not buy and sell their products between each other with the immediacy that occurs between electric utilities.

-- curiously concerned (concerned@notconvinced.com), November 09, 1999.



Mr. Way,

You stated, "We do not have to worry about having electric power or phones around rollover, but we do have to worry longer-term about having economically viable power and telephone companies".

Can you please be more specific about the contrast in this statement; I'm unable to follow your reasoning. It could be those meta-blockers I ate.

Incidentally; why were you unable to gain a consensus from the IEEE members for the release of your Y2 end game critique on the IEEE website rather then its release through Roliegh Martin?

Be well,and thank you

-- Rob Carroll (flyingred@montana.com), November 09, 1999.


In response to: -- Ray (ray@tottacc.com), November 09, 1999.

Thanks for the kind words.

In response to: -- Lars (lars@indy.net), November 09, 1999.

Q: Translation please? Does this mean that the IEEE is optimistic or pessimistic?

A: I do not speak for the entire IEEE. To speak for my whole committee I have to get approval on text from them, a time-consuming effort. I am speaking here for myself based on what I have learned in my capacity with the IEEE Y2K committee and my previous 25 years in the field.

I know it is very human to ask that the most complex situation in history be boiled down to one word: optimistic or pessimistic. Talk about collapsing dimensions! I am optimistic about the physical/ process control system infrastructure in and of itself. I am less optimistic about the support and maintenance systems that support those control systems, but still generally optimistic. I am very much less optimistic about the business/accounting/administrative systems that support the economic functions of almost all organizations on the planet. There will be many system failures there. But few if any single one of these will have a direct and catastrophic effect on the outside world. But there will be many in many different organizations. Many will be handled or compensated for within those organizations and will not be visible to the outside world. Some might get out and impact, though not fatally, other organizations or their functions. If there are enough of them and they last long enough, we will see visible impacts, mostly in the area of delays and slowdowns, not general collapses. This will bring economic stress to many organizations. Y2K will ultimately be like being pecked by ducks -- each peck annoying, troublesome and even damaging to some level. The question is, will there be so many that they bleed us into serious damage. I do not know, but am an optimistic person.

In response to: -- buzz (thanks@I.think), November 09, 1999

Q1: However, you state that as long as a computer gets a consistent date field to rely upon, it will sort itself out. Are we dealing with artificial intelligence? Can computers sort this mess out themselves?

A1: No, it is in the intrinsic nature of date processing in computer systems. The vulnerability is what comes and goes, not our or the computer systems response.

Q2: Second, are you being optimistic about the 'lights staying on' at rollover? Or do you have some kind of inside information which contradicts the many posts here? In other words, do embedded chips have a smaller role to play in power generation than we might hope or suspect? Thanks again

A2: No, just an understanding of power generation gained by research and talking to real, on-the-ground experts (some who are on my committee), academics who spend their lives studying the field and its technology and power plant designers, as well as understanding the nature of technology and systems development methodologies in physical/process control systems. Nothing special that a lot of work would not bring to almost anyone.

In response to: -- Dog Gone (layinglow@rollover.now), November 09, 1999

Comment: Sounds good, but I still don't think Medicare providers will get paid, and the IRS is still toast.

A: Do you have any hard evidence for that belief, or are you just guessing? Perhaps out of a genuine concern for the possible effects of crisis and your perception of societys late or insufficient response you have taken a position that you now feel compelled to defend, irrespective of any new information that might be developed.

In response to: -- Paula (chowbabe@pacbell.net), November 09, 1999

Comment: I think you've done a wonderful job at disinformation and some outright lying. Who am I to stop you?

A: Nobody.

In response to: -- bob brock (bb@myhouse.com), November 09, 1999

Q1: If I am reading you correctly, you are saying that in many, if not most cases, FOF may be the best approach (as long as you can do without the system for awhile), in that you then can identify where the problem is and not spend a lot of time (money) fixing something that may not really be a problem.

A1: Not exactly. Yes, if you can do without the system for as long as it is vulnerable to Y2K you can ignore it (you will not have to FOF because there will not be any Fs). You must research your system and its users to reliably determine what that window actually is.

Q2: Also, the act of trying to become compliant, infects the system with a 20% additional bug problem (average), thus obscuring where the problems originated from, if there were an original problem.

A2: Research shows error repair for any reason, Y2K or otherwise, has that probability of introducing new errors. This probably goes up when a lot of interconnected systems are modified more or less at one time, due to interaction/interdependency errors induced AMONG modified systems.

Q3: If that's what you imply, then hasn't the world wasted one trillion dollars or more?

A3: Not necessarily. For some systems where the window is too wide or the system carries too big a workload to live without, you must attempt to prevent problems or failures. There are several approaches that may be employed (I do not want to go into that here). But even if you rely on traditional invasive remediation (add/change code, or change data format and add code everywhere date processing occurs), you may be able to isolate it from less critical systems (whose function therefore may have to be partially or completely sacrificed) and still do invasive remediation on it that does not negatively impact other systems, because you have disallowed data sharing that may infect other systems. In some cases even without such isolation the remediation efforts that pushed these interdependency problems into the testing phase may still be worthwhile if testing catches enough of them or there just arent enough to matter anyway. But for many cases, given the highly interdependent nature of some systems, it is probably true that much money was wasted.

Q4: Isn't then the last-in, the first out (of the mess)? have we opened Pandoras box?

A4: I dont understand this sentence.

In response to: -- Brian (imager@home.com), November 09, 1999.

Q: From your previous post it seems like you continue to worry about the interconnections more than the code. It is MHO that the powers that be should have an extended "vacation" from Xmass extending into the New Year to slowly maneuver through the critical date periods and put the puzzle back together again. Any comments?

A: I am worried about the interconnections because they are outside the scope of traditional remediation thought, unconsidered, like the warping of time and space is outside Newtonian mechanics. It is pretty much a blind spot. The extended vacation idea is not bad for narrow-window systems like in physical/process control systems, but too many systems have rather wide windows, in the range of many months or years like in many business systems that probably render this idea untenable.

In response to: -- Val Jones (vjones@cableestevan.com), November 09, 1999.

Q: Dale, you stated "the vast majority of such physical/process control systems that underlie the production facilities of our utilities and much of our factory infrastructure are at no or little risk of direct Y2K disruption that will be visible to the wider world." Is there any way power transmission and distribution could be indirectly affected by an internal/external Y2K problem/failure which could in turn be visible to the wider world in terms of loss of power? Also, you explicitly state THE LIGHTS WILL NOT GO OUT AT MIDNIGHT. I agree, but would like to know your opinion as to the reliability of the power industry after the rollover. Thanks in advance!

A: Longer-term there is some threat from two sources. The first lies in the support systems of the power industry. Both the more front-line monitoring, maintenance scheduling, spare parts management, etc. systems, and the more backroom accounting/administrative systems that could disrupt the business end of their operations. They have more year sensitivity, have wider windows of vulnerability (i.e., they have to handle wider ranges of date representations) and so are more prone to Y2K problems. I am not sure that problems there, even if they emerged, would really impact power generation and distribution to any significant extent. I tend to think problems in either system type will be compensated for in some way. The second source is in energy source supplies that have to run through sometimes multi-tiered transportation systems and the companies that own them. Those transportation facilities, and the administrative computing that must support them, have somewhat more vulnerability to Y2K problems. But I continue to believe that power is so important, so strategic, that administrative and economic problems will not be allowed by the government or the Fed to impact the generation and distribution of power, no matter what.

In response to: -- (...@.......), November 09, 1999.

Comment: Basically this fellow did not 'critique' Yourdon. His statements seem consistent [in] both the original and this later statement.

A: I have to disagree. Although I am making many novel points in Y2K- land, I was still critiquing Yourdon in the original piece. His equating rollover with The End and his use of the flawed, unexamined concept of compliance to constitute winning and the assumption of system independence implied in it needed to be criticized and was.

In response to: -- The Creosote Exxplosion! (A@Arealstickymess.com), November 09, 1999.

Comment/Qs: In early 1999, there was a report published by, I believe, Jim Lord, which cited as acknowledged fact that 3 of Venezuela's 5 oil refineries were noncompliant and non-fixable and would be shut down prior to rollover. This was confirmed by sources at the House Subcommittee on Management, Information and Technology.

1. Are these non-fixable Venezuelan refineries representative of all refineries?

2. If Venezuela's refineries are a special case of difficult remediation work, how and why are they special?

3. What methodology AND DATA have been published which would allow expert and authoritative electrical engineers to conclude that the petroleum industry has successfully completed a top-to-bottom remediation?

I look forward to something other than the tried-and-true political response, i.e. "I haven't seen those reports, so I can't comment on them." 3.

A: What methodology AND DATA have been published which would allow you to conclude that the petroleum industry NEEDS top-to-bottom remediation? Where are the facts, not the reported conclusions? What part of the plant is impacted? How is it impacted? Why is it impacted? How does that/those impacts render it :non-fixable? For how long will it be impacted? In the absence of facts, what is the logical chain of critical reasoning that COULD go from any POSSIBLE computer system behavior to a non-fixable refinery? Show me the mechanism! Show me the money! Presumably this is a multi-million dollar facility. Show me a scenario where a computer system could render it or any refinery non-fixable? Are people going to shut that down over date processing? Walk away for it (your implication)? How did the House Subcommittee on Management, Information and Technology confirm it? Did they send down power system engineers to inspect and verify that report, or did they just acknowledge they had seen this report? Did you verify that before you invoke the name of a Congressional subcommittee to bolster your contentions that this is true or that you have properly interpreted it?

My guess is that the Venezuelans, being a practical people, are going to shut this refinery down for a day or two around rollover, wait for whatever windows of vulnerability of its devices, embedded systems or other monitoring/support systems to pass and then bring it back on- line in a careful, systematic fashion. The only other reason it might be abandoned (although I am not sure either you or the report you are quoting is saying this) is that it is truly obsolete, too dangerous to people or the environment and/or is no longer economically viable, and Y2K is coming up so what the hell, just let it go and blame it on Y2K. Of course, I could be wrong.

-- Dale W. Way (d.way@ieee.org), November 09, 1999.


Mr. Way - First of all, thank you very much for offering to answer our questions.

I have a few questions of my own for you.

1. This seems to be a popular issue. When you say that the lights will not go out, do you mean worldwide or in the United States alone?

2. Are there any places that you feel are especially high-risk areas. For example, are embedded chips issues likely to significantly affect oil production?

3. Could you speculate upon the number of "big" failures we are likely to see from Y2K? Chemical plants blowing up or businesses unable to ship product for extended periods, that sort of thing.

4. In your post, you talk about some areas that will cause problems, and some other areas that are likely to be okay. Could you speculate on what effects the problem areas will have on society?

5. What do you think of the polly point of view (no big deal, systems fail all the time, y2k is just a bump in the road)?

-- John Ainsworth (ainsje00@wfu.edu), November 09, 1999.


Dale,

Thanks for posting. Ill formulate questions and ask them in the a.m... after coffee.

;-D

Diane

Posters, for background... see threads...

IEEE Y2K Chairman's Personal, Pessimistic Take on Y2K and Yourdon's End Game Paper

http://www.greenspun.com/bboard/q-and-a-fetch-msg.tcl?msg_id= 001fqh

And...

Interpretation of Dale Way's commentary on Yourdon's End Game article. Can this help me be better prepared!

http://www.greenspun.com/bboard/q-and-a-fetch-msg.tcl?msg_id= 001gqd



-- Diane J. Squire (sacredspaces@yahoo.com), November 10, 1999.


And also this earlier Senate testimony thread...

Institute of Electrical and Electronics Engineers letter to the Senate Committee on Liability issue

http://www.greenspun.com/bboard/q-and-a-fetch-msg.tcl?msg_id= 000zDx

TAB YEAR 2000 TECHNICAL INFORMATION FOCUS GROUP
Piscataway, NJ, June 9, 1999.


-- Diane J. Squire (sacredspaces@yahoo.com), November 10, 1999.


Thank you for your time. Are you preparing? how?

-- Hatti (klavine@tco.com), November 10, 1999.

Hi Dale...I too am thankful for your willingness to participate in this forum. I have a question about your response to Val Jones. You stated, "The second source is in energy source supplies that have to run through sometimes multi-tiered transportation systems and the companies that own them."

Q. Are you talking about the small electric Co-ops and such that don't produce electricity, but simply pass it along to customers?

You also state, "But I continue to believe that power is so important, so strategic, that administrative and economic problems will NOT BE ALLOWED {my caps) by the government or the Fed to impact the generation and distribution of power, NO MATTER WHAT (my caps)."

Q. I'm not sure I understand what IT is. Are you speaking of disruption of power? Don't mean to be dense here...grin. How will the government not allow it? IOW,would you explain a little more what you see the government doing please and at what point they might choose to take action.

Thanks ahead for your reply.

beej

-- beej (beej@ppbbs.com), November 10, 1999.


Sheesh...wouldn't you know it. You try to come off as a halfway intelligent being and blow it...sigh.

I asked about IT in my above questions. I suppose what I was wanting to ask is what type of administrative or economic problems would affect the generation or distribution of power?

Again, thanks.

beej

-- beej (beej@ppbbs.com), November 10, 1999.


Good evening, Mr. Way. I have enjoyed and appreciated greatly both your comments on this thread and the essay that was posted on Roleigh Martin's website some nights ago. As a nontechnical outsider who has been struggling to understand this problem for some 18 months, I found some of your discussion tough going, but did discover much of it to be in agreement with analyses I've read elsewhere, especially regarding complex business software systems (enterprisewide mainframe systems, etc.). Your worst-case scenario sounds like what has often been described as "death by a thousand cuts"--or, to use your own analogy (straight from Hitchcock's "The Birds," it seems!), being pecked into considerable discomfort by myriads of birds/Y2K problems. It is difficult, if not impossible, to predict what would constitute a "critical mass" of problems for any business, let alone any industry, economic sector, or country. (N.B. Do you know Cory Hamasaki? You and he seem to be on the same page in many respects.)

Your comments about possible physical control/infrastructure problems (embedded systems) seem much more reassuring, and are very much appreciated. This area has been a lingering concern of mine, what with all the conflicting "information" on the internet and elsewhere. There are several points I'd be grateful if you would address:

1. Large Scale Embedded Systems (LSES)--for instance, Distributed Control Systems (DCS), Supervisory Control and Data Acquisition (SCADA), Energy Management Systems (EMS), etc. It's my understanding that some of these systems are purely "embedded," while others are also computer (usually PC) based. From various reports that I have seen--TAVA/Beck, GartnerGroup, the British Office of Health & Safety, etc.--LSES failure rates can run as high as 35% or more. LSES are often found in the process, control, and safety functions of the chemical and petrochemical industries, the oil and natural gas industries, the power industry, and large manufacturing facilities. In August 1998 I saw published (online) data from TAVA/Beck, based on assessments of embedded systems problems in four giant U.S. corporations: an auto mfgr. (GM), a pharmaceutical (probably Johnson & Johnson), an oil company (probably Chevron), and a beverage company. TAVA was reporting embedded system failure rates of 15-20%, with plant shutdown odds in each case cited at 60-90% without proper remediation. I recently also checked a couple of UK websites (as you no doubt know, British engineers were ahead of their American counterparts in discovering and addressing possible Y2K problems in embedded systems), those of the IEE and Embedded Science, Ltd. (developers of the Delta-T Probe; you are probably familiar with it). The Brits seem to be still reporting 1-6% failure rates in tested embedded systems; Embedded Science (www.embedded-science.com) was basing its numbers on a database of some 150,000 embedded systems supposedly tested in the field. The IEE embedded system fault casebook includes some 100 "sample" cases of embedded system problems reportedly found in the field; I've reviewed most of them, and find they range from "cosmetic" to "catastrophic," according to the reporting facility's own assessment. Most striking was a reported DCS failure at a petrochemical facility: when a presumably identical DCS had been bench tested, no problems were found; but when this plant tested its own DCS in situ, massive failure resulted, with three out of four operator stations having to be shut down. Live, the actual results would have been "near catastrophic," according to this report: total plant shutdown would have been avoided, but efficiency and reliability would have been greatly reduced. (The moral seems to be, avoid type testing whenever possible!) It's my understanding that DCS of various sorts are found in many petrochemical and chemical plants, power plants (except simpler plants like hydro), oil refineries, manufacturing plants, etc.--and that, in many cases, manual workarounds of DCS failures would not be feasible.

Now, I understand (I think) what you are saying about the very narrow range ("window") of RTCs and hence their extremely limited (thankfully) vulnerability to rollover problems; but it seems to me that when you start looking at LSES, you have to consider a much wider (and potentially more disruptive) "window" of vulnerability (i.e., a much wider period during which incongruous dates may be compared). Or at least that is what I surmise from the field results reported from all these outfits. Or do you consider that their approaches, methodologies, and reported results are simply wrong or (more charitably) misguided?

2. In a "Bloomberg News" article of June 30, 1999, Cameron Daley, chief operating officer at TAVA, was quoted as saying that TAVA was finding overlooked Y2K problems in many of the 100-plus U.S. power companies that it had audited to date--problems that could lead to outages lasting up to several weeks in some areas. (Drew Parkhill of CBN News later telephoned Mr. Daley; Mr. Daley stood by his remarks.) "There are a number of instances where utilities didn't go deep enough into their systems--they accepted vendors' words that parts of a system were compliant." (Rick Cowles and I also assumed at the time that too much reliance upon type testing might have also been a problem.) I believe that Mr. Daley used to be an exec with Boston Edison, though I don't know what his actual technical background is. TAVA has many engineers (including utility engineers), technicians, and systems analysts on staff, and I presume that Mr. Daley's remarks were founded upon some actual test results--otherwise, he was guilty of almost criminal irresponsibility in making such statements in a news report, in my opinion. I don't whether TAVA was running Delta-T Probe analysis or what, and I certainly don't know whether or not this reflects a misunderstanding by Mr. Daley and/or TAVA's engineers about how power generation and T&D systems actually work; but I sure in blazes would like to have this issue cleared up. Do you know some of the folks at TAVA? If so, can you get to the bottom of this? (I have the unpleasant suspicion that this story has just assumed a new incarnation on another, well-known website, in disguise.) TAVA has also done embedded systems work for quite a few companies outside the power industry, including some household names (see above).

On the subject of the power industry, are you familiar with the January "ABB Review" report authored by Dr. Klaus Ragaller, one of the lead scientists at Swiss-based engineering giant ABB? If not, you can find an online version (minus the graphs, alas) at this address: http://www2.abb.ch/global/chibm/chibm007.nsf/Y2K/F1 ABB is the world's largest provider of electronic and digital equipment to power plants and has over 200,000 employees in 100 countries; it has also been providing massive tech support on Y2K issues in the power industry. The ABB report seems to indicate that Y2K problems have occasionally been discovered that could indeed cause power plants to trip offline; threats seem minimal at the T&D level, however. T&D-based outages would occur only if an unfortunate (and unlikely) combination of events transpired. EMS failures can be serious, of course; see the remarks by Hawaiian Electric Co. (HECO) systems analyst Wendell Ito in "Newsweek," June 2, 1997, p. 53 (or thereabouts). I corresponded one evening with Phil Hystad, the principal designer of three major SCADA and EMS systems used by U.S. power companies, and he surmised that the HECO 1996 test failure was in an old Rockwell International system. (It turns out he was right; HECO has since repaired or replaced the system. My own power company has had to go to encapsulation because ABB couldn't deliver a new EMS on time.) Regardless, it's my understanding that SCADA and EMS failures can be worked around manually, though you might then have a greater likelihood of operater error, particularly in the unusual underload conditions that are likely to occur next January as many plants (and even the main SSA database center in Baltimore) go offline for a while, just in case. (Under normal conditions, January electric demand in the U.S. is only 55% of the peak demand in July.) That's why I worry that there might still be some threats to power system reliability come January, though long-term outages seem unlikely. Anyway, any further thoughts you have on such issues would be most appreciated.

3. In your estimation, how serious are the threats to embedded systems in oil and natural gas wellheads, deep sea oil rigs, and oil and natural gas pipeline systems? Would not even relatively "minor" problems prove troublesome if they occurred in places that were difficult to access?

Thanks for all your time.



-- Don Florence (dflorence@zianet.com), November 10, 1999.


Mr. Way mentioned

A: I am worried about the interconnections because they are
       outside the scope of traditional remediation thought,
       unconsidered, like the warping of time and space is outside
       Newtonian mechanics. It is pretty much a blind spot.

>>>>>>>>>>>>>>>>>>>>>>

This is an interesting observation, Just as Physics has trouble with the measure of Quantum Particles, so to would we find the observations of that around us.

Is it a point in time or is it a wave function? (little picture / big picture)

Do our observations effect the final result? (contributing errors to the code)

How does the minute reconcile with the macro? (Where does the code stop and real life start)

What will it take to move the masses of humanity? (Which hair will it take to break the camels back)

What is the momentum of the errors at any given time? Could this be measured and how? Where is this momentum going?

And my favorite question

What could we do to increase mans understanding of the movement of time and change. Learn to anticipate.

Some thoughts

-- Brian (imager@home.com), November 10, 1999.


This dissertation is very difficult for this mentally impaired old lady to understand. But I hung in there and read every word...and sometimes twice! I was feeling some "hope" until I got to the below quote. Is this not the standard quote of all the pollys and all the JQP in denial? This quote took me right back to square one.

But I continue to believe that power is so important, so strategic, that administrative and economic problems will not be allowed by the government or the Fed to impact the generation and distribution of power, no matter what. Taz

-- Taz (Taz@aol.com), November 10, 1999.


Do you have any hard evidence for that belief, or are you just guessing? Perhaps out of a genuine concern for the possible effects of crisis and your perception of societys late or insufficient response you have taken a position that you now feel compelled to defend, irrespective of any new information that might be developed

The latest information that I've seen is from Medicare, which said that only 20% of the providers have tested remediation, and many of them failed the test. The IRS has said that it is still doing an inventory of its systems, meaning remediation, if needed, has not yet begun. Do you have some newer information than that (this was from last week).

In essence, I take your comments to mean that Y2K was never a serious problem, with or without remediation, and that we should be able to work around and FOF. Like you say, you are an optimist.

-- Dog Gone (layinglow@rollover.now), November 10, 1999.


Mr. Way: First, I wish to thank you for your efforts to educate and bring awareness and understanding to this enormously complex problem. You have authored two of the most important documents on the subject. Unfortunately, these have not made the problem easier to understand, but have highlighted the complexity of the situation. For many of us, this has heightened our level of concern, and strengthened our resolve to be prepared to face the consequences. One question I have for you: Jim Lord outlines a plan for mitigating the consequences of failures in the industrial sector in his 10/22/99 essay "A Graceful Scenario." It can be found at his website at www.jimlord.to under new items. Could you please comment on this strategy? Do you think it will be implemented on a wide scale? I guess that is two questions. Sorry. Godspeed,

-- Pinkrock (aphotonboy@aol.com), November 10, 1999.

Wow!!

It looks more and more like we will have power and utilities.

Unfortunately and I do mean unfortunately it clearly looks as if Mr. Way is pointing very strongly towards the BACK ROOMS,OFFICES, economics et.

Banks toast? Stock market? Insurance? Reserve?

Forget about power and water. How is this country going to operate if none of the bookkeeping can be done or reconstructed when the systems in the back rooms fail.

You mean this may all come down to people being honest about what they owe or are owed?? HONESTY!!!

Oh this should be fun!!!!

I smell government privitization, how about you!!

-- d.b. (dciinc@aol.com), November 10, 1999.


Mr. Way, let me add my appreciation for your contributions.

I am not sure I agree with one of your central points, however.

In point #2 of your post, you state:

"2. With a few rare, minor, short-term exceptions, every logical element in a two-digit year computer system (a device or program module or anything built of combinations of them) will cure itself, without remediation, given enough time..."

I beleive this is accurate up to a point, however the epidemic problem I've found in the Access/VB world I work in, 2-digit inputs are interpreted by the software *before* any date comparisons are made. In most applications I've seen, users are forced to enter dates in a mm/dd/yy format, and the software makes the assumption about which century the date belongs in. So a birth date of 01/15/23 is interpreted and stored as Jan 15 1923 in some systems, and 2023 in others. Either way, the software only allows a 2 digit year input and if the software is placing it in the wrong century, the user can't override it.

In many cases, the input is also tested to see if it is even a valid date and depending on how it the application was designed, 01/15/00 may not be.

So a system can be thoroughly disrupted before ANY data comparisons are made. These vulnerablities are not limited to Access/VB applications.

This would not resolve itself within a reasonable time period.

Thanks again for your comments.

Regards-

-- Lewis (aslanshow@yahoo.com), November 10, 1999.


Thank you Mr. Way for the awesome thread. In your opinion, is there any way that the y2k technical problem could cause food shortages that would require more than the usual 3-day emergency stash? I am trying to convince my friends to prepare but my own convictions vacilate wildly from day to day, so I'm not very convincing.

-- Wishy Washy Doomer (polly@ismy.friend), November 10, 1999.

Mr. Way:

You write:

"2. With a few rare, minor, short-term exceptions, every logical element in a two-digit year computer system (a device or program module or anything built of combinations of them) will cure itself, without remediation, given enough time. This is true because every logical element has its own range of year representations it must COMPUTER UPON (not just record/store) in a consistent manner."

I'm sorry, but I just can't buy into this argument of yours at all. I find it to be so theoretically weak, or just plain inaccurate, to be of no practical use in the real world.

It should be obvious to many of us by now, that in the real world, the majority of broken computer bios, broken operating systems and broken user software simply could not be allowed to "cure" themselves. To do so would of course deny us the use of many mission critical systems - while we wait for some hypothetical "curing time".

Case in point: I have written a very time/date dependent software program of approximately 45,000 lines of code. A great deal of data is acquired by the software and used to control a variety of external functions like heating, air conditioning, security, etc. All coding subroutines pass four byte year data and hence are termed compliant by definition. The software is designed to be run 24 hours a day, seven days a week and the users depend upon that functionality. Dates, forward and backward, are continually compared.

If this program is run on an unremediated computer with a faulty date bios, it simply fails, irrespective of the error trapping routines to avoid divide by zero, etc. Fix on failure in this case means replacing the bios, or entire computer when zero down-time is the objective. If tech support told a user to just shut it off and wait for it to cure itself, you could hear the laughter all the way to your home.

Further, you write:

"When that entire range is on ONE SIDE of the century boundary, AND IT DOES NOT MATTER WHICH ONE, there is no ambiguity with the missing century digits."

Oh, really? Well then what happens in the real world, and most likely occurrence, when we must use dates on BOTH SIDES of the century boundary. Have we now created "ambiguity" with the missing century digits? Since 00 - 99 = -1. I'm afraid I don't see a cure for this one, short of adding more error trapping. You deal with this century date straddle (using years on both sides)with the statement:

"However long (in intrinsic calendar time) year/date representations in that range that straddle the century boundary can or are allowed to flow into the element is how long the element is intrinsically vulnerable to Y2K, no less, no more."

I have read this at least twenty times and have no idea what you are trying to say.

-- TruthSeeker (truthseeker @ seektruth.always), November 10, 1999.


Mr. Way; I understand you statement that yours is a critique. Indeed it is, in that regard and I personally feel you make a very valid point in that critique. I believe, I understand your points. I don't really have much to add so I will be off line.

Beware the slippery slope, for gravity exists.

-- (...@.......), November 10, 1999.


Responses 991111

In response to: -- John Ainsworth (ainsje00@wfu.edu), November 09, 1999:

Mr. Way - First of all, thank you very much for offering to answer our questions. I have a few questions of my own for you.

Q1. This seems to be a popular issue. When you say that the lights will not go out, do you mean worldwide or in the United States alone?

A1: I mean worldwide. The fundamental reason I believe this is that there are NO or VERY FEW year-sensitive elements necessarily processed by computers in the generation and distribution of electricity (though there is some in the inter-grid transmission facilities, but those are more involved in accounting than production functions. Outside the USA there is even fewer computers per anything to worry about. Why would they be any more vulnerable? Is it because we tend to think of ourselves as more advanced (and maybe superior) and therefore safer? Even when advanced, in this case, carries more risk?

Q2. Are there any places that you feel are especially high-risk areas. For example, are embedded chips issues likely to significantly affect oil production?

A2: I dont know the oil business intimately, though Ive had some exposure. But Im almost sure they have long/wide ranges of year representations in the BUSINESS MANAGEMENT of their product but only very narrow ones in the drilling/pumping production systems. As wells are generally managed in terms of their beginning, current and end life-points (not only current production, but how much is presumed left to pump), which could span a hundred years. As much of this time period would be in the future on average, I would presume many of these kinds of applications have had the leading-edge of their window of Y2K vulnerability already crossed the boundary, exposing them to Y2K errors, years ago, and so, would have had to have been dealt with long ago.

Q3. Could you speculate upon the number of "big" failures we are likely to see from Y2K? Chemical plants blowing up or businesses unable to ship product for extended periods, that sort of thing.

A3: I predict no, or very very few, big/blow-up failures, especially when considering how many big things there are. Businesses unable to ship product for extended periods? Thats more likely to some extent. But what matters is how often and how long those delays are encountered. That there are/may be delays is not really the issue. I say we can tolerate a fair amount of short-ish delays and a few long ones from almost every non-life-critical function of society and the economy, if necessary.

Q4. In your post, you talk about some areas that will cause problems, and some other areas that are likely to be okay. Could you speculate on what effects the problem areas will have on society?

A4: A general transaction rate and information flow slowdown of some extent (maybe 5%, maybe 20% for some period) leading to economic effects I am not competent to judge, though I have my estimates.

Q5. What do you think of the polly point of view (no big deal, systems fail all the time, y2k is just a bump in the road)?

A5: They do not understand the extent of the complexity and exposure in the software end, so it is not likely to be just a speed bump. But, I believe we will ultimately handle them, although at some economic cost and risk.

In response to: -- Hatti (klavine@tco.com), November 10, 1999:

Q: Thank you for your time. Are you preparing? how?

A: I have stopped saying. I do not want the responsibility. We each have our own situation and what is good for me and mine may not be the same as for you and yours. And you cannot know my situation, nor I yours.

In response to: -- beej (beej@ppbbs.com), November 10, 1999:

Hi Dale...I too am thankful for your willingness to participate in this forum. I have a question about your response to Val Jones. You stated, "The second source is in energy source supplies that have to run through sometimes multi-tiered transportation systems and the companies that own them."

Q1: Are you talking about the small electric Co-ops and such that don't produce electricity, but simply pass it along to customers?

A1: No, I talking about the coal and oil delivery systems that bring raw fuel to the power plants. Hydro-electric and nuclear power plants have even less risk in this regard because the fuel does not have to be brought on-site. (Ironic, isnt it, that nuclear plants are, for once, at less risk for anything -- although there are heavily politically-invested people who would rather die than admit that.)

Q2: You also state, "But I continue to believe that power is so important, so strategic, that administrative and economic problems will NOT BE ALLOWED {my caps) by the government or the Fed to impact the generation and distribution of power, NO MATTER WHAT (my caps)." I'm not sure I understand what IT is. Are you speaking of disruption of power? Don't mean to be dense here...grin. How will the government not allow it? IOW, would you explain a little more what you see the government doing please and at what point they might choose to take action.

A2: I mean administrative computing-impaired financial/money flow or even spare parts problems. I believe the government would intercede and add its power and resources to those problem before it lets electricity supply be seriously undermined.

Thanks ahead for your reply.

beej

In response to: -- beej (beej@ppbbs.com), November 10, 1999:

Sheesh...wouldn't you know it. You try to come off as a halfway intelligent being and blow it...sigh.

Q: I asked about IT in my above questions. I suppose what I was wanting to ask is what type of administrative or economic problems would affect the generation or distribution of power?

A: See answer to you above.

Again, thanks.

beej

In response to: -- Don Florence (dflorence@zianet.com), November 10, 1999:

Comment/Q1: Good evening, Mr. Way. I have enjoyed and appreciated greatly both your comments on this thread and the essay that was posted on Roleigh Martin's website some nights ago. As a nontechnical outsider who has been struggling to understand this problem for some 18 months, I found some of your discussion tough going, but did discover much of it to be in agreement with analyses I've read elsewhere, especially regarding complex business software systems (enterprise-wide mainframe systems, etc.). Your worst-case scenario sounds like what has often been described as "death by a thousand cuts"--or, to use your own analogy (straight from Hitchcock's "The Birds," it seems!), being pecked into considerable discomfort by myriads of birds/Y2K problems. It is difficult, if not impossible, to predict what would constitute a "critical mass" of problems for any business, let alone any industry, economic sector, or country. (N.B. Do you know Cory Hamasaki? You and he seem to be on the same page in many respects.)

Your comments about possible physical control/infrastructure problems (embedded systems) seem much more reassuring, and are very much appreciated. This area has been a lingering concern of mine, what with all the conflicting "information" on the internet and elsewhere. There are several points I'd be grateful if you would address:

1. Large Scale Embedded Systems (LSES)--for instance, Distributed Control Systems (DCS), Supervisory Control and Data Acquisition (SCADA), Energy Management Systems (EMS), etc. It's my understanding that some of these systems are purely "embedded," while others are also computer (usually PC) based. From various reports that I have seen--TAVA/Beck, Gartner Group, the British Office of Health & Safety, etc.--LSES failure rates can run as high as 35% or more. LSES are often found in the process, control, and safety functions of the chemical and petrochemical industries, the oil and natural gas industries, the power industry, and large manufacturing facilities. In August 1998 I saw published (online) data from TAVA/Beck, based on assessments of embedded systems problems in four giant U.S. corporations: an auto mfgr. (GM), a pharmaceutical (probably Johnson & Johnson), an oil company (probably Chevron), and a beverage company. TAVA was reporting embedded system failure rates of 15-20%, with plant shutdown odds in each case cited at 60-90% without proper remediation. I recently also checked a couple of UK websites (as you no doubt know, British engineers were ahead of their American counterparts in discovering and addressing possible Y2K problems in embedded systems), those of the IEE and Embedded Science, Ltd. (developers of the Delta-T Probe; you are probably familiar with it). The Brits seem to be still reporting 1-6% failure rates in tested embedded systems; Embedded Science (www.embedded-science.com) was basing its numbers on a database of some 150,000 embedded systems supposedly tested in the field. The IEE embedded system fault casebook includes some 100 "sample" cases of embedded system problems reportedly found in the field; I've reviewed most of them, and find they range from "cosmetic" to "catastrophic," according to the reporting facility's own assessment. Most striking was a reported DCS failure at a petrochemical facility: when a presumably identical DCS had been bench tested, no problems were found; but when this plant tested its own DCS in situ, massive failure resulted, with three out of four operator stations having to be shut down. Live, the actual results would have been "near catastrophic," according to this report: total plant shutdown would have been avoided, but efficiency and reliability would have been greatly reduced. (The moral seems to be, avoid type testing whenever possible!) It's my understanding that DCS of various sorts are found in many petrochemical and chemical plants, power plants (except simpler plants like hydro), oil refineries, manufacturing plants, etc.--and that, in many cases, manual workarounds of DCS failures would not be feasible.

Now, I understand (I think) what you are saying about the very narrow range ("window") of RTCs and hence their extremely limited (thankfully) vulnerability to rollover problems; but it seems to me that when you start looking at LSES, you have to consider a much wider (and potentially more disruptive) "window" of vulnerability (i.e., a much wider period during which incongruous dates may be compared). Or at least that is what I surmise from the field results reported from all these outfits. Or do you consider that their approaches, methodologies, and reported results are simply wrong or (more charitably) misguided?

2. In a "Bloomberg News" article of June 30, 1999, Cameron Daley, chief operating officer at TAVA, was quoted as saying that TAVA was finding overlooked Y2K problems in many of the 100-plus U.S. power companies that it had audited to date--problems that could lead to outages lasting up to several weeks in some areas. (Drew Parkhill of CBN News later telephoned Mr. Daley; Mr. Daley stood by his remarks.) "There are a number of instances where utilities didn't go deep enough into their systems--they accepted vendors' words that parts of a system were compliant." (Rick Cowles and I also assumed at the time that too much reliance upon type testing might have also been a problem.) I believe that Mr. Daley used to be an exec with Boston Edison, though I don't know what his actual technical background is. TAVA has many engineers (including utility engineers), technicians, and systems analysts on staff, and I presume that Mr. Daley's remarks were founded upon some actual test results--otherwise, he was guilty of almost criminal irresponsibility in making such statements in a news report, in my opinion. I don't whether TAVA was running Delta-T Probe analysis or what, and I certainly don't know whether or not this reflects a misunderstanding by Mr. Daley and/or TAVA's engineers about how power generation and T&D systems actually work; but I sure in blazes would like to have this issue cleared up. Do you know some of the folks at TAVA? If so, can you get to the bottom of this? (I have the unpleasant suspicion that this story has just assumed a new incarnation on another, well-known website, in disguise.) TAVA has also done embedded systems work for quite a few companies outside the power industry, including some household names (see above).

Reply/A1: Working backward through your extensive comments, I consider their approaches and methodologies undefined, un-explained, unknown. If they were reported in detail then every one could evaluate them. You know, how scientists do it. Right now we are being asked to take the word of someone we do not know based on criteria we do not know and data we do not know. You quote terms like "near catastrophic," failure rates of 15-20%, failure rates can run as high as 35% or more, massive failure resulted, plant shutdown odds in each case cited at 60-90% without proper remediation. What do these words mean? What constitutes failures? What do they effect? How long do they last? What is the normal failure rate and is this a lot more, the same, less? What do they do with non-Y2K failures, shot down the plant? If so, why is this any different? What is the mechanism in going from failure to shutdown? How were the odds computes, based on what? All undefined and meaningless as a basis for understanding either the general case or these particular instances.

Comment/Q2: On the subject of the power industry, are you familiar with the January "ABB Review" report authored by Dr. Klaus Ragaller, one of the lead scientists at Swiss-based engineering giant ABB? If not, you can find an online version (minus the graphs, alas) at this address: http://www2.abb.ch/global/chibm/chibm007.nsf/Y2K/F1 ABB is the world's largest provider of electronic and digital equipment to power plants and has over 200,000 employees in 100 countries; it has also been providing massive tech support on Y2K issues in the power industry. The ABB report seems to indicate that Y2K problems have occasionally been discovered that could indeed cause power plants to trip offline; threats seem minimal at the T&D level, however. T&D- based outages would occur only if an unfortunate (and unlikely) combination of events transpired. EMS failures can be serious, of course; see the remarks by Hawaiian Electric Co. (HECO) systems analyst Wendell Ito in "Newsweek," June 2, 1997, p. 53 (or thereabouts). I corresponded one evening with Phil Hystad, the principal designer of three major SCADA and EMS systems used by U.S. power companies, and he surmised that the HECO 1996 test failure was in an old Rockwell International system. (It turns out he was right; HECO has since repaired or replaced the system. My own power company has had to go to encapsulation because ABB couldn't deliver a new EMS on time.) Regardless, it's my understanding that SCADA and EMS failures can be worked around manually, though you might then have a greater likelihood of operator error, particularly in the unusual underload conditions that are likely to occur next January as many plants (and even the main SSA database center in Baltimore) go offline for a while, just in case. (Under normal conditions, January electric demand in the U.S. is only 55% of the peak demand in July.) That's why I worry that there might still be some threats to power system reliability come January, though long-term outages seem unlikely. Anyway, any further thoughts you have on such issues would be most appreciated.

Reply/A2: Ill check it out. But I did not ever say there would be no problems of any kind. I said there would be few, most would be caught and handled or of short duration. If a power plant shuts down even for hours or a day the rest of the grid is still available. (The only single point failure potential is from the substation to the home/ office, not in any one power plant or the transmission grid.) How would this be much different from normal circumstances? Where I live in California my power goes out about once a quarter, sometimes for many hours. If that goes to five times a year for one year, I can live with that.

Q3: In your estimation, how serious are the threats to embedded systems in oil and natural gas wellheads, deep sea oil rigs, and oil and natural gas pipeline systems? Would not even relatively "minor" problems prove troublesome if they occurred in places that were difficult to access?

A3: Every situation is unique, but the general case for physical/ process control systems has been laid out by me over this whole thread.

Thanks for all your time.

In response to: -- Brian (imager@home.com), November 10, 1999:

Mr. Way mentioned

I am worried about the interconnections because they are outside the scope of traditional remediation thought, unconsidered, like the warping of time and space is outside Newtonian mechanics. It is pretty much a blind spot. This is an interesting observation, Just as Physics has trouble with the measure of Quantum Particles, so to would we find the observations of that around us.

Q1: Is it a point in time or is it a wave function? (little picture / big picture)

A1: Too philosophical for here, but my money is on a wave function, although superstring theory removes some of the assumed quantum foam artifacts from the issue.

Q2: Do our observations effect the final result? (contributing errors to the code)

A2: Interesting you should say that. Turn your observation about our observations around. Our putting remediated code through the testing phase is really an observational activity, more realistically part of the ASSESSMENT phase, not a validation function, except after many iterative loops of assess/fix/test/re-assess/re-fix/re-test. Another delusion we have allowed to become entrenched in Y2K-land is that these phases are linear and sequential; when you finish assessment, you go into renovation; when you finish renovation, you go into validation; when you finish validation, you go into implementation. What hogwash. What a fundamental mis-understanding of the real task inherent in Y2K.

Q3: How does the minute reconcile with the macro? (Where does the code stop and real life start)

A3: All politics may be local in terms of motivations, but decisions get made, resources get allocated, at a more central, macro level. Hopefully those two line up congruently.

Q4: What will it take to move the masses of humanity? (Which hair will it take to break the camels back)

A4: There are many such hairs that could do it, but compared to how many hairs there are, not that many. We are right to watch out for the non-linear response of highly coupled complex systems -- a little stimulus can force, through various unpredictable gyrations of the parts of the system, a very big response.) But we should also keep in mind this can work for us too; a big stimulus can be damped out by similar gyrations into a small response. Sometimes they work for us, not always against us.

Q5: What is the momentum of the errors at any given time? Could this be measured and how? Where is this momentum going?

A5: That is a very good question. I would like to think this is something Gen. Kinds information center is would do. But I have no idea yet what they are doing.

Q6: And my favorite question: What could we do to increase mans understanding of the movement of time and change. Learn to anticipate.

A6: We incorrectly, or incompletely, associate change with the new, as in adding new things/technologies when we should be focused on understanding how to change the existing. We have a creationist bias: believing the main thing is creating new systems based on new technology. We should have an evolutionary bias based on understanding all that we already have at the level of completeness and detail necessary and knowing how to simply, safely and regularly transforming it, intelligently moving it onto an ever-newer architectural and technological foundation. Then we would not have to anticipate correctly all the time, but could adapt quickly to new realities as they emerge.

In response to: -- Taz (Taz@aol.com), November 10, 1999:

Q: This dissertation is very difficult for this mentally impaired old lady to understand. But I hung in there and read every word...and sometimes twice! I was feeling some "hope" until I got to the below quote. Is this not the standard quote of all the pollys and all the JQP in denial? This quote took me right back to square one. But I continue to believe that power is so important, so strategic, that administrative and economic problems will not be allowed by the government or the Fed to impact the generation and distribution of power, no matter what. Taz

A: Mentally impaired, indeed! You, Madame, have uncovered a stark truth that few in the field, let alone the wider world, have noted. It is said more explicitly in the Technical Information Statement of the Year 2000 Technical Information Focus Group of the Technical Activities Board of the Institute of Electrical and Electronic (IEEE). <>. (The IEEE is oldest and largest association of engineers and computer scientists in the world.) But you have sniffed it out, seen between the lines, squeezing it out like virgin olive oil from a first pressing. Let me state it here as succinctly as I can: BOTH the Chicken Littles and the Pollyannas ARE WRONG.

The Chicken Littles, who most often exhibit the hand-wringing/ embedded chip/ physical control systems/ we could lose power and everything!/ utilities/ hazardous material plants/ Bhopal & Chernobyl trajectory are wrong because they do not understand the basic mechanism necessary to force a Y2K error and what relatively minimal opportunities in those systems for those mechanisms to play out. They are wrong because they have no concept of the nature of those systems and how, and under what value and incentive systems, the overall, often life-critical systems (the ones containing all the embedded components and subsystems) were ENGINEERED to withstand regular failures of almost all of its parts at one time or another without ceasing to function. They do not understand, on top of these other advantages, how well these systems are understood by their care- givers. How much simpler they are in comparison to those systems we must be concerned about. Those that the Pollyannas cannot see.

The Pollyannas are in denial not because they do not see threats that are not really there, but because they do not understand and appreciate the massive size and complexity of software-intensive, intensely interconnected/interdependent/data sharing enterprise management systems normally associated with accounting and administrative functions. They do not understand how prevalent and long-lasting are the opportunities for Y2K errors to emerge in systems here and how much more difficult and time-consuming it will be to track them down and neutralize them. They do not understand how resistant these systems are to remediation, especially fundamentally flawed, compliance-based traditional invasive software remediation that pushes its most difficult problems out into the beleaguered testing phase. Pollyannas do not understand how little these systems are really understood by their caregivers and how ill-disciplined, how CRAFT and occasionally ART-based are the doings here, having been carried on under a decades-long succession of trendy, fashion-based technologies and methodologies however competently (or not) executed by an equally long succession of different maintenance teams.

But before the Chicken Littles (and everybody else) run over to the other side of the boat, the accounting/administrative computing side, threatening to tip it over in to despair, take great comfort from the fact that most of these errors will not be very destructive, or that destructive to things that really matter. We can, to a large extent, isolate and contain, or compensate for in other ways, most of the errors, including just slowing things down to the rate we can manage. Some transactions will get kicked out, some systems will stop, but only more frequently than they do now, not stop as if they have never stopped before; most non-trivial systems fail regularly already. Plus, as I have indicated, the problems will tend to correct themselves when the vulnerability windows of systems close as all their data representations clear the century boundary and inhabit only the 2000 side. Accounting systems do not DIRECTLY threaten life and limb. We have more flexibility in dealing with their short comings.

Do not think of me as a Pollyanna (more precisely a mealy-mouthed apologist) because I know the electrical system is going to function very close to, if not totally, normally through and beyond the rollover. And dont think of me as a Chicken Little because I see the weakness in the administrative computing infrastructure. I am a bell- curve centrist. The extremists on both end are wrong. As Will Rodgers said Its not what we dont know that hurts us, its what we know that aint so.

In response to: -- Dog Gone (layinglow@rollover.now), November 10, 1999:

Do you have any hard evidence for that belief, or are you just guessing? Perhaps out of a genuine concern for the possible effects of crisis and your perception of societys late or insufficient response you have taken a position that you now feel compelled to defend, irrespective of any new information that might be developed.

Q: The latest information that I've seen is from Medicare, which said that only 20% of the providers have tested remediation, and many of them failed the test. The IRS has said that it is still doing an inventory of its systems, meaning remediation, if needed, has not yet begun. Do you have some newer information than that (this was from last week). In essence, I take your comments to mean that Y2K was never a serious problem, with or without remediation, and that we should be able to work around and FOF. Like you say, you are an optimist.

A: What test? What was tested? How was it tested? How do you know those IRS systems need remediation? You have not been given enough information to know what this really means. (Although you have been given enough to engage your imagination and hear what you want to hear.) I never said nor implied Y2K was never a serious problem nor said we should be able to work around and FOF. I said to a certain extent we will HAVE to work around it and FOF while and where we have to because in many situations, there never was a choice, our illusions to the contrary notwithstanding.

In response to: -- Pinkrock (aphotonboy@aol.com), November 10, 1999:

Comment: Mr. Way: First, I wish to thank you for your efforts to educate and bring awareness and understanding to this enormously complex problem. You have authored two of the most important documents on the subject. Unfortunately, these have not made the problem easier to understand, but have highlighted the complexity of the situation.

Reply: There is no subject, however complex, which -- if studied with patience and intelligence -- will not become more complex.  New Speaker's Handbook. And, "For every problem, there is a solution which is simple, neat, and wrong." -- H. L. Mencken

Q: For many of us, this has heightened our level of concern, and strengthened our resolve to be prepared to face the consequences. One question I have for you: Jim Lord outlines a plan for mitigating the consequences of failures in the industrial sector in his 10/22/99 essay "A Graceful Scenario." It can be found at his website at www.jimlord.to under new items. Could you please comment on this strategy? Do you think it will be implemented on a wide scale? I guess that is two questions. Sorry. Godspeed,

A: Sorry, I do not have time to read, analyze and comment on Mr. Lords piece at this time.

In response to: -- d.b. (dciinc@aol.com), November 10, 1999:

Wow!!

Q1: It looks more and more like we will have power and utilities. Unfortunately and I do mean unfortunately it clearly looks as if Mr. Way is pointing very strongly towards the BACK ROOMS,OFFICES, economics et. Banks toast? Stock market? Insurance? Reserve? Forget about power and water. How is this country going to operate if none of the bookkeeping can be done or reconstructed when the systems in the back rooms fail.

A1: We have longer to deal with it before it really bites. Systems will unlikely collapse, just run intermittently and at a slower rate overall. There are temporary arrangements we can make, if necessary. We do not eat, drink or breath money and bills. No question that is important and there will be impacts. But the banks, markets et al will not become toast.

Q2: You mean this may all come down to people being honest about what they owe or are owed?? HONESTY!!!

A2: It may. Then we can figure on the usual break down: 20% selfish and venal, 20% altruistic and noble, and the 60% middle going which ever way the wind blows strongest. Forget all this technology crap. See what the real challenge is??

Q3: I smell government privatization, how about you!!

A3: I smell government doing everything it can to keep things going until we get through this, including keeping everybody calm through the rollover.

In response to: -- Lewis (aslanshow@yahoo.com), November 10, 1999:

Mr. Way, let me add my appreciation for your contributions.

I am not sure I agree with one of your central points, however.

In point #2 of your post, you state:

"2. With a few rare, minor, short-term exceptions, every logical element in a two-digit year computer system (a device or program module or anything built of combinations of them) will cure itself, without remediation, given enough time..."

Q: I believe this is accurate up to a point, however the epidemic problem I've found in the Access/VB world I work in, 2-digit inputs are interpreted by the software *before* any date comparisons are made. In most applications I've seen, users are forced to enter dates in a mm/dd/yy format, and the software makes the assumption about which century the date belongs in. So a birth date of 01/15/23 is interpreted and stored as Jan 15 1923 in some systems, and 2023 in others. Either way, the software only allows a 2 digit year input and if the software is placing it in the wrong century, the user can't override it. In many cases, the input is also tested to see if it is even a valid date and depending on how it the application was designed, 01/15/00 may not be. So a system can be thoroughly disrupted before ANY data comparisons are made. These vulnerabilities are not limited to Access/VB applications. This would not resolve itself within a reasonable time period.

A: I did say there were exceptions. But before we conclude this is one, let me ask you how many date entries will be misinterpreted? Out of how many total? How many Access/VB transactions are there compared to all the transactions in the world? Could you define (not for me, but for yourself) what thoroughly disrupted really means? At a detail level? Is there no other way to handle the dates, or the transactions they come in with, that are mis-interpreted? May be those are few enough to be handled manually. Maybe what that system does could be lived without for awhile while it is adjusted to more properly interpret the dates?

In response to: -- Wishy Washy Doomer (polly@ismy.friend), November 10, 1999:

Q: Thank you Mr. Way for the awesome thread. In your opinion, is there any way that the y2k technical problem could cause food shortages that would require more than the usual 3-day emergency stash? I am trying to convince my friends to prepare but my own convictions vacillate wildly from day to day, so I'm not very convincing.

A: My sense is that overall (I cannot know your particular situation) food distribution will at worst be slowed down, not all choices will necessarily always be available all the time as we are used to it, and everyone will continue to accept checks even if the banks take longer to process them. (What choice to they have? They cannot afford to lose that many customers. It is not like whole bunches of folks will be paying with cash or gold coins for very long.) If a really catastrophic situation, if we have only hoarded food available after many months, I doubt you could store enough food to make much of a difference. But I reiterate, I think that is highly unlikely, given our adaptability when the chips are down (no pun intended).

In response to: -- TruthSeeker (truthseeker @ seektruth.always), November 10, 1999:

Mr. Way, you write:

"2. With a few rare, minor, short-term exceptions, every logical element in a two-digit year computer system (a device or program module or anything built of combinations of them) will cure itself, without remediation, given enough time. This is true because every logical element has its own range of year representations it must COMPUTER UPON (not just record/store) in a consistent manner."

Comment 1: I'm sorry, but I just can't buy into this argument of yours at all. I find it to be so theoretically weak, or just plain inaccurate, to be of no practical use in the real world.

Reply 1: Could be describe some facts or conceptual structures and reasoning you used to come to that conclusion so that our readers would be able to evaluate your opinion relative to mine?

Comment 2: It should be obvious to many of us by now, that in the real world, the majority of broken computer bios, broken operating systems and broken user software simply could not be allowed to "cure" themselves. To do so would of course deny us the use of many mission critical systems - while we wait for some hypothetical "curing time".

Reply 2: They do not really cure themselves. That is a figure of speech. The vulnerability goes away, that is all. Im not saying do nothing until the window closes. Only that the number of problem dates or transactions will decrease regardless of what you do. Do whatever you want, but recognize the limitations of what you are doing so you will not be blindsided by unexpected errors. You may be denied the use of many mission critical systems regardless of what I say and what you do. I can neither give nor take away that reality.

Comment 3: Case in point: I have written a very time/date dependent software program of approximately 45,000 lines of code.

Reply 3: Lets see. The average number of finished, tested and working lines of software a programmer generates a day is, and has been for the entire Computer Age, FIVE. So unless you are quite exceptional, that took you 9,000 days to write, or a little over 40 business-hour years. Wow! Thats amazing. Am I missing something here or are you just throwing numbers or achievements around like elephant dung on art?

Comment: A great deal of data is acquired by the software and used to control a variety of external functions like heating, air conditioning, security, etc. All coding subroutines pass four byte year data and hence are termed compliant by definition. The software is designed to be run 24 hours a day, seven days a week and the users depend upon that functionality. Dates, forward and backward, are continually compared.

Reply 4: How far forward? How far backward? Are those date representation used in computations that make direct modifications to the operations of those facilities? Tell me something I can use to evaluate what you are saying. But if they are 4-digit years, why are you telling me this at all?

Comment 5: If this program is run on an unremediated computer with a faulty date bios, it simply fails, irrespective of the error trapping routines to avoid divide by zero, etc. Fix on failure in this case means replacing the bios, or entire computer when zero down-time is the objective. If tech support told a user to just shut it off and wait for it to cure itself, you could hear the laughter all the way to your home.

Reply 5: The few BIOSs that may not smoothly handle the rollover can have their OS and RTC clocks reset ONCE and be fine again. I have to manually change the dates on my computer twice a year because of daylight savings time coming and going. It seems to me it would not be the end of the world if you had to change yours once every one hundred years.

Q: Further, you write: "When that entire range is on ONE SIDE of the century boundary, AND IT DOES NOT MATTER WHICH ONE, there is no ambiguity with the missing century digits." Oh, really? Well then what happens in the real world, and most likely occurrence, when we must use dates on BOTH SIDES of the century boundary. Have we now created "ambiguity" with the missing century digits? Since 00 - 99 = - 1. I'm afraid I don't see a cure for this one, short of adding more error trapping. You deal with this century date straddle (using years on both sides) with the statement: "However long (in intrinsic calendar time) year/date representations in that range that straddle the century boundary can or are allowed to flow into the element is how long the element is intrinsically vulnerable to Y2K, no less, no more." I have read this at least twenty times and have no idea what you are trying to say.

A: That forest must really get in the way of those trees. YES, when we must use dates on BOTH SIDES of the century boundary WE HAVE A PROBLEM. Not as in Houston, we have a problem and have blown up our spacecraft. But we have a transaction that might get bounced or be mis-processed. Boy! That never happens! What you are missing is that ALL THE OTHER dates/transactions will NOT have problems. Isnt that a big improvement from when you thought ALL two-digit years would automatically be problems? But I dont think you can see anything good. You have already made up your mind that it will be bad and that is that.

In response to: -- (...@.......), November 10, 1999:

Comment: Mr. Way; I understand your statement that yours is a critique. Indeed it is, in that regard and I personally feel you make a very valid point in that critique. I believe, I understand your points. I don't really have much to add so I will be off line.

Beware the slippery slope, for gravity exists.

Reply: Thank you.

-- Dale W. Way (d.way@ieee.org), November 11, 1999.


Thanks, Mr. Dale, for your responses to my questions. They are appreciated.

Evidently what's needed is more open airing of research (field) findings by Embedded Science, TAVA/Beck, and other engineering firms actually involved in embedded system remediation (in TAVA's case, for a number of Fortune 100 companies). Also needed are more technical details on the cases submitted for the IEE (UK) embedded system fault casebook; I do not know how well the IEE screens these examples, though I presume some background checking was conducted. I was hoping that you or some members of your committee had contacts at some of these firms and could provide more details on what was being discovered out in the field. Otherwise, it is very difficult for most of us to decide just who is right and who is wrong.

-- Don Florence (dflorence@zianet.com), November 11, 1999.


I'm sorry. I meant to write, "Thanks, Mr. Way."

-- Don Florence (dflorence@zianet.com), November 11, 1999.

Mr. Way,

Judging from your position with the IEEE organization you must be a very experienced engineering type person. I am a very experienced (33 years) systems type person. That said my question to you is this.

Does all of our experience give us enough insight to predict with any accurracy the possible results of the Y2K problem. This particular problem has never occurred before in the history of computing. An event that requires(?) mass changes to large numbers of computer programs running on various hardware/software platforms with an unmovable deadline is certainly new to computing. My experience tells me that the road thru this problem is going to be very rocky and the ultimate outcome is going to depend upon people more than computers.

Your thoughts would be appreciated.

Respectfully,

wally wallman

-- wally wallman (wally_yllaw@hotmail.com), November 12, 1999.


Mr. Way, here you go again. Your retort to my critique above:

(Way)Reply 5: The few BIOSs that may not smoothly handle the rollover can have their OS and RTC clocks reset ONCE and be fine again. I have to manually change the dates on my computer twice a year because of light savings time coming and going. It seems to me it would not be the end of the world if you had to change yours once every one hundred years.

TruthSeeker Response:

1. If the RTC only requires a single reset, what do we do with a power down situation where the CPU must be cold started? Implicit in your argument is the perfect case scenerio where no system will have to be rebooted. You see, the poorest date-capable bios and OS (and there are still plenty of them out there ) will continue to see the two digit date string and continue to replicate the error on each and every bootup - assuming no remediation. To make matters worse, many systems are built to run unattended, as is our data acquisition and control software. This of course means that there will seldom be a human operator to "manually" fix the date upon power failure, etc. A watchdog timer serves no useful purpose in these situations. That is what is precisely wrong with your statement, "It seems to me it would not be the end of the world if you had to change yours once every one hundred years." It well could be the end of the world for a plant requiring a hydraulic valve to turned on or off at an exact moment in time.

2. A more minor point, but still important. Practically all new computer bios automatically adjust for daylight savings time twice a year. If yours do not, it implies they are outdated and should be checked for date rollover as well as leap year reliability. And that includes reliability after power off!!!

-- TruthSeeker (truthseeker@ seektruth.always), November 12, 1999.


Mr. Way please compare and contrast this info with your findings--thank you

California State Water Resources Control Board http://www.swrcb.ca.gov/html/y2ksumry.html

Summary of YEAR 2000 Embedded Systems Issues

There is a lot of material contained in these links. Quite a bit of it is redundant. Realizing that your time is limited, and you may not be able to visit each of the sites listed, this document is an attempt to summarize the key issues.

* Based on experience, it takes 21 months to inventory, analyze, fix and test the embedded systems in a small to medium sized plant.

* Testing of embedded chips must be done very carefully; there are instances where tests have damaged chips beyond repair.

* Testing of embedded microchips can be very complex, requiring trained specialists with specialized equipment.

* Approximately 10% of embedded chips have date functions. Approximately 4% of those may not be year 2000 compliant. That equates to an estimated 160 million non-compliant chips in the world.

* Just because a chip does not perform a date function does not guarantee it will work properly after 1/1/2000 (date sensitive chips have been used in devices that do not perform date functions).

* Just because an embedded chip works properly on 1/1/2000 does not guarantee that it will continue to operate properly after that.

* There are over 30 tests that must be performed on each embedded system to ensure that it is year 2000 compliant.

* Vendors of electronic devices cannot be certain that they are year 2000 compliant (therefore, all devices must be tested).

* Just because one device is Y2k compliant, there is no guarantee that all other devices that have the same manufacturer, model, version, lot number, etc. will also be compliant (therefore, all devices must be tested).

* A system containing numerous Y2k compliant embedded microchips (from different manufacturers) may not be compliant, because there is no date standard followed by all manufacturers.

* It will not be possible to test all embedded systems and repair the non-compliant ones (therefore, efforts should be focused on the most critical systems).

* It is possible that major utilities (electric, telecommunications, water, etc.) will be unable to operate for some period of time after 1/1/2000 (therefore, it is essential to have a contingency plan).

* Businesses will be held legally responsible for damages caused by non-compliant embedde

-- ed portillo (magnacarta00@aol.com), November 24, 1999.


I am wondering about the British experimentals. They have done some advancement work and published sneak previews at: http://www.iee.org.uk/2000risk/Casebook/eg_index.htm

An example of which I just randomly picked out (below) shows the shut down of an HVAV system...I think these are important in electric power generating systems. I guess experimental results are all we can really rely on at this point....the techno bantering will not do me or my chemical company any good at this point.

Equipment Type HVAC Industry Sector ALL PC or Computer based No System Age 6 Application Package Boiler Control System local and remote Description of the Problem Hardware and software How was it Identified Z180 Microprocessors found during physical examination and 2 digit date found when examining code ( assembler ?) What was the Solution Solution not yet known as manufacturer not now in business Consequences for the SYSTEM System Stops Consequences of failure to the BUSINESS Failure would result in no bulk oil supplies to a major works as steam is used to preheat heavy oil for distribution(5 x 10,000 tonne tanks pumped at 50t /hour)

There are 106 other examples on the site where advancement of date in chips have caused shut downs. Am I looking at this wrong?

-- Marcus Brackeen (marcus@chromatographers.com), December 06, 1999.


Moderation questions? read the FAQ