Stuck Dominoes???

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

OK, folks, we're almost to mid-November. And I've yet to see any "pessimists" step up to address the current situation.

Daily, Homer Beanfag and others grace these pages with multiple posts of computer problems that are considered newsworthy. In virtually every instance, these problems are related to system replacements. It isn't worth the time to debate whether the replacements were for Y2k reasons or not; the point is, they happened.

These posts also point out that fallbacks to old systems were not available, or utilized. Again, it isn't worth the effort to debate whether these were correct decisions; the point is, it wasn't done. The problems and failures have been dealt with, for better or worse, in the replacement systems.

These are problems that have been occurring for some time; some stretching back to the beginning of the year. Problems that have not been fixed in "oh, 2 or 3 hours". Some ongoing for months.

All of which providing daily validation of what I've been trying to say. That the massive number of system replacements and modifications have generated enormous numbers of errors and failures. The vast majority are caught and fixed before any external effect is witnessed; some are not.

Dale Way's essay and comments have provided more backup. He describes in more detail the problems that occur in interfacing systems when individual pieces are modified or replaced. And again, these are not being somehow "stored up"; they have been occurring on an ongoing basis, as systems are replaced or re-implemented into production. These types of errors are more acute during system replacements; remediated systems for the most part can be done "in place", with little or no changes to external interfaces and systems.

Although I have little direct experience with "embedded systems", my opinion is the above holds true, if not more so. Problems in interfacing systems are more acute when individual systems are "replaced". My impression is little actual "remediation" is available for most embedded systems, and that most that do fail require replacement.

So all of the above is occurring, in a very compressed timeframe. But where are the "Falling Dominoes"? Where are the JIT problems with supplies? Maybe I'm not looking hard enough, but it sure seems to me that I can walk into virtually any store and find the same products available as a year ago.

Now, I'm sure the usual suspects will jump in with the "it's not 2000 yet" chant. But that's not the point, folks. If anything, system errors and failures due to replacements and modifications are much more severe and difficult to deal with than date processing errors. According at least to GartnerGroup, we are well up the slope of date-related error rates. Yet virtually every post of "Y2k-related" errors are due to replacements, and not due to failures in date-processing.

Yes, there have been problems. Yes, there will be problems. But every indication is that we, globally, have more than enough capacity to deal with the problems that do occur. That we have not even approached the "error threshold" of our ability to deal with system problems. Nor will we.

So, are the Dominoes just Stuck?

-- Hoffmeister (hoff_meister@my-deja.com), November 11, 1999

Answers

Hoff,

Maybe we should wait till the rollover before we start talking about dominoes. :)

-- (JohnQ.@public.com), November 11, 1999.


Let's hope so Hoffmeister. You've been a solid source of accurate information for as long as I can remember, so I'm thinking you're probably pretty close to the mark with this as well.

I hope the others who read this won't arbitrarily write you off as a "polly", because they would be losing out on a rational perspective from someone who knows.

-- (anon@anon.com), November 11, 1999.


Everyone has their fingers crossed~ for once maybe it's working.

-- kevin (innxxs@yahoo.com), November 11, 1999.

Who knows? (Hoff doesn't) Who cares, about your or your family? (Hoff doesn't)

Finish your preps. The previews are almost over. Know where the exits are. And, now, sit back and enjoy the show.

-- No Smoking Please (SitBack@Enjoy.The.Show), November 11, 1999.


Hoff:

So you don't think there will be a possible significant heightening of the risk with respect to a potential worldwide embedded systems crunch on or shortly after 1/1/2000?

If you don't think so, what is the basis of your position?

Thank you,

-- eve (123@4567.com), November 11, 1999.



Dear Hoffmeyer,

Thanks for the lovely post. Your insight into this issue is refreshing. It might be a good time to put aside a little reserve of cash and some gas cans. Also, while your out, if it's convenient, you might want to consider getting five or six cases of pork 'n beans. Just to be on the safe side, of course. And don't forget the cornbread mix.

Go back to sleep now.

Have a Nice Day!

(hitting the)

-- snooze button (alarmclock_2000@yahoo.com), November 11, 1999.


As an authority on SAP could you explain to me why problems with Hershey and Wirlpool have not been corrected in 3 hours? Eagerly awaiting your reply.

-- Brooklyn (MSIS@cyberdude.com), November 11, 1999.

Wait'll international communications and trade go down.

Then go into your store and give us a then vs. now availability comparison.

And why aren't you up helping those poor people at Hershey?

[Do you drink coffee, Hoff?]

-- Usual Suspect (lisa@work.now), November 11, 1999.


It seems that the Domino theory doesn't apply any better to Y2K than it did in Southeast Asia. I would suggest a more appropriate theory would be the Avalanche Theory. That is, that it can snow and snow and snow, yet all we see is the pristene beauty of the glistening peaks until that one little flake of snow sets off the crushing avalanche. Most everyone is blinded by the beauty of the scene.

"Look at how pretty it is and how the new snowfall piles higher and higher"

"Nothing bad could happen here, I don't need to pay attention to the ranger's warning"

"How could my skiing across that pretty landscape have any effect on this great big mountain?"

-- VegasBoy (Heck@whadoiknow.com), November 11, 1999.


Sorry, Hoffy. But I just HAVE to point out that we are not in the year 2000 yet, hence we have yet to experience any of the upcoming problems that will be caused by systems that cannot handle the year 2000. (Otherwise known as Y2K problems.)

As you correctly observe, many places -- often motivated by wanting to REPLACE their non-Y2K compliant system with ones that are Y2K compliant -- are experiencing problems as they try to implement the replacement systems. These problems may indeed be independent of Y2K per se, since:

1) They are Y2K compliant.

2) We are not yet at the year 2000. (I know, sorry, but I really had to state it again.)

Now, WHAT HAPPENS when systems that were NOT replaced outright by Y2K compliant ones enter into the year 2000 (which, at this time, we are not yet quite ... uhhh, sorry) it it turns out that they are not Y2K compliant? Perhaps a lot of problems? Which will ADD to the woes already being experienced by Y2K compliant but incorrectly implemented replacement systems?? Which just might provide the DOMINOES that you seek???

Yep, like it or not, you really have to wait until 2000 to see what happens. Just like on the Titanic, where until the ship actually touched the iceberg, everything seemed peachy keen. Nobody knows what will happen, but some people use common sense when they start guessing.

-- King of Spain (madrid@aol.cum), November 11, 1999.


JohnQ

Thanks! I'm sure more are to come.

eve

Again, I have no real experience with embeddeds. I have opinions based on research, but unlike some, I don't consider that enough to make me in any way an "expert".

But, since you ask, my opinion is that:

a) Very small percentages of embedded systems seem to even care about dates.

b) Of those, most seem to be cosmetic problems, that do not affect the actual processing.

c) And as I stated above, since my impression is that most embeddeds with problems require replacement, my guess is that error rates due to replacement of embedded systems again have already been quite high.

Yes, I still expect a spike on rollover. But Dale Way brought up another interesting aspect, in basically the "failure span" of a system. Virtually every "Y2k" problem in a system will eventually not be a problem, when the system stops dealing with dates spanning the century rollover. IT systems deal with data that can span years, and thus have a very long "failure span". My impression is embeddeds tend to have "failure spans" measured in seconds, minutes and hours. So while I expect a spike, I expect the resolution to be fairly quick.

Brooklyn

Don't have details on Hershey, or Whirlpool. My impression from the articles is that at least Whirlpool had known capacity problems, yet went live anyway.

SAP typically involves changing business processes. Users are not usually receptive, and can take quite a long time to "embrace" the new process. Usually, this leads to more errors in data entry and processing flow, which in turn means problems in order fulfillment, etc.

lisa

Sorry, my plates been full. "Helped" one company go live 2 weeks ago; "helping" another right now.

-- Hoffmeister (hoff_meister@my-deja.com), November 11, 1999.


Is this the same Hoffmeister admitting that things are wrong?

Is this the same Hoffmeister who laughed at efforts to prepare?

He nows backpeddles and try to exit quickly --

"I'm sure the usual suspects will jump in with the "it's not 2000 yet" chant. But that's not the point, folks."

IT IS THE POINT, GUY! Just wait until the new year when reason begin to mount to seriously start replacing systems. There will be no time, no staff, and no cash-flow -- no chance!

Pete

-- Peter Starr (startrak@northcoast.com), November 11, 1999.


I look at it this way. The companies that are having problems right now with new installations, are some of the companies that are/were the farthest ahead in the compliance game. What is going to happen with all the companies that will work on implementation right through the end of December, then force the system into use only because they are out of time? Between that and the remediated systems that are rushed into place in late December sounds like an awfully severe bump to me...

-- Bob (bob@bob.bob), November 11, 1999.

Pete

Ummm, reading comprehension problems?

Is this the same Hoffmeister admitting that things are wrong?

Well, yes and no. This IS the same Hoffmeister that said things go wrong daily with systems. And the same that said that in total, the number of errors and failures due to replacements and remediation will dwarf the number of actual "rollover" errors.

And the same that said they would be handled. Just as they are being handled.

Is this the same Hoffmeister who laughed at efforts to prepare?

Hmmm. Don't remember that. I have said that I see no reason to "prepare", but don't recall laughing at anyone. Care to enlighten me?

He nows backpeddles and try to exit quickly --

"I'm sure the usual suspects will jump in with the "it's not 2000 yet" chant. But that's not the point, folks."

IT IS THE POINT, GUY! Just wait until the new year when reason begin to mount to seriously start replacing systems. There will be no time, no staff, and no cash-flow -- no chance!

I realize it's useless to attempt and explain this to you. Keep up the chant, though. I'd suggest you start on the new party-line of the "long-slow grind". It has so much more potential for "coverups" and "lies", and has a virtually limitless time-frame.

-- Hoffmeister (hoff_meister@my-deja.com), November 11, 1999.


Hoff,

You are correct. It's just getting started.

Repeat after me, "It's just getting started."

Hoff, if this is as bad as it gets then we will be okay. This assumes we are CURRENTLY AT THE PEAK FAILURE RATE. In fact, we are NOW operating inside the "POLLY Scenario" where (mostly) single failures are occuring but being (for the most part) dealt with successfully.

Will computer/embedded chip problems remain as drawn-out and rare as they currently are, for the next two months? If failures continue to increase, at what point does the failure rate reach "critical mass" (when multiple failures begin) and the dominos (connected systems) truly begin to fall?

In two months we will know the answer. Until then, prepare.

-- GoldReal (GoldReal@aol.com), November 11, 1999.



Users are not usually receptive, and can take quite a long time to "embrace" the new process.

ROTF!

[he's transparently drawing a parallel to how quickly hostages 'embrace' their terrorists captors]

Hey, Hoff, glad to see ya back. Go to bok's place later, if you can make it.

-- lisa (lisa@work.now), November 11, 1999.


Hoff:

Do you have any guesstimate/hunch on how much of the world economy might be effectively shut down by the "spike" that you speak of?

-- eve (123@4567.com), November 11, 1999.


You're quite right, which is why I'm not TOO worried about the technical aspect of Y2K.

A caveat though: embedded stuff like PLCs really are a special case. A lot of them date back to the 1970s and 1980s when Y2K was just a bunch of letters, so their compliance is often "unknown" or more specifically "undeterminable".

I'll talk about the two situations that I've seen, or heard about directly from a drinking buddy. In the nuke power industry, "unknown" is read as "non-compliant" and they've been ripping them ALL out - regardless of whether they have any date related function whatsoever - and replacing them with certified systems for the past few years (which strengthens your argument).

But in the supermarket distribution sector, they've been replacing only those that they have determined are date related. Rather, they hire in consultants to do it, and if the one my friend dealt with is representative, we might as well assume that they were tested and replaced on a random basis. There's a big difference between a group of software engineers analysing, checking and re-checking code in a cosy, climate controlled, coffee-machine equipped office, and a wet, chilled hardware engineer working on his own in a warehouse, sticking probes into an endless row of PLC's, tapping the same key on his laptop over and over, and ripping out and replacing the spurious modules. There's no REASON why we can't have replaced them all, I'm just dubious about the reliability of the replacement methods.

-- Colin MacDonald (roborogerborg@yahoo.com), November 11, 1999.


Hoff,

You state that The problems and failures have been dealt with, for better or worse, in the replacement systems. I was wondering if you had evidence as to what percentage of companies have gone live as of today. Couldnt it be that the most are procrastinating until the last minute, before they slam things into place? Wouldnt a delay implementation, and lets see what happens to or competitors be a likely strategy?

-- --Brad (Brad@Brad.com), November 11, 1999.


Hoff,

Godd question, but a little premature I think.

I believe what we are seeing with Hershey / Whirlpool is just an indication of what will be happening in the future. The companies produce finished goods, not parts further up the supply chain. They have implemented ERP or enterprise-wide systems that were definitely not ready. Why roll them out when they did?

Well, I have been working with many companies attempting to get into a steady state as of 7/1, or 10/1. The mandate has been a freeze on system changes after that point. Could be what's happening here.

Another possibility is that Hershey wanted to get everything in and working with time to spare for the "busy season" (Halloween / Thankgsgiving / Christmas) So after beginning the project thre or four years ago...wanted to get it out now.

Now what about the dominoes? I haven't read about problems further up the supply chain. Why is that? Maybe there have been cut overs to legacy systems? Maybe it's smaller businesses not taking this as seriously and not looking to roll out new systems, but rather fix on failure (FOF).

Maybe, just maybe, they are still racing to implement, and we will start to see the problems in December, or after planned planned shutdowns over the holidays. That's traditionally a time when many companies do system maintenance/upgrades.

An area that I've got first hand knowledge in is Payroll, and let me tell you, a company will not go live with it until they are absolutely sure (school districts are evidentally another story).

Tell you what though, I know of fortune fifty companies that are running out of time on payroll, and will be stuck putting out buggy systems in January.

So all in all, you may be hitting on something, but there are alot of variables. I'd like to think that maybe there are some dominoes strong enough to stop the onslaught!

-- Duke 1983 (Duke1983@AOL.com), November 11, 1999.


Hoffmeister,

First of all, on a non-technical level, it's worth noting that the government announced on Monday that 24-hour operation of its Y2K Information Coordination Center will begin on December 28th:

http://biz.yahoo.com/rf/991108/68.html

They aren't convinced yet that your line of reasoning is the way it will be. It's a good idea for governments, businesses and individuals to be prepared for the unexpected.

But, I'll make an attempt here to answer the technical points in your post.

The problems and failures have been dealt with, for better or worse, in the replacement systems.

These are problems that have been occurring for some time; some stretching back to the beginning of the year. Problems that have not been fixed in "oh, 2 or 3 hours". Some ongoing for months.

All of which providing daily validation of what I've been trying to say. That the massive number of system replacements and modifications have generated enormous numbers of errors and failures. The vast majority are caught and fixed before any external effect is witnessed; some are not.

Have these system replacements generated "enormous" numbers of errors and failures so far? That may or may not be true. The only thing we can say for sure is that there have been errors and failures related to system replacements that have lasted long enough to be reported by the media.

Although I have little direct experience with "embedded systems", my opinion is the above holds true, if not more so. Problems in interfacing systems are more acute when individual systems are "replaced". My impression is little actual "remediation" is available for most embedded systems, and that most that do fail require replacement.

Hoffmeister, you know perfectly well that the vast majority of whatever embedded systems failures that occur will happen near January 1st. Your comment that most embedded systems that do fail require replacement is not a good sign for January.

But where are the "Falling Dominoes"? Where are the JIT problems with supplies? Maybe I'm not looking hard enough, but it sure seems to me that I can walk into virtually any store and find the same products available as a year ago.

Why would there be enough just-in-time delivery problems already that it would be noticeable to the average consumer? The number of companies experiencing problems with system replacements seems to be growing, but again, it's not a given that these replacements so far "have generated enormous numbers of errors and failures." And, most embedded systems problems are still ahead of us, too.

Are you trying to say that system replacements are a potential source of "corrupted data" that could be passed on to healthy organizations, and that is why "falling dominoe" problems would be possible now? I can't think of any other likely reasons why, at this point, noticeable JIT problems would be happening.

Now, I'm sure the usual suspects will jump in with the "it's not 2000 yet" chant. But that's not the point, folks. If anything, system errors and failures due to replacements and modifications are much more severe and difficult to deal with than date processing errors. According at least to GartnerGroup, we are well up the slope of date- related error rates. Yet virtually every post of "Y2k-related" errors are due to replacements, and not due to failures in date- processing.

The likely reason why system errors and failures so far have been more difficult to deal with than date processing errors is that many date processing errors so far have been in accounting software that deals with fiscal years, and in financial forecasting software. I would suspect that accounting software would have been the first part of an organization's Y2K project, and if an organization did not address accounting software at the beginning, it's possible in some cases to put a "bandage" on lookaheads and temporarily avoid dates in 2000.

Date processing errors that affect distribution or manufacturing have so far, in my opinion, been small relative to the ones that are still to come. A non-compliant general ledger or problems with financial forecasting software isn't as damaging to an organization as problems that involve distribution or manufacturing processes would be. There was another thread recently on which you and Sysman discussed lookaheads, and what I got from it was that most billing systems are 30-day lookaheads, so problems of that type wouldn't be happening all that much until December.

Manufacturing that's controlled by either PC's or embedded systems have little chance of being affected until January. Scattered problems with utilities here in the U.S. and greater utility problems overseas that could affect manufacturing are still ahead of us. Gasoline supplies are still ample at this time.

Yes, there have been problems. Yes, there will be problems. But every indication is that we, globally, have more than enough capacity to deal with the problems that do occur. That we have not even approached the "error threshold" of our ability to deal with system problems. Nor will we.

Yes, we will deal with these problems, and we will overcome them sooner or later, but at what financial cost and and what personal cost to individuals? Hopefully, major infrastructure problems will only happen in scattered areas, and hopefully any nationwide problems that happen will only be minor. I doubt, though, that we will have been able to get back to 1999's level of productivity and 1999's standard of living by March of 2000.

A quote from this article is notable:

http://www.capitolalert.com/news/old/capalert01_19990927.html

[snip]

A report last week by the Senate's special Year 2000 committee warned that unpreparedness in such countries as China, Russia and Italy, along with a handful of oil-supplying nations, could plunge the United States, and even the world, into recession.

"Severe long- and short-term disruptions to supply chains are likely to occur," the report said.

[snip]

So, are the Dominoes just Stuck?

There's a reason for Y2K also being called the Year 2000 computer problem or the Century Date Change.



-- Linkmeister (link@librarian.edu), November 11, 1999.


hoff you're right,

us usual 'suspects' will correcly point out that we have NOT hit the big date yet.

i guess u dont have (young) family to care for. because if you did, you'd certainly want to take the *absolutely* PRUCENT precautions to prepare for something that:

1. has never occured before, and 2. the biggest experts in the world will admit they don't know whats going to happen.

i've been in the information industry for 12 yrs now, and i see LOTS of failures not being dealt with as they happen. just read about the recent events at hershey and whirlpool. these are just a smidgeon of whats going on.

good luck to you sir.

-- lou (lanny1@ix.netcom.com), November 11, 1999.


Brad

What point is there in "procrastinating"? I can only speak from my experience, which is that SAP implementations peaked sometime this summer. The market for new contracts began drying up in January, and was virtually non-existent by July.

Combine that with the overall "freeze" that is in effect, and I think you have some pretty strong evidence that companies are NOT "waiting till the last minute". Besides the common-sense argument that it makes no sense whatsoever to wait.

Duke

Have been working with "smaller" companies lately. They tend to be somewhat easier and faster implementations. My opinion is they are much more used to using "canned" software, and don't expect/require as many modifications to generic SAP as larger companies, that are used to having home-grown, specialized systems.

Kev

Yes, most companies are putting contingency plans in place. This is SOP; when we rollout SAP, we have people on call 24/7 for the first few weeks.

As for your points:

Have these system replacements generated "enormous" numbers of errors and failures so far? That may or may not be true. The only thing we can say for sure is that there have been errors and failures related to system replacements that have lasted long enough to be reported by the media.

Have been through enough implementations to answer "YES". Haven't been through one that DIDN'T generate errors and failures. The qualitative evidence is overwhelming; why do you think Homer is so busy? I addressed the quantitative evidence earlier in discussing function points and error rates.

Hoffmeister, you know perfectly well that the vast majority of whatever embedded systems failures that occur will happen near January 1st. Your comment that most embedded systems that do fail require replacement is not a good sign for January.

You miss the point, Kev. Like IT replacements, my guess is when you replace components in Embedded Systems, the overall error rate skyrockets. Again, these aren't errors due to "date-processing"; they are errors introduced when a component is replaced, that may not be an exact 1-to-1.

Why would there be enough just-in-time delivery problems already that it would be noticeable to the average consumer? The number of companies experiencing problems with system replacements seems to be growing, but again, it's not a given that these replacements so far "have generated enormous numbers of errors and failures." And, most embedded systems problems are still ahead of us, too.

The problems have grown. You hear reports 2, 3, 6 months down the road; but virtually all of these reports are from implementation done earlier this year. If these "domino" theories are accurate, why wouldn't it be noticeable to the average consumer?

Are you trying to say that system replacements are a potential source of "corrupted data" that could be passed on to healthy organizations, and that is why "falling dominoe" problems would be possible now? I can't think of any other likely reasons why, at this point, noticeable JIT problems would be happening.

System replacements are not just a "potential" source. Read Dale Way's essay, considering what he is referring to are errors due to modifications and replacements.

The likely reason why system errors and failures so far have been more difficult to deal with than date processing errors is that many date processing errors so far have been in accounting software that deals with fiscal years, and in financial forecasting software. I would suspect that accounting software would have been the first part of an organization's Y2K project, and if an organization did not address accounting software at the beginning, it's possible in some cases to put a "bandage" on lookaheads and temporarily avoid dates in 2000.

Date processing errors that affect distribution or manufacturing have so far, in my opinion, been small relative to the ones that are still to come. A non-compliant general ledger or problems with financial forecasting software isn't as damaging to an organization as problems that involve distribution or manufacturing processes would be. There was another thread recently on which you and Sysman discussed lookaheads, and what I got from it was that most billing systems are 30-day lookaheads, so problems of that type wouldn't be happening all that much until December.

Manufacturing that's controlled by either PC's or embedded systems have little chance of being affected until January. Scattered problems with utilities here in the U.S. and greater utility problems overseas that could affect manufacturing are still ahead of us. Gasoline supplies are still ample at this time.

Billing and Payment processing deal with terms. While my experience is most are NET 30 or less, many are not. The point is, the software has to deal with those that extend to 2000.

As for manufacturing and embeddeds, again, my general impression is their "failure span" is quite small.

Yes, we will deal with these problems, and we will overcome them sooner or later, but at what financial cost and and what personal cost to individuals? Hopefully, major infrastructure problems will only happen in scattered areas, and hopefully any nationwide problems that happen will only be minor. I doubt, though, that we will have been able to get back to 1999's level of productivity and 1999's standard of living by March of 2000.

As for financial costs, I seriously doubt that the impact of Y2k problems themselves will come anywhere close to the impact already felt by the spending for replacements and remediation.

Personal costs? Again, you deal with what's dealt. I know alot of SAP consultants that had been riding high on the hog, only to get shot down in the past few months do the the overall freeze. It's called Life.



-- Hoffmeister (hoff_meister@my-deja.com), November 11, 1999.


Hoff:

I'm going to break rank here and say that your points are well taken. I thought that the DWay Q&A was very interesting and had the effect of moving my bell curve of probabilities TOWARD the lower end of the scale. Do I feel bad about preparing for the worst? Not a bit. When I go to the doc and he says I'm healthy and, in turn, shifts my bell curve toward the optimistic side, it in no way changes my health and life insurance plans.

As an aside, I also felt that DWay strengthened my conviction that the big brains on Wall St. bidding the market up are actually pinheads in disguise. If you're not on the Fed payroll, I think you are wasting some serious talents fighting with the doomers.

With respect...

-- Dave (aaa@aaa.com), November 11, 1999.


As for financial costs, I seriously doubt that the impact of Y2k problems themselves will come anywhere close to the impact already felt by the spending for replacements and remediation.

Hoffmeister,

That depends. It depends on whether or not an organization's vendors are compliant in January, whether needed parts from overseas are still arriving, and how available gasoline is early next year and at what price.

http://www.capitolalert.com/news/old/capalert01_19990927.html

[snip]

A report last week by the Senate's special Year 2000 committee warned that unpreparedness in such countries as China, Russia and Italy, along with a handful of oil-supplying nations, could plunge the United States, and even the world, into recession.

"Severe long- and short-term disruptions to supply chains are likely to occur," the report said.

[snip]

-- Linkmeister (link@librarian.edu), November 11, 1999.


If these "domino" theories are accurate, why wouldn't it be noticeable to the average consumer?

Hoffmeister,

I think failures as a result of system replacements would only be noticeable to the average consumer if failures have affected distribution and especially manufacturing and production.

Have any system replacements caused manufacturing and production problems yet?

-- Linkmeister (link@librarian.edu), November 11, 1999.


Hoff,

Thanks for your input. If we are talking about remediation in the Big Iron mainframe shops, then it might be logical for a company (once it has done all the work that it will do this year) to hold off on implementation until the very last minute. Once the decision has been made to go with the code, why implement any sooner than necessary? After all, things are moving along just fine with the old code, so why throw a potential monkey wrench into the works prematurely and risk losses? Obviously Hersheys competitors are not shedding any tears about their predicament. Perhaps it would have been to Hersheys advantage to wait until late December when other companies were also forced to play their hand. In a competitive environment, why would it be common-sense to be the first of out of the gate?

-- Brad (Brad@Brad.com), November 11, 1999.


Brad, it makes no sense for the same reason that system freezes have gone into effect.

You don't go to the effort of fixing the system, only to hold off to the last minute to implement. Companies are very aware of the potential for Y2k problems. The freezes exist to avoid the potential for introducing new errors, Y2k or not, into the systems. The freezes exist because the systems have been re-implemented.

Hershey, Whirlpool, etc., probably made the best decision, given the circumstances at the time. That is, implement as soon as possible, and deal with the errors, so most bugs are worked through by the rollover.

-- Hoffmeister (hoff_meister@my-deja.com), November 11, 1999.


Hoff:

Apparently you overlooked my question. I'll put it to you again.

Do you have any guesstimate/hunch as to how much of the world economy might be effectively shut down by the "spike" that you speak of?

-- eve (123@4567.com), November 11, 1999.


eve:

Any answer to your question would be pure speculation, which is mostly what we've been doing for the last couple of years anyway. My speculation is: Shut down -- very few, and very temporarily. Reduced efficiency -- pretty common, very temporary. Economic impact -- 50-50 chance of it being noticeable at all against the normal background noise.

Try to bear in mind quite a bit of code has already encountered dates in 2000, yet all of the computer problems we read about that are even indirectly y2k related have to do with implementation problems going to new systems. The argument that implementation problems are MUCH harder to find and debug (sometimes requiring a change in the whole way you do business) than actual date bugs, where you know what you're looking for and don't need to redesign anything.

For big problems to happen at (and just after) rollover, we need to posit that the remaining, unremediated date bugs will be *at least* an order of magnitude worse than all the implementation issues put together, *combined* with all the problems resulting from 2000 dates encountered so far. This requires a great leap of faith.

-- Flint (flintc@mindspring.com), November 11, 1999.


Im just saying that it is plausible that companies may go through the effort of fixing the system, only to hold off until the last minute to implement, if it makes purely economic sense to do so. Remember that we are in this mess because companies procrastinated until the last minute due to the economic ramifications of starting early (or earlier than their competitors). Why would the end game be any different? The logic of freezing and then implementing directly is not clear in the economic arena. Since profitability and the bottom line rule most corporate decision-making, this may be true right up to the very precipice! Having the bugs worked through by the rollover may not be a big priority if your competitor is in the same boat as you at the rollover. Its still an even playing field in economic terms! Just not such a pretty picture for the little people. Hershey and Whirlpool may be the forerunners in their market group, and losers because of it! This could be a case of Y2K Mexican standoff." At least until the clock strikes 12:00.

-- Brad (Brad@Brad.com), November 11, 1999.

Brad:

The logic of your argument runs aground against the observation that nobody is in fact holding off (or more accurately, if anyone is, nobody has ever mentioned this on any forum or newsgroup, or in any known article in news or trade journals).

My suspicion (based on limited reading, I admit) is that some companies are effectively "testing" their remediation by returning it to production and seeing what happens. While this means they know the code doesn't (any longer) simply break, and that interfaces are being tested, the code coverage can be lousy.

-- Flint (flintc@mindspring.com), November 11, 1999.


Flint:

Thanks for your response, but I'm really referring to the embedded systems as opposed to the IT issue. So we don't need to talk about changing code because we're essentially dealing with firmware. The reason I pose the question is that we then don't have to bog ourselves down with minutae addressing how many IT errors we've already encountered and how many are being and have been absorbed, etc. etc. ad nauseum.

In other words, in this context, the IT really becomes irrelevant. That is why Hoff probably doesn't know what to do with questions like this. They take him out of his comfort zone.

So, apparently Hoff is having a hard time with this one. I'm still waiting for a response after posting the question twice. Of course, maybe he's busy...

-- eve (123@4567.com), November 11, 1999.


eve:

Hoff has already said he's not knowledgeable about embeddeds, so doesn't feel qualified to address them. I think our current information indicates that those with embedded exposure were neither more nor less consciencious about dealing with such problems than the IT people have been with theirs. Which is to say, spotty. But if any general trend has emerged from the widely disparate and scattered embedded area, it's that knowledge and use of dates by embeddeds has been substantially lower than feared, both in incidence and in impact. Even when dates are used by embeddeds for some reason, "wall clock" time has been largely irrelevant. As a general (but not universal) rule, if you can't set the date in a device, then the date is not relevant to that device. After all, our clock crystals aren't very accurate, gaining or losing 10 seconds to 2 minutes per day. So if real time is important, they'd need to be reset pretty often.

Another advantage embeddeds have enjoyed is that they are much more seldom "home-grown" than IT systems. As a result, manufacturers and distributors have been in much better positions as "clearing houses" for notifying customers of both problems and solutions. I can assure you from personal experience that a lot of this has been going on.

Most remaining, serious embedded problems are likely to show up the first day after rollover. After that, most such problems will be logging issues, like "gee, the maintenance record is scrambled. When *did* we check those bearings, can you remember?"

In general, it doesn't appear embeddeds are likely to be a huge problem either. But there will surely be some problems. Not with economic impact perhaps, but locally catastrophic (is my guess). A few explosions, some escaped poisonous gases, things like that. It won't curtail the economy, but it won't be pretty either.

-- Flint (flintc@mindspring.com), November 11, 1999.


Flint,

You seem to be saying that since no one has ever mentioned the possibility of end-game corporate procrastination on this forum. (or any forum, trade journal, or newsgroup), we must then assume that said procrastination is not the fact. Is this a serious argument? Ive stated here that this procrastination certainly makes sense in economic terms for any company (corporation) that is worried about losing immediate market share to its competitors. Isnt that the core or  root cause of this Y2K mess to begin with? Hasnt it been corporate profitability concerns that have got us here? Hasnt it been the need to keep the Board of Directors and shareholders happy that has driven remediation up until mow? Why would that be any different in the last 50 days?

-- Brad (Brad@Brad.com), November 11, 1999.


Hoff,

From a Y2K remediators perspective, systems should be fixed and implemented as soon as they able. But what if its your CEO that deducts that your competitor is in the same predicament that you are. What if implementing your Y2K remediation immediately would be like shooting yourself in the foot? Sure you might get an early start in a volatile situation next year, but youd expose yourself to potential losses right now! Whada mean their teams car just lapped us?! We just pulled into the pit for a tire change! In a competitive world, no racing company will change their tires before they know their competitor must pull into the pit stop also. And even then, theyll wait til the last minute.

-- Brad (Brad@Brad.com), November 12, 1999.


eve

Geez, sorry. Yes, I have been busy.

Again, my opinions on embedded systems are just that, and are based only on research and applying some logic. But again, I expect no economic "shutdowns" due to embedded system failures.

The same argument I originally made seems to hold for embedded systems, if not more so. That is, the replacement of these systems has probably caused more errors than the rollover will.

And again, it appears that the vast majority of embedded systems that do fail, will fail for only a very short period of time, after which they will again function. The workarounds for embedded systems, such as resetting to an earlier date, appear far more numerous than for IT systems. Witness the number of electric plants already running in the year 2000; obviously, they don't really care much that their date actually matches the actual date.

Brad

You can postulate many things. The realities are that the large corporations have been implementing replacements for some time. The sequence is not "freeze, then implement"; it is "implement, then freeze".

And smaller companies have been under enormous pressure to comply. My current client had planned on implementing SAP Dec 1, and was told by their largest customer that they would also purchase the upgrade to their existing system and have it installed by Nov 1, or their largest customer would no longer be their largest customer.

Guess what happened?

-- Hoffmeister (hoff_meister@my-deja.com), November 12, 1999.


Flint:

There were so many weaknesses in your post I hardly know where to begin. I guess I'll just pick a few for now.

First, you state, "...knowledge and use of dates by embeddeds has been substantially lower than feared..."

My response to this is that if in theory just one bad chip can bring down an important piece of equipment, in turn possibly bringing down a plant, then your "substantially lower" comment is really meaningless.

You state "wall clock time has been largely irrelevant." See my response above. Same principle. And I read that chips with dates can be pulled off shelves even if dates aren't needed, because it's a quick fix. I can't vouch for the truth of this, but from a human expediency perspective it makes sense.

Re your second paragraph, what about the manufacturers who no longer exist? And you say "a lot" of notifying has been going on. What does this mean? A hundred thousand notifications when ten million notifications should have been made? So what?

Third paragraph: What makes you so sure the maintenance guy will even be there to make such a statement if he can't get gas for his car or feels his family might be at risk? Watch your assumptions and premises.

Fourth paragraph: When you look at interrelationships, how can you possibly come to conclusions that effects will be "local" and "few"; even as a guess? Do you really not understand the extent of the interrelationships? Find the essay, "I, Pencil," by Leonard Read. After you're done with it, just sit and think about it for a while; let it really sink in. Then read the IEEE website. This should help for starters.

My observation re your third paragraph tells me that you like to talk about things in isolation, i.e., out of context. But you thereby artificially reduce the seriousness of the issue and turn it into a casual academic exercise that has little reference to reality. With friends it's ok to do this, but in the forum you end up fooling some people who don't understand the possible ramifications and are trying to learn something. I'm in no way saying you intend to fool them, but this is the result.

Please don't take this post as condescending in any way. I really don't know you, and I'm sure I've read only a tiny fraction of what you've written. All I'm trying to do is help you to see things in a way that you really don't seem to be aware of, based on what I've seen of your writings so far.

-- eve (123@4567.com), November 12, 1999.


Hoff:

First, please read my response to Flint, above. Since I think you and Flint are pretty close on these positions (of course I could be wrong on this), I think most of that post would apply to you as well. If you disagree with any of my responses to Flint in the above post, please let me know. I'd like your angle.

Re your post: As with Flint's post, I see many problems, but just to focus on a few:

Your second paragraph: How can you possibly so conclude when, for example, the situation of the oil and gas industry is all over the map, depending on what you read?

Third paragraph: How do you know that they will all function again after they fail?

Third paragraph: How does one "reset" firmware? I thought most bad embedded chips have to be replaced, unless that's what you meant by "reset."

Third paragraph: Re your "numerous" comment: So what? What about the number of workarounds that are not done? What about the number of workarounds that are not possible?

And I'll try to be more understanding about you not getting back to me right away. I'm really busy too; especially these days, if you know what I mean.

-- eve (123@4567.com), November 12, 1999.


Hoff,

Your first hand experience that companies are practicing due diligence is very encouraging. Aside from SAP, I sincerely hope your right about completed remediation and the timely implementation in the mainframe arena. I am skeptical only because corporations have demonstrated that when the immediate bottom line is adversely affected, they will delay until the last possible minute, to eke out every last drop of profit. Is it feasible for many of them to wait until their backs are right up against the wall of 2000 before the code meets the road? Lets hope this Leopard has indeed changed its spots. I've enjoyed reading your posts. Best of luck to you in the New Year.

-- Brad (Brad@Brad.com), November 12, 1999.


eve:

I'll respond as best I can...

[There were so many weaknesses in your post I hardly know where to begin. I guess I'll just pick a few for now.]

Reading your comments, I come to the conclusion that the major "weakness" of what I wrote is that I have violated assumptions you may not even realize you're making. But we'll get there.

[First, you state, "...knowledge and use of dates by embeddeds has been substantially lower than feared..."

My response to this is that if in theory just one bad chip can bring down an important piece of equipment, in turn possibly bringing down a plant, then your "substantially lower" comment is really meaningless.]

You would be correct if your theory were correct. In practice, it's not. Plants are not brought down by theoretical bad chips, they are brought down by actual, functional failure. The point of my observation (which could have been clearer, I admit) was that actual, physical examination and testing revealed that actual incidence of embedded systems having the potential to cause such failures was surprisingly low. As a result (and *very* fortunately), embedded remediation turned out to be easily feasible.

Three years ago, your concern was valid. Indeed, it got me started preparing in a big way. There were speculations about "billions" of bad chips, and *nobody knew* if this were true or not. I found that scary. Subsequently, the process of remediation has shown such speculations to have been wildly inaccurate *in practice*. The situation could have been hopeless, but real information from real investigation showed that it was not hopeless at all.

In order for the plant to be brought down, the embedded system must contain a date somehow (the vast majority do not); it must *use* that date (it turns out that only a minority of embedded systems that contain date functionality actually use it); it must use the date incorrectly (only a minority of embeddeds that use the date screw it up); the incorrect usage must cause a *functional* (rather than cosmetic) problem (and only a tiny minority of embeddeds that screw up the date would experience functional problems); the functional problem must be severe enough to disable the whole operation (extremely rare even for functional problems); such a problem must escape the notice of all the engineers assigned to find and fix exactly such problems, and finally, the problem cannot be addressed expiditiously when encountered.

As I wrote, I expect that a very few will meet ALL these requirements. But I expect this to be extremely abnormal, and very far from pervasive or common. Without question, there are some damfool outfits that didn't bother making the effort. If they encounter problems both severe and intractable, they deserve to die. [You state "wall clock time has been largely irrelevant." See my response above. Same principle. And I read that chips with dates can be pulled off shelves even if dates aren't needed, because it's a quick fix. I can't vouch for the truth of this, but from a human expediency perspective it makes sense.]

Review my list of requirements. The wall clock issue falls in the category of systems that use the date incorrectly. Indeed, most systems *do* use it 'incorrectly', with respect to real, outside world time and date. That's because they use the date as part of a big "current time" value, to be compared with a "prior time" to get an elapsed time. The actual wall clock time is not relevant, only the difference between two times.

I take some exception to your notion that an engineer grabs something inappropriate off the shelf and slaps it in, untested, to a critical process that could shut down a plant. This may make sense to you, but in real life any engineer who did that would have to find himself a new profession Real Damn Quick!

[Re your second paragraph, what about the manufacturers who no longer exist? And you say "a lot" of notifying has been going on. What does this mean? A hundred thousand notifications when ten million notifications should have been made? So what?]

Yes, quantification is important, and I've been trying to address it. First, while the situation where a substitute device is unavailable isn't a simple problem, this doesn't mean the problem is not addressed. The engineers don't simply shrug their shoulders and say "well, can't fix this one, I guess we'll let the plant blow up!" Such cases are more expensive, but NOT as expensive as blowing up or closing down. Believe me, something is done about the problem!

As for the notifications, they have been useful and fairly comprehensive. I'd estimate they have achieved 70-80% penetration to customers, NOT the 1% you fantasize about. Also, bear in mind that most facility remediators (and managers) are not just sitting around idle, passively waiting for notifications. They are (and have been) investigating, testing, fixing, and reporting problems BACK to the manufacturers as well. It's a 2-way street.

[Third paragraph: What makes you so sure the maintenance guy will even be there to make such a statement if he can't get gas for his car or feels his family might be at risk? Watch your assumptions and premises.]

Now your sequence of false assumptions has got you carried away. I have tried to point out that *in practice* the embedded problem has been both manageable and managed. You can claim everyone ignored it to your heart's content, but your claim remains false. I have pointed out that *in practice*, a very tiny minority of date issues presented functional problems. You can claim that *all* problems are pervasive and serious problems as well, but this claim also remains false. I have pointed out that *in practice*, the functional problems have been found and fixed, with some exceptions but not all that many. You can claim that's not true either, but it is. Finally, you can extrapolate from your false claims a world where problems are so profound that maintenance cannot be performed. I can't regulate your imagination. I can only point to the reality. You can deny it all you want.

[Fourth paragraph: When you look at interrelationships, how can you possibly come to conclusions that effects will be "local" and "few"; even as a guess? Do you really not understand the extent of the interrelationships? Find the essay, "I, Pencil," by Leonard Read. After you're done with it, just sit and think about it for a while; let it really sink in. Then read the IEEE website. This should help for starters.]

Yes, I read that essay. I understand it. And I do indeed recognize the extent of interrelationships, although it appears that you do not. Consider the thousands of companies that go bankrupt each week in the US. That's normal. Consider the number of poor decisions businessmen make (Robert Townsend estimated that 2/3 of a *good* executive's decisions are mistakes). Consider the large number of problems experienced all the time, of all kinds. Yet this is normal, and life wobbles along without undue incident.

Your position seems fairly common here -- that if it's remotely possible that the kingdom could be lost if just the right nail were lost at just the right time (a farfetched serial sequence), that *therefore* every lost nail must necessarily cost every kingdom. What I'm trying to get through to you is that we lose nails in large quantities on a regular basis. Your chain of interdependencies, while real, is nowhere even close to being as deterministic and nonredundant as you seem to think.

[My observation re your third paragraph tells me that you like to talk about things in isolation, i.e., out of context. But you thereby artificially reduce the seriousness of the issue and turn it into a casual academic exercise that has little reference to reality. With friends it's ok to do this, but in the forum you end up fooling some people who don't understand the possible ramifications and are trying to learn something. I'm in no way saying you intend to fool them, but this is the result.]

I'm not trying to fool anyone. I'm trying to explain what actual, empirical results have been. *You* are the one claiming that the empirical results aren't the reality! You are claiming that your dire, contrary-to-fact speculations are the reality. Who are *you* trying to fool? You seem to have succeeded in fooling yourself, and that's an unpromising start.



-- Flint (flintc@mindspring.com), November 12, 1999.


eve

You misunderstand. I'm not trying to make an argument regarding embedded systems. I don't have experience with them. At best, I can extrapolate IT experience, but that is probably a very dangerous method. It's dangerous for anyone to attempt to make technical arguments outside their area of expertise.

You asked my opinions. I gave them.

You seem to take a very linear approach to this interconnectedness argument. Do faults with single chips take out entire plants today? My guess is they don't.

Your second paragraph: How can you possibly so conclude when, for example, the situation of the oil and gas industry is all over the map, depending on what you read?

My point in the second paragraph is again an extrapolation. Replacing an IT system typically causes far more errors, especially in interfaces, than modifying an existing system. Since my impression is that most embedded system component with Y2k problems must be replaced, rather than remediated, my guess is these have caused relatively more problems than the IT system replacements.

Third paragraph: How do you know that they will all function again after they fail?

I don't know they will all function. But short of physical damage, why wouldn't they? They presumably fail when encountering dates spanning the rollover. Again, an extrapolation from IT systems, most function perfectly well as long as all dates are on one-side or the other of the rollover.

Third paragraph: How does one "reset" firmware? I thought most bad embedded chips have to be replaced, unless that's what you meant by "reset."

No, I meant resetting the date to the past. Or future, for that matter. This seems to be pretty prevalent. Again, my impression is that most of these systems don't really care whether their date matches the actual calendar date.

Third paragraph: Re your "numerous" comment: So what? What about the number of workarounds that are not done? What about the number of workarounds that are not possible?

Again, an extrapolation from IT systems. Y2k problems in IT are much more prevalent than embedded systems. The embedded systems appear to have many more workarounds available than IT systems.

Again, if you feel the need to make technical arguments about embedded systems, then I'd suggest somebody else. You asked my opinion; I gave it to you.

-- Hoffmeister (hoff_meister@my-deja.com), November 12, 1999.


Flint:

Thanks for your detailed response. And quick! I still disagree on many points, but I'll let you have the last word for now. I do think I've come to understand you a bit better, though.

We'll be talking again soon.

-- eve (123@4567.com), November 12, 1999.


Hoff:

I'll try to remember not to ask you questions on embeddeds again. But hopefully your continued very strong emphasis on the IT side, while (comparatively speaking) ignoring embeddeds, yet then making conclusions on overall estimated impact, won't tend to lead new initiates astray into assuming that: the main y2k issue is IT; IT is ok, therefore y2k must be ok. That would be tragic, indeed.

Also, I find your repeated "you asked for my opinion; I gave it" pretty dismissive. It's obvious I'm asking for your opinion and you're giving it. What is the point in reminding me of this in practically every other paragraph?

Sometimes you may find me asking what seems to be the same question a second time. Rest assured, there's a purpose. Perhaps you had addressed a straw man, ignored the main point the first time, or what seems to be the second time is rephrased to change an important detail.

Anyway, I'm done for now. If you want to respond to this post fine, if not, that's fine, too. I still disagree on important points, but I'm getting tired. In any case, we'll be talking soon again.

Thanks for your time.

-- eve (123@4567.com), November 12, 1999.


Moderation questions? read the FAQ