My FINAL reply to Beach and his clocks

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

This is my final reply to Mr. Beach about his SECONDARY clocks. I have put it as a new question as the old thread has now rolled off the end of the forum. My apologies about the technical language, it is there for the purpose of making sure that my statments are PRECISE.

Mr. Beach

Well, congratulations on the great-grandchild. I am a grandfather myself, and got my first introduction to computers on a PDP/8 in 1972, so I suppose we could be regarded as contemporaries.

Since my use of caps seems to bother you, I will try to keep it to a minimum in this reply.

IMHO, you, and several of the respondents have ignored or forgotten certain of the basic principles as regards logic hardware.

FIRST: The state of a machine on power up must be known. When a device of any sort is first put under power, it must end in a predictable state. That known state will involve setting registers to a known state. The normal case will set them all to zeros. So, on initial power up, an RTC chip will be set internally to all zeros, which will translate to the first second of the first day of the first year and so forth. (It can be noted here that as computer based numbers do not divide evenly into the numbers that make up our time scheme, that quite a number of logic gates are present that force the actual rollover from 59 seconds to 1 minute, etc., in a complete RTC chip - but the layout of these gates is a bit beyond the scope of this elementary discussion.)

SECOND: Reading the state of a register by another device requires transferring the contents of the register via a bus and address line. The address lines are set to the state required to access a certain register, and the contents of that register will be placed on the bus. Whatever device requested the contents of this register will now transfer the Bus State into its own circuitry, and do whatever its internal logic dictates with the data.

THIRD: All devices require power. I know of no logic device that will run without power.

FOURTH: All electronic circuits, and logic gates are no exception, have characteristic constants that may be measured. Every pin on a chip will have associated with it such things as resistance, capacitance, inductance, reactance and so forth that are unique to that pin.

NOW, making use of these principles, what can we deduce that will apply to constructing a basic RTC chip? An RTC chip, like any logic chip, will start in a known state. It will have both bus and address lines. It will have at least two power lines, one for power, another for ground. It will produce its output in registers that will be capable of being read by another device via the bus and address lines.

If you have any disagreement with the above principles, please provide me with a reference. I am always eager to increase my knowledge of basic chip technology, and would really like to find an exception to these rules that is in use somewhere outside of the more advanced labs that deal with such things as optical computing technologies - which can, of course, transfer information via light pulses and fiber rather than electronically.

As a check on my knowledge, lets find a real RTC chip to compare with, OK?

Picking one at random from National Semiconductor - MM58167B - we find address lines, power inputs, a bus, registers/counters and even some of the electronic specs (output impedance) from the above discussion.

http://www.national.com/ds/MM/MM58167B.pdf

Still don't think I know what I am talking about? There is a link to the PDF document from National - the 58167 is a fairly sophisticated RTC chip, and has a number of other functions such as two interrupt lines, a chip select line (of course), and even the ability to send alerts when unusual conditions affect the chip, such as total power loss. Careful reading of the document will also show that the clock does not require a GO command to be started, it will start at zero on first power up.

BUT, back to the elementary RTC chip we are discussing.

Your specs for the items under discussion were that the device could not be started with a date input. Thus the device must start at zero (or perhaps some factory default set internally). In such a case, I would like for you to tell me WHY the device should not be powered down totally and reset to zero (or factory default).

There is NO reason to want to know the date present inside the chip in such a case. If the device the chip is used in has run for 3 years using the factory default date, powering down the device totally (including pulling the clock battery) will reset the device to the default. You then KNOW that you have another 3 years of run time without any problem relating to the date on the RTC. I do not understand what problem you have with this means of resetting the clock to the factory default/zero date state.

If you feel for some reason that you MUST test the device for rollover problems, even though your engineers assure you that it is not needed (for the reasons above) you can not perform a valid test by interfacing another RTC chip into the existing circuit board. RTC chips are soldered to the board. They are not meant to be removed except in cases of failure. I have never seen an RTC in a socket. Powering down the RTC chip totally (whoops, you just reset it) and connecting the pins from another chip over the pins on the existing chip is not a good idea. In the first place, to perform such a test correctly, the RTC with the advanced date should not have any other devices connected to it for the test to be valid. In other words, it must be hooked up in the IDENTICAL way the original chip was connected. This means you should not have other devices in the circuit. This would include any device such as devices meant to set the date on the chip. And, that means you have to interface the advanced RTC WHILE THE POWER IS ON, OTHERWISE IT WILL BE RESET TO ZERO. That is not a valid test. Moreover, to test the device correctly, you would have to remove the old RTC, not just connect over the pins. Each pin has electrical characteristics that would be affected by the presence of a second chip. All the bus, address and power lines must be cut and reconnected to the new RTC. This is just begging for trouble. I don't understand why you have trouble understanding that this method of testing would both require soldering and chip removal - you are, after all, the guy who brought it up!

The proper method for testing such a device would be to hook the running RTC into a logic analyzer. (See I left you an out. You did not take it.) Then read the date from the current state of the bus. All you have to do is find any Computer Engineer, hand him the equipment and the specs for that RTC chip, and tell him you want the BCD numbers off the bus - he will have them for you in a few minutes. Then you know the date on the chip. If it were over 50 years to rollover, I rather doubt even the most pessimistic would demand a test. After all, the machine will be dead and gone long before then. BUT, you insist on a test - a meaningless test if the rollover date will occur after the machine is scrap. Again, tell your CE that you want him to reset the date on the chip to 12/31/99:23:55:00:00:00. After he gives you the same explanation I have just run through, order him to do it over his bitter protests. He will hook a logic pulser to the unused pins on the chip (while the rest of the device is powered down) force the bus to the state corresponding to 12/31/99:23:55:00:00:00 and pulse the pin that sets the clock. That will reset the chip to the time you want. Then power up the device and wait 5 minutes for your useless test.

The only circumstance I can conceive of in which a test would be required is in the case of an RTC chip that CAN be set to match real time. In that case, testing is required if for some reason the device cannot be set to one of the earlier calendar dates that match the months and days of the calendar in 2000 and after. Then, you do need to test, but only then, because you have a device that can actually make the rollover and PERHAPS even have a problem dealing with it!

Now remember principle one before you reply - the RTC will start in a known state. Resetting will put it back to that known state.

Also remember principles two and three - You can't have an RTC (or any other chip) without power, bus and address lines. You can't just interface circuits; you must connect them properly for valid testing or manufacturing of any type.

And finally, remember principle four - each chip and circuit board has various electronic characteristics that must remain the same during testing, otherwise you have introduced random variations in your test.

Paul Davis

A short PS here - firmware clocks get their startup data after a complete power down from the hardware clock, at least in the type of device under discussion. It will either go to some default number, throw an exception/alert, or ask to be set. Since he specifies devices with clocks that can't be set externally, the options are rather limited. If you have ever powered up a PC with a dead battery, you know what they do. And BTW - if anyone is offended by the use of he in referring to a CE, my apologies. I personally know several very good female CE's.

-- Paul Davis (davisp1953@yahoo.com), April 13, 1999

Answers

Paul,

Interesting post, will have to read it again carefully to digest all of it.

Was amused to see how you got started in the field. FWIW, I wrote the assembler for the PDP-5 and PDP-8 as a young whippersnapper at DEC in early 1965. Also wrote the FORTRAN math library -- sines, cosines, exp, log, etc. -- for those machines, as well as the PDP-6 and PDP-10. Glad to see you survived the experience and that it left you sufficiently vigorous to not only have children, but grandchildren as well.

Cheers, Ed

-- Ed Yourdon (ed@yourdon.com), April 13, 1999.


I'm glad someone understands all this stuff....

-- Apple (villarta@itsnet.com), April 13, 1999.

Paul and Ed,

I'm glad to see you both responding to this, especially because Gary North thought it to be of great import. If he did, so will others--rightly or wrongly.

I don't have the credentials to affirm/discredit this theory, I hope others will digest it and translate it to the rest of us.

The idea of "Black Holes" in these systems is frightening, and thus deserving of more attention on this forum.

Thanks!

-- FM (vidprof@aol.com), April 13, 1999.


If experts disagree, then we have a problem regardless.

-- Mike Lang (webflier@erols.com), April 13, 1999.

Be it known up front that I too am without credentials in this area. But one question occurs to me on reading:
"The only circumstance I can conceive of in which a test would be required is in the case of an RTC chip that CAN be set to match real time."

In an operating installation of normal size, it possible to identify the chips in this category, so they can be tested?

-- Tom Carey (tomcarey@mindspring.com), April 13, 1999.



Paul

Trying to keep up! Thanks from a Non Tech person. Just wondering though, if someone didn't test or power down the system for a restart what would be the effect on a chip that does rollover. It would seem that business entities are not always 'aware' of all the issues involved with such a situation as embedded systems. I am thinking of SMEs that aren't paying that much attention. And what about a system that is 7/24 and can't be powered down.

>>>>>>>>>>>>>>> There is NO reason to want to know the date present inside the chip in such a case. If the device the chip is used in has run for 3 years using the factory default date, powering down the device totally (including pulling the clock battery) will reset the device to the default. You then KNOW that you have another 3 years of run time without any problem relating to the date on the RTC. I do not understand what problem you have with this means of resetting the clock to the factory default/zero date state. <<<<<<<<<<<<<<<<<<<<<<<<<<<

-- Brian (imager@ampsc.com), April 13, 1999.


Mike,

You took the words right out of my mouth. I've printed the heart of both Beach's and Davis' comments and I come away scratching my head. I'm a software guy, not a hardware guy, and don't know beans about this stuff, but I do know that when you've got educated opinions flying about and no fixed and "true" declaration of facts, it leads to the result I've found true with hardware and software glitches:

There is a lot of standing around and finger-pointing and there isn't much anyone can do about it.

Yes, someone finally takes the bull by the horns and begins leading, rightly or wrongly. Yes, you get a direction you go toward remediating the problem, but how many of these can you endure within a tense and chaotic period of time before you encounter system collapse?

-- Brett (savvydad@aol.com), April 13, 1999.


Paul:

Excellent response!!!

Mike & Brett:

Before you assume that there are two "experts" arguing here see Mr. Beach's credentials below:

Most significant thing GN every posted

RMS

-- RMS (rms_200@hotmail.com), April 13, 1999.


Tom - as a general rule it should not be that much of a problem. After all, the device has some facility for setting the date, even if it is just an LCD panel on the cover, and that will make it stand out. The only hard ones would be one that got its date from another device that was paired with it, but even that should be OK as if you set the first device's time, the second should follow. So again, setting the date back to 72 (or whatever) on one should set it back on the other, and that would show up in the reports you ran to test the system.

-- Paul Davis (davisp1953@yahoo.com), April 13, 1999.

Any comments on how this may affect my automobile? I've heard very little from the auto manufactures. I keep wondering if it will operator after 1/1/2000 and if not will unpluggin the battery reset the defaults?

-- L. Miotti (louie.miotti@sait.ab.ca), April 13, 1999.


L. Miotti,

Your car will run. There will be no reason to reset any chip.

If anybody wants to argue with that they're full of it.

-- Doomslayer (1@2.3), April 13, 1999.


The biggest problem I see with all of this from my industrial maintenance point of view is this. It will be easy for the most part to configure stand alone systems to run with whatever date you want. But if you have situations where you are required to download data to another system which has not or cannot be faked out you will suffer data loss or corrupted data. And I guess in a nutshell that will shoot all the information services people in the foot.

I've seen this happen with daylight savings time situations on several occasions where the host or remote node got changed and the other one didn't. The file shells on the next down load were there but the files had no contents. This is ok if you have a full backup of whatever files were lost but in my experience that doesn't happen often.

-- nine (nine_fingers@hotmail.com), April 13, 1999.


Paul:

Since the secondary clock is either connected or not connected to the 'system' clock are the following two statements true?

Secondary clocks that are isolated from a 'system' clock will not pose a Y2k problem because they will "default to zero" each time they are cold started which occurs often enough so that rollover is not really a practical problem.

Secondary clocks that ARE CONNECTED to a 'system' clock can be adjusted to be 28 years from rollover, and, therefore, not a practical Y2k problem by setting the 'system' clock to 1972 (thus, days and leap year will match those of the year 2000).

If these two statements are true, is THAT all there is to to fixing the embedded chip problem?

Roger Altman

-- Dr. Roger Altman (rogaltman@aol.com), April 13, 1999.


Paul,

Do you have any experience with complex embedded controllers? Design, build, applications???

How long will a 1 Farad capacitor power a CMOS timer circuit in a MCU in sleep mode? How long would it take to recharge that cap when power reappears?

What are the design criteria for designing a box in which a battery is not used to maintain system state and elapsed time (Calendar/clock)?

Do you have any experience with RTOS's?

How about designing with ASICs or custome Si fab?

What happens when a data variable is overflowed in a real time embedded system?

Are you a rocket scientist? Or at least do you know any rocket scientists?

Your arguments sound fully informed but are a little shy of a full quiver, at least IMHO. You have described only SOME types of extant systems in the world. Unless you have extensive experience in the field of embedded systems I would suggest that you may be missing the points in some of Beach's writing.

I have a background in electronics to include work with component level issues, CPU's, high speed buss and comm. Also an interest in higher level programming and embedded systems. Your assertion that a BSEE with a logic analyzer can, in a matter of minutes, interrogate an unknown black box system to determine date functions, etc is patently ridiculous.

Unless you have a full knowledge set about the internals of the box you are flying blind (and why would you need to do it if you already knew what was inside and how it worked?). One does not spend a few minutes reverse engineering a black box. It is a tremedous effort requiring alot of thinking, time and material resources.

I am not saying that everything that Beach expressed was technically correct. There are some things with which one could quible. IMHO he did a pretty good job explaining, in layman's terms, what the issues are. It is a muddle out there .. alot of room for 'creative' designing. Not alot of rigor in terms of design and implimentation. Much of it is hand crafted.

Beach's article has brought to light an important aspect which is hidden deep in some of these systems.

Its important that we look for every aspect instead of trying to shoot everyone who comes up with an aspect we didn't already think of for ourselves.

Happy hunting.

-- David (C.D@I.N), April 13, 1999.


Your turn, Paul. It's a draw so far.

-- BigDog (BigDog@duffer.com), April 13, 1999.


David,

Do you have the answers to your questions? Or are you just good at asking questions?

-- Doomslayer (1@2.3), April 13, 1999.


This sounds like the parable about the blind men and the elephant. Except some of them have spent more time with the elephant than the others...

-- Chatty Cathy Alumina (chaky@chickpea.beans.foo), April 13, 1999.

I have the answers. But I want to hear it from him. Its easy to spout off and project limited knowledge into all cases. It is also not valid to argue that way. He has some knowledge, but I doubt it is as comprehensive as it sounds.

-- David (C.D@I.N), April 13, 1999.

"He has some knowledge, but I doubt it is as comprehensive as it sounds"

Precisely.

-- a (a@a.a), April 13, 1999.


After reading this thread, I ALMOST sympathize with the media.

Does this mean planes MIGHT fall out of the sky?? :)

R.

-- Roland (nottelling@nowhere.com), April 13, 1999.


Hi Paul,

You are correct in your descriptions, but it doesn't negate what Bruce wrote -- which is intended for the general public.

I'm sure you're aware that most embedded systems, especially the smaller, harder-to-find embeddeds, can have on-chip date processing capabilities. I'm sure you're also aware that these systems can have a capacitor or tiny rechargeable battery to maintain time/date processing (trivial example: some VCR's will continue to keep their time/date clocks running even when unplugged for months).

The real problem, in terms of being able to find and assess them, is with embedded systems that outwardly don't use a date, but have ongoing date processing built in. Due to the layered design approach available to, and used by, most embedded systems developers (which was the thrust of Beach's article), these clocks represent a real time bomb when they're in systems where the 'secondary' clock isn't actually used. As a convenience, many of these chips did have the time/date clock set at the factory (those that have on-chip capacitors), and could malfunction some time around (or just after) 1/1/00.

I don't expect many of this type of system to malfunction, or if they do, they probably will have only a nuisance impact. But, a few could have a major impact.

BTW, I'm wearing a solar watch that uses a capacitor for energy storage. It has a quartz-controlled, mechanical movement (no LCD). The watch doesn't have a battery, and the capacitor is charged from 6 solar cells, each of which is 4mm x 2mm in size (I didn't open the case, but that's the size of the holes in the watch bezel).

As a test, I placed the watch in a dark drawer to see how long it would run without light -- and it was still running at 30 days. The circuit obviously needs incredibly small amounts of power.

An aside. It's also the most accurate watch I've ever owned. I last set it over 4 months ago, and every time I checked it against the atomic clock via the Internet, it's been within 3 seconds. The last check was about 10 seconds ago. :)

To Ed, You said:

"Was amused to see how you got started in the field. FWIW, I wrote the assembler for the PDP-5 and PDP-8 as a young whippersnapper at DEC in early 1965. Also wrote the FORTRAN math library -- sines, cosines, exp, log, etc. -- for those machines, as well as the PDP-6 and PDP-10. Cheers, Ed"

NOW I know who to blame!! Upon many occasion I cursed those assemblers after I had to switch from the Computer Control computers (and assemblers) to DEC.

-- Dean -- from (almost) Duh Moines (dtmiller@nevia.net), April 13, 1999.


Can somebody link to the first part of this thread or tell me the thread category and title?

P.S. How many k is the Uncategorized up to anyway? Or is the proper question How many Meg is the Uncategorized up to?

-- Ken Seger (kenseger@earthlink.net), April 13, 1999.


This has been a very interesting exchange of opinions. I will note that another extensive look at embedded chips/systems was done a year ago by Dr. Mark Frautschi and is worth the time of anyone trying to learn a bit more about this matter.

www.tmn.com/~frautsch/y2k2.html

-- Gordon (gpconnolly@aol.com), April 13, 1999.


The Great Disconnect just won't go away. See http://www.greenspun.com/bboard/q-and-a-fetch-msg.tcl?msg_id=000iJd

-- Tom Carey (tomcarey@mindspring.com), April 13, 1999.

Might as well have some of the text here.

http://www.tmn.com/~frautsch/y2k2.htmlEmbedded Systems and the Year 2000 Problem (The OTHER Year 2000 Problem) http://www.tmn.com/~frautsch/y2k2.html

Draft of 6 April 1999

Copyright 1998, 1999 Mark A. Frautschi, Ph.D. Shakespeare and Tao Consulting http://www.tmn.com/y2k/ (410) 453-9256 frautsch@tmn.com

Abstract:

There is another Year-2000 risk. It is distinct from the more widely reported risks concerning impending failures of computers and software that represent dates using two digits for the year. This risk involves Real Time Clocks and their interactions with associated embedded processors and logic arrays, dedicated electronic control and monitoring logic incorporated into larger systems. These are essential to the operation of a vast portfolio of infrastructures, from medical equipment, to buildings (phone, security, heating, plumbing and lighting), to transportation, to financial networks, to just-in-time delivery systems, and so on. According to a recent study, the firmware (permanently loaded instructions) that enables these systems to run is date sensitive and not Year-2000-compliant in less than 1 percent of the fifty billion microprocessors and microcontrollers used in embedded systems installed worldwide by the end of the twentieth century. This small fraction will fail, causing the systems they control to begin failing around 1 January 2000 and for the first few years of the next century. These failures are coupled with significant factors mitigating their diagnosis and repair. These include concerns over legal liability, the absence of standards and of reliable documentation of Year-2000-compliance of date sensitive systems produced over the past few decades. This poses formidable assessment issues.

*snip*

It is time to shift emphasis from repair to triage and contingency planning and to make appropriate preparations for risk management against massive loss of infrastructure.

*snip*

Introduction:

Embedded microprocessors and other time sensitive logic are silicon integrated circuits, generally with permanently coded instructions (firmware - where these serve as an operating system they may be called a microkernel) that are not designed to be easily changed. These monitor, regulate or control the operation of devices, systems, networks or plants. These are generally in the form of silicon microelectronic chips, such as microprocessors, microcontrollers, timers, sequencers and controllers built-in to machinery from small devices such as wrist watches and consumer electronics, to dedicated processors controlling large industrial plants. The term "embedded" refers to the instructions that are permanently loaded in one of the (ROM) chips comprising the system. The IEE give a broader definition that includes dedicated, code-driven, systems (IEE, 1997) "Embedded" can also denote that the microprocessor and other hardware are installed within the device at hand at a depth that they may not be obvious to the user (and possibly experts) without disassembly.

Typically, an embedded system will be comprised of a microprocessor, Read Only Memory (ROM), input/output circuitry (for monitoring and control, e.g. Digital to Analog Converters), Random Access Memory (RAM), communications circuitry (e.g. a link with a central computer) a system clock and possibly a Real Time Clock (RTC). Several of these elements may be integrated onto a single chip (or multi-chip-module) which may be called a microcontroller. A typical embedded system contains approximately ten individual chips. This number varies greatly depending on the age of the design, the technologies used, the desired functionality and finally with cost. Generally, chip counts tend to decrease with design date for the same level of functionality. A treatment of the basic technical elements of digital electronics may be found in (Horowitz, 1989). See note [1].

A treatment of the distinguishing characteristics between Year-2000 failures [2] in Information Technology (IT - computers and software) and embedded systems (ES - dedicated processors, logic and firmware) may be found in (Smith, 1998). GartnerGroup [3] estimates that there will be fifty billion microprocessors and microcontrollers used in embedded systems worldwide and that under 1 percent of these devices will have Year-2000 (Y2K) related failures leading to shutdowns, erroneous results or chaotic behavior. Of this, a fraction are in mission critical systems, leaving on the order of 25 million microprocessors and microcontrollers (deployed in systems containing these and other chips) which must be repaired world wide. This, in turn, causes the devices in which they are incorporated to fail or behave unpredictably. The implications for society are widespread. A pessimistic scenario (Schwartz, 1996), (Williamson, 1997) and note [4] will be presented for risk management purposes; thus proactive and reactive responses will be described in a section on recommended actions rather than as part of the scenario itself. This scenario is not intended as a prediction.

------------------------------------------------------

Read the Rest.

It's another excellent paper.

I think it supports Beach's Report and is certainly credible enough to appease (sp?) any pollyanna.

IMO it's just plain scary.

Father

-- Thomas G. Hale (hale.t@att.net), April 13, 1999.


Ken,

We have several threads going on this. Start here and note the links posted by Kevin... <:)=

Gary North: most significant document...

-- Sysman (y2kboard@yahoo.com), April 13, 1999.


I think part of the problem is that what Beach and Frautschi describe is indeed possible in theory. But y2k is a practical matter, so you need to ask how common such an occurrance really is.

Beach's argument is like saying, if you walk into a door without opening it, you could get hurt. And there are many millions of doors out there, and billions of people going through them all the time. We must therefore logically project an epidemic of broken noses and black eyes, and that means the hospitals must be overwhelmed, and that means the breakdown of our medical system, etc. etc.

In practice, nothing like this happens. But the *potential* is there.

-- Flint (flintc@mindspring.com), April 13, 1999.


There has been so much about this on so many threads that I can scarcely remember who said what. I thought Beach claimed that his sources were experiencing this potential problem "actually" and with surprising frequency?

-- BigDog (BigDog@duffer.com), April 13, 1999.

Flint,

I think the real problem is that we know some of these embedded chips/systems will fail *unless* they are found and fixed or replaced. And we don't know if the level of expertise in correcting this is as high as it needs to be in the time remaining. Apparently Paul Davis knows exactly what should be done to minimize the problem, but how many are out there like him? And what about the places where they are not looking very deeply or have elected to fix on failure? There is no way that I can feel comfortable about oil refineries or any other complex business using 1,000s of these systems now that I do know more about their dangers and not enough about the repairs being done or intentionally being put off. So it's the human factor that is the larger part of this minor problem. We can debate the complexity of the chip/system and whether it can be readily fixed, or not, and whether it really matters, or not. But after all is said and done and we agree that these items need serious attention right now, the question remains, are they getting that attention. Here in the USA? What about the refineries in Venezuela and the Middle East? And if that interrupts the oil supply to us, how much will it matter? I have seen a report that stated that during the oil shortage problem of the early 70s the US was importing 35% of its oil (now 55%) and the shortfall due to the Middle East was only 5%. Will this embedded chip situation cause at least that much oil shortage for us?

-- Gordon (gpconnolly@aol.com), April 13, 1999.


DAVID:

Can YOU compare apples and apples? AT NO TIME was the CE asked to check a "black box system"!!! (S)He was asked to acquire the apropriate data from a spaecific chip, with supporting documantation. (And a small bow to SWE, several members of which I know VERY well ;-} )

Chuck

For the enginering challenged SWE is teh Society of Women Engineers, which had a surprisingly strong chapter at Clarkson as I left (course my grad class only had 23+/- women)

-- chuck, a Night Driver (rienzoo@en.com), April 14, 1999.


Chuck - Let's not piss all over each other here. A one off situation can not be extrapolated to the entire world. There aren't enough logic analyzers in the world let alone talented engineers to cover the BILLION of systems in place. There are plenty of systems which are completely 'black box' to the owners. They bought them and that's about all they know. Secondly, finding date related registers/circuitry on a chip is not the only problem. It is part of the problem. The embedded software itself is another morase.

I have not been abusive to people here. Paul came on with a scathing rebutal to Beach which was not comprehensive. He came in with both sixc guns blazing. THAT does not help us here does it??? I was letting a little air out of his balloon, probing for some understanding as to were his experience was coming from. He hasn't responded to the questions and I don't know why. Maybe he has to do some research in a couple of libraries and the internet before getting back to me on this? Maybe.

-- David (C.D@I.N), April 14, 1999.


How many times does this have to be said before people will understand?

If the date matters to a device, then there will be a way to set the date. It's ludicrous to suggest that anyone would design a device to be dependent on dates without providing a way to change the date.

-- Doomslayer (1@2.3), April 14, 1999.


Interesting that this thread has started so much discussion - that's a good thing IMHO as what this subject needs is more discussion focussed on the real world technical issues and less on 'might', 'maybe' and 'possibly'.

Ed Yourdon - As a matter of fact I did know you had worked on the PDP series software. Back in the 70's it was possible to keep track of people in the computing community to a much greater extent than is possible today. Everyone knew what Grace Hopper, Gary Kidall and Bill Godbout were up to - we all knew what was happening at TI and Intel. Now we have much better hardware, and lots more freedom in the way we can connect and communicate, but the old small town feeling is a thing of the past.

Brian - the effect on a system that does rollover is not predictable without the code for the program the system is running. Avoiding rollover in low-level systems is much easier, and should be done whenever possible instead of trying to patch the program. I can't think of any system that would use a non-resettable RTC that would run 7/24 - this is the realm of factory line controllers and such.

Nine - yes, and that is exactly what I said. Blind RTC's that can't be set can't cause that sort of problem - else you will be seeing that problem right now in your monitoring software and reports.

Dr. Altman - except for the case in which the device is reporting a date to a monitor program or such, yes. If the date does not show up on a report or monitor in the system, generally you won't have a problem with rolling dates back. AND, in my hunt for PLC and related problems, monitor panels reporting incorrect dates and related problems were the only responses I got.

David - In order, Yes, with normal power draw, 6 weeks to 3 months - in such a case you would short the leads on the cap to discharge it, # 3 is unclear, no, not my job - I don't have to design the gates to know what they do, as I said in my original post that is easily avoided so is not an issue for the vast majority of real equipment, no I am not a rocket engineer though I do know a couple and what does that have to do with anything? And I am not talking about Black Boxes here, I am talking about a specific type of chip. Any competent CE should be able to pick out the bus and address lines on a chip and read their contents with a decent logic analyzer - I know hobbyists that do it for that matter. And conversion of Binary to BCD is not difficult - it is taught in freshman classes for Heaven's sake.

Dean - as I said before - discharging the cap or pulling the battery will force the chip to a known state. If you open up your watch and short all the power leads for a few seconds you are going to discharge the cap and stop that watch. And a mechanical movement is analog, not digital, so holds the last setting - unlike dynamic ram or registers the motor does not roll back to zero just because I pulled the plug - it just stops.

My problem with Beach extends to Frautschi as well - I don't think they approach the problem correctly.

Generally speaking, I think a lot of the confusion around this issue involves the differences between software and hardware. Most of the folks who expect real trouble from hardware seem to be programmers. There are some very large differences between hardware and software failure modes, and it is not correct to extrapolate software failure modes into hardware. Just as an example of the differences - When programming in C the initial state of a declared variable is not known when the program starts. A program can fail if the variable is not initialized on startup. When dealing with logic gates, the state of the device on startup is always known. BIG difference, and enormously important.

-- Paul Davis (davisp1953@yahoo.com), April 14, 1999.


Sir Doom - You're forgetting that we are trying to find, fix, and prevent the results of previous ludricious" decisions that are hidden away in old processers and control systems that were not designed "right" in the first place.

If these d**n things were perfectly design, perfectly installed, perfect maintained, perfectly configured and correctly updated everywhere - there would be no problem.

It foolish to try to use to logic to "eliminate" the possiblity of previous bad designs (or errors) or poor design decisions done earlier that have been used for many years without changes - the only way to find errors - which are inherently "illogical" anyway right? - is to test for the results of known input conditions, and "hope" through thorough testing and rigorous test designs that you have induced all the failures that will actually happen.

You're trying to pretend that you can use good logic to find earlier assumptions, design errors and illogical planning - and that is ludicrious behavior.

-- Robert A Cook, PE (Kennesaw, GA) (Cook.R@csaatl.com), April 14, 1999.


So, Mr. Cook, let me see if I can put it in layman's terms so the non-techies here can understand. In your roundabout way you are saying that it is possible that there are devices out there that need to know the date but for which there is no way to set that date. And that we cannot explain away that possibility using logic. OK, maybe not, but the possibility of such a device existing must be extremely low. I maintain it is zero.

-- Doomslayer (1@2.3), April 14, 1999.

At the risk of being chewed apart, I'm still trying to sort wheat from chaff, fully admitting Paul's comment about the tendencies of software guys (me) to thoroughly misunderstand hardware.

My problem, tho, is Paul seems to be arguing fundamentally that the ENTIRE embedded systems things is a crock, as he has many times before. Then again, he argues that Y2K in its entirety is a crock. Hmmmm.

Others (I'm talking about guys with hardware experience) are saying, "yes, Paul, you're right so far as you go, but you're extrapolating from known, well-designed, easily testable systems to a larger world of kludge that is a bewildering mess of embedded hardware/embedded software. The effects of that stuff remain unknown, at best, to negative." Right? Wrong? (Not me but my interpretation of what's being said)?

That's the subject generally. With respect to Beach:

Davis, Flint and some others think he's COMPLETELY wrong. David and a few others think he's basically correct but didn't articulate it as well as he might. Right? Wrong?

This is a kind of "time-out" set of questions. Otherwise (hint to the Pollys) instead of clarity, which we all seek, we're going to end up further alarming us doomers by the sheer craziness of the discussion.

Sheesh, and I thought software guys were bad ......

-- BigDog (BigDog@duffer.com), April 14, 1999.


I highly recommend this article on embedded chips...

http://www.jsonline.com/bym/tech/0214chips.asp

"Problems lurk in more than just computers"

-- Kevin (mixesmusic@worldnet.att.net), April 14, 1999.


BigDog - you are both right and wrong. You are correct that I don't think embedded problems exist - at the hardware level under discussion. There are several levels at which hardware and software interact to make up a complete system, and the hardware/logic level is the least likely to have any sort of dating problem. Hmm - perhaps I should post something describing the various levels of code/hardware that make up a complete system. Whoever thought I would take up teaching basic logic hardware at my age?

-- Paul Davis (davisp1953@yahoo.com), April 14, 1999.

Paul,

Do you have any experience with complex embedded controllers? Design, build, applications??? [Yes] OK, what?

How long will a 1 Farad capacitor power a CMOS timer circuit in a MCU in sleep mode? How long would it take to recharge that cap when power reappears? [6 to 12 weeks] Sounds good to me. It takes less than one second of power to recharge that puppy. Yes, you can discharge it, but why would you want to do that? You are trying to test this not reset it.

What are the design criteria for designing a box in which a battery is not used to maintain system state and elapsed time (Calendar/clock)? [unclear] I'm thinking remote, hard to access, environmental extremes here .. battery not happy, don't want to go there to replace it kind of places (ocean depths, hot environments, etc).

Do you have any experience with RTOS's? [No] Real Time Operating Systems. They are the typical OS for moderate to complex embedded systems. They run multiple processes simultaneously on one processor, do multiprocessor systems, run from code libraries, etc. Segue stright into object oriented operating systems. This is not straight line coding. It is building virtual machines which have nothing to do with the underlying hardware configurations. Real Twilight-Zone stuff.

How about designing with ASICs or custome Si fab? [No?] Me neither, but the jist is that it's like custom designing with standard parts in silicon via interconnect overlays or blowing fuses. Each run is custom and so there is no way you can know what was included with the core processor or how it was wired or what it really does. Not like going down to the local store to buy a 68X00 chip at all. Often the part numbers are coded or obscured.

What happens when a data variable is overflowed in a real time embedded system? [?] Overwrites other dynamic data variable areas or ??? When the stuff is crammed into a small space then overflows in Ram can cause some very weird and wonderful failures completely unrelated to the date failure itself. This is what has been found on some RTC/Cmos configurations in older PC's when the date flips over and changes the clocking speed, disables seriel ports of 'forgets' about floppy drives.

Clocks and counters can be implimented as software processes as well as in hardware. Like the epic counter in the GPS system. Its a data structure. What happens if it overflows, goes to zero, overwrites neigboring data structure? See above. Also see Beach's "UNDEFINED".

Are you a rocket scientist? Or at least do you know any rocket scientists? [No] Just teasing you a bit here. The people who are intensively involved with building these custom embedded rigs are a chosen few. Often there is a lifespan estimate they design to and there is never the expectation that it will be replaced or upgraded. If it fails then replace it from stock or get a new piece of equipment. The embedded guys are driven by bean counters - get product out, sell, do new product, revenue, etc. Repair or recoding is not a part of the immediate spec and does not happen.

Your example of someone throwing an analyzer onto a buss is a bit simplistic IMHO. Out in the real world there are ALOT of Black Boxes. (Your known box is my black box if I don't know what you know, eh?) There are ALOT of unmarked chips. I wish you luck and God speed if you EVER have to troubleshoot a complex system without total documentation, code libraries, proper tools, and ALOT of time.

Happy hunting.

-- David (C.D@I.N), April 14, 1999.


David - you scrambled my answers a bit, for instance I do know a couple of rocket engineers, but most of your questions were needles, so it doesn't matter.

Now I was examining a single mode of failure here - whether or not it matters what date is present in a device that only uses the timing circuits of the clock internally. That was what Beach was discussing (after you trim out all the excess verbiage). My answer is NO, the only worry would be a single program cycle at the rollover, and since the hardware starts in a known state, testing is not needed for such a device as rollover will not occur until the RTC times out internally - which will be long after the original device is in the scrap heap or in a museum. Can you provide a counter example to this?

As for black boxes - take a close look at them. Most seem to be analog devices, around here anyway. And I haven't run into all that many unmarked LSI chips - SSI is not a problem as they are pretty easy to figure out. Now if you MUST have something to worry about, worry about PLA's where you don't have documentation as to just what was burnt into them. ROM and EPROM are a lot easier, just download and disassemble, but a PLA would be a right pain to figure out, for me anyhow.

-- Paul Davis (davisp1953@ yahoo.com), April 14, 1999.


I keep seeing 1-1-2000 mentioned as a date when RTC's will rollover. I thought these clocks were storing time as a count of elapsed seconds or some other time interval since some base date. If so, the date that these clocks rollover is not a y2k issue. The rollover date would be a function of the base date, the time interval, and the size of the clock variable. So if I used n bits to store my clock and incremented my clock by 1 every ith of a second, my clock would rollover at base date + (2 ** n / i) seconds.

Also I find it hard to believe that an embedded system would calculate a time interval by converting the times being compared from the internal clock value to calendar format and then comparing. It would be a more difficult algorithm to code, it would take up more space, and it would be much slower. All of thse seem contrary to the requirements of an embedded system.

-- Ed Colletta (ecollett@lehman.com), April 14, 1999.



So: if I understand the contention, there could be embedded systems out there with hidden RTCs that have undefined behavior if they time out in Y2K. Before I'll believe Beach's argument, someone needs to name an actual example (ideally, give me the chip number(s). I haven't seen anyone do that yet; all I've seen has been pure speculation (maybe I just missed it). But there's another point which is being overlooked here (and re: Y2K in general). Thousands of computerized systems, from complete WANs down to tiny embedded controllers, fail now, and on a daily basis. We work around this. We EXPECT it to happen. Even if Beach were correct, he'd have to demonstrate how this could be more damaging than, say, a controller in a critical process developing a bad power supply at the worst possible time (this DOES happen, believe me). The average lifespan of even the best capacitor is quite finite. It goes bad and the power supply develops excessive ripple -- which can cause precisely the "wildcard"-type behavior that Beach attributes to his "secondary clock" problem. (In fact, if you've got ripple on the Vdd line to a processor, you've got an Electronic Wildcard: that thing is liable to branch to an undefined location and truly go insane.) And yet, we anticipate this and work around it. We have ways to shut them down, take them off line, and/or work around them when (not "if," WHEN) they fail. The idea that there are millions of these "time bomb" embedded systems lurking in the shadows, waiting to bite our hineys in Y2K, is something worse than silly. -- Stephen
http://www.wwjd.net/smpoole

-- Stephen M. Poole, CET (smpoole7@bellsouth.net), April 14, 1999.

Mr. Beach's article certainly has stimulated a great deal of spirited discussion. Unfortunately, the discussion hasn't done a whole lot to clarify the situation. For those of us who are trying to convince plant operators and utility managers that the embedded systems problem is real, there continues to be very little hard evidence. Mr. Beach's position seems to be that no amount of testing will reveal potential system failures when clocks roll over to 1/1/00. Although Mr. Davis refutes Mr. Beach's conclusions, he agrees that testing is problematic. Rather than trying to reach concensus about how pervasive and "embedded" the problems are, we should all be stressing the need for contingency planning. Since all systems with embedded microprocessors are subject to failure (apparently even those that have been remediated), every business, plant, facility that depends on devices with embedded microprocessors, must have a clearly defined course of action to follow in the event any of their critical systems fail.

If Mr. Beach's conclusions are correct, then there is very little point in testing embedded systems. The effort currently being spent checking with manufacturers and testing embedded systems would better be spent practicing how to run operations in manual mode. For this reason, it would be helpful to know if Mr. Beach's points are valid.

-- Michael Gentry (mgentry@oit.swrcb.ca.gov), April 15, 1999.


This astounds me. There are "billions and billions" (just like McDonald's) of these little imbedded chips running our equipment for us all over the world and yet this collection of highly qualified and barely comprehensible experts does not agree on how they work or how to test or fix them, or even whether there is any fixing needed.

We sell equipment operated by PLCs and here at ground level all we did was call our suppliers who called the chip makers and asked them if their chips were going to work next January or not, and they said "Yes" to all the ones we buy. And when our customers ask us if our equipment will work right next January, we tell them "We think so, because the chip makers said so". What else can we possibly do?

This whole discussion does not give me any kind of secure feeling. Stored food in a remote location is looking better and better to me, and that's what I'm betting on.

Stephen

-- Stephen Kovaka (kovaka@usa.net), April 15, 1999.


Doomslayer said, "If the date matters to a device, then there will be a way to set the date. It's ludicrous to suggest that anyone would design a device to be dependent on dates without providing a way to change that date." I disagree. If the y2k problem were not in the forefront of the microprocessor-designer's mind, and if the device needed to know only time intervals and not the actual date, there would appear to be no need to ever change the date in the "secondary clock." There would appear to be no need to make that clock's date accessible for resetting. I am not a techie and do not know the answer to the Bruce Beach embedded systems issue. But I am as amazed as some of you that it is April of 1999 and we are still arguing about how embedded systems work. There are two things I hope Bruce Beach is wrong about. One is that there have often been bad results in the rare cases when testers went to the great difficulty of setting the secondary clocks forward. The other is his statement that these clocks were updated to the current time when they were manufactured. But I have no reason to claim he is wrong. Hopefully someone does and will explain.

-- Bill Byars (billbyars@softwaresmith.com), April 15, 1999.

Sounds like D.C., guys and gals. How about the scientific method. As one bod observed, the are parts failing all the time, therefore there are spares which can be mocked-up and have their wee ladders fiddled with, using the ever-so-blue light. Set them to do the 1/1/2K rollover next week. Do the same with several cheeps which are widely used. One should be able to make fairly complex bricks...or find whole boards with EPROMs. No? Ciao and all the best. Geoff

-- Geoff Gubb (cnggubb@shore.intercom.net), April 16, 1999.

I'd be interested in how Beach's analysis compares with Frautschi's (sp?). It seems to me that both are talking about the same thing.

Frautschi basically thinks that if some of the RTCs are powered on long enough that they can rollover and give unexpected results. I don't see how Beach has said anything substantially different.

Beach does suggest that there are often batteries built into these RTCs. He also suggests that the epoch date will be somewhat close to the manufacture date.

-- JMWildenthal (jmw_now@hotmail.com), April 16, 1999.


David, Interesting questions - may I take a turn at them?

"How long will a 1 Farad capacitor power a CMOS timer circuit in a MCU in sleep mode? How long would it take to recharge that cap when power reappears?" ANSWER: You will never find a 1 FARAD capacitor in a CMOS timer circuit, unless its the size of a bread box...puleez...how about tens/hundreds of MICROFarads? Charge time will depend on the RC time constant, give me the circuit and resistor and capacitor values...

"What are the design criteria for designing a box in which a battery is not used to maintain system state and elapsed time (Calendar/clock)?"

ANSWER: Well, for starters I would require an external power source....;)

"Do you have any experience with RTOS's? How about designing with ASICs or custome Si fab?" ANSWER: Yep. Also have y2k tested some RTOS's. RTOS versus any other operating systems ASICS - no design experience, but these aren't a very big Y2K problem from what I have seen.

"What happens when a data variable is overflowed in a real time embedded system?"

ANSWER: Depends on the specific system.

"Are you a rocket scientist? Or at least do you know any rocket scientists?"

ANSWER: I know OF Dr. Werner Von Braun.

In closing, your questions were very interesting, smoke and mirrors is what comes to mind. Not one of the questions has any bearing on the real issues involving y2k problems in embedded systems.

Regards,

Regards, FactFinder

-- FactFinder (FactFinder@home.com), April 16, 1999.


Motorola has posted "Motorola Year 2000 Alert on Real Time Clock Semiconductor Devices" on it's web site (http://sps.motorola.com). It includes a list of RTC's that have a two digit year register that Mot says are "Not Year 2000 Ready." If you are looking for a list of the offending "secondary clocks", this is a good place to start.

As a defense to designing RTC's that are held up by 1 Farad caps that are smaller than a brick, I have had good results with the Gold Cap by Panasonic (PN EEC-F5R5U105). It is a wet type electric double layer capacitor (8mm x 21.5mm)with a working voltage of 5.5V and can be purchased from DigiKey for a single quantity price of $6.13. They also sell a 3.3 Farad version.

-- Paul Mallon (pmallon@ionisys.com), April 16, 1999.


Correction - make that smaller than a bread box, not brick.

-- Paul Mallon (pmallon@ionisys.com), April 16, 1999.

It seems to me that the engineers at the water plant in Austrailia would be the ones to ask about this issue. How did they determine which chips needed remediation? What did they do to fix them? If Mr. Beach is following this thread, perhaps he could ask them to contribute. As a non hardware tech, I would rather hear the practial experiences of someone who has dealt specificly with this problem.

-- Richard Korp (korp@principia.edu), April 19, 1999.

Paul,

Thanks for the answer. Like any scenario, an exact answer is impossible but, one thing is true, if you remove the power all goes bye bye except for non-volitile memory.

PS. Based on what I've seen of "rocket scientists", it's not a very good credential to throw around.

-- Rick Heffelfinger (Aheffs4@aol.com), April 21, 1999.


Moderation questions? read the FAQ