john_rezabek
john_rezabek
john_rezabek
john_rezabek
john_rezabek

Tolerate less redundancy

Dec. 12, 2006
Today, with Foundation fieldbus, the old redundancy paradigm no longer applies. Chances are, though, it isn’t free. So where should you apply it to achieve the fault tolerance you need?
By John Rezabek, Contributing Columnist

While designing our first Foundation fieldbus (FF) segments in 1999, we had a one-day training session for our engineers and designers. Someone cracked open the case of the proposed FF power conditioners and we were aghast to find multiple integrated circuits (ICs) and (gasp!) fuses. A single point of failure for our multi-loop segments! There were red faces and bulging veins, and during the ensuing weeks, I was perhaps a bit more unpleasant toward our system integrator than normal.

Our attitude was compounded by a new supplier that had relatively modest redundancy in its system. Bulk DC power, controllers, and controller power supplies were redundant, but nearly everything else was simplex. We became less confident this supplier fully appreciated the demands the plant placed on us: basically to never shut down.

One way we found comfort was fieldbus backup link active scheduler (BLAS). In theory, if the system had a bad day, control on the segment would continue uninterrupted. However, for this to function, one needs reliable segment power. The theoretical segment power conditioners, made up of basic inductors, capacitors, resistors, etc., could be considered a simple device, akin to the 250 ohm dropping resistor in a legacy system. But to make them more compact and efficient, manufacturers used ICs. These were not simple devices.

After much agonizing, our supplier saved the day with a redundant solution that was effectively a really simple device.

We put the bulky redundant conditioners only on that 20% of the segments we considered critical. We used the non-redundant devices, those with the ICs and fuses, on the remaining 50 segments, which had between three and 15 devices. Most of the valves in the plant were Level 3, which means they could go to their fail positions without causing a shutdown. We applied this engineering judgment because then, as now, redundancy cost more, took up more space, supported fewer instruments per segment, generated more heat, and added complexity.

The irony is—after six years under continuous power and 90% of it running as a continuous process—none of the non-redundant power conditioners ever failed in a way that caused a valve to go to its fail position. Nearly half of them did fail, but not in a way that caused more than nuisance alarms or controller-mode shedding. Many, maybe most, failure modes don’t result in a process upset. Simply put, all components, especially those with improved diagnostics, can have sufficient fault tolerance without being redundant.

Today, we have a good selection of redundant power conditioners, redundant H1 cards, and even solutions that accommodate redundant H1 trunks. But they aren’t free.

Redundancy became commonplace in the late 1980s when second-generation DCSs, in response to demands for improved fault tolerance from the large process industries, began to offer redundancy at the power supply, controller, I/O, network, and HMI levels. We justified redundancy’s increased cost, complexity, and system footprint in light of the dire consequences of a process shutdown. By achieving fault tolerance for the DCS, we could deliver a solution that was equally, if not more, fault tolerant than pre-DCS, single-loop solutions.

Sometimes it seems we have a whole generation of systems specialists who only remember that TDC-3000 was vastly more fault-tolerant than TDC-2000, largely due to available redundancy at all levels. I was among those who dismissed any PLC or DCS that didn’t offer redundant controllers, I/O, power, and networks for any application more demanding than wastewater treatment or filter cleaning.

Today, with Foundation fieldbus, the old redundancy paradigm no longer applies. Chances are, though, it isn’t free. So where should we apply it to achieve the fault tolerance we need?

Have you noticed the “spurious trip rate” statistic that falls out of SIL analyses? Even the most obsessively redundant, bulletproof automation can potentially shut down the plant. Maybe it’s every 30 or 18,000 years, but it’s not never.

Why not use something similar for our basic controls? Hey, suppliers, we users need tools that have inserted statistics for MTTF and so on, so we can judiciously apply redundancy to components and services where we need it. On my next project, if I mess with all the old Level 1, 2, 3 stuff, I want to be able to tell my project manager I know precisely where to apply redundancy to achieve the fault tolerance demanded by operations.

  About the Author
John Rezabekis a process control specialist for ISP Corp. in Lima, Ohio. You can reach John at [email protected].