IN13Q4-bus-stop

Redundant Redundancy Revisited

Oct. 31, 2013
American Axle's Jeff Smith Discussion on New Features He'd Like to See for Ethernet/IP Device Suppliers Sparks Outage
About the Author

John Rezabek is a process control specialist for ISP Corp. in Lima, Ohio. Email him at [email protected] or check out his Google+ profile.

At the ARC forum in Orlando earlier this year, American Axle's Jeff Smith was on a panel with a number of engineers, myself among them, talking about fieldbus,. He described his sophisticated, painstaking and very successful method for ensuring all his Ethernet/IP device suppliers had the necessary capabilities to function in any one of 35 assembly lines across the globe, some of which are a few football fields long. Jeff's efforts ensured that his uptime and interoperability were impeccable.

But when asked what new features he'd like to see, he said it was for suppliers' devices to capture and preserve diagnostics that preceded an outage. Despite best-in-class networking practices, all devices are still susceptible to the vagaries of local power quality and availability, and other disruptions of the physical layer. If the next step is to fortify the power and network infrastructure with redundancy, can we choose judiciously so we get the maximum reliability gain for the effort?

SEE ALSO: Many Faces of Ethernet Redundancy

The fundamental goal of any redundancy scheme is "fault tolerance." That means no single fault shall cause a loss of data or control. Hot standby or redundant CPUs can fend off an array of detectable logic solver crashes, but their utility is limited if they share the same power supply and/or network infrastructure.

One plant's design had diode-auctioneered redundant power wired to each redundant pair of CPUs, only to find that the scheme didn't protect them from an over-voltage fault. That is, if any of the redundant power supplies ever had a fault that resulted in a higher voltage, the simple redundancy scheme would route the too-high voltage to all the CPUs and possibly cause a system-wide, common-mode failure. So we take pains to power each CPU of a redundant pair with physically and electronically separate redundant power systems.

This sort of vexing common-mode bugaboo comes into play both upstream and downstream of where we might power an individual panel, machine or rack. Our dc power supplies don't power themselves, so where is the ac or other upstream power coming from?

Clearly, we don't want redundant power supplies on the same circuit, but what if they're on the same power bus or even the same UPS? A brown-out might bring production to a halt, and the boss will be asking why. Having no data or incomplete data due to the same power failure won't answer many questions or help to avoid the same issue in the future.

Keeping devices and nodes alive despite faults in the infrastructure is helpful, but the network poses the same challenges and some unique ones. Ethernet requires switches; switches require power; and the way we source and distribute that power will directly impact the effectiveness of any network redundancy strategy. If you have clean, redundant, battery-backed-up UPS power in a control center or rack room, it's not always trivial to distribute it far and wide through the production facility, especially if you measure your site in miles.

Power-over-Ethernet (PoE) would seem to have some promise. Power your "power sourcing equipment" (PSE) switch in the rack room and distribute the same clean power to the field using PoE. But PoE has the same distance limitation (100 m) as plain Ethernet over Cat-5 cable — you can only push the 48 VDC so far over 24 AWG twisted-pair. And because PoE typically only delivers 15 W of power, the remote node can't become a PSE for another PoE-powered switch 100 meters away — no daisy chaining is currently possible. So if you're constrained to locally powered switches and media converters, you can improve your odds by using a ring topology, so one switch can't bring down the entire network. And by designing some degree of power diversity, that is, at least source the field power from different buses, switch racks or power panels, the best you can do to minimize common-mode failures.

The other gotcha of network redundancy is geographic diversity. If the network is threatened by heedless back-hoe or crane operators, running all the cable or fibers in the same raceway negates the fault tolerance you were aiming to provide. If such human and environmental factors are in play at your site, you'll want your redundant media to take geographically separate paths through the facility.

The dictionary defines redundant as "needless." To ensure your redundancy strategy isn't needless, minimize common-mode vulnerabilities to ensure you're getting the maximum value from the additional hardware and complexity.