Digital Network Reliability

Feb. 8, 2012
How Can You Increase Reliability and Network Availability?

As we commit our control system design to include digital networks rather than hardwired I/O, we want to be sure about reliability and know where redundant network devices are most needed, even if we specify hardened devices for many of the components. We worry that adding unnecessary network complexity and cost will give us after-sales support headaches. We'd like some seasoned advice.

—From December '11 Control Design


Reliability Important

With the digital network now transporting the information of many I/O points, the reliability of that network and its infrastructure (cable, switches, etc.) is that much more important than the individual I/O cable it replaces.

One way to increase reliability and network availability is to add redundancy to create a system that can tolerate one network failure. Depending on the network architecture, this can be achieved in different ways.

A line architecture can be upgraded to a ring by closing the loop between the last and first device in a line. If the data transmission is interrupted in one direction, it can then be rerouted automatically in the other direction. With a ring redundancy (Figure), a single fault can be tolerated without loss of communication and without adding additional network components besides the cable closing the loop.

But before going through the expense and sometimes increased complexity of adding network redundancy, some other aspects should be considered when choosing a digital network. One is network topology. Some networks require a line architecture, meaning that all devices on the network have to be in the same segment. A failure in such a network means loss of communication to all devices behind the point of failure. Other networks allow much more flexibility in network architecture such as line, star, tree, ring or any combination of those types. Using a mix of line and star architecture for network segmentation can allow for the remaining network segments to function during a failure in one segment without investing in redundant network components.

A final consideration is the current separation—and thus quasi-redundancy—of regular I/O and safety I/O via different hardwired signals or networks, leading once again to increased cost and system complexity. Technology like openSafety allows data of safety devices to be transported over the same digital network as regular I/O data. Integrating safety into digital networks will therefore reduce the necessary network and hardwired infrastructure, and allows the same network flexibility in terms of architecture in general and redundancy in particular to be used for all I/O, both regular and safe.

Robert Muehlfellner,
Director of automation technology,
B&R Industrial Automation,

Points to Consider

Because networked I/O solutions will be able to provide a plethora of additional diagnostics data, one might conclude that they have to be more complex. Unfortunately, in many instances, that is the case. If you intend on benefiting from the diagnostic advantages and the added flexibility these solutions offer, and possibly even put your safety devices on the same network, you have to be willing to tackle a certain learning curve. Here are a few guiding questions (and answers) that you might want to use when picking a digital I/O technology:

  1. Is it possible to work with the network without using a specialized software tool? Ideally, starting up and maintaining the network can be completed without additional PC configuration software.
  2. How much downtime is necessary when more I/O must be added to the network? Adding field I/O to the network should be accomplished in minutes.
  3. Can the same networking solution be used with PLCs from different suppliers? You never know what happens at the PLC market. Any networking solution that can only work with a small number of PLCs is problematic.
  4. Is the technology deterministic and fast enough for the application? A nondeterministic solution is a nonstarter. I/O update times should be twice as fast as PLC scan times. Less than 10 ms is probably a good number.
  5. Does the technology support machine safety operations? Even if this is not what you have in mind right now, using a technology that does not support safety will be a problem in the future. Networking safety results in a reduction of wiring complexity as high a 90%. Any selected technology should at least be ready.
  6. Can a field module be exchanged without additional tools? You hope that the network you select is reliable. This means that I/O module exchanges are infrequent and knowledge about the solution is not well retained by maintenance. Module exchange must, therefore, be as simple as "removing the old, inserting the new."

I also urge you to take advantage of the expertise your chosen supplier brings to the table. Get them involved, and describe the application as best as possible. Your supplier can then review your layouts and designs, thus reducing cost and making the digital network run at peak performance.

Helge Hornis
Manager, Intelligent Systems Group


Congratulations Are Due

Congratulations on embracing a fieldbus network instead of hardwired I/O. I believe everyone in the fieldbus communities will state unanimously that with distributed I/O you will capitalize on superior diagnostics, simpler wiring, improved uptime and overall cost reductions compared with your current hardwired implementation.

Since your concern seems to be with the communication network itself, the need for controller or even I/O device redundancy does not seem to be your need or concern, but instead what the industry calls "added availability" or, more informally, "network redundancy." Basically, this is a scheme that enables the system to withstand faults in the communication network, such as a cable shear, unplugged cable, or dead module.

Network redundancy even permits the purposeful removal and re-installation of a section of devices in the network ("hot connect/disconnect"), and the remainder of the network can continue running. EtherCat, for example, needs only an additional cable attached to the end of the devices on the "redundant" section of the network to a second Ethernet port of the controller, and the master will send out identical frames in both directions (transmit and receive) of the Ethernet cables. The slaves do not need any additional configuration and do not "know" they are part of a redundant system. But a switchover time of 15 µs is inherent in EtherCat devices, meaning a maximum of just one frame would be lost in such an event. The result here is that both sections of the network continue to function, avoiding the network errors and system downtime that might be experienced in alternative systems.

Joey Stubbs, PE, PMP
North American Representative,
EtherCat Technology Group,

Plan Your Network

First, invest in the proper network components because your machine is only as strong as your weakest link. Case in point, most industrial networking issues are a result of inadequate cabling. By using more robust equipment like industrial-rated cables you can help drive reliable network communications. Also, consider incorporating a managed switch into your system. Managed switches help improve network reliability (performance and uptime) by providing important diagnostic capabilities and intelligent features such as quality of service (QoS), port mirroring, loop prevention, and network security. You don't want the network to be the limiting factor of your machines' production.

Network reliability calls for more than robust equipment, so be sure to thoroughly plan your network structure and do your homework. For example, if you're building a machine that end users will integrate into their infrastructure, we suggest having dialogues with their IT department to make sure your machine's networking won't violate any IT policies, especially as it relates to security.

Another concern could be centered on the machine's connected devices. In our experience, embedded switch technology generally lays the foundation for the most robust network at the machine's device level. By embedding the switch into the device itself, you can use a device-level ring (DLR) topology that produces a single, fault-tolerant network—thus one that's more resilient. It simplifies design and configuration while increasing the reliability of the machine network. So you might have a switch-level ring above the machine and a DLR at the machine.

Built-in diagnostics and resiliency allow the system to run as expected if a failure occurs, instead of stopping production. In fact, when up to 50 devices are connected, a DLR recovers in less than 3 ms. This enables you to schedule maintenance at a later time, providing increased flexibility while reducing production waste.

Mike Hannah
Manager, Networks Business,
Rockwell Automation,

Machine System Reliability

Regarding the system's network layer, a machine designer's thought process must consider "The Theory of Control Reliability" per ANSI B11.19, Section 2.12 and B11.19, Section 5.5.1.

Three principles of machine reliability are redundant components, monitoring and diversity. Today's Ethernet switches incorporate many features to help machine designers meet those three principles. One such example of vulnerability on the system is the network storm. Typically, a network storm will cause Ethernet communication failures, which is a result of connecting unmanaged Ethernet switches in a loop or ring configuration. Spanning Tree Protocol (STP) and Rapid Spanning Tree Protocol (RSTP) were developed to detect network loops and eliminate broadcast storms. Some manufacturers have Ethernet switches designed with these features and many others to protect the network. They are extremely simple to construct and do not require unnecessary hardware, extra wiring or network complexity.

Pre-terminated cable assemblies with modular connectors can help eliminate hardwiring errors in the system.HartingOn the machine's physical layer, reliability (or the lack of) can be traced to point-to-point cabling or what is commonly known as hardwiring. This style of wiring can complicate the setup, testing, troubleshooting and debugging of the system. When an independent contractor is not experienced with the system they are working with, hardwiring errors are very likely.

A solution to this problem is connectorization, or a plug-and-play system. Pre-terminated cable assemblies with modular connectors to handle everything from power to signal to fiber take the guess work out of assembling a system. Even within a control panel, the use of connectors helps to remove one layer of possible wiring errors, which is the terminal block. Removing the terminal block and wiring directly to the device eliminates the chance of human error in one more area. Also, depending on the environment the machine is subject to, this removes one more point of failure if there is the chance of loose wiring. Finding any opportunity to remove wiring errors can possibly improve the overall reliability of your system.

Craig Zagorski
Market and Applications Manager, NA,
Harting North America,

Redundancy Options

Stability is a critical requirement for industrial networks, and can be enhanced by adding device or network redundancy. When it comes to redundancy, the key tradeoff is determining how to expand and service the industrialized network while maintaining performance and system uptime. Redundancy is an essential requirement for most industrial Ethernet networks. There are a wide variety of redundant path mechanisms, with the most common of these being STP, RSTP, mesh networking and ring redundancy.

STP is commonly used in enterprise applications. Although it solves the issue of looping in the path, it has drawbacks such as speed-of-recovery time, which can be several seconds. RSTP was created to improve on the slow recovery time for STP, with the goal of less than one second network recovery. Although RSTP is an improvement, high-demand applications such as large networks require even faster response times. Because it is an open standard, manufacturers can adapt and improve the recovery times of redundant networks while still adhering to IEEE standards. Out of these standards and improvements, many propriety ring/chain redundancy systems were born.

Ring redundancy is ideal for systems that have inherent cabling difficulties. It allows for multiple connections and multiple rings, thereby allowing multiple subnetworks to be connected within one overall redundant system. Setup is as simple as configuring one master in a ring that auto-negotiates the path through all its connected slaves.

Redundant chain technology is based on an advanced software that gives network administrators any type of redundant network topology required. When using the chain concept, the first step is to connect Ethernet switches in a chain and then simply link the two ends of the chain to an Ethernet network. In chain systems, you basically have a "head" switch, a "tail" and multiple members.

Propriety ring redundancy is commonly used in industrial applications because of response times typically in the millisecond range and its ease of setup. Although industrial manufacturers will support regular STP/RSTP, it is often cumbersome to set up, and an ever-evolving industrial network requires extensive pre-planning. Ring redundancy ensures the non-stop operation of networks with extremely fast recovery times.

One of the ideal methods to maintain speed and ensure uptime in an industrial network is to set up networks using redundant ring/chain topology, which allows recovery time of less than 20 ms. This system architecture was developed specifically for industrial networks, which require both uptime and rapid installation.

Many manufacturers have an easy redundancy setup feature via embedded software that is activated by a simple check box selection. The switches themselves can determine the fastest route from source to destination. Some even can be configured via external DIP switches so technicians don't have to get involved with the software, making it a plug-and-play scenario.

Traditionally, ring/chain protocols did not work well with existing networks. With newer generations of managed switches, the integration of the two networks can easily be done with some devices able to run both RSTP/STP and redundant ring/chain architectures at the same time. Even though many are set up using proprietary protocols and are specific to manufacturers, they are transparent or co-work with existing RSTP/STP networks.

Andrew Barco
Product Manager, Network Connectivity,
Weidmüller North America,

IP Options

The move from hardwired I/O has many advantages. The first and most obvious is the saving on copper by replacing parallel wiring with an I/O station closer to the end I/O points. These I/O stations are then connected together through a serial or Ethernet network back to the PLC. IP20-rated I/O can be placed around the machine in small cabinets. You can even go as far as placing IP67-rated I/O directly on the machine, completely removing the cabinets. Depending on the application, the most cost-effective option could be either or a combination of both. The IP67 option could carry a higher initial cost, but saves money by reducing the labor that would be used to wire control cabinets.

With all these new network options, it is important to determine what works best with the PLC/PC you plan to run your application on. No matter what you choose, the complexity of configuring this new network depends on the application software and what tools are available from the PLC manufacturer. The individual I/O usually is configured through a DIP switch setting or software tools provided by the I/O manufacturer. The PLC and I/O don't need to be supplied by the same manufacturer. Sometimes you can save by going with an I/O supplier independent of a PLC. These suppliers provide added flexibility by supporting multiple networks, making it simpler to transition between bus networks if requested by the end user.

With the newer Ethernet-based I/O options, a lot of the configuration can be done using Internet Explorer and the I/O's built-in web pages. This simplifies the configuration process and removes the need to create complicated diagnostic screens in the HMI. Diagnostics can be polled from the I/O simply by opening up the built-in web pages.

The Ethernet switching infrastructure is one of the best options for I/O networking. As with the I/O devices, Ethernet switches can also be IP20-rated for placement in the machine cabinet, or IP67-rated and located directly on the machine. The need for network reliability is addressed by the use of an easy redundancy mechanism such as Rapid Spanning Tree Protocol (RSTP). An Ethernet switch using RSTP allows you to deploy redundant device connections, thereby creating multiple communication paths for the I/O module. In the event that one path becomes disabled, the RSTP algorithm will automatically enable another, thus promoting the concept of the self-healing network.

Jason Haldeman, I/O Product Marketing Lead Specialist;
Ken Austin, Ethernet Product Marketing Lead Specialist;
Phoenix Contact USA,

Saving Hours

The whole point of using digital industrial networks is to improve productivity over hardwired systems. This is accomplished primarily by reducing system downtime; however, all systems eventually break down. The major advantage of networked I/O is the ability to get far-more diagnostic data to identify the problem and solution than you can with hardwired I/O. With better diagnostics, the controller can be programmed to tell you what you need to do to get back into production.

Implement diagnostics that make sense for the user's application (they will need to be willing to pay for that functionality). It is possible for the controller to not only identify the problem, but also to provide the location on the machine, pictures, work instructions, tool list—even the part number of the failed part and its location in the store room. The upfront engineering of the machine for diagnostics eliminates many hours of lost production that would otherwise be used to troubleshoot why the machine stopped.

If a machine is implemented in this way, then where to use "redundant" I/O becomes a much simpler decision: What part of your machine or process can you absolutely not afford to stop working? Use redundant I/O where I/O failure causes unacceptably high cost due to lost production or damage to your equipment.

Phil Marshall, COO,
Hilscher North America,