Many operational-technology (OT) networks that are not on distributed control systems may not have server backups or any programmable-logic-controller (PLC) backups, and this puts business at risk.
Operations folks need to evaluate their data, substantiate the risk of a data loss to the business, understand the recovery options and timeframe and then decide on mitigation. Once those steps are done, the system should be tested periodically to make sure backups are usable for action.
Get your subscription to Control Design’s daily newsletter.
First, it’s imperative to understand that OT data is centered around real-time data processing that monitors a process or control of physical devices. Picture electricity power transfers and schedules based on grid use. Perhaps there are scheduled transfers during peak hours; during low hours, electricity is put more toward cities based on loads. Or, for discrete manufacturing, picture pharmaceutical traceability or tracing product in a process for work in progress, which is used in every modern-day operation. Batch recipes or CNC program storage for bulk production can take stored data and load it to machines. Or it could be as simple as tracing shipping container locations, such as the 2017 cyber attack against Maersk that crippled the company in a short amount of time and hindered logistics schedules for various Maersk customers worldwide.
What happens if a PLC fails or a hard drive fails during the day and all WIP data is lost? Is there a backup? What is the recovery plan?
Many automotive facilities run plants on a just-in-time schema, and that means they may have six to 10 hours of inventory at the customer, before the customer will be shut down. Typical fines for this are in the thousands of dollars per hour of down time. Small businesses cannot afford it. The risks are high.
Thus, asset management and recovery become critical. Also, having a data processing schema that is recoverable is key. For instance, if a company provides seats to Nissan, they need to track the orders on the truck and know what they are sending to Nissan every 45 minutes. The line that is doing assembly may stay up and running, but what if the assembly and seat data drop or none of the torque values on the assembly line make it into the database for the traceability that is required for automotive manufacturing?
This requires that OT data have a clear path, storage and recovery actions. Asset managers for PLCs should be implemented on the OT network; the HMI screens should be able to dump data read at the time of the action into a database, and that should be stored raw in an historian and then sorted in backups for database management. Historians have been the traditional word for this data repository, but today it could be a data lake, a data platform or a distribution dashboard. We even can store short-term data at the PLC level if we need an average and data changes might be stored in milliseconds, or at change on every machine cycle, or at different operations in the line.
The time that data moves from the PLC into a register for uptake in a database is the time that there needs to be a reference to save that data to a backup device. Thus, if you have a machine with 12 stations and your customer requires traceability of, say, torque values, then one unit may have a minimum of 12 data points per build.
However, realistically, there would be more like 48 data points per build to go up under that register for the unit.
Now look at the production time. If a line makes 50 units per hour, then picture 50 units multiplied by 48 data points and multiplied by however many hours of production.
This means that there are 28,800 data points for that line in a 12-hour period. What if you have five lines? Then there are 144,000 data points in 12 hours.
If the customer calls with a complaint, the ability to look up that data in a timely manner is imperative. This means understanding storage requirements.
It’s easy to see how IT and OT are related, and that IT infrastructure and the costs of terabytes is something to understand if machine builders are interfacing with plant processes that require traceability of material or assembly products. Thus, it would be recommended to have PLC software backups and an historian backup for the intermediate data going to the database. It would also be recommended to do a database backup.
Then there is the backup timing. This is where it’s important to take operations suggestions from the distributed-control-system side and spend the money on a backup device because physical duplicates allow hot swaps and time to keep producing while data is stored, or to recover if one of the machines fails.
Risk analysis of the types of failure a business may handle is based on the time to recover and threshold to mediate if a failure or cybersecurity attack occurs. The mediation of a failure depends on the structure and processes put in operations for prevention and then mediation when the failure happens. It will happen in manufacturing, either based on attack or environmental or forklift.
With that in mind, the modern operations team cannot go without taking from informational technology playbooks and the importance of understanding data. When to get the data, where to store the data, how long to store the data and how to recover it are crucial. Thus, data storage, cybersecurity and network planning should be a part of any operational buildup, be it green- or brown field.