Why Machine Learning Needs to Happen at the Edge

by Brad Johnson, on Dec 7, 2017 2:08:29 PM

Machine Learning has become a dominant strategy for applying artificial intelligence capabilities to real-world information. Self-training Machine Learning enables for pattern discovery in unstructured data, which is ideal for edge environments. Industrial edge sensor data is usually unstructured, with data being generated at a high rate, high volume, and with low information density. Sensor data often lacks the context necessary to identify which information is valuable, and therefore traditional application architectures have forwarded the entirety of raw data to the cloud for processing.

ML Needs Edge_No URL

But high data volumes can overwhelm cloud architectures, creating mountains of valuable operations data that may never be analyzed. For example, Deloitte’s 2016 report, Smart Buildings: How IoT Technology Aims to Add Value for Real Estate Companies, explores this very problem. Deloitte explains that most “legacy systems can handle structured data, but increasingly, IoT data are unstructured—indeed, unstructured data are growing at twice the rate of structured data and already account for 90 percent of all enterprise data.” In other words, 90 percent of enterprise data is going unused due to incompatibility with existing systems. Think about all that lost value, simply because enterprises lack the right tools!

How Edge Computing Solves the Unstructured Data Problem

Machine Learning, alone, is not enough to solve this unstructured data problem. Performing analysis on a massive industrial dataset in the cloud can be time consuming, and introduces multiple bottlenecks, both at the network and database level. Frequently, the insights generated from industrial data are time-sensitive. For example, post hoc analysis may derive valuable insights about machine health and required maintenance, but if those discoveries are outdated, no action can be taken that may have prevented downtime or equipment failure.

By distributing Machine Learning capabilities at the edge, learning instances can operate on much smaller datasets, scoped to an individual device or cluster of devices. This is a far more efficient means of processing data from an industrial system. Raw data is reduced at the source into (still) high rate, but structured, lower volume, and more information-dense data which can in turn be consumed immediately by legacy systems. Furthermore, because sensor data is being processed in parallel, application latency times are massively improved. Edge-based Machine Learning instances can arrive at insights in milliseconds, that may otherwise have taken 30 minutes or more to compute in a cloud-based silo.

Lastly, by extending Machine Learning to edge devices, enterprises can more accurately model their real-world hierarchies within OT applications. This makes it easier for developers to siphon off information to user interfaces and operations dashboards, where an operator may only be interested in a subset of machines.

Learn More
Learn how SWIM uses Edge Computing to deliver real-time edge data insights with millisecond latency for industrial and other real-time applications.

Topics:Machine LearningSWIM SoftwareEdge AnalyticsEdge Computing