Redefining the Edge: Process Data at the Edge
by Chris Sachs, on Sep 29, 2017 12:06:30 PM
What’s cloud computing, and what’s the edge? We could describe boundaries based on the points of demarcation between providers and customers, but that quickly gets messy: AT&T serves my phone but it also serves my company; moreover AT&T is more familiar as a communications provider than as a cloud provider. Let’s try again: In an enterprise context Edge means “on prem”, and Cloud means “not on prem”. Tricky again: Silver Spring Networks is one of the largest providers of Smart Metering Infrastructure, with over 25M endpoints under management (like your house). They operate low bandwidth networks city-wide, and give their utility provider customers accurate information about energy consumption. What’s “on prem” in this story? What’s “on prem” to a vendor of self-driving cars, or to Uber, which eschews on-prem IT infrastructure but receives masses of data from drivers and riders to its cloud hosted apps world wide?
In my view the Edge is defined by the data flows that need to be processed, and not by any physical or organizational boundary. The Edge is the right place to apply the appropriate analysis or learning (as “close” as possible to) where the data enters the organization.
When I use the word “close,” I don’t mean geographically or even organizationally. Instead “closeness” is really about convenience and cost for a given analysis or learning goal. Swim can analyze traffic signal data on a micro-controller on a pole at the intersection itself, or in the datacenter of the city that operates the lights. And since learning on a micro-controller is tricky, Swim learns about the local stream of data from a single intersection on the pole, and on the stream of insights from all intersections at an appropriate aggregation point - either in the city data center, or in an Azure instance.
Does this mean analysis at the edge and learning in the cloud? No, because both of these functions are required to produce meaningful (valuable/consumable) output: Predictions for the future state of the intersection, and predictions for all intersections jointly, computed as a convolved learning solution. The entire process: Turning voltage swings into phase changes on lights, deduplication, identifying intersections and associating cars with loop crossings and pedestrians with crossing buttons, through to learning on a “virtual twin” of the intersection given its recent past behavior and the traffic at other intersections in the city – all amounts to “Edge Learning”, independently of where the actual processing occurs.
The rest is all about convenience and cost. For example, one could analyze a stream of data from a small town (200 intersections yields about 50MB/s or 4.5 TB/day) in AWS Lambda. (We’ve done it.) Splitting the initial analysis between the city data center and a (public cloud hosted) instance costs about $40/month. Doing it all in Lambda would cost north of $5K per month.
Learn more about SWIM and how to redefine processing your data at the edge.