Are Mobile Networks Ready for the Streaming Tsunami?
by Simon Crosby, on Jun 7, 2021 8:45:00 AM
I bet you think this is about the rise of Netflix, YouTube, Disney+ etc. It isn’t.
Sure, streaming video has eclipsed TV, but that's old news. I’m referring to the rise of streaming data - from every consumer and industrial product, and from every bit of corporate infrastructure - that needs to be analyzed in real-time to drive smarter decisions. It's streaming into the data center and toward the cloud, over swamped provider networks. Streaming data is the consequence of everything being connected, and blindly reporting
This isn't a temporary shift, and "streaming" is an understatement. Anticipate a future in which every product, digital or physical and every component of an enterprise's infrastructure and applications is instrumented to enable real-time intelligence. This isn’t a stream, it’s a tsunami that won’t end. Witness the IPO announcement from Apache Kafka creator Confluent with an annualized $300M revenue run rate.
For mobile operators there are two opportunities:
- The first is to use mobile device status data to gain real-time insight into network performance, user experience and outages to deliver a more robust network and to ensure customer satisfaction.
- The second is to use their proximity to every edge device and enterprise infrastructure component to deliver “Edge” services that tame and make sense of the flood of data before it hits cloud service providers and enterprise applications. Operators can host "edge cloud" services on compute close to data sources.
The opportunity has never been greater than with the introduction of 5G networking: Operators can offer enterprise customers secure, private slices of network capacity with access to real-time edge computing capabilities, to deliver smart cities, smart grids, and tailored enterprise focused offerings.
What’s needed to succeed in both areas is for would-be users of streaming systems to become fluent in the language of cloud-native event streaming like Apache Kafka, Apache Pulsar, CNCF NATS and cloud hosted counterparts, and to become fluent in open source platforms for real-time stream analysis.
One can argue that pub/sub messaging is a new “dial tone” for service providers: Offering a platform that helps companies securely scale messaging from edge devices is an important service offering, and just as importantly, adopting cloud-native software architectures is crucial for operators to master to be able to deliver real-time customer service and to understand the state of their networks. One might question whether or not operators should enter the fray. To be clear, pub/sub is a commodity, but a platform that can analyze, learn and predict from data on the fly would be a powerful benefit to providers – to host customer apps and offer their own low-latency services. This is the reason there's a huge focus on Confluent - applications.
For use cases in traffic prediction and routing, and any interactive service, the response time is critical. Using real-time messaging to a smart carrier edge could save hundreds of milliseconds of event processing time. For a real-time stream processing framework such as Apache Samza or Swim, getting hold of events fast is key to real-time analysis, learning and prediction to drive visualizations and automated responses.
For the second, one can consider subscribing to events at a broker to be the streaming equivalent of the database-era “SELECT”. App dev teams can independently subscribe to and write apps for different event topics. All apps, from customer care to predicting outages in network equipment are feasible when all events are reported in real time.
Streaming data contains events that are updates to the state of applications or infrastructure. When choosing an application architecture to process it, the role of a data distribution system, like Kafka or Pulsar, is limited. This section provides some insight.
- Data is often noisy – Many real-world systems are noisy and repetitive; and large numbers of data sources add up to a huge amount of data. If raw data is delivered as a stream of pubs from the edge, the transport cost can be huge.
- State matters, not data – Streaming environments never stop producing data – typically real-world events – but analysis is dependent on the meaning of those events, or the state changes that the data represents. Even de-duplicating repetitive updates requires a stateful processing model. This means that the stream processor must manage the state of every object in the data stream.
Stream Processors are a special kind of subscriber. In addition to subscribing to events from the broker, they publish their output back to the broker. Stream processors often support a variant of SQL syntax for “stream queries” that are continually executed against the stream. Kafka, Pulsar and Samza provide a variants of SQL query syntax, but beyond this, they aren’t opinionated on application architecture.