The Journey to Continuous Intelligence (Part 3)
by Simon Crosby, on Aug 12, 2020 7:45:00 AM
A Technical Blog Series for Data Architects
This multi-part blog series is intended for data architects and anyone else interested in learning how to design modern real-time data analytics solutions. It explores key principles and implications of event streaming and streaming analytics, and concludes that the biggest opportunity to derive meaningful value from data – and gain continuous intelligence about the state of things - lies in the ability to analyze, learn and predict from real-time events in concert with contextual, static and dynamic data. This blog series places continuous intelligence in an architectural context, with reference to established technologies and use cases in place today.
Smarter Databases Can’t Help
The rise of event streaming is due to the importance of the change in system behavior over time. The challenge at the application layer is to deliver intelligence gleaned from streamed events, continuously, over time. But a major drawback of event streaming architectures is that they store events in topic queues ordered by event arrival time, forcing applications to only consume from the head of the queue. What we’d really like is a stream of intelligence - delivered like events - that results from the continuously concurrent interpretation of the effects of all events on a stateful model of the system. Instead, today’s application developers must use some sort of database to represent the state of the system. And that just isn’t enough: modifying a representation of the system state in response to changes communicated in events is one thing but delivering a continuous stream of intelligence that results from those changes is quite another. Databases can help with the first - but lead to performance impacts. They do nothing for the second (see figure 4).
Figure 4: Data-driven Computation (Source: Swim)
There is a vast number of databases and cloud database services available. Most can store streaming data, and many have evolved powerful features to confirm their role as masters of application state. Sophisticated data management capabilities are migrating into the database engine to deal with latency challenges. Leading the feature development race are the hosted database services from the major cloud providers.
But there are hundreds of others. Broadly the trend is toward large in-memory stores, grids and caches that attempt to reduce latency. All of today’s database engines can ingest events at huge rates. But that’s not the problem. No database, in-memory or other, can understand the meaning of data, or deliver real-time, situationally relevant responses. Applications interpret events from the real-world to change a model of the state of the system, but a single event may cause state changes to multiple related entities. By way of example: A truck needing maintenance enters a geo-fence meaning that the truck is near an inspector, so the inspector is alerted. A single event with the GPS coordinates of the truck might change the states of the geo-fence and the inspector. Every time the states or relationships between entities change, the application may need to evaluate sophisticated logical or mathematical predicates, joins, maps or other aggregations, and execute business logic. Each of these might require scores of round-trips to the database. For every truck, and every inspector, in real-time.
For an application at scale, this leads rapidly to a situation where the database is the bottleneck. For distributed applications, the round-trip latency for database access can quickly dominate performance. For an application processing hundreds of thousands of events per second, the only way to reduce latency is to execute application logic in the memory context of each impacted entity, avoiding database latency entirely. That is exactly what Swim does.
There’s another reason that smarter databases can’t help with continuous intelligence: They don’t drive computation of insights or “push” them to users. The inversion of the control loop is fundamental: In most applications that claim to be real-time, the query to a database drives computation, and the results are delivered to the user. But that’s not enough for today’s continuous intelligence use cases. Users want to deliver real-time responses to analysis, learning and prediction, as it occurs, concurrently for all entities in the system. They want the application to always have an answer. Databases don’t do that.
Swim offers the first open core, enterprise-grade platform for operating continuous intelligence at scale. Built upon the open source SwimOS core, Swim Continuum powers contextual analytics, visualization and real-time responses on streaming and historical data in concert, providing businesses with complete situational awareness and operational decision support at every moment. For more information, visit us at www.swim.ai and follow us @swim.