The Journey to Continuous Intelligence (Part 2)
by Simon Crosby, on Jul 29, 2020 6:15:00 AM
A Technical Blog Series for Data Architects
This multi-part blog series is intended for data architects and anyone else interested in learning how to design modern real-time data analytics solutions. It explores key principles and implications of event streaming and streaming analytics, and concludes that the biggest opportunity to derive meaningful value from data – and gain continuous intelligence about the state of things - lies in the ability to analyze, learn and predict from real-time events in concert with contextual, static and dynamic data. This blog series places continuous intelligence in an architectural context, with reference to established technologies and use cases in place today.
Event Streaming Applications
Brokers don’t execute applications - they simply act as a buffer between the real world and applications. The broker keeps an index per application and topic so if an application restarts, it can resume processing from the next event in each queue - if it is stateful. If not, it must start from the head of each queue again. But re-playing events for a long period of (real-world) time is wasteful and leaves the application even further behind the real-world, which never stops!
Every application that delivers useful insights needs a model of the system generating events. The stream may be used to deliver data to a machine learning pipeline, or modify a relational or graph database, but ultimately the reasoning about the meaning of each event and the state changes it communicates both to an entity and to related entities in the context of the system overall requires a stateful model which is typically saved in a database. The unfortunate consequence is that now applications need to deal with two sources of delay - topic queues at the broker and a database at the application layer. Neither is predictable in its delay characteristics, making it impossible to bound response times.
If the application state is organized in a database, event streaming fits nicely into a microservices architecture. Applications can scale by adding stateless microservice instances to consume events. Each instance independently asks the broker for the next event - avoiding the need for a load balancer, and if an instance fails it can simply restart and carry on.
But applications are always stateful, and pushing the hard problems of distribution, consistency, availability, load balancing and resilience onto the database layer impacts performance and ultimately cannot deliver continuous intelligence: A database is after all simply a repository that reflects an application’s current model of the system - it doesn’t compute based on that. Context is problematic too: Processing a single event can cause scores of database accesses as its knock-on effects change the states of many additional entities. Each access adds to the end-to-end processing latency for each event.
Brokers Buffer Data
Historically databases attempted to represent the current state of the system, on disk. But the demand for data-driven computation of insights or responses means that the idea of a database as an organized repository for state is no longer sufficient. The application logic – analysis, learning and prediction – need to be carried out in real-time, continuously, as data flows into the application. Time is of the essence: If the app is too slow it will lose track and deliver useless insights, and “store-then-analyze” architectures fall into this trap.
Unfortunately, topic queues managed by the broker introduce a serious problem:
- Applications can only process events in a topic in arrival order. But they cannot control the number of events that are admitted to the queue for a given topic in a time window, so if they process events too slowly, they will fall behind or must drop events, losing fidelity. Yet there is no way for the broker to communicate the arrival rate or depth of the queue to the application to enable it to appropriately scale resources.
- How many topics does your app need? If you get it wrong, it’s difficult to fix later. For any choice that results in more than one data source per topic, the app may end up having to read a lot of unimportant or out of date events from the queue to find information that’s important. Because of this, there’s no way to bound application response times. For an answer that is effectively “one source per topic” the challenge is the broker limit on topic queues. Even a million topics is too few for large applications.
- An application might need to consume and discard many events that are not of interest before finding one that is critical. Since there is no way to know what might be in the queue, there’s no way to reason about the response time of the application. Will it always find the critical event in time?
- Brokers are un-opinionated about the meaning of events - with no understanding of urgency, importance or irrelevance (they are just events in the queue). So, an app may find itself processing useless or out of date events just to get to a critical insight - wasting time and making it impossible to deliver a real-time response.
- Finally, brokers don’t reason across events, so for example a series of events that, if correlated, could trigger an application layer response, is not interpreted.
Swim offers the first open core, enterprise-grade platform for operating continuous intelligence at scale. Built upon the open source SwimOS core, Swim Continuum powers contextual analytics, visualization and real-time responses on streaming and historical data in concert, providing businesses with complete situational awareness and operational decision support at every moment. For more information, visit us at www.swim.ai and follow us @swim.