Analyze Kafka Events On-The-Fly
by Simon Crosby, on Sep 22, 2021 8:30:00 AM
Swim enhances the actor model to support continuous analysis of streaming data from millions of sources in a distributed runtime environment - using Java. Swim is the easiest way to build applications that continuously analyze streaming data from Apache Kafka.
Developers write simple, object-oriented Java applications and deploy the same code to each runtime instance, typically in containers, on Kubernetes. At runtime, an application is a distributed dataflow pipeline of streaming actors – a DAG whose vertices are concurrent actors that continuously analyze new data and stream their state changes over their links to other (possibly remote) actors.
Streaming actors are like “digital twins” of data sources that continuously analyze new data and stream their state updates over their links - relationships that they discover through continuous evaluation of parametric functions including geospatial (“near”) and analytical (“correlated”).
Since actors can create new actors, the runtime can build the application directly from streaming data: Each data source is represented by a streaming actor that analyzes its events from the real-world in the context of its current state and the states of actors to which it is linked. An application spans a distributed p2p mesh of execution instances. Each runtime instance hosts part of the DAG and caches a replica of the state of every remote actor to which its local actors are linked.
The Swim runtime streams actor state changes to remote replicas as op-based CRDTs. This frees actors to compute whenever they want to, using locally cached replicas of remote actor state. Remote replicas are updated using a cache coherency protocol - WARP - that delivers updates to all instances in at most ½ RTT.
Each actor is also a fully-fledged web service, so a browser-based UI object can subscribe to an actor’s streaming API to display a live, real-time view of its state. Swim is OSS licensed under Apache 2.0. Here's code for an app that consumes events from Kafka and renders a continuous live view in 3D for all known satellites.