How is Swim a Distributed Operating System?
by Chris Sachs, on Mar 19, 2019 1:16:46 PM
This is the last post in a three part series about the future of the Web. You can read parts 1 and 2 here.
Like MS-DOS, Swim is "integrated software that consists of a persistent transactional memory system, a process scheduler, an application programming interface, and a unified I/O model," says Chris Sachs.
An operating system is integrated software that typically consists of a persistent file system, a stateful process execution scheduler, a system call interface, an inter-process communication mechanism, and a device I/O model. When designed properly, an operating system can provide the foundation for a vibrant ecosystem of applications and significantly alter the technology landscape.
Like an operating system, Swim is integrated software that consists of a persistent transactional memory system, a stateful Web Agent process scheduler, an application programming interface, and a unified I/O model. A traditional OS kernel may run on bare-metal hardware, or it may run as a guest OS managed by a hypervisor. A distributed OS kernel, like Swim, runs as processes, managed by traditional OSes. Swim kernel processes cluster together to create a coherent, distributed compute fabric. Within this distributed fabric, Swim provides everything applications need to execute stateful, long running, persistent, globally addressable processes that continuously communicate with each other and also with external systems.
A Higher Order Operating System
Swim was designed from first principles and built from scratch to be a fully functional, self-contained, distributed operating system. We designed Swim as a distributed operating system in order to create a simpler, more efficient, decentralized, general purpose model for building distributed applications than the middleware-mess model that dominates today. As Swim applications typically consist of many devices and VMs running different operating systems, Swim functions as a "higher order" operating systems that unifies the heterogeneous environments.
To understand how we're able to do this, let’s start by discussing the three core concepts in Swim—Web Agents, Lanes, and Links—and how they relate to the components of a traditional operating system:
Processes ⇒ Web Agents
Processes become Web Agents. Think of a Web Agent as a distributed OS process that’s addressed by a URI instead of a machine-specific process ID. Web Agents are stateful and long running, just like ordinary OS processes. Web Agents can set timers, participate in streaming I/O, and execute arbitrary code. Being self-contained allows Swim to move Web Agents between machines to optimize compute, memory, disk, and network load. And universal addressability makes Web Agents first class citizens of a World Wide Web of stateful, streaming processes.
Files ⇒Lanes
File become Lanes. You can think of every Web Agent as having its own dedicated file system, with each Web Agent’s “files” stored in its lanes. Similar to how files are identified by file paths, lanes are identified by lane URIs. Just as different machines can store different files with the same file name, different Web Agents can have different lanes with the same lane URI.
Traditional operating systems deal with many kinds of files: data files, device files, IPC pipes, and even network sockets. Likewise, Web Agents support many kinds of lanes: value lanes, map lanes, list lanes, join lanes, command lanes, and demand lanes, to name a few. Just as some kinds of files represent persistent data, and other kinds of files handle I/O. Some kinds of lanes store persistent data, while other kinds of lanes provide I/O mechanisms. Unlink ordinary files, all changes to lanes—whether local or remote—are observable to and governed by the Web Agent that owns the lane. This fact is essential to maintaining continuous, real-time consistency in a distributed environment.
Sockets ⇒ Links
Sockets become Links. To access a file, a process running on a traditional OS must first obtain a file handle, or file descriptor, to the target file. Analogously, to access a non-local lane, a Web Agent, or Web Agent client, must first open a WARP link to the intended lane. A file descriptor is used to provide a consistent view into the contents of a file. A link is used to provide a consistent view into the state of a remote lane.
File descriptors have coarse-grained, file-system defined consistency semantics, with extremely limited backpressure and multiplexing capabilities—unsurprising, given file descriptors’ origin as an interface to local file systems. WARP links, by contrast, have fine-grained, end-to-end defined consistency semantics, with built-in backpressure regulation and multiplexed modulation. Unlink file descriptors, any change to a remote lane is eventually observed by every link to that lane. Links update in real-time, within the latency of the network, enabling continuous, bi-directional consistency of all views of a given lane. And stateful tracking of lane deltas lets links precisely synchronize their state, without the use of buffer-bloat inducing message queues.
The Building Blocks for Distributed Apps
"No developer would try to build a mobile app by first integrating a file system, IPC mechanism, process scheduler, and application framework. Instead they just choose Android or iOS," says Chris Sachs.
These three simple concepts—Web Agents, Lanes, and Links—efficiently and elegantly provide all of the capabilities of a distributed database, message broker, job manager, processing engine, application server, container manager, application server, and serverless framework—everything a distributed application needs—without the significant cost, complexity, capability, and centralization restrictions inherited when integrations those components together.
No developer in her right mind would try to build a mobile app by first integrating a file system, IPC mechanism, process scheduler, and application framework. Instead she would just choose Android or iOS. We depend on operating systems to solve those problems for us. Contrast this with the fact that, in order to build a distributed app, developers must first integrate a database, message broker, job manager, and application server. Shouldn’t we have distributed operating systems solve these problems for us?
Of course, there’s much more to a traditional operating system than just processes, files, and sockets. OSes also have security models, monitoring and management interfaces, utility programs, and user interfaces. Swim has equivalents of these things too; we’ll discuss them in a future blog series. There’s also a significant amount of brass tacks engineering that goes into the implementation of Web Agents, Lanes, and Links; a future series will cover that too. But before diving deeper into Swim internals, it’s worth taking stock of what having a distributed operating system means for everyday developers, and the applications they can create. Part 4 of this series will cover that very topic. Stay tuned!
Other Posts in This Series
- Part 1 examines why a Distributed Operating System is necessary to kickstart the next generation of connected applications.
- Part 2 examines how the principles of the Web can be fused together with the fundamentals of a traditional OS to create a truly distributed operating system.
Learn More
Swim is an Open Source platform for building stateful data-driven applications that continually compute and stream their insights in real-time. You can get Swim here.