Serverless Anti-Patterns

by Simon Crosby, on Mar 4, 2019 2:00:32 PM

Serverless computing is a seductive and powerful addition to a cloud developer’s arsenal. But despite its elegance, serverless is not the right architecture for most edge applications.

Combined with the rich portfolio of other services offered by major cloud platforms, serverless allows developers to focus on their app, and leave most concerns related to the infrastructure that runs it, to the cloud vendor. Instead of worrying about how to instantiate, manage, secure, scale and orchestrate your application, the serverless service does it for you. Developers write apps, and the service hides the infrastructure: When an event arrives at the application’s URI, the service assigns it to a worker that dynamically loads the application code to process the event. When processing is complete the execution context is deleted. Any state the application needs to persist between events must be saved in a database. You pay for the application execution time.

pexels-photo-1181316

Serverless computing is a seductive and powerful addition to a cloud developer’s arsenal. But despite its elegance, serverless is not the right architecture for most edge applications.

Serverless scales beautifully: As event rates increase, more workers concurrently execute your code.  Provided that the database can keep up, the application will automatically scale. And today’s cloud database services scale superbly.

So when is serverless not ideal?

  • Potential for lock-in: It’s important to realize that your application will be tied to a particular serverless offering – say AWS Lambda, Azure Functions or Google Cloud Functions.   If app portability between providers is important to you then you may be in trouble. I think this is a long term risk, but see no indication that vendors will gauge their customers.
  • Cold-start overhead: It’s useful to think of serverless as “forgetful computing” since each new event requires a new cold start: the worker must load and run your code before processing the request.  Applications can scale wide easily – lots of workers can concurrently process events - but the end-to-end latency for a single event will be higher than with a dedicated hot server that does not need to load the app code first.
  • Stateless means databases: The serverless processing is stateless, which leads to a need for some sort of database to store any state needed for computation across events.  Typical access time for a database is 5 orders of magnitude slower than RAM, and in the cloud, the database runs somewhere else, so there is additional delay due to networking - probably another order of magnitude.  So processing is probably a factor of 106 slower than “hot” stateful processing in memory in a stateful VM or container based service in the same cloud.
  • You pay for the database delay: Unfortunately while your serverless function is waiting to load state from or write to a database, your code is “running”, and billable by your cloud service provider.  Sadly that means you’re paying for about 106 idle cycles for every cycle of useful work.  So the elegance and simplicity has a huge cost!
  • Getting to the cloud needs a network: For most use cases a cloud hosted serverless stack  that can be easily located using the service URI is the simplest and most elegant way to get data from a device to a service.  But if you can avoid additional network latency by processing data locally - where it is generated, you can save the overhead of network processing time.  Let’s conservatively say that packetization and transport delays add 20ms of overhead each way.

Stateless is typically suboptimal for edge data: When devices change state fast at the edge, the optimal data processing paradigm is to use a stateful in-memory model, at the edge. Swim uses stateful digital twins, automatically created from the data, to achieve this.  For fixed devices, the locality of the device is known and unchanging, so routing data to a CPU nearby is trivial.  For mobile use cases, the roundtrip time to a CPU at a nearby base station is substantial but still only a fraction of the cost of getting to the cloud.  

Using a stateful processing model at the edge is simple and blindingly fast.  Processing at memory speed, at the edge is typically a factor 108 faster than the cloud stack. Each event processed this way saves billions of CPU cycles compared to a serverless stack in the cloud.  These cycles can be used for reinforcement learning, analysis, query processing and other edge intelligence functions.  

Let’s put that in human terms: If you walk at 1m/s (and that’s the equivalent of in-memory edge processing), then using a serverless cloud function to do the same work would take over 3 years. You’d pay for all 3 years of time, but the CPU would only be busy for a second.

Learn More

Swim is an Open Source platform for building stateful data-driven applications that continually compute and stream their insights in real-time. You can get Swim here.

Topics:Machine LearningStateful ApplicationsSWIM SoftwareIndustrial IOTSmart CitiesEdge AnalyticsDigital TwinSWIM AIEdge ComputingSwim Enterprisedistributed computingserverless

Subscribe