Do You Need Big Data, Fast Data, or Both?
by Brad Johnson, on Sep 30, 2018 12:57:19 AM
All Internet of Things (IoT) data has some time component to it. Therefore it’s critical to approach applications which use IoT data with the understanding that in order to derive maximum value, different types of data are needed at different times. Evaluating your business needs for an application project can help you identify whether you need Big or Fast data for your application. Let’s use traffic as an example to identify the types of problems both Big Data and Fast Data seek to solve.
More servers and bigger databases aren't always the answer for a better application. The key is finding the right balance between Big and Fast Data for your needs.
For city planners trying to identify the traffic impact of an upcoming construction project, they would likely take a Big Data approach. The city planners may look at recorded traffic data from the surrounding area for the past few years, traffic impact studies from similar construction projects, and create a model for the upcoming project. Using this model, they can project the likely impact of the construction given various scenarios and create a report of their findings. This is a Big Data approach because the planners are evaluating historical data, in order to try and understand how a future event may unfold. The current traffic at the future construction site isn’t terribly relevant to their findings, as traffic conditions are continuously changing and therefore not likely to help them predict the impact to traffic. However, when looking at a large data set of historical data, the city planners are able to evaluate a wide range of scenarios in order to determine the likely impact of their project. This is one of the key strengths of Big Data analysis.
Now, consider a city manager is tasked with monitoring a major event in their city. The city manager must coordinate with police, event organizers, volunteers, and others in real-time in order to ensure a successful event with minimal impact to traffic in the city. In this case, the city manager cares most about Fast Data. While in the previous scenario, city planners benefited from a large dataset of historical information, in this case the city manager only cares about what is happening right now, so that he can react to new information in real-time and share insights with others on the ground. Big Data does not solve this problem, as it’s often too slow to deliver perishable insights in a timely manner. Streaming analytics help, but still require centralized databases to run an application. The ideal solution would process all data sources in parallel, without introducing any unnecessary latency, in order to provide the city manager with the most current information possible. Fast Data solutions help the city manager identify “what is going on right now?” and act quickly to ensure a successful event.
The reality is that many applications have both real-time and historical components, and therefore benefit from both Big Data and Fast Data approaches. Often times, each approach requires different technological solutions in order to deliver the desired insights. Howver, finding the right balance can lead to much more efficient applications. For example, edge computing can serve to improve real-time application performance by reducing and transforming data closer to the source, without incurring latency from unnecessary network hops. Meanwhile, edge computing benefits centralized Big Data databases, which are no longer overtaxed by processing raw, noisy, or redundant data. Knowing the difference between Big and Fast Data, and by clearly identifying what you need to know now vs. what you can evaluate later, your applications will achieve significant cost, efficiency, and productivity gains.
Learn more about how SWIM can integrate Fast Data into your Big Data applications by visiting www.swim.ai.