Original article is published at Forbes: link
Have heard about the magic pill? Not sure how it works, but it helps you lose 20 pounds in a week while consuming the same calories as before. And you’ve probably also heard about the scary side effects of that pill. The need for magic pills is appearing in the IoT market as well. Thanks to the explosion of sensors to measure everything imaginable within the Internet of Things, enterprises are confronted with a never-ending buffet of tempting data.
Typically data has been consumed like food: first it is grown, harvested, and prepared. Then this enjoyable meal is ingested into a data warehouse and digested through analytics. Finally we extract the nutritional value and put it to work to improve some part of our operations. Enterprises have evolved to consume data from CRM, ERP, and even the Web that is high in signal nutrition in this genteel, managed manner from which they can project trends or derive useful BI.
The IoT and its superabundance of sensors completely changes that paradigm and we need to give serious consideration to our data dietary habits if we want to succeed in this new data food chain. Rather than being served nicely prepared data meals, sensor data is the equivalent of opening your mouth in front of some kind of cartoon food fire hose. Data comes in real-time, completely raw, and in such sustained volume that all you can do is keep stuffing it down.
And, as you would expect, your digestion will be compromised. You won’t benefit from that overload of raw IoT data. In fact, we’ll need to change our internal plumbing, our data pipelines, to get the full nutritional benefit of IoT sensor data.
That will require work, but if you can process the data and extract the value, that’s where the real power comes in. In fact, you can attain something like superpowers. You can have the eyesight of eagles (self-driving cars), the sonar wave perception of dolphins (for detecting objects in the water), and the night vision of owls (for surveillance cameras).If we can digest all this sensor data and use it in creative ways, the potential is enormous. But how can we adapt to handle this sort of data? Doing so demands a new infrastructure with massive storage, real-time ingestion, and multi-genre analytics.
If we can digest all this sensor data and use it in creative ways, the potential is enormous. But how can we adapt to handle this sort of data? Doing so demands a new infrastructure with massive storage, real-time ingestion, and multi-genre analytics.
Massive storage. More than five years ago, Stephen Brobst predicted that the volume of sensor data would soon crush the amount of unstructured data generated by social media(remember when that seemed like a lot?). Sensor data demands extreme scalability.
Real-time ingestion. The infrastructure needs to be able to ingest raw data and determine moment by moment where to land it. Some data demands immediate reaction and should move into memory. Other data is needed in the data warehouse for operational reporting and analytics. Still other data will add benefit as part of a greater aggregation using Hadoop. Instant decisions will help parse where cloud resources are appropriate versus other assets.
Multi-genre analytics. When you have data that you’ve never seen before, you need to transform data and apply different types of algorithms. Some may require advanced analytics and some may just require a standard deviation. Multi-genre analytics allows you to apply multiple analytics models in various forms so that you can quickly discern the value of the data.
The self-driving car is a helpful metaphor. I’ve heard estimates that each vehicle has 60,000 sensors generating terabytes of data per hour. Consider the variety of that data. Data for obstacle detection requires millisecond response and must be recognized as such if it is to be useful. A sensor on the battery to predict replacement requires aggregation to predict a trend over time and does not require real-time responsiveness. Nevertheless both types of data are being created constantly and must be directed appropriately based on the use case.
How does this work at scale? Consider video games. Real-time data is critical to everything from in game advertising, which depends on near instant delivery of the right ad at a contextually appropriate moment, to recommendations and game features that are critical to the user experience and which are highly specific to moments within the game. At the same time, analyzing patterns at scale is critical to understanding and controlling churn and appeal. This is a lot of data to parse on the fly in order to operate effectively.
From a data perspective, we’re going to need a new digestive system if we are to make the most of the data coming in from the IoT. We’ll need vision and creativity as well. It’s an exciting time to be in analytics.