Data quality is one of the few fundamental truths to consider when architecting an IoT solution. Adopting a "pull" model for data sharing makes it easier to keep dirty data from infecting other systems and jeopardizing data quality across your enterprise environment.
Highly regulated industries have a head start in this area - companies that operate in the food, healthcare, and processing industries are already required to collect data and report it to government agencies. In some industries, the difference between a temperature of 30 degrees and 35 degrees amounts to significant lost revenue. A company operating a fleet of refrigerated tractor-trailers may need to track cargo temperature on a continual basis along its journey. A unified system of record ensures a definitive source of truth for answering these regulatory questions.
In the "pull" model, systems that make use of incoming IoT data ask for what they want and pull it from a single source: the system of record. You send a request, and you get the data.
Many IoT systems adopt a "push" model, which streams data to all users, all the time. A common problem with this model in a highly regulated industry is keeping protected data clean. Once you push dirty data to all users, you've lost operational control of it. You can't retract it and you can't clean it. If you have a chain of custody you may have virtual breadcrumbs that allow you to find and clean data once you realize it's dirty, but you'd have to find and clean it everywhere it went. That's difficult because a push model also encourages you to save copies of the data across your system.
How do you clean data that has been distributed "shotgun-style" across multiple data stores? It's not easy. Finding and cleaning dirty data across your IoT system will be virtually impossible. Fixing a problem in one data store doesn't fix the same problem in another. In the pull-based model, users only receive the data they specifically request. A system of record provides a single source for everything that has happened in the system - including what events have resulted in clean or dirty data.
When, for example, the doors of a refrigerated tractor-trailer are opened, the company's fleet manager can report where the truck stopped, what time the doors were opened, and at what time the temperature in the cargo hold went of range of the regulatory standard. When reports show unexpected results, or a device malfunctions, the manager can also track down when a sensor failed, what data from it was used by downstream systems (throwing off calculations), and then clean this data retroactively.
As IoT devices become more pervasive, big data multiplies, and more users and systems request access to data, adoption of a system of record and the "pull" model will become even more important for long-term reliability and production of value.
Photo Credit: John Borthwick