Subscribe to our Newsletter | To Post On IoT Central, Click here

Data (183)

guest blog by Jin Kim, VP Product Development for Objectivity, Inc.

Almost any popular, fast-growing market experiences at least a bit of confusion around terminology. Multiple firms are frantically competing to insert their own “marketectures,” branding, and colloquialisms into the conversation with the hope their verbiage will come out on top.

Add in the inherent complexity at the intersection of Business Intelligence and Big Data, and it’s easy to understand how difficult it is to discern one competitive claim from another. Everyone and their strategic partner is focused on “leveraging data to glean actionable insights that will improve your business.” Unfortunately, the process involved in achieving this goal is complex, multi-layered, and very different from application to application depending on the type of data involved.

For our purposes, let’s compare and contrast two terms that are starting to be used interchangeably – Information Fusion and Data Integration. These two terms in fact refer to distinctly separate functions with different attributes. By putting them side-by-side, we can showcase their differences and help practitioners understand when to use each.

Before we delve into their differences, let’s take a look at their most striking similarity. Both of these technologies and best practices are designed to integrate and organize data coming in from multiple sources in order to present a unified view of data for consumption by various applications to derive actionable insights, thus making it easier for analytics applications to use and derive the “actionable insights” everyone is looking to generate.

However, Information Fusion diverges from Data Integration in a few key ways that make it much more appropriate for many of today’s environments.

• Data Reduction – Information Fusion is, first and foremost, designed to enable data abstraction. So, while data integration focuses on combining data to create consumable data, Information Fusion frequently involves “fusing” data at different abstraction levels and differing levels of uncertainty to support a more narrow set of application workloads.

• Handling Streaming/Real-Time Data – Data Integration is best used with data-at-rest or batch-oriented data. The problem is that the most compelling applications associated with Big Data and the Industrial Internet of Things are often based on streaming, sensor data. Information Fusion is capable of integrating, transforming and organizing all manner of data (structured, semi-structured, and unstructured), but specifically time-series data, for use by today’s most demanding analytics applications to bridge the gap between Fast Data and Big Data. Another way to put this is Data integration creates an integrated set of data where the larger set is retained. By comparison, Information Fusion uses multiple techniques to reduce the amount of stateless data and provide only the stateful, valuable and relevant, data to deliver improved confidence.

• Human Interfaces – Information Fusion also adds in the opportunity for a human analyst to incorporate their own contributions to the data in order to further reduce uncertainty. By adding and saving inferences and detail that can only be derived with human analysis and support into existing and new data, organizations are able to maximize their analytics efforts and deliver a more complete “Big Picture” view of a situation.

As you can see, Information Fusion, unlike Data Integration, focuses on deriving insight from real-time streaming data and enriching this stream with semantic context from other Big Data sources. This is a critical distinction, as todays most advanced, mission-critical, analytical applications start looking to Information Fusion to add real-time value.

Originally posted on Data Science Central

Follow us @IoTCtrl | Join our Community

Read more…

Brontobytes, Yottabytes, Geopbytes, and Beyond

Guest blog post by Bill Vorhies

Now that everyone is thinking about IoT and the phenomenal amount of data that will stream past us and presumably need to be stored we need to break out a vocabulary well beyond our comfort zone of mere terabytes (about the size of a good hard drive on your desk).

In this article Beyond Just “Big” Data author Paul McFedries argues for nomenclature even beyond Geopbytes (and I'd never heard of that one).  There is a presumption though that all that IoT data actually needs to be stored which is misleading.  We may want to store some big chunks of it but increasingly our tools are allowing for 'in stream analytics' and for filtering the stream to identify only the packets we're interested in.  I don't know that we'll ever need to store Geopbytes but you'll enjoy his argument.  Use the link Beyond Just “Big” Data.

Here's the beginning of his thoughts:

Beyond Just “Big” Data

We need new words to describe the coming wave of machine-generated information

When Gartner released its annual Hype Cycle for Emerging Technologies for 2014, it was interesting to note that big data was now located on the downslope from the “Peak of Inflated Expectations,” while the Internet of Things (often shortened to IoT) was right at the peak, and data science was on the upslope. This felt intuitively right. First, although big data—those massive amounts of information that require special techniques to store, search, and analyze—remains a thriving and much-discussed area, it’s no longer the new kid on the data block. Second, everyone expects that the data sets generated by the Internet of Things will be even more impressive than today’s big-data collections. And third, collecting data is one significant challenge, but analyzing and extracting knowledge from it is quite another, and the purview of data science.

Follow us @IoTCtrl | Join our Community

Read more…

Guest blog post by ajit jaokar

Often, Data Science for IoT differs from conventional data science due to the presence of hardware.

Hardware could be involved in integration with the Cloud or Processing at the Edge (which Cisco and others have called Fog Computing).

Alternately, we see entirely new classes of hardware specifically involved in Data Science for IoT(such as synapse chip for Deep learning)

Hardware will increasingly play an important role in Data Science for IoT.

A good example is from a company called Cognimem which natively implements classifiers(unfortunately, the company does not seem to be active any more as per their twitter feed)

In IoT, speed and real time response play a key role. Often it makes sense to process the data closer to the sensor.

This allows for a limited / summarized data set to be sent to the server if needed and also allows for localized decision making.  This architecture leads to a flow of information out from the Cloud and the storage of information at nodes which may not reside in the physical premises of the Cloud.

In this post, I try to explore the various hardware touchpoints for Data analytics and IoT to work together.

Cloud integration: Making decisions at the Edge

Intel Wind River edge management system certified to work with the Intel stack  and includes capabilities such as data capture, rules-based data analysis and response, configuration, file transfer and  Remote device management

Integration of Google analytics into Lantronix hardware –  allows sensors to send real-time data to any node on the Internet or to a cloud based application.

Microchip integration with Amazon Web services  uses an  embedded application with the Amazon Elastic Compute Cloud (EC2) service. Based on  Wi-Fi Client Module Development Kit . Languages like Python or Ruby can be used for development

Integration of Freescale and Oracle which consolidates data collected from multiple appliances from multiple Internet of things service providers.


Libraries are another avenue for analytics engines to be integrated into products – often at the point of creation of the device. Xively cloud services is an example of this strategy through xively libraries


In contrast, provides APIs for IoT devices to create their own analytics engines ex (smartwatch Pebble’s using of  without locking equipment providers into a particular data architecture.

Specialized hardware

We see increasing deployment  of specialized hardware for analytics. Ex egburt from Camgian which uses sensor fusion technolgies for IoT.

In the Deep learning space, GPUs are widely used and more specialized hardware emerges such asIBM’s synapse chip. But more interesting hardware platforms are emerging such as Nervana Systemswhich creates hardware specifically for Neural networks.

Ubuntu Core and IFTTT spark

Two more initiatives on my radar deserve a space in themselves – even when neither of them have currently an analytics engine:  Ubuntu Core – Docker containers+lightweight Linux distribution as an IoT OS and IFTTT spark initiatives

Comments welcome

This post is leading to vision for Data Science for IoT course/certification. Please sign up on the link if you wish to know more when launched in Feb.

Image source: cognimem

Follow us @IoTCtrl | Join our Community

Read more…
Email me when there are new items in this category –

IoT Open Discussion Forums

Upcoming IoT Events

More IoT News

How wearables can improve healthcare | TECH(talk)

Wearable tech can help users track their fitness goals, but these devices can also give wearers ownership of their electronic health records. TECH(talk)'s Juliet Beauchamp and Computerworld's Lucas Mearian take a look at how wearable health tech can… Continue

IoT Career Opportunities