Subscribe to our Newsletter | To Post On IoT Central, Click here


machine learning (7)

Today we are into digital age, every business is using big data and machine learning to effectively target users with messaging in a language they really understand and push offers, deals and ads that appeal to them across a range of channels.
With exponential growth in data from people and & internet of things, a key to survival is to use machine learning & make that data more meaningful, more relevant to enrich customer experience.
Machine Learning can also wreak havoc on a business if improperly implemented. Before embracing this technology, enterprises should be aware of the ways machine learning can fall flat.Data scientists have to take extreme care while developing these machine learning models so that it generate right insights to be consumed by business.
Here are 5 ways to improve the accuracy & predictive ability of machine learning model and ensure it produces better results.
·       Ensure that you have variety of data that covers almost all the scenarios and not biased to any situation. There was a news in early pokemon go days that it was showing only white neighborhoods. It’s because the creators of the algorithms failed to provide a diverse training set, and didn't spend time in these neighborhoods. Instead of working on a limited data, ask for more data. That will improve the accuracy of the model.
·       Several times the data received has missing values. Data scientists have to treat outliers and missing values properly to increase the accuracy. There are multiple methods to do that – impute mean, median or mode values in case of continuous variables and for categorical variables use a class. For outliers either delete them or perform some transformations.
·       Finding the right variables or features which will have maximum impact on the outcome is one of the key aspect. This will come from better domain knowledge, visualizations. It’s imperative to consider as many relevant variables and potential outcomes as possible prior to deploying a machine learning algorithm.
·       Ensemble models is combining multiple models to improve the accuracy using bagging, boosting. This ensembling can improve the predictive performance more than any single model. Random forests are used many times for ensembling.
·       Re-validate the model at proper time frequency. It is necessary to score the model with new data every day, every week or month based on changes in the data. If required rebuild the models periodically with different techniques to challenge the model present in the production.
There are some more ways but the ones mentioned above are foundational steps to ensure model accuracy.
Machine learning gives the super power in the hands of organization but as mentioned in the Spider Man movie – “With great power comes the great responsibility” so use it properly.
Read more…
Internet of Things (IoT) began as an emerging trend and has now become one of the key element ofDigital Transformationthat is driving the world in many respects.
If your thermostat or refrigerator is connected to the Internet, then it is part of the consumer IoT.  If your factory equipment have sensors connected to internet, then it is part of Industrial IoT(IIoT).
IoT has an impact on end consumers, while IIoT has an impact on industries like Manufacturing, Aviation, Utility, Agriculture, Oil & Gas, Transportation, Energy and Healthcare.
IoT refers to the use of "smart" objects, which are everyday things from cars and home appliances to athletic shoes and light switches that can connect to the Internet, transmitting and receiving data and connecting the physical world to the digital world.
IoT is mostly about human interaction with objects. Devices can alert users when certain events or situations occur or monitor activities:
·       Google Nest sends an alert when temperature in the house dropped below 68 degrees
·       Garage door sensors alert when open
·       Turn up the heat and turn on the driveway lights a half hour before you arrive at your home
·       Meeting room that turns off lights when no one is using it
·       A/C switch off when windows are open
IIoT on the other hand, focus more workers safety, productivity & monitors activities and conditions with remote control functions ability:
·       Drones to monitor oil pipelines
·       Sensors to monitor Chemical factories, drilling equipment, excavators, earth movers
·       Tractors and sprayers in agriculture
·       Smart cities might be a mix of commercial and IIoT.
IoT is important but not critical while IIoT failure often results in life-threatening or other emergency situations.
IIoT provides an unprecedented level of visibility throughout the supply chain. Individual items, cases, pallets, containers and vehicles can be equipped with auto identification tags and tied to GPS-enabled connections to continuously update location and movement.
IoT generates medium or high volume of data while IIoT generates very huge amounts of data (A single turbine compressor blade can generate more than 500GB of data per day) so includes Big Data,Cloud computingmachine learning as necessary computing requirements.
In future, IoT will continue to enhance our lives as consumers while IIoT will enable efficient management of entire supply chain.
Read more…
Today, with Digitization of everything, 80 percent the data being created is unstructured. 
Audio, Video, our social footprints, the data generated from conversations between customer service reps, tons of legal document’s texts processed in financial sectors are examples of unstructured data stored in Big Data.
Organizations are turning to natural language processing (NLP) technology to derive understanding from the myriad of these unstructured data available online and in call-logs.
Natural language processing (NLP) is the ability of computers to understand human speech as it is spoken. NLP is a branch of artificial intelligence that has many important implications on the ways that computers and humans interact. Machine Learning has helped computers parse the ambiguity of human language.
Apache OpenNLP, Natural Language Toolkit(NLTK), Stanford NLP are various open source NLP libraries used in real world application below.
Here are multiple ways NLP is used today:
The most basic and well known application of NLP is Microsoft Word spell checking.
Text analysis, also known as sentiment analytics is a key use of NLP. Businesses are most concerned with comprehending how their customers feel emotionally adn use that data for betterment of their service.
Email filters are another important application of NLP. By analyzing the emails that flow through the servers, email providers can calculate the likelihood that an email is spam based its content by using Bayesian or Naive based spam filtering.
Call centers representatives engage with customers to hear list of specific complaints and problems. Mining this data for sentiment can lead to incredibly actionable intelligence that can be applied to product placement, messaging, design, or a range of other use cases.
Google and Bing and other search systems use NLP to extract terms from text to populate their indexes and to parse search queries.
Google Translate applies machine translation technologies in not only translating words, but in understanding the meaning of sentences to provide a true translation.
Many important decisions in financial markets use NLP by taking plain text announcements, and extracting the relevant info in a format that can be factored into algorithmic trading decisions. E.g. news of a merger between companies can have a big impact on trading decisions, and the speed at which the particulars of the merger, players, prices, who acquires who, can be incorporated into a trading algorithm can have profit implications in the millions of dollars.
Since the invention of the typewriter, the keyboard has been the king of human-computer interface. But today with voice recognition via virtual assistants, like Amazon’s Alexa, Google’s Now, Apple’s Siri and Microsoft’s Cortana respond to vocal prompts and do everything from finding a coffee shop to getting directions to our office and also tasks like turning on the lights in home, switching the heat on etc. depending on how digitized and wired-up our life is.
Question Answering - IBM Watson is the most prominent example of question answering via information retrieval that helps guide in various areas like healthcare, weather, insurance etc.
Therefore it is clear that Natural Language Processing takes a very important role in new machine human interfaces. It’s an essential tool for leading-edge analytics & is the near future.
Read more…

Want to know how to choose Machine Learning algorithm?

Machine Learning is the foundation for today’s insights on customer, products, costs and revenues which learns from the data provided to its algorithms.
Some of the most common examples of machine learning are Netflix’s algorithms to give movie suggestions based on movies you have watched in the past or Amazon’s algorithms that recommend products based on other customers bought before.
Typical algorithm model selection can be decided broadly on following questions:
·        How much data do you have & is it continuous?
·        Is it classification or regression problem?
·        Predefined variables (Labeled), unlabeled or mix?
·        Data class skewed?
·        What is the goal? – predict or rank?
·        Result interpretation easy or hard?
Here are the most used algorithms for various business problems:
 
Decision Trees: Decision tree output is very easy to understand even for people from non-analytical background. It does not require any statistical knowledge to read and interpret them. Fastest way to identify most significant variables and relation between two or more variables. Decision Trees are excellent tools for helping you to choose between several courses of action. Most popular decision trees are CART, CHAID, and C4.5 etc.
In general, decision trees can be used in real-world applications such as:
·        Investment decisions
·        Customer churn
·        Banks loan defaulters
·        Build vs Buy decisions
·        Company mergers decisions
·        Sales lead qualifications
 
Logistic Regression: Logistic regression is a powerful statistical way of modeling a binomial outcome with one or more explanatory variables. It measures the relationship between the categorical dependent variable and one or more independent variables by estimating probabilities using a logistic function, which is the cumulative logistic distribution.
In general, regressions can be used in real-world applications such as:
·        Predicting the Customer Churn
·        Credit Scoring & Fraud Detection
·        Measuring the effectiveness of marketing campaigns
 
Support Vector Machines: Support Vector Machine (SVM) is a supervised machine learning technique that is widely used in pattern recognition and classification problems - when your data has exactly two classes.
In general, SVM can be used in real-world applications such as:
·        detecting persons with common diseases such as diabetes
·        hand-written character recognition
·        text categorization – news articles by topics
·        stock market price prediction
 
Naive Bayes: It is a classification technique based on Bayes’ theorem and very easy to build and particularly useful for very large data sets. Along with simplicity, Naive Bayes is known to outperform even highly sophisticated classification methods. Naive Bayes is also a good choice when CPU and memory resources are a limiting factor
In general, Naive Bayes can be used in real-world applications such as:
·        Sentiment analysis and text classification
·        Recommendation systems like Netflix, Amazon
·        To mark an email as spam or not spam
·        Facebook like face recognition
 
Apriori: This algorithm generates association rules from a given data set. Association rule implies that if an item A occurs, then item B also occurs with a certain probability.
In general, Apriori can be used in real-world applications such as:
·        Market basket analysis like amazon - products purchased together
·        Auto complete functionality like Google to provide words which come together
·        Identify Drugs and their effects on patients
 
Random Forest: is an ensemble of decision trees. It can solve both regression and classification problems with large data sets. It also helps identify most significant variables from thousands of input variables.
In general, Random Forest can be used in real-world applications such as:
·        Predict patients for high risks
·        Predict parts failures in manufacturing
·        Predict loan defaulters
The most powerful form of machine learning being used today, is called “Deep Learning”.
In today’s Digital Transformation age, most businesses will tap into machine learning algorithms for their operational and customer-facing functions
Read more…

A smart, highly optimized distributed neural network, based on Intel Edison "Receptive" Nodes

Training ‘complex multi-layer’ neural networks is referred to as deep-learning as these multi-layer neural architectures interpose many neural processing layers between the input data and the predicted output results – hence the use of the word deep in the deep-learning catchphrase.

While the training procedure of large scale network is computationally expensive, evaluating the resulting trained neural network is not, which explains why trained networks can be extremely valuable as they have the ability to very quickly perform complex, real-world pattern recognition tasks on a variety of low-power devices.

These trained networks can perform complex pattern recognition tasks for real-world applications ranging from real-time anomaly detection in Industrial IoT to energy performance optimization in complex industrial systems. The high-value, high accuracy recognition (sometimes better than human) trained models have the ability to be deployed nearly everywhere, which explains the recent resurgence in machine-learning, in particular in deep-learning neural networks.

These architectures can be efficiently implemented on Intel Edison modules to process information quickly and economically, especially in Industrial IoT application.

Our architectural model is based on a proprietary algorithm, called Hierarchical LSTM, able to capture and learn the internal dynamics of physical systems, simply observing the evolution of related time series.

To train efficiently the system, we implemented a greedy, layer based parameter optimization approach, so each device can train one layer at a time, and send the encoded feature to the upper level device, to learn higher levels of abstraction on signal dinamic.

Using Intel Edison as layers "core computing units", we can perform higher sampling rates and frequent retraining, near the system we are observing without the need of a complex cloud architecture, sending just a small amount of encoded data to the cloud.

Read more…

A Visual Introduction to Machine Learning

Machine learning is the science of getting computers to act without being explicitly programmed. In the past decade, machine learning has given us self-driving cars, practical speech recognition, effective web search, and a vastly improved understanding of the human genome. It will play a big part in the IoT. From our friends at R2D3 is a very interesting visual introduction to machine learning. Check it out here

Read more…

As we move towards widespread deployment of sensor-based technologies, three issues come to the fore: (1) many of the these applications will need machine learning to be localized and personalized, (2) machine learning needs to be simplified and automated, and (3) machine learning needs to be hardware-based. 

Beginning of the era of personalization of machine learning

Imagine a complex plant or machinery being equipped with all kinds of sensors to monitor and control its performance and to predict potential points of failure. Such plants can range from an oil rig out in the ocean to an automated production line. Or such complex plants can be human beings, perhaps millions of them, who are being monitored with a variety of devices in a hospital or at home. Although we can use some standard models to monitor and compare performance of these physical systems, it would make more sense to either rebuild these models from scratch or adjust them to individual situations. This would be similar to what we do in economics. Although we might have some standard models to predict GDP and other economic variables, we would need to adjust each one of them to individual countries or regions to take into account their individual differences. The same principle of adjustment to individual situations would apply to physical systems that are sensor-based. And, similar to adjusting or rebuilding models of various economic phenomena, the millions of sensor-based models of our physical systems would have to be adjusted or rebuilt to account for differences in plant behavior. We are, therefore, entering an era of personalization of machine learning at a scale that we have never imagined before. The scenario is scary because we wouldn’t have the resources to pay attention to these millions of individual models. Cisco projects 50 billion devices to be connected by 2020 and the global IoT market size to be over $14 trillion by 2022 [1, 2].

 

The need for simplification and automation of machine learning technologies 

If this scenario of widespread deployment of personalized machine learning is to play out, we absolutely need automation of machine learning to the extent that requires less expert assistance. Machine learning cannot continue to depend on high levels of professional expertise.  It has to be simplified to be similar to automobiles and spreadsheets where some basic training at a high school can certify one to use these tools. Once we simplify the usage of machine learning tools, it would lead to widespread deployment and usage of sensor-based technologies that also use machine learning and would create plenty of new jobs worldwide. Thus, simplification and automation of machine learning technologies is critical to the economics of deployment and usage of sensor-based systems. It should also open the door to many new kinds of devices and technologies.

 

The need for hardware-based localized machine learning for "anytime, anywhere" deployment and usage 

Although we talk about the Internet of Things, it would simply be too expensive to transmit all of the sensor-based data to a cloud-based platform for analysis and interpretation. It would make sense to process most of the data locally. Many experts predict that, in the future, about 60% of the data would be processed at the local level, in local networks - most of it may simply be discarded after processing and only some stored locally. There is a name for this kind of local processing – it’s called “edge computing” [3].

The main characteristics of data generated by these sensor-based systems are: high-velocity, high volume, high-dimensional and streaming. There are not many machine learning technologies that can learn in such an environment other than hardware-based neural network learning systems. The advantages of neural network systems are: (1) learning involves simple computations, (2) learning can take advantage of massively parallel brain-like computations, (3) they can learn from all of the data instead of samples of data, (4) scalability issues are non-existent, and (4) implementations on massively parallel hardware can provide real-time predictions in micro seconds. Thus, massively parallel neural network hardware can be particularly useful with high velocity streaming data in these sensor-based systems. Researchers at Arizona State University, in particular, are working on such a technology and it is available for licensing [4].

 

Conclusions

Hardware-based localized learning and monitoring will not only reduce the volume of Internet traffic and its cost, it will also reduce (or even eliminate) the dependence on a single control center, such as the cloud, for decision-making and control. Localized learning and monitoring will allow for distributed decision-making and control of machinery and equipment in IoT.

We are gradually moving to an era where machine learning can be deployed on an “anytime, anywhere” basis even when there is no access to a network and/or a cloud facility.

 

References

  1. Gartner (2013). "Forecast: The Internet of Things, Worldwide, 2013."

         https://www.gartner.com/doc/2625419/forecast-internet-things-worldwide-

     2. 10 Predictions for the Future of the Internet of Things

     3. Edge Computing

     4. Neural Networks for Large Scale Machine Learning

 

Read more…

Upcoming IoT Events

6 things to avoid in transactional emails

transactional man typing

  You might think that once a sale has been made, or an email subscription confirmed, that your job is done. You’ve made the virtual handshake, you can have a well-earned coffee and sit down now right? Wrong! (You knew we were…

Continue

More IoT News

IoT Career Opportunities