Subscribe to our Newsletter | To Post On IoT Central, Click here


All Posts (918)

Mapping the Internet of Things

You would think that in this day and age of infographics that finding a map laying out the ecosystem of the Internet of Things would exist. Surprisingly, a Google search doesn’t appear to return much. Neither does a Twitter a search.

Recently though I found two worth sharing. One from Goldman Sachs and the other from Chris McCann which I found very interesting - A Map of The Internet of Things Market.

Goldman Sachs’ map is pretty generic but it takes IoT related items all the way from the consumer to the Industrial Internet. In a September 2014 report, “The Internet of Things: Making sense of the next mega-trend”, Goldman states that IoT is emerging as the third wave in the development of the Internet. Much of what we hear about today are on the consumer end of the spectrum - early simple products like fitness trackers and thermostats. On the other end of the spectrum, and what I think IoT Central is all about, is the Industrial Internet. The opportunity in the global industrial sector will dwarf consumer spend. Goldman states that industrial is poised to undergo a fundamental structural change akin to the industrial revolution as we usher in the IoT. All equipment will be digitized and more connected and will establish networks between machines, humans, and the Internet, leading to the creation of new ecosystems that enable higher productivity, better energy efficiency, and higher profitability. Goldman predicts that  IoT opportunity for Industrials could amount to $2 trillion by 2020.

IOT-map_goldmansachs2014.png

 

Chris McCann, who works at Greylock Partners, has an awesome map of the Internet of Things Market (below). This is what venture capitalists do of course - analyze markets and find opportunities for value by understanding the competitive landscape. This map is great because I think it can help IoT practitioners gain a better understanding of the Internet of Things market and how all of the different players fit together.

The map is not designed to be comprehensive, but given the dearth in available guidance, this is a great starting point. The map is heavily geared towards the startup space (remember the author is a VC) and I think he leaves out a few machine-to-machine vendors, software platforms and operating systems.

Other maps I found that are interesting are:

Thingful, a search engine for the Internet of Things. It provides a geographical index of connected objects around the world, including energy, radiation, weather, and air quality devices as well as seismographs. Near me in earthquake prone Northern California I of course found a seismograph, as well as a weather station, and an air quality monitoring station.  

Shodan, another search engine of sorts for IoT.

And then there is this story of Rapid7’s HD Moore who pings things just for fun.

If you have any maps that you think are valuable, I would love for you to share them in the comments section.



1-N0LCH8hOGeO0x0BKYviYaA.png

Follow us on Twitter @IoTCtrl

Read more…

The Internet of Things and the Right to Record

Today there are over 5 billion connected devices in the world that make up the Internet of Things (IoT). Research firms like IDC and Gartner predict that within five years’ time, this number will skyrocket to 25 billion. Although we often think of the ways these IoT devices can make our lives easier, make our homes smarter, improve manufacturing, and even revolutionize healthcare, there are some uses for IoT that aren’t as straightforward.

 

One of these, is how IoT has changed our ability to record the world around us, and immediately share what we capture. Combined with social media, this ‘right to record’ has brought into question when it is appropriate or not appropriate to record. More importantly, is it legal?

The Legalities of Recording in Public

Smartphones, tablets, and even connected eyewear are all part of IoT, and they’re all capable of recording pictures and video. The most obvious example to look at is the phenomenon of members of the public recording law enforcement officers, performing their duties.

  • There are a number of states that have an ‘all parties consent’ law, requiring that subjects be made aware of video, image, or audio capture that is taking place.
  • There is a clause, however. There should also be a reasonable expectation of privacy on behalf of the subjects. This means, with interpretation, that filming in public places, without consent, would be acceptable and legal.
  • Illinois and Massachusetts have ‘all parties consent’ laws, however they don’t allow for the provision regarding the expectation of privacy. In 2010, Tiawanda Moore was arrested for attempting to record law enforcement personnel with a cell phone. She was later acquitted of all charges (http://articles.chicagotribune.com/2011-08-25/news/ct-met-eavesdropping-trial-0825-20110825_1_eavesdropping-law-police-officers-law-enforcement).
  • It is not legal to record on private property, to make commercial gain from recorded material of another person’s likeness, or to use recordings to commit libel.

The Right to Record is a Two Way Street

Tech Republic, a leading trade publication for IT professionals, recently ran an opinion piece on how IoT and smart devices can cause controversy when it comes to the right to record. (http://www.techrepublic.com/article/the-right-to-record-is-not-a-question-of-technology-but-rather-power-and-policy/).

The article not only discussed the recording of law enforcement by private citizens, but also how it can be beneficial for law enforcement officers to constantly record their daily duties. Doing so would add a layer of transparency, and would serve to protect the interests of officers and their relevant governments, as well as the general public. This recording would be in addition to the already present police vehicle dash cams, and the surveillance cameras in most urban centers.

The questions then, are not as much about recordings been made in the first place, but rather about how they are used. Two key questions are;

  • Should law enforcement agencies have the right to publish footage or images of suspects before they have been convicted of crimes?
  • Should individuals have the right to publish police activity when footage or an image doesn’t portray an event or incident within its full context?

The Internet of Things is hugely dependent on constant information, easy accessibility to information, and the almost instant distribution of that information. IoT has changed the way that people expect services to work. Almost one third of those surveyed by the American Red Cross in 2012 would expect law enforcement or emergency assistance if they posted a request for help on a public social media website. Would those who are embracing social media be happy to post controversial images or videos of law enforcement agents in the line of duty? What if they were the ones being featured on a law enforcement social media account?

As more connected devices are able to easily record and share the world around us, lines will become blurred when it comes to rights. The ‘right to record’ could be considered a civil liberty under the right to free speech, so does the government share that same right? As IoT devices become more commonplace, and the internet of everything becomes a part of daily life, these questions will be answered, laws will be tested, and new precedents will be set.

20 million more IoT devices will be installed, carried, or worn by people at all levels of society, by 2020. Users and creators of IoT technologies will need to keep a close eye on ‘the right to record’, and how it impacts the industry and public perception of these devices in the years to come.

Read more…

The Internet of Things might seem like a buzzword right now. Google Trends shows continual interest in the subject each year, and the marketplace is growing. With an estimated 26 billion devices projected to be connected by 2020, it is actually something that you need to take seriously. It is more than some hip term, and may actually be the way of the future. 

Seeing actual examples of Internet of Things devices in action can help shed light on what these devices actually do. 

Take for example, the use of the internet of things in medicine. In a hospital, devices are connected to pagers, computers and other devices so doctors and nurses can easily monitor the stats of a patient, regardless of where they are in the world. If there is an alert, such as a patient coding, healthcare professionals are alerted at once.

Major cities are even incorporating the internet of things into how they handle their parking. In city parking garages, the IoT helps drivers know how many free spaces are available on each level. These sensors help drivers to locate spaces easier and help the garage to determine when they are filled to capacity. This is incredibly useful when a major sporting event or concert is going on.

Cities are also using the Internet of Things to help them to better maintain roads. Sensors located on roadways monitor the normal flow of traffic at a given time. In areas where traffic is heavier than others, these devices transmit counts to a central system. The city then can plan maintenance for these areas and increase lanes, based on the statistics these devices transmit.

Another way the Internet of Things is being used today is in your car. Some car insurance companies now have devices you can plug into your vehicle. This device monitors the speed you are going, braking habits and even how loud your radio is. This information is sent over the internet to the provider. Based on the transmitted data, the provider determines if you are eligible for good driving discounts. Of course, this is a double-edged sword. Those who drive erratically may also face higher rates based on the transmissions from this device.

As you can see, the internet of things is becoming an important part of our world. Every year, new industries are rolling out new technologies that incorporate, or take advantage of IoT. Before long, our world will be universally connected and our devices will be far more powerful than they are today.

Read more…

IoT Security - Hacking Things with the Ridiculous

In the 1996 sci-fi blockbuster movie “Independence Day”, there is a comical seen near the end where actor Jeff Goldblum, playing computer expert David Levinson, writes a virus on his Macintosh PowerBook that disables an entire fleet of technologically advanced alien spaceships. The PowerBook 5300 used in the movie had 8 MB of RAM. How could this be?

Putting aside Apple paying for product placement, we’re not going to stop advanced alien life who are apparently Mac-compatible.

I cite the ridiculous Independence Day ending because I was recently reading through a number of IoT security stories and began thinking about the implications of connecting all these things to the network. How much computing power does one actually need to hack something of significance? Could a 1997 IBM Thinkpad running Windows 95 take down the power grid in the eastern United States? Far fetching, yes, but not ridiculous.

Car hacks seem to be in the news recently. Recall last month’s Jeep hack and hijack. Yesterday, stories came out about hackers using small black dongles connected to a Corvette’s diagnostic ports to control many parts of the car through, wait for it, text messages!

Beyond cars and numerous other consumer devices, IoT security has to reach hospitals, intelligent buildings, power grids, airlines, oil and gas exploration as well as every industry listed in the IRS tax code.

IBM’s X-Force Threat Intelligence Quarterly, 4Q 2014 notes that IoT will drag in its wake a host of unknown security threats. Even IBM, a powerful force in driving IoT forward, says that their model for IoT security is still a work in progress since IoT, as a whole, is still evolving. They do suggest however five security building blocks: secure operating systems, unique identifiers for each device, data privacy protection, strong application security, strong authentication and access control.

In the end, it will be up to manufacturers to build security from the ground up and continual work with the industry to make everything more secure. As we coalesce around an ever evolving threat landscape, it will be the responsibility of smaller manufacturers, giants like IBM and industry organizations like the Industrial Internet Consortium and Online Trust Alliance’s IoT Trust Framework to help prevent the ridiculous from happening.

 

Read more…

Do You Believe the Hype?

I’m guilty of hype.

As a communications consultant toiling away at public relations, media relations and corporate communications, I’ve had my fair share of businesses and products that I’ve helped get more attention than it probably deserved. Indeed, when it comes to over-hyping anything, it’s guys like me and my friends in the media who often take it too far.

Recently though, I came across an unlikely source of hype - the McKinsey Global Institute.

In a June 2015 report that I’m now reading, McKinsey states, “The Internet of Things—digitizing the physical world—has received enormous attention. In this research, the McKinsey Global Institute set to look beyond the hype to understand exactly how IoT technology can create real economic value. Our central finding is that the hype may actually understate the full potential of the Internet of Things…” (emphasis is mine).

If McKinsey is hyping something, should we believe it?

Their report, “The Internet of Things: Mapping the Value Beyond the Hype”, does point out that “capturing the maximum benefits will require an understanding of where real value can be created and successfully addressing a set of systems issues, including interoperability.”

I think this is where the race is today - finding the platforms for interoperability, compiling data sources, building security into the system and developing the apps that deliver true value. We have a long way to go, but investment and innovation is only growing.

If done right the hype just may be understated. McKinsey finds that IoT has a total potential economic impact of $3.9 trillion to $11.1 trillion a year by 2025. They state with consumer surplus, this would be equivalent to about 11 percent of the world economy!

Do you believe the hype?

 

Read more…

List of IoT Platforms

IoT platforms make the developer’s life easier by offering some independent functionality which can be used by the applications they write to achieve their objective. Saving them from the task of reinventing the wheel. Given here is a list of useful IoT platforms.

 

Kaa

Kaa is a flexible open source platform licenced under Apache 2.0 for building, managing, and integrating connected software in IoT. Kaa’s “data schema” definition language provides a universal level of abstraction to achieve cross-vendor product interoperability. Kaa supports multiple client platforms by offering endpoint SDKs in various programming languages. In addition, Kaa’s powerful back-end functionality greatly speeds up product development, allowing vendors to concentrate on maximizing their product’s unique value to the consumer.

 

Axeda

The Axeda Platform is a complete M2M and IoT data integration and application development platform with infrastructure delivered as a cloud-based service.

 

Arrayent

The Arrayent Connect Platform is an IoT platform that helps to connect products to smartphones and web applications. It comes with an an agent which helps the embedded devices to connect to cloud, A cloud based IoT operating system, A mobile framework and a business intelligence reporting system

 

Carriots

Carriots is a Platform as a Service (PaaS) designed for Internet of Things (IoT) and Machine to Machine (M2M) projects. It provides tools to Collect & store data from devices, SDK to build powerful applications, deploy and scale from tiny prototypes to thousands of devices

 

Xively

Xively offers an enterprise IoT platform which helps in connecting products and users, manage the information and an interface to for product deployment and health check

 

ThingSpeak

ThingSpeak is an open source Internet of Things application and API to store and retrieve data from things using the HTTP protocol over the Internet or via a Local Area Network. ThingSpeak enables the creation of sensor logging applications, location tracking applications, and a social network of things with status updates

 

The Intel® IoT Platform

The Intel® IoT Platform is an end-to-end reference model and family of products from Intel, that works with third party solutions to provide a foundation for seamlessly and securely connecting devices, delivering trusted data to the cloud, and delivering value through analytics.

A votable & rankable list of these platform can be found at Vozag

 
Read more…

Originally posted on Data Science Central

Summary:  Thanks to the IOT (internet of things) an internet-like experience of recommendations and awareness of your preferences is coming to the brick and mortar store near you.

You’ve probably noticed the huge difference in the tone of the conversation between data scientists and the general public over the issue of privacy and personalization.  The professional community is largely quiet but for the public you’d think we were developing bionic eyeballs tracking their most minute and private habits.

In my house my wife is always complaining that I can’t remember how many sweeteners she takes in her tea; who her favorite actors are, or whether she liked that Indian restaurant we visited last year enough to want to go back.  But if a web site shows her a picture of something she browsed yesterday, or if the recommended books and movies on Amazon are a little too on target she’s the first one to raise the hue and cry that her privacy is being violated.  My failing to remember – bad.  Their being helpful by remembering or recommending – also bad???

This is beginning to look like a real Catch 22.  Behaviors we wish for at home are suddenly evil if a web site does an even better job than your spouse at remembering your likes and dislikes.

Personally I think site personalization is a real blessing.  I don’t really want to see ads for rock climbing walls or baby diapers.  I’m not in that market so not being exposed to a random untargeted bunch of ads (think your Sunday paper – what’s a Sunday paper you say?) is all for the good.

Well web sites are one thing but these days with the emerging IOT our brick and mortar stores are gearing up to behave more like a web site and less like a random walk up one aisle and down another.  Here’s a brief update on who’s doing what in retail IOT.  I’m sure there are many providers I’ve missed and can’t say if these folks are good or bad at what they do but my hat’s off to them for trying something new that might make my life better even if my wife would find it a little spooky.

In retail Heat Maps (which products get picked up more often than others) and Flow Charts (how customers navigated the store) are all the rage.  Sensors also allow retailers to offer coupons over your smart phone that are tailored to your shopping pattern.  And by moving desirable merchandise with long linger times to better locations, frequently to deeper in the store, they can achieve that same ‘stickiness’ we associate with web sites to make us stay a little longer.  Where exactly are the customers going in the store, where do they pause and ponder, and how can the retailer use this information to revise the store layout, the merchandise displays, pricing, or anything else to squeeze out another dollar. 

The specifics of sensors and strategies differ from one vendor to another and in this early stage of adoption it’s fair to say that we’re waiting for the market to tell us which are most successful.  Some use your cell phone to triangulate your position, some use cameras, radio beacons, or even more exotic sensor types.  This is a good thing since all this experimentation will tell us what’s worth the investment and what’s not.  Any number of major retailers are running experiments. To name just a few:

Nordstrom – Euclid Analytics

Macy’s – Shopkick

Timberland and Kenneth Cole -Swirl Networks

Goldman’s Dept. Stores - RetailNext

The Future of Privacy Forum, a Washington, D.C., think tank, estimates that about 1,000 retailers are testing some sort of sensor strategy.

Swarm Solutions says 6,000 retailers have installed its door sensors to compare foot traffic with transactions.

Others working with Wi-Fi triangulation include Ekahau, Wifislam, and Prism Skylabs.  Apple’s iBeacon technology probably belongs in this group as well.

Blinksight and Insiteo are working with radio beacons.

Bytelight, Aisle411, Everyfit, and PointInside are all working with other sensor types including embedded floor sensors and even LED lights.

These 15 innovators are probably only the tip of the iceberg.  This is one of those ‘stay tuned for results’ stories.  The results aren’t in but there are lots of horses in the race.  Meanwhile, I’m still looking for the sensors I can install at home that will make my wife think I am a better husband.

Bill Vorhies, President & Chief Data Scientist – Data-Magnum - © 2014, all rights reserved.

 

About the author:  Bill Vorhies is President & Chief Data Scientist of Data-Magnum and has practiced as a data scientist and commercial predictive modeler since 2001.  He can be reached at:

[email protected]

The original blog can be viewed at:

http://data-magnum.com/privacy-personalization-and-the-iot-retail/

Read more…

Understanding the nature of IoT data

Originally posted on Data Science Central

This post is in a series Twelve unique characteristics of IoT based Predictive analytics/machine learning.

Here, we discuss IoT devices and the nature of IoT data

Definitions and terminology

    Business insider makes some bold predictions for IoT devices

    The Internet of Things will be the largest device market in the world.  

    By 2019 it will be more than double the size of the smartphone, PC, tablet, connected car, and the wearable market     combined.

    The IoT will result in $1.7 trillion in value added to the global economy in 2019.

    Device shipments will reach 6.7 billion in 2019 for a five-year CAGR of 61%.

    The enterprise sector will lead the IoT, accounting for 46% of device shipments this year, but that share will decline     as the government and home sectors gain momentum.

    The main benefit of growth in the IoT will be increased efficiency and lower costs.

    The IoT promises increased efficiency within the home, city, and workplace by giving control to the user.

And others say internet things investment will run 140bn next five years

 

Also, the term IoT has many definitions – but it's important to remember that IoT is not the same as M2M (machine to machine). M2M is a telecoms term which implies that there is a radio (cellular) at one or both ends of the communication. On the other hand, IOT means simply connecting to the Internet. When we are speaking of IoT(billions of devices) – we are really referring to Smart objects. So, what makes an Object Smart?

What makes an object smart?

Back in 2010, the then Chinese Premier Wen Jiabo once said “Internet + Internet of things = Wisdom of the earth”. Indeed the Internet of Things revolution promises to transform many domains .. As the term Internet of Things implies (IOT) – IOT is about Smart objects

 

For an object (say a chair) to be ‘smart’ it must have three things

  • An Identity (to be uniquely identifiable – via iPv6)
  • A communication mechanism(i.e. a radio) and
  • A set of sensors / actuators

 

For example – the chair may have a pressure sensor indicating that it is occupied

Now, if it is able to know who is sitting – it could co-relate more data by connecting to the person’s profile

If it is in a cafe, whole new data sets can be co-related (about the venue, about who else is there etc)

Thus, IOT is all about Data ..

How will Smart objects communicate?

How will billions of devices communicate? Primarily through the ISM band and Bluetooth 4.0 / Bluetooth low energy.

Certainly not through the cellular network (Hence the above distinction between M2M and IoT is important).

Cellular will play a role in connectivity and there will be many successful applications / connectivity models (ex Jasper wireless which primarily require a SIM card in the device).

A more likely scenario is IoT specific networks like Sigfox(which could be deployed by anyone including Telecom Operators).  Sigfox currently uses the most popular European ISM band on 868MHz (as defined by ETSI and CEPT), along with 902MHz in the USA (as defined by the FCC), depending on specific regional regulations.

Also, when 5G networks are deployed (beyond 2020) - Cellular will provide wide area connectivity for IoT devices

In any case, Smart objects will generate a lot of Data .

.

Understanding the nature of IoT data

In the ultimate vision of IoT, Things are identifiable, autonomous, and self-configurable. Objects  communicate among themselves and interact with the environment. Objects can sense, actuate and predictively react to events

 

Billions of devices will create massive volume of streaming and geographically-dispersed data. This data will often need real-time responses.

There are primarily two modes of IoT data: periodic observations/monitoring or abnormal event reporting.

Periodic observations present demands due to their high volumes and storage overheads. Events on the other hand are one-off but need a rapid response.

In addition, if we consider video data(ex from surveillance cameras) as IoT Data, we have some additional characteristics.

Thus, our goal is to understand the implications of predictive analytics to IoT data. This ultimately entails using IoT data to make better decisions.

I will be exploring these ideas in the Data Science for IoT course /certification program when it's launched.

Comments welcome. In the next part of this series, I will explore Time Series data 

Read more…

Big Data, IOT and Security - OH MY!

While we aren’t exactly “following the yellow brick road” these days, you may be feeling a bit like Dorothy from the “Wizard of Oz” when it comes to these topics. No my friend, you aren’t in Kansas anymore! As seem above from Topsy, these three subjects are extremely popular these days and for the last 30 days seem to follow a similar pattern (coincidence?).

 

The internet of things is not just a buzzword and is no longer a dream, with sensors abound. The world is on its way to become totally connected, although it will take time to work out a few kinks here and there (with a great foundation, you create a great product; this foundation is what will take the most time). Your appliances will talk to you in your “smart house” and your “self-driving car” will take you to your super tech office where you will work with ease thanks to all the wonders of technology. But let’s step back to reality and think, how is all this going to come about, what will we do with all the data collected and how will we protect it?

 

First thing first is all the sensors have to be put in place, and many questions have to be addressed. Does a door lock by one vendor communicate with a light switch by another vendor, and do you want the thermostat to be part of the conversation and will anyone else be able to see my info or get into my home? http://www.computerworld.com/article/2488872/emerging-technology/explained--the-abcs-of-the-internet-of-things.html

How will all the needed sensors be installed and will there be any “human” interaction? It will take years to put in place all the needed sensors but there are some that are already engaging in the IOT here in the US. Hotels (as an example but not the only one investing in IOT) are using sensors connected to products that they are available for sale in each room, which is great but I recently had an experience with how “people” are the vital part of “IOT” – I went to check out of a popular hotel in Vegas, when I was asked if I drank one of the coffees in the room, I replied, “no, why” and was told that the sensor showed that I had either drank or moved the coffee, the hotel clerk verified that I had “moved” and not “drank” the coffee but without her, I would have been billed and had to refute the charge. Refuting charges are not exactly good for business and customers service having to handle “I didn’t purchase this” disputes 24/7 wouldn’t exactly make anyone’s day, so thank goodness for human interactions right there on the spot.

 

“The Internet of Things” is not just a US effort - Asia, in my opinion, is far ahead of the US, as far as the internet of things is concerned. If you are waiting in a Korean subway station, commuters can browse and scan the QR codes of products which will later be delivered to their homes. (Source: Tesco) - Transport for London’s central control centers use the aggregated sensor data to deploy maintenance teams, track equipment problems, and monitor goings-on in the massive, sprawling transportation systemTelent’s Steve Pears said in a promotional video for the project that "We wanted to help rail systems like the London Underground modernize the systems that monitor it’s critical assets—everything from escalators to lifts to HVAC control systems to CCTV and communication networks." The new smart system creates a computerized and centralized replacement for a public transportation system that used notebooks and pens in many cases. http://www.fastcolabs.com/3030367/the-london-underground-has-its-own-internet-of-things

 

But isn't the Internet of Things too expensive to implement? Many IoT devices rely on multiple sensors to monitor the environment around them. The cost of these sensors declined 50% in the past decade, according to Goldman Sachs. We expect prices to continue dropping at a steady rate, leading to an even more cost-effective sensor. http://www.businessinsider.com/four-elements-driving-iot-2014-10

 

 

The Internet of Things is not just about gathering of data but also about the analysis and use of data. So all this data generated by the internet of thing, when used correctly, will help us in our everyday life as consumer and help companies keep us safer by predicting and thus avoiding issues that could harm or delay, not to mention the costs that could be reduced from patterns in data for transportation, healthcare, banking, the possibilities are endless.

 

Let’s talk about security and data breaches – Now you may be thinking I’m in analytics or data science why should I be concerned with security? Let’s take a look at several breaches that have made the headlines lately.

 

Target recently suffered a massive security breach thanks to attacker infiltrating a third party. http://www.businessweek.com/articles/2014-03-13/target-missed-alarms-in-epic-hack-of-credit-card-data and so did Home depot http://www.usatoday.com/story/money/business/2014/11/06/home-depot-hackers-stolen-data/18613167/ PC world said “Data breach trends for 2015: Credit cards, healthcare records will be vulnerable http://www.pcworld.com/article/2853450/data-breach-trends-for-2015-credit-cards-healthcare-records-will-be-vulnerable.html

 

 

Sony was hit by hackers on Nov. 24, resulting in a company wide computer shutdown and the leak of corporate information, including the multimillion-dollar pre-bonus salaries of executives and the Social Security numbers of rank-and-file employees. A group calling itself the Guardians of Peace has taken credit for the attacks. http://www.nytimes.com/2014/12/04/business/sony-pictures-and-fbi-investigating-attack-by-hackers.html?_r=0

 

http://www.idtheftcenter.org/images/breach/DataBreachReports_2014.pdf

 

So how do we protect ourselves in a world of BIG DATA and the IOT?
Why should I – as a data scientist or analyst be worried about security, that’s not really part of my job is it? Well if you are a consultant or own your own business it is! Say, you download secure data from your clients and then YOU get hacked, guess who is liable if sensitive information is leaked or gets into the wrong hands? What if you develop a platform where the client’s customers can log in and check their accounts, credit card info and purchase histories are stored on this system, if stolen, it can set you up for a lawsuit. If you are a corporation, you are protected in some extents but what if you operate as a sole proprietor – you could lose your home, company and reputation. Still think security when dealing with big data isn’t important?

Organizations need to get better at protecting themselves and discovering that they’ve been breached plus we, the consultants, need to do a better job of protecting our own data and that means you can’t use password as a password! Let’s not make it easy for the hackers and let’s be sure that when we collect sensitive data and yes, even the data collected from cool technology toys connected to the internet, that we are security minded, meaning check your statements, logs and security messages - verify everything! When building your database, use all the security features available (masking, obfuscation, encryption) so that if someone does gain access, what they steal is NOT usable!

 

Be safe and enjoy what tech has to offer with peace of mind and at all cost, protect your DATA.

 

I’ll leave you with a few things to think about:


“Asset management critical to IT security”
"A significant number of the breaches are often caused by vendors but it's only been recently that retailers have started to focus on that," said Holcomb. "It's a fairly new concept for retailers to look outside their walls." (Source:  http://www.fierceretail.com/)

 

“Data Scientist: Owning Up to the Title”
Enter the Data Scientist; a new kind of scientist charged with understanding these new complex systems being generated at scale and translating that understanding into usable tools. Virtually every domain, from particle physics to medicine, now looks at modeling complex data to make our discoveries and produce new value in that field. From traditional sciences to business enterprise, we are realizing that moving from the "oil" to the "car", will require real science to understand these phenomena and solve today's biggest challenges. (Source:  http://www.datasciencecentral.com/profiles/blogs/data-scientist-owning-up-to-the-title)

 

 

Forget about data (for a bit) what’s your strategic vision to address your market?

Where are the opportunities given global trends and drivers? Where can you carve out new directions based on data assets? What is your secret sauce? What do you personally do on an everyday basis to support that vision? What are your activities? What decisions do you make as a part of those activities? Finally what data do you use to support these decisions?

http://www.datasciencecentral.com/profiles/blogs/top-down-or-bottom-up-5-tips-to-make-the-most-of-your-data-assets



Originally posted on Data Science Central 

Follow us @IoTCtrl | Join our Community

Read more…

Guest blog post by Cameron Turner

Executive Summary

Though often the focus of the urban noise debate, Caltrain is one of many contributors to overall sound levels along the Bay Area’s peninsula corridor. In this investigation, Cameron Turner of Palo Alto’s The Data Guild takes a look at this topic using a custom-built Internet of Things (IoT) sensor atop the Helium networking platform.

Introduction

If you live in (or visit) the Bay Area, chances are you have experience with the Caltrain. Caltrain is a commuter line which travels 77.4 miles between San Francisco and San Jose , carrying over 50 thousand passengers on over 70 trains daily.[1]

I’m lucky to live two blocks from the Caltrain line, and enjoy the convenience of the train. My office, The Data Guild, is just one block away. The Caltrain and its rhythms, bells and horns are a part of our daily life, and connect us to the City and with connections to BART, Amtrak, SFO and SJC, the rest of the world.

Over the holidays, my 4-year-old daughter and I undertook a project to quantify the Caltrain through a custom-built sensor and reporting framework, to get some first-hand experience in the so-called Internet of Things (IoT). This project also aligns with The Data Guild’s broader ambition to build out custom sensor systems atop network technologies to address global issues. (More on this here.)

Let me note here that this project was an exploration, and was not conducted in a manner (in goals or methodology) to provide fodder for either side of the many ongoing caltrain debates: the electrification project, quiet zone, or tragic recent deaths on the tracks.

Background

My interest in such a project began with an article published in the Palo Alto Daily in October 2014. The article addressed the call for a quiet zone in downtown Palo Alto, following complaints from residents of buildings closest to the tracks. Many subjective frustrations were made by residents based on personal experience.

According the the Federal Railroad Administration (FRA), the rules by which Caltrain operates, train engineers “must begin to sound train horns at least 15 seconds, and no more than 20 seconds, in advance of all public grade crossings.”

Additionally: “Train horns must be sounded in a standardized pattern of 2 long, 1 short and 1 long blasts.” and “The maximum volume level for the train horn is 110 decibels which is a new requirement. The minimum sound level remains 96 decibels.“

Questions

Given the numeric nature of the rules, and the subjective nature of current analysis/discussion, it seemed an ideal problem to address with data. Some of the questions we hoped to address including and beyond this issue:

  • Timing: Are train horns sounded at the appropriate time?
  • Schedule: Are Caltrains coming and going on time?
  • Volume: Are the Caltrain horns sounding at the appropriate level?
  • Relativity: How do Caltrain horns contribute to overall urban noise levels?

Methodology

Our methodology to address these topics included several steps:

  1. Build a custom sensor equipped to capture ambient noise levels
  2. Leverage an uplink capability to receive data from the sensor in near real-time
  3. Deploy sensor then monitor sensor output and test/modify as needed
  4. Develop a crude statistical model to convert sensor levels (voltage) to sound levels (dB)
  5. Analysis and reporting

Apparatus

We developed a simple sensor based on the Arduino platform. A baseline Uno board, equipped with a local ATmega328 processor, was wired to and Adafruit Electret Microphone/Amplifier 4466 w/adjustable gain.

We were lucky to be introduced through the O’Reilly Strata NY event to a local company: Helium. Backed by Khosla Ventures et al, Helium is building an internet of things platform for smart machines. They combine a wireless protocol optimized for device and sensor data with cloud-based tooling for working with the data and building applications.

We received a Beta Kit which included a Arduino shield for uplink to their bridge device, which then connects via GSM to the Internet. Here is our sensor (left) with the Helium bridge device (right).

Deployment

With our instrument ready for deployment, we sought to find a safe location to deploy. By good fortune, a family friend (and member of the staff of the Stanford Statistics department, where I am completing my degree) owns a home immediately adjacent to a Caltrain crossing, where Caltrain operators are required to sound their horn.

Conductors might also be particularly sensitive to this crossing, Churchill St., due to its proximity to Palo Alto High School and the tragic train-related death of a teen, recently.

From a data standpoint, this location was ideal as it sits approximately half-way between the Palo Alto and California Avenue stations.

We deployed our sensor outdoors facing the track in a waterproof enclosure and watched the first data arrive.

Monitoring

Through a connector to Helium’s fusion platform, we were able to see data in near real-time. (note the “debug” window on the right, where microphone output level arrives each second).

We used another great service, provided by Librato, (now a part of SolarWinds) a San Francisco-based monitoring and metrics company. Using Librato, we enabled data visualization of the sound levels as they were generated. We were able to view this relative to its history. This was a powerful capability as we worked to fine-tune the power and amplifier.

Note the spike in the middle of the image above, which we could map to a train horn heard ourselves during the training period.

Data Preparation

Next, we took a weekday (January 7, 2015), which appeared typical of a non-holiday weekday relative to the entire month of data collected. For this period, we were able to construct a 24-hour data set at 1-second sample intervals for our analysis.

Data was accessed through the Librato API, downloaded as JSON, converted to CSV and cleansed.

Analysis

First, to gain intuition, we took a sample recording gathered at the sensor site of a typical train horn.

Click HERE to hear the sample sound.

Using matplotlib within an ipython notebook, we are able to “see” this sound, in both its raw audio form and as a spectrogram showing frequency:

Next, we look at our entire 24 hours of data, beginning on the evening of January 6, and concluding 24 hours later on the evening of January 7th. Note the quiet “overnight” period, about a quarter of the way across the x axis.

To put this into context, we overlay the Caltrain schedule. Given the sensor sits between the Palo Alto and California Avenue stations, and given the variance in stop times, we mark northbound trains using the scheduled stop at Palo Alto (red), and southbound trains using the scheduled stop at California Ave (green).

Initially, we can make two converse observations: many peak sound events tend to lie quite close to these stop times, as expected. However: many of the sound events (including the maximum recorded value, the nightly ~11pm freight train service) occur independent of the scheduled Caltrains.

Conversion to Decibels

On the Y axis above, the sound level is reported in the raw voltage output from the Microphone. To address the questions above we needed a way to convert these values to decibel units (dB).

To do so, a low-cost sound meter was obtained from Fry’s. Then an on-site calibration was performed to map decibel readings from the sensor to the voltage output uploaded from our microphone.

Within R Studio, these values were plotted and a crude estimation function was derived to create a linear mapping between voltage and dB:

The goal of doing a straight line estimate vs. log-linear was to compensate for differences in apparatus (dB meter vs. microphone within casing) and overall to maintain conservative approximations. Most of the events in question during the observation period were between 2.0 and 2.5 volts, where we collected several training points (above).

A challenge in this process was the slight lag between readings and data collection with unknown variance. As such, only “peak” and “trough” measurements could be used reliably to build the model.

With this crude conversion estimator in hand, we would now replot the data above with decibels on the y axis.

Clearly the “peaks” above are of interest as outliers from the baseline noise level at this site. In fact, there are 69 peaks (>82 dB) observed (at 1-second sample rate), and 71 scheduled trains for this same period. Though this location was about 100 yards removed from the tracks, the horns are quieter than the recommended 96dB-115dB range recommended by the FRA. (With caveat above re: crude approximator)

Interesting also that we’re not observing the “two long-two short-one long” pattern. Though some events are lost to the sampling rate, qualitatively this does not seem to be a standard practice followed by the engineers. Those who live in Palo Alto also know this to be true, qualitatively.

Also worth noting is the high variance of ambient noise, the central horizontal blue “cloud” above, ranging from ~45 dB to ~75 dB. We sought to understand the nature of this variance and whether it contained structure.

Looking more closely at just a few minutes of data during the Jan 7 morning commute, we can see that indeed there is a periodic structure to the variance.

In comparing to on-site observations, we could determine that this period was defined by the traffic signal which sits between the sensor and the train tracks, on Alma St. Additionally, we often observe an “M” structure (bimodal peak) indicating the southbound traffic accelerating from the stop line when the light turned green, followed by the passing northbound traffic seconds later.

Looking at a few minutes of the same morning commute, we can clearly see when the train passed and sounded its horn. Here again, green indicates a southbound train, red indicates and northbound train.

In this case, the southbound train passed slightly before its scheduled arrival time at the California Avenue station, and the Northbound train passed within its scheduled arrival minute, both on time. Note also the peak unassociated with the train. We’ll discuss this next.

Perhaps a more useful summary of the data collected is shown as a histogram, where the decibels are shown on the X axis and the frequency (count) is shown on the Y axis.

We can clearly see a bimodal distribution, where sound is roughly normally distributed, with a second distribution at the higher end. The question still remained why several of the peak observed values fell nowhere near the scheduled train time?

The answer here requires no sensors: airplanes, sirens and freight trains are frequent noise sources in Palo Alto. These factors, coupled with a nearby residential construction project accounted for the non-regular noise events we observed.

Click HERE to hear a sample sound.

Finally, we subsetted the data into three groups, one to look at non-Train minutes, one to look at northbound train minutes and one to look at southbound train minutes. The mean dB levels were 52.13, 52.18 and 52.32 respectively. While the order here makes sense, these samples bury the outcome since a horn blast may only be one second of a train-minute. The difference between northbound and southbound are consistent with on-site observation-- given the sensor lies on the northeast corner of the crossing, horn blasts from southbound trains were more pronounced.

Conclusion

Before making any conclusions it should be noted again that these are not scientific findings, but rather an attempt to add some rigor to the discussion around Caltrain and noise pollution. Further study with a longer period of analysis and duplicity of data collection would be required to statistically state these conclusions.

That said, we can readdress the topics in question:

Timing: Are train horns sounded at the appropriate time?

The FRA recommends engineers sound their horn between 15 and 20 seconds before a crossing. Given the tight urban nature of this crossing this recommendation seems a misfit. Caltrain engineers are sounding within 2-3 seconds of the crossing, which seems more appropriate.

Schedule: Are Caltrains coming and going on time?

Though not explored in depth here, generally we can observe that trains are passing our sensor prior to their scheduled arrival at the upcoming station.

Volume: Are the Caltrain horns sounding at the appropriate level?

As discussed above, the apparent dB level at a location very close to the track was well below the FRA recommended levels.

Relativity: How do Caltrain horns contribute to overall urban noise levels?

The Caltrain horns generate roughly an additional 10dB to peak baseline noise levels, including period traffic events at the intersection observed.

Opinions

Due to their regular frequency and physical presence, trains are an easy target when it comes to urban sound attenuation efforts. However, the regular oscillations of traffic, sirens, airplanes and construction create a very high, if not predictable baseline above which trains must be heard.

Considering the importance of safety to this system, which operates just inches from bikers, drivers and pedestrians, there is a tradeoff to be made between supporting quiet zone initiatives and the capability of speeding trains to be heard.

In Palo Alto, as we move into an era of electric cars, improved bike systems and increased pedestrian access, the oscillations of noise created by non-train activities may indeed subside over time. And this in turn, might provide an opportunity to lower the “alert sounds” such as sirens and train horns required to deliver these services safely. Someday much of our everyday activity might be accomplished quietly.

Until then, we can only appreciate these sounds which must rise above our noisy baseline, as a reminder of our connectedness to the greater bay area through our shared focus on safety and convenient public transportation.

Acknowledgements:

Sincere thanks to Helen T. and Nick Parlante of Stanford University, Mark Phillips of Helium and Nik Wekwerth/Jason Derrett/Peter Haggerty of Librato for their help and technical support.

Thanks also to my peers at The Data Guild, Aman, Chris, Dave and Sandy and the Palo Alto Police IT department for their feedback.

And thanks to my daughter Tallulah for her help soldering and moral support.

[1] http://en.wikipedia.org/wiki/Caltrain

Originally posted on LinkedIn. 

Follow us @IoTCtrl | Join our Community

Read more…

Guest blog post by ajit jaokar

By Ajit Jaokar @ajitjaokar Please connect with me if you want to stay in touch on linkedin and for future updates

Cross posted from my blog - I look forward to discussion/feedback here

Note: The paper below is best read as a pdf which you can download from the blog for free

Background and Abstract

This article is a part of an evolving theme. Here, I explain the basics of Deep Learning and how Deep learning algorithms could apply to IoT and Smart city domains. Specifically, as I discuss below, I am interested in complementing Deep learning algorithms using IoT datasets. I elaborate these ideas in the Data Science for Internet of Things program which enables you to work towards being a Data Scientist for the Internet of Things  (modelled on the course I teach at Oxford University and UPM – Madrid). I will also present these ideas at the International conference on City Sciences at Tongji University in Shanghai  and the Data Science for IoT workshop at the Iotworld event in San Francisco

Please connect with me if you want to stay in touch on linkedin and for future updates

Deep Learning

Deep learning is often thought of as a set of algorithms that ‘mimics the brain’. A more accurate description would be an algorithm that ‘learns in layers’. Deep learning involves learning through layers which allows a computer to build a hierarchy of complex concepts out of simpler concepts.

The obscure world of deep learning algorithms came into public limelight when Google researchers fed 10 million random, unlabeled images from YouTube into their experimental Deep Learning system. They then instructed the system to recognize the basic elements of a picture and how these elements fit together. The system comprising 16,000 CPUs was able to identify images that shared similar characteristics (such as images of Cats). This canonical experiment showed the potential of Deep learning algorithms. Deep learning algorithms apply to many areas including Computer Vision, Image recognition, pattern recognition, speech recognition, behaviour recognition etc

 

How does a Computer Learn?

To understand the significance of Deep Learning algorithms, it’s important to understand how Computers think and learn. Since the early days, researchers have attempted to create computers that think. Until recently, this effort has been rules based adopting a ‘top down’ approach. The Top-down approach involved writing enough rules for all possible circumstances.  But this approach is obviously limited by the number of rules and by its finite rules base.

To overcome these limitations, a bottom-up approach was proposed. The idea here is to learn from experience. The experience was provided by ‘labelled data’. Labelled data is fed to a system and the system is trained based on the responses. This approach works for applications like Spam filtering. However, most data (pictures, video feeds, sounds, etc.) is not labelled and if it is, it’s not labelled well.

The other issue is in handling problem domains which are not finite. For example, the problem domain in chess is complex but finite because there are a finite number of primitives (32 chess pieces)  and a finite set of allowable actions(on 64 squares).  But in real life, at any instant, we have potentially a large number or infinite alternatives. The problem domain is thus very large.

A problem like playing chess can be ‘described’ to a computer by a set of formal rules.  In contrast, many real world problems are easily understood by people (intuitive) but not easy to describe (represent) to a Computer (unlike Chess). Examples of such intuitive problems include recognizing words or faces in an image. Such problems are hard to describe to a Computer because the problem domain is not finite. Thus, the problem description suffers from the curse of dimensionality i.e. when the number of dimensions increase, the volume of the space increases so fast that the available data becomes sparse. Computers cannot be trained on sparse data. Such scenarios are not easy to describe because there is not enough data to adequately represent combinations represented by the dimensions. Nevertheless, such ‘infinite choice’ problems are common in daily life.

How do Deep learning algorithms learn?

Deep learning is involved with ‘hard/intuitive’ problem which have little/no rules and high dimensionality. Here, the system must learn to cope with unforeseen circumstances without knowing the Rules in advance. Many existing systems like Siri’s speech recognition and Facebook’s face recognition work on these principles.  Deep learning systems are possible to implement now because of three reasons: High CPU power, Better Algorithms and the availability of more data. Over the next few years, these factors will lead to more applications of Deep learning systems.

Deep Learning algorithms are modelled on the workings of the Brain. The Brain may be thought of as a massively parallel analog computer which contains about 10^10 simple processors (neurons) – each of which require a few milliseconds to respond to input. To model the workings of the brain, in theory, each neuron could be designed as a small electronic device which has a transfer function similar to a biological neuron. We could then connect each neuron to many other neurons to imitate the workings of the Brain. In practise,  it turns out that this model is not easy to implement and is difficult to train.

So, we make some simplifications in the model mimicking the brain. The resultant neural network is called “feed-forward back-propagation network”.  The simplifications/constraints are: We change the connectivity between the neurons so that they are in distinct layers. Each neuron in one layer is connected to every neuron in the next layer. Signals flow in only one direction. And finally, we simplify the neuron design to ‘fire’ based on simple, weight driven inputs from other neurons. Such a simplified network (feed-forward neural network model) is more practical to build and use.

Thus:

a)      Each neuron receives a signal from the neurons in the previous layer

b)      Each of those signals is multiplied by a weight value.

c)      The weighted inputs are summed, and passed through a limiting function which scales the output to a fixed range of values.

d)      The output of the limiter is then broadcast to all of the neurons in the next layer.

Image and parts of description in this section adapted from : Seattle robotics site

The most common learning algorithm for artificial neural networks is called Back Propagation (BP) which stands for “backward propagation of errors”. To use the neural network, we apply the input values to the first layer, allow the signals to propagate through the network and read the output. A BP network learns by example i.e. we must provide a learning set that consists of some input examples and the known correct output for each case. So, we use these input-output examples to show the network what type of behaviour is expected. The BP algorithm allows the network to adapt by adjusting the weights by propagating the error value backwards through the network. Each link between neurons has a unique weighting value. The ‘intelligence’ of the network lies in the values of the weights. With each iteration of the errors flowing backwards, the weights are adjusted. The whole process is repeated for each of the example cases. Thus, to detect an Object, Programmers would train a neural network by rapidly sending across many digitized versions of data (for example, images)  containing those objects. If the network did not accurately recognize a particular pattern,  the weights would be adjusted. The eventual goal of this training is to get the network to consistently recognize the patterns that we recognize (ex Cats).

How does Deep Learning help to solve the intuitive problem

The whole objective of Deep Learning is to solve ‘intuitive’ problems i.e. problems characterized by High dimensionality and no rules.  The above mechanism demonstrates a supervised learning algorithm based on a limited modelling of Neurons – but we need to understand more.

Deep learning allows computers to solve intuitive problems because:

  • With Deep learning, Computers can learn from experience but also can understand the world in terms of a hierarchy of concepts – where each concept is defined in terms of simpler concepts.
  • The hierarchy of concepts is built ‘bottom up’ without predefined rules by addressing the ‘representation problem’.

This is similar to the way a child learns ‘what a dog is’ i.e. by understanding the sub-components of a concept ex  the behavior(barking), shape of the head, the tail, the fur etc and then putting these concepts in one bigger idea i.e. the Dog itself.

The (knowledge) representation problem is a recurring theme in Computer Science.

Knowledge representation incorporates theories from psychology which look to understand how humans solve problems and represent knowledge.  The idea is that: if like humans, Computers were to gather knowledge from experience, it avoids the need for human operators to formally specify all of the knowledge that the computer needs to solve a problem.

For a computer, the choice of representation has an enormous effect on the performance of machine learning algorithms. For example, based on the sound pitch, it is possible to know if the speaker is a man, woman or child. However, for many applications, it is not easy to know what set of features represent the information accurately. For example, to detect pictures of cars in images, a wheel may be circular in shape – but actual pictures of wheels may have variants (spokes, metal parts etc). So, the idea of representation learning is to find both the mapping and the representation.

If we can find representations and their mappings automatically (i.e. without human intervention), we have a flexible design to solve intuitive problems.   We can adapt to new tasks and we can even infer new insights without observation. For example, based on the pitch of the sound – we can infer an accent and hence a nationality. The mechanism is self learning. Deep learning applications are best suited for situations which involve large amounts of data and complex relationships between different parameters. Training a Neural network involves repeatedly showing it that: “Given an input, this is the correct output”. If this is done enough times, a sufficiently trained network will mimic the function you are simulating. It will also ignore inputs that are irrelevant to the solution. Conversely, it will fail to converge on a solution if you leave out critical inputs. This model can be applied to many scenarios as we see below in a simplified example.

An example of learning through layers

Deep learning involves learning through layers which allows a computer to build a hierarchy of complex concepts out of simpler concepts. This approach works for subjective and intuitive problems which are difficult to articulate.

Consider image data. Computers cannot understand the meaning of a collection of pixels. Mappings from a collection of pixels to a complex Object are complicated.

With deep learning, the problem is broken down into a series of hierarchical mappings – with each mapping described by a specific layer.

The input (representing the variables we actually observe) is presented at the visible layer. Then a series of hidden layers extracts increasingly abstract features from the input with each layer concerned with a specific mapping. However, note that this process is not pre defined i.e. we do not specify what the layers select

For example: From the pixels, the first hidden layer identifies the edges

From the edges, the second hidden layer identifies the corners and contours

From the corners and contours, the third hidden layer identifies the parts of objects

Finally, from the parts of objects, the fourth hidden layer identifies whole objects

Image and example source: Yoshua Bengio book – Deep Learning

Implications for IoT

To recap:

  • Deep learning algorithms apply to many areas including Computer Vision, Image recognition, pattern recognition, speech recognition, behaviour recognition etc
  • Deep learning systems are possible to implement now because of three reasons: High CPU power, Better Algorithms and the availability of more data. Over the next few years, these factors will lead to more applications of Deep learning systems.
  • Deep learning applications are best suited for situations which involve large amounts of data and complex relationships between different parameters.
  • Solving intuitive problems: Training a Neural network involves repeatedly showing it that: “Given an input, this is the correct output”. If this is done enough times, a sufficiently trained network will mimic the function you are simulating. It will also ignore inputs that are irrelevant to the solution. Conversely, it will fail to converge on a solution if you leave out critical inputs. This model can be applied to many scenarios

In addition, we have limitations in the technology. For instance, we have a long way to go before a Deep learning system can figure out that you are sad because your cat died(although it seems Cognitoys based on IBM watson is heading in that direction). The current focus is more on identifying photos, guessing the age from photos(based on Microsoft’s project Oxford API)

And we have indeed a way to go as Andrew Ng reminds us to think of Artificial Intelligence as building a rocket ship

“I think AI is akin to building a rocket ship. You need a huge engine and a lot of fuel. If you have a large engine and a tiny amount of fuel, you won’t make it to orbit. If you have a tiny engine and a ton of fuel, you can’t even lift off. To build a rocket you need a huge engine and a lot of fuel. The analogy to deep learning [one of the key processes in creating artificial intelligence] is that the rocket engine is the deep learning models and the fuel is the huge amounts of data we can feed to these algorithms.”

Today, we are still limited by technology from achieving scale. Google’s neural network that identified cats had 16,000 nodes. In contrast, a human brain has an estimated 100 billion neurons!

There are some scenarios where Back propagation neural networks are suited

  • A large amount of input/output data is available, but you’re not sure how to relate it to the output. Thus, we have a larger number of “Given an input, this is the correct output” type scenarios which can be used to train the network because it is easy to create a number of examples of correct behaviour.
  • The problem appears to have overwhelming complexity. The complexity arises from Low rules base and a high dimensionality and from data which is not easy to represent.  However, there is clearly a solution.
  • The solution to the problem may change over time, within the bounds of the given input and output parameters (i.e., today 2+2=4, but in the future we may find that 2+2=3.8) and Outputs can be “fuzzy”, or non-numeric.
  • Domain expertise is not strictly needed because the output can be purely derived from inputs: This is controversial because it is not always possible to model an output based on the input alone. However, consider the example of stock market prediction. In theory, given enough cases of inputs and outputs for a stock value, you could create a model which would predict unknown scenarios if it was trained adequately using deep learning techniques.
  • Inference:  We need to infer new insights without observation. For example, based on the pitch of the sound – we can infer an accent and hence a nationality

Given an IoT domain, we could consider the top-level questions:

  • What existing applications can be complemented by Deep learning techniques by adding an intuitive component? (ex in smart cities)
  • What metrics are being measured and predicted? And how could we add an intuitive component to the metric?
  • What applications exist in Computer Vision, Image recognition, pattern recognition, speech recognition, behaviour recognition etc which also apply to IoT

Now, extending more deeply into the research domain, here are some areas of interest that I am following.

Complementing Deep Learning algorithms with IoT datasets

In essence, these techniques/strategies complement Deep learning algorithms with IoT datasets.

1)      Deep learning algorithms and Time series data : Time series data (coming from sensors) can be thought of as a 1D grid taking samples at regular time intervals, and image data can be thought of as a 2D grid of pixels. This allows us to model Time series data with Deep learning algorithms (most sensor / IoT data is time series).  It is relatively less common to explore Deep learning and Time series – but there are some instances of this approach already (Deep Learning for Time Series Modelling to predict energy loads using only time and temp data  )

2)      Multiple modalities: multimodality in deep learning. Multimodality in deep learning algorithms is being explored  In particular, cross modality feature learning, where better features for one modality (e.g., video) can be learned if multiple modalities (e.g., audio and video) are present at feature learning time

3)      Temporal patterns in Deep learning: In their recent paper, Ph.D. student Huan-Kai Peng and Professor Radu Marculescu, from Carnegie Mellon University’s Department of Electrical and Computer Engineering, propose a new way to identify the intrinsic dynamics of interaction patterns at multiple time scales. Their method involves building a deep-learning model that consists of multiple levels; each level captures the relevant patterns of a specific temporal scale. The newly proposed model can be also used to explain the possible ways in which short-term patterns relate to the long-term patterns. For example, it becomes possible to describe how a long-term pattern in Twitter can be sustained and enhanced by a sequence of short-term patterns, including characteristics like popularity, stickiness, contagiousness, and interactivity. The paper can be downloaded HERE

Implications for Smart cities

I see Smart cities as an application domain for Internet of Things. Many definitions exist for Smart cities/future cities. From our perspective, Smart cities refer to the use of digital technologies to enhance performance and wellbeing, to reduce costs and resource consumption, and to engage more effectively and actively with its citizens (adapted from Wikipedia). Key ‘smart’ sectors include transport, energy, health care, water and waste. A more comprehensive list of Smart City/IoT application areas are: Intelligent transport systems – Automatic vehicle , Medical and Healthcare, Environment , Waste management , Air quality , Water quality, Accident and  Emergency services, Energy including renewable, Intelligent transport systems  including autonomous vehicles. In all these areas we could find applications to which we could add an intuitive component based on the ideas above.

Typical domains will include Computer Vision, Image recognition, pattern recognition, speech recognition, behaviour recognition. Of special interest are new areas such as the Self driving cars – ex theLutz pod and even larger vehicles such as self driving trucks

Conclusions

Deep learning involves learning through layers which allows a computer to build a hierarchy of complex concepts out of simpler concepts. Deep learning is used to address intuitive applications with high dimensionality.  It is an emerging field and over the next few years, due to advances in technology, we are likely to see many more applications in the Deep learning space. I am specifically interested in how IoT datasets can be used to complement deep learning algorithms. This is an emerging area with some examples shown above. I believe that it will have widespread applications, many of which we have not fully explored(as in the Smart city examples)

I see this article as part of an evolving theme. Future updates will explore how Deep learning algorithms could apply to IoT and Smart city domains. Also, I am interested in complementing Deep learning algorithms using IoT datasets.

I elaborate these ideas in the Data Science for Internet of Things program  (modelled on the course I teach at Oxford University and UPM – Madrid). I will also present these ideas at the International conference on City Sciences at Tongji University in Shanghai  and the Data Science for IoT workshop at the Iotworld event in San Francisco

Please connect with me if you want to stay in touch on linkedin and for future updates

Follow us @IoTCtrl | Join our Community

Read more…

Guest blog post by ajit jaokar

Often, Data Science for IoT differs from conventional data science due to the presence of hardware.

Hardware could be involved in integration with the Cloud or Processing at the Edge (which Cisco and others have called Fog Computing).

Alternately, we see entirely new classes of hardware specifically involved in Data Science for IoT(such as synapse chip for Deep learning)

Hardware will increasingly play an important role in Data Science for IoT.

A good example is from a company called Cognimem which natively implements classifiers(unfortunately, the company does not seem to be active any more as per their twitter feed)

In IoT, speed and real time response play a key role. Often it makes sense to process the data closer to the sensor.

This allows for a limited / summarized data set to be sent to the server if needed and also allows for localized decision making.  This architecture leads to a flow of information out from the Cloud and the storage of information at nodes which may not reside in the physical premises of the Cloud.

In this post, I try to explore the various hardware touchpoints for Data analytics and IoT to work together.

Cloud integration: Making decisions at the Edge

Intel Wind River edge management system certified to work with the Intel stack  and includes capabilities such as data capture, rules-based data analysis and response, configuration, file transfer and  Remote device management

Integration of Google analytics into Lantronix hardware –  allows sensors to send real-time data to any node on the Internet or to a cloud based application.

Microchip integration with Amazon Web services  uses an  embedded application with the Amazon Elastic Compute Cloud (EC2) service. Based on  Wi-Fi Client Module Development Kit . Languages like Python or Ruby can be used for development

Integration of Freescale and Oracle which consolidates data collected from multiple appliances from multiple Internet of things service providers.

Libraries

Libraries are another avenue for analytics engines to be integrated into products – often at the point of creation of the device. Xively cloud services is an example of this strategy through xively libraries

APIs

In contrast, keen.io provides APIs for IoT devices to create their own analytics engines ex (smartwatch Pebble’s using of keen.io)  without locking equipment providers into a particular data architecture.

Specialized hardware

We see increasing deployment  of specialized hardware for analytics. Ex egburt from Camgian which uses sensor fusion technolgies for IoT.

In the Deep learning space, GPUs are widely used and more specialized hardware emerges such asIBM’s synapse chip. But more interesting hardware platforms are emerging such as Nervana Systemswhich creates hardware specifically for Neural networks.

Ubuntu Core and IFTTT spark

Two more initiatives on my radar deserve a space in themselves – even when neither of them have currently an analytics engine:  Ubuntu Core – Docker containers+lightweight Linux distribution as an IoT OS and IFTTT spark initiatives

Comments welcome

This post is leading to vision for Data Science for IoT course/certification. Please sign up on the link if you wish to know more when launched in Feb.

Image source: cognimem

Follow us @IoTCtrl | Join our Community

Read more…

Security challenges for IoT

Guest blog post by vozag
 

Emergence of IoT presents security challenges more challenging than any industrial systems have seen.

Open Web Application Security Project (OWASP) is a reputed international organization which focuses on improving the security of the software. It sponsors the hugely  popular Top ten project which publishes the top ten security risks for web applications all over the world.

 

The “OWASP Internet of Things (IoT) Top 10” project defines the top ten security surface areas presented by IoT systems. The project aims to provide practical security recommendations for builders, breakers, and users of IoT systems.

 

Last year HP which started this project used it as a baseline to evaluate top ten IoT devices which are were widely used and released a report. The study concluded that on an average each device studied had 25 vulnerabilities listed as a part of project.

 

The top 10 vulnerabilities impact of each vulnerability and the link in the order listed in project are given below:

 

Insecure Web Interface

Insecure web interfaces can result in data loss or corruption, lack of accountability, or denial of access and can lead to complete device takeover.

 

Insufficient Authentication/Authorization

Insufficient authentication/authorization can result in data loss or corruption, lack of accountability, or denial of access and can lead to complete compromise of the device and/or user accounts.

 

Insecure Network Services

Insecure network services can result in data loss or corruption, denial of service or facilitation of attacks on other devices.

 

Lack of Transport Encryption

Lack of transport encryption can result in data loss and depending on the data exposed, could lead to complete compromise of the device or user accounts.

 

Privacy concerns

Collection of personal data along with a lack of protection of that data can lead to compromise of a user's personal data.

 

Insecure Cloud Interface

An insecure cloud interface could lead to compromise of user data and control over the device.

 

Insecure Mobile Interface

An insecure mobile interface could lead to compromise of user data and control over the device.

 

Insufficient Security Configurability

Insufficient security configurability could lead to compromise of the device whether intentional or accidental and/or data loss.

 

Insecure_Software/Firmware

Insecure software/firmware could lead to compromise of user data, control over the device and attacks against other devices.

 

Poor Physical Security

Insufficient physical security could lead to compromise of the device itself and any data stored on that device.


Read more…

Given all the buzz happening in the market around IoT, We looked at related projects in the crowd funding website Kickstarter.com to see how are IoT projects doing with respect to all the other ones.

We chose projects which have either “IoT” or “Internet of Things” either in their title or description and here are our findings.

The success rate of projects at Kickstarter is around 37.5%, for Technology projects it is 21% which is a lot less than the average success rate of projects. In Spite of this our analysis shows that the success rate of IoT projects is 44%, which is pretty good news. People are realizing the importance of IoT and are willing to fund the related projects.

 

The projects locations are almost concentrated in US and Europe with a few scattered in Asia and Australia

 

Because the projects are spread all over the world the goals of money to be raised were also in different currencies so to be able to analyse the monetary part we normalized all the numbers to US dollars.

The total sought out money for all the IoT related projects ( ongoing, successful and failed ) in Kickstarter is around $4.7 million and the actual pledged amount for the projects is around $1.5 million.

If you only consider the projects which have made it the total sought out and pledged amount is approximately $1.2 million. So only 2% of the pledged amount went to the unsuccessful projects which is usually the case with most of the projects on Kickstarter.

The average requested funding for all projects is around $60 thousand while the average funding requested by the successfully funded projects is around $44 thousand. For the failed projects it is $3500.

The top 10 successfully funded projects along with their links are given below

 

Read more…

At the CES 2015, I was fascinated by all sorts of possible applications of IoT – socks with sensors, mattresses with sensors, smart watches, smart everything – it seems like a scene in sci-fi movies has just come true. People are eager to learn more about what’s happening around them and now they can.

 

While I was at there I attended a talk given by David Pogue – he is awesome. He pointed out that the prevalence of smartphone is the key to the realization of the phenomenon called “Quantified Self.” I agreed with him. Smart phones play a vital role as a hub where all our personal data converge and present, seamlessly. The fact that you carry your smartphone around all the time and that the screen size perfectly reveals all the information results in a catalyst for wearable devices, IoT or what we like to call it, Intelligence of Things.

 

It’s all relevant; Big Data, IoT, Wearable, Cloud Computing… While most data is uploaded to the cloud, the client devices are generally powerful enough that the computing can be decentralized. That said, small data (client side) and big data (server side) form an eco-system where small data triggers the knowledge base cultivated by big data and does the predictive analysis and decision making in a timely manner. Furthermore, your smartphone gathers versatile data and is able to analyze cross-app data to personalize your application settings. For example, what about optimizing navigation based on my physical condition? Or how about suggesting the best route according to my health along with the weather? These individual data records might be small, but collectively they enrich the content of analysis and contribute some amazing value. We at BigObject really appreciate this context of Big Data.

 

Marc Andreessen once said, “I think we are all underestimating the impact of aggregated big data across many domains of human behavior, surfaced by smartphone apps.” For us here at BigObject, the next big thing in big data is to find out a methodology that can link multiple data sources together and identify the meaningful connections between that data. Most importantly it must be responsive enough to deliver actionable insight and simple enough for people to adopt. That is the key to fulfill a connected world. 


Originally posted on Data Science Central

Read more…

The Internet of Things (IOT) will soon produce a massive volume and variety of data at unprecedented velocity. If "Big Data" is the product of the IOT, "Data Science" is it's soul.

.

Let's define our terms:

.

Internet of Things (IOT): equipping all physical and organic things in the world with identifying intelligent devices allowing the near real-time collecting and sharing of data between machines and humans. The IOT era has already begun, albeit in it's first primitive stage.
.
Data Science: the analysis of data creation. May involve machine learning, algorithm design, computer science, modeling, statistics, analytics, math, artificial intelligence and business strategy.
.
Big Data: the collection, storage, analysis and distribution/access of large data sets. Usually includes data sets with sizes beyond the ability of standard software tools to capture, curate, manage, and process the data within a tolerable elapsed time. 
.
We are in the pre-industrial age of data technology and science used to process and understand data. Yet the early evidence provides hope that we can manage and extract knowledge and wisdom from this data to improve life, business and public services at many levels. 
.
To date, the internet has mostly connected people to information, people to people, and people to business. In the near future, the internet will provide organizations with unprecedented data. The IOT will create an open, global network that connects people, data and machines. 
.
Billions of machines, products and things from the physical and organic world will merge with the digital world allowing near real-time connectivity and analysis. Machines and products (and every physical and organic thing) embedded with sensors and software - connected to other machines, networked systems, and to humans - allows us to cheaply and automatically collect and share data, analyze it and find valuable meaning. Machines and products in the future will have the intelligence to deliver the right information to the right people (or other intelligent machines and networks), any time, to any device. When smart machines and products can communicate, they help us and other machines understand so we can make better decisions, act fast, save time and money, and improve products and services.
.
The IOT, Data Science and Big Data will combine to create a revolution in the way organizations use technology and processes to collect, store, analyze and distribute any and all data required to operate optimally, improve products and services, save money and increase revenues. Simply put, welcome to the new information age, where we have the potential to radically improve human life (or create a dystopia - a subject for another time).
.
The IOT will produce gigantic amounts of data. Yet data alone is useless - it needs to be interpreted and turned into information. However, most information has limited value - it needs to be analyzed and turned into knowledge. Knowledge may have varying degrees of value - but it needs specialized manipulation to transform into valuable, actionable insights. Valuable, actionable knowledge has great value for specific domains and actions - yet requires sophisticated, specialized expertise to be transformed into multi-domain, cross-functional wisdom for game changing strategies and durable competitive advantage.
.
Big data may provide the operating system and special tools to get actionable value out of data, but the soul of the data, the knowledge and wisdom, is the bailiwick of the data scientist.
.
.
.

Originally posted on  Data Science Central
Read more…

Internet of Things and Bayesian Networks

Originally posted on AnalyticBridge

As big data becomes more of cliche with every passing day, do you feel Internet of Things is the next marketing buzzword to grapple our lives.

So what exactly is Internet of Thing (IoT) and why are we going to hear more about it in the coming days.

Internet of thing (IoT) today denotes advanced connectivity of devices,systems and services that goes beyond machine to machine communications and covers a wide variety of domains and applications specifically in the manufacturing and power, oil and gas utilities.

An application in IoT can be an automobile that has built in sensors to alert the driver when the tyre pressure is low. Built-in sensors on equipment's present in the power plant which transmit real time data and thereby enable to better transmission planning,load balancing. In oil and gas industry, it can help in planning better drilling, track cracks in gas pipelines.

IoT will lead to better predictive maintenance in the manufacturing and utilities and this is will in turn lead to better control, track, monitor or back-up of the process. Even a small percentage improvement in machine performance can significantly benefit the company bottom line.

IoT in some ways is to going to make our machines more brilliant and reactive.

According to GE, 150 Billion dollars in waste across major industries can be eliminated by IoT.

There can be questions that how is IoT different from a SCADA (supervisory control and data acquistion) systems which gets extensively used in the manfucturing industries.

IoT can be considered to be an evolution on the data acquisition part of the SCADA systems.

SCADA has been basically considered to be systems in silos with the data accessible to few people and not leading to long term benefit.

IoT starts with embedding advanced sensors in machines and collecting the data for advanced analytics.

As we start receiving data from the sensors , one important aspect that needs all the focus is the data transmitted correct or erroneous.

How do we validate the data quality.

We are dealing with uncertainty out here.

One of the most commonly used methods for modelling uncertainty is Bayesian networks.

Bayesian network is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph.

Bayesian networks can be used extensively in Internet of things projects to ascertain data transmitted by the sensors.

Read more…

The Internet of Things may be giving over to the Internet of Everything as more and more uses are dreamed up for the new wave of Smart Cities.

In the Internet of Things, objects have their own IP address, meaning that sensors connected to the web can send data to the cloud on just about anything: how much traffic is rolling through a stoplight, how much water you’re using, or how full a trash dumpster is.

Cities are discovering how they can use these new technologies — and the data they generate — to be more efficient and cost effective in many different ways. And it’s a good thing, too; some estimates suggest that 66 percent of the world’s population will live in urban areas by the year 2050.

These are cutting edge ideas, but here are some of the most fascinating ways Smart Cities are using big data and the Internet of Things to improve quality of life for their residents:

  • The city of Long Beach, California is using smart water meters to detect illegal watering in real time and have been used to help some homeowners cut their water usage by as much as 80 percent. That’s vital when the state is going through its worst drought in recorded history and the governor has enacted the first-ever state-wide water restrictions.
  • Los Angeles uses data from magnetic road sensors and traffic cameras to control traffic lights and thus the flow (or congestion) of traffic around the city. The computerized system controls 4,500 traffic signals around the city and has reduced traffic congestion by an estimated 16 percent.  
  • Xcel Energy initiated one of the first ever tests of a “smart grid” in Boulder, Colorado, installing smart meters on customers’ homes that would allow them to log into a website and see their energy usage in real time. The smart grid would also theoretically allow power companies to predict usage in order to plan for future infrastructure needs and prevent brown out scenarios.
  • A tech startup called Veniam is testing a new way to create mobile wi-fi hotspots all over the city in Porto, Portugal. More than 600 city buses and taxis have been equipped with wifi transmitters, creating the largest free wi-fi hotspot in the world. Veniam sells the routers and service to the city, which in turn provides the wi-fi free to citizens, like a public utility. In exchange, the city gets an enormous amount of data — with the idea being that the data can be used to offset the cost of the wi-fi in other areas. For example, in Porto, sensors tell the city’s waste management department when dumpsters are full, so they don’t waste time, man hours, or fuel emptying containers that are only partly full.
  • New York City is creating the world’s first “quantified community” where nearly everything about the environment and residents will be tracked. The community will be able to monitor pedestrian traffic flow, how much of the solid waste collected is recyclable or food waste, and air quality. The project will even collect data on residents’ health and activity levels through an opt-in mobile app.
  • Songdo, South Korea has been conceived and built as the ultimate Smart City — a city of the future. Trash collection in the city is completely automated, through pipes connected to every building. The solid waste is sorted then recycled, buried, or burned for fuel. The city is partnering with Cisco to test other technologies, including home appliances and utilities controlled by your smartphone, and even a tracking system for children (using microchips implanted in bracelets).

This is just the beginning of the integration of big data and the Internet of Things into daily life, but it is by no means the end. As our cities get smarter and begin collecting and sending more and more data, new uses will emerge that may revolutionize the way we live in urban areas.

Of course, more technology can also mean more opportunities for hackers and terrorists. (Anyone see Die Hard 4, where terrorists hacked the traffic control systems in Washington, D.C.?) The threat that a hacker could shut down a city’s power grid, traffic system, or water supply is real — mostly because the technology is so new that cities and providers are not taking the necessary steps to protect themselves.

Still, it would seem that the benefits will outweigh the risks with these new data-driven technologies for cities, so long as the municipalities are paying attention to security and protecting their assets and their customers.

What’s your opinion? Are you for or against more integrated technologies in cities? I’d love to hear your thoughts in the comments below.

I hope you found this post interesting. I am always keen to hear your views on the topic and invite you to comment with any thoughts you might have.

About : Bernard Marr is a globally recognized expert in analytics and big data. He helps companies manage, measure, analyze and improve performance using data.

His new book is: Big Data: Using Smart Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance You can read a free sample chapter here.


Originally posted on Data Science Central

Read more…

Upcoming IoT Events

More IoT News

Answering your Huawei ban questions

A lot has happened since we uploaded our most recent video about the Huawei ban last month. Another reprieve has been issued, licenses have been granted and the FCC has officially barred Huawei equipment from U.S. networks. Our viewers had some… Continue

IoT Career Opportunities