Sensors as a Service are a Disservice

Building up an IoT solution, at least in just about any work that I do, tends to require data from sensors.  Preferably cheap sensors.  And in many cases wireless sensors.  This has been problematic for years.  It used to be that the sensors were expensive, or the wireless was terribly unreliable but as Moore’s Law has marched forward, you can now get some reliable, inexpensive sensors with good battery life.  Unfortunately sensor vendors seem to be opting for a new type of friction – instead of making money on selling us sensors, they’ve decided that making money off of the data from the sensor is the route they should go.

It’s common to hear from the vendors that integration into your solution is simple.  After all, they provide SDKs that allow you to access your data.  Except the SDK is a REST API to their cloud.  A fair number of installations I work on don’t have internet connectivity available.  What then?  Or the connectivity is through a cellular connection, so I have to pay to send the data up to the cloud, then pay to bring it back to a gateway sitting two feet from the device?

Even when I have internet access, having the sensor data in some separate cloud just adds unnecessary complexity.  Let’s say I am working on a solution where I want to store a location and a temperature periodically.  The “sensor as a service” architecture means that I send the location data to my server, the temperature data to the sensor vendors server, then when I want to display that data in a temporally consistent manner, I have to pull from both sources (assuming both are up) and then merge the results based on timestamps (and hoping that the timestamps are identical of course)?

It amazes me that vendors think this is an elegant, or even viable solution.  Sure, I can come up with uses cases where having my sensor data in someone else’s cloud makes sense.  But for every one of those, I can think of at least ten scenarios where I just want to be able to connect right to the sensor from a gateway, very often in close proximity to the sensor, and pull the data into my own app.  From there I can do analytics, aggregation or whatever I need.

Sensors as a service are not what we need.  Proprietary APIs or vendor-only, closed source consumer mobile apps are near useless.  What we, as innovators and as a development community are simple, low-cost sensors with open (preferable standards-based) SDKs that allow our apps to directly connect to, configure and get data from sensors.  Where appropriate, I’d also like clear documentation of any packet data that gets sent to and from the sensor.  If the vendor provides an app for communicating with the sensors, I want to source for that app because it’s highly likely it does things I’m going to want to do.  Essentially, I want to take the sensor out of the box and then spend a minimal amount of time writing code to get to something that proves that it’s going to work for me.  If I can’t at least read sensor data into a console or test app in under two hours then the sensor vendor has failed.

Responsible M2M

About a year ago, maybe two years now, we had a large manufacturing customer that we were working with to implement MTConnect on their production floor. Basically they had 20 five-axis machine tools running creating aircraft parts and they wanted to be able to get data off of those machines and “put it in The Cloud.” Well, first off I’ve talked about how much I dislike the term “The Cloud” so we had to clarify that. Turns out they meant “in a SQL Server database on a local server.”

MTConnect is a machine tool (hence the “MT” part) standard that we leveraged and heavily extended for use in our Solution Family products. Painting it with a broad brush, what it means is that all data from each machine tool – axis positions, part program information, door switches, coolant temperature, run hours, basically the kitchen sink – can be made available through a REST service running either on the machine tool or on a device connected to it.

They wanted to take that data and put it into SQL Server so their engineering group could run analytics on the data. Maybe they wanted to look at part times, energy consumption, tool path length, whatever. They actually weren’t fully sure what they wanted to do with the data, they just knew that “in The Cloud” is where everyone said it should be, so the commandment came down that that’s where the data would go.

Ugh. The conversation went something like this.

“So you want all of the data from each machine tool to go to the server?”

“Yes. Absolutely.”

“You know that there are 6 continually moving axes on those machines, right? And a constantly changing part program.”

“Of course. That’s the data we want.”

“You are aware that that’s *a lot* of data, right?”

“Yes. We want it.”

“You’re sure about this?”

“Yes, we’re sure. Send the data to The Cloud.”

So we set up a mesh of Solution Engines to publish *all* of the data from *all* of the machines to their local server. We turned on the shop floor. And roughly 20 seconds later the network crashed. This was a large, well built, very fast, hard-wired network. There was a lot of available bandwidth. But we were generating more than a lot of data, and the thing puked, and puked fast.

So what’s the lesson here? That you can always generate more data out at the edge of your system than the infrastructure is capable of carrying. If you’re implementing the system for yourself, trying to transfer all of the data is a problem, but if you’re implementing it for a customer, trying to transfer all of it is irresponsible. We did it in a closed system that was just for test, knowing what the result would be and that it would be non-critical (they simply turned off data broadcasting and everything went back to normal), but we had to show the customer the problem. They simply wouldn’t be told.

We need to do this thing, this M2M, IoT, Intelligent Device Systems or whatever you want to call it responsibly. Responsible M2M means understanding the system. It means using Edge Analytics, or rules running out at the data collection nodes, to do data collection, aggregation and filtering. You cannot push all of the data into remote storage, no matter how badly you or your customer might think it’s what needs to happen.

But that’s fine. Most of the time you don’t need all of the data anyway, and if, somehow, you do there are still ways you can have your cake and eat it too.

Let’s look at a real-world example. Let’s say we have a fleet of municipal busses. These busses drive around all day long on fixed routes, pickup up and dropping off people. These busses are nodes that can collect a lot of data. They have engine controller data on CAN or J1708. They have on-board peripherals like fare boxes, head signs and passenger counters. The have constantly changing positional data coming from GPS and/or dead-reckoning systems. They’re also moving, so they can’t be wired into a network.

Well we could send all of that data to “The Cloud”, or at least try it, but not only would it likely cause network problems, think of the cost. Yes, if you’re AT&T, Verizon or one of the mobile carriers, you’ve just hit pay dirt, but if you’re the municipality the cost would be astronomical. Hello $20 bus fares.

What’s the solution here? Well, first of all there’s a load of data that we have that’s near useless. The engine temperature, RPMs or oil pressure (or any of the other of the thousands of data points available from the engine controller) might fluctuate, but generally we don’t care about that data. We care about it only when it’s outside of a “normal” range. So we need Edge Analytics to be able to watch the local data, measure it, and react when some conditions are met. This means we can’t just use a “dumb” device that grabs data from the controller and forwards it on. Instead we need an Intelligent Device – maybe an Intelligent Gateway (a device with a modem) – that is capable of running logic.

Now when we’re out of the “normal” range, what do we do? Maybe we want to just store that data locally on the vehicle in a database and we can download it at the end of the shift when the vehicle returns to the barn. Maybe we want to send just a notification back to the maintenance team to let them know there’s a problem. Maybe we want to send a capture of a lot of a specific set of data immediately off to some enterprise storage system for further analysis so the maintenance team can order a repair part or send out a replacement vehicle. It depends on the scenario, and that scenario may need to change dynamically based on conditions or the maintenance team’s desires.

Positional data is also ever-changing, but do we need *all* of it? Maybe we can send it periodically and it can provide enough information to meet to data consumer’s needs. Maybe once a minute to update a web service allowing passengers to see where the bus is and how long it will be until it arrives at a particular spot. Or the device could match positional data against a known path and only send data when it’s off-route.

And remember, you’re in a moving vehicle with a network that may or may not be available at any given time. So the device has to be able to handle transient connectivity.

The device also needs to be able to affect change itself. For a vehicle maybe it puts the system into “limp mode” to allow the vehicle to get back to the barn and not be towed. For a building maybe it needs to be able to turn on a boiler.

The point here is that when you’re developing your Intelligent Systems you have to do it with thought. I’d say that it’s rare that you can get away with a simple data-forwarding device. You need a device that can:

– Run local Edge Analytics
– Store data locally
– Filter and aggregate data
– Run rules based on the data
– Function with a transient or missing network
– Effect change locally

Intelligent Systems are great, but they still need to be cost-effective and stable. They also should be extensible and maintainable. You owe it to yourself and your customer to do M2M responsibly.

Of course if you want help building a robust Intelligent System, we have both products and services to help you get there and would be happy to help. Just contact us.

Intel’s New Quark Core

This week Intel announced their new processor core named Quark.  It is smaller than their current embedded-focused Atom core (hence the name – smaller than an atom is a quark) and more importantly it uses about 10% of the power of an Atom.  We can probably assume it also produces a lot less heat, so your embedded devices will no longer double as a Panini press.

Intel has, unfortunately, been pretty vague about the Quark, so I’ll have to remain equally vague about some things.  We don’t know exactly when we’ll be able to actually buy processors using the Quark core (Q4 for general availability of evaluation systems?).  We don’t know what price point they are targeting (I’ve seen guesses at the $5 range).  We can be pretty sure that with the market we’re in that the definitive answers will be “soon” and “low.”

So what do we know, then?  Well, it’s x86 (so 32-bit) and probably single core. What?!  I can hear the screams now.  In fact reading the comments on several other tech sites, there seems to be a lot of furor about how it can’t compete with ARM processors shipping in phones and tablets today and how 32-bit, single-core architecture is so 1990’s and useless in today’s landscape.

I think those people are totally missing the point.  This isn’t a processor for a phone or tablet.  Intel has even said it isn’t.  Quit trying to place it into the devices you think of.  This baby is designed for the Internet of Things (IoT) and M2M, and I think it’s going to be a game changer.

M2M – and I’m just going to call it that for now, instead of IoT or the acronym I saw this week IoE (internet of everything. seriously)  – is growing.  It looks like it’s the next wave of “things to do” and happily, we’ve been doing it for a decade.

Quark is going to enable all of those applications using 16- and 32-bit microcontrollers to run full-blown OSes.  That means they’ll have access to connectivity.  That means they’ll be able to do local analytics and run local rules. It means they’ll be able to push data upstream to clouds.  It means they’ll start participating in overall solutions. That also means they’ll need security, but they’ll have the capacity to implement it.

The core itself is also synthesizable, meaning it’s “open”.  No, it’s not that anyone can go in and change the actual processor core, it’s not that open, but they can change the fabric, meaning they can build their own SoC with the Quark core directly wired to peripheral components like radios and crypto devices to further reduce cost and footprint.

I’m confident that we’ll have Solution Engine running on a Quark system very soon and it will be interesting to see how it performs compared to the Atom and ARM systems we’re already running on.

What I’d really love to see is someone building a Windows CE OS for it to give us low-latency, real-time capabilities coupled with the familiar Win32 API.  Since it’s still x86, that’s not a big stretch.

Intelligent Devices and the Internet of Things

Over the past couple years, I’ve seen the proliferation of some new terms. The ideas don’t seem new to me – hell, we’ve been doing the tasks these definitions describe for a decade or more and have a full product suite based on the ideas – but evidently some corporate management teams seem to have gotten interested, and so they needed catchy names, with alphabet soup acronyms. Here’s my take on what some of them mean.

Machine to Machine (M2M)

M2M seems to be all the new buzz. Oracle thinks it will be huge (a $25B market by 2015). Intel thinks it will be even bigger (15 billion connected devices by 2015). But really, the concept is pretty simple, and it’s really not new at all. M2M is simply two machines talking together. Typically it’s low-powered (resource wise) things. So it’s not your PC talking to a server, but it might be a sensor talking to an embedded device. Or two embedded devices sharing data.  What’s new is that we’re finally at the cusp of technology and cost where we can actually start fielding widespread M2M capabilities.

Intelligent Devices

Moore’s Law is has finally gotten us to where pretty small devices, using small amounts of energy and costing very little can do a lot of things. Not so long ago a municipal bus couldn’t tell you anything about itself. Then it could tell you about fault codes if you brought it back to the bus barn. Then it could tell you where it was. But now, you can put a small appliance right on the vehicle that can tell the driver if a problem requires returning to maintenance. It can even call maintenance and give them diagnostic info or even tell them what part to have ready. Basically any device out at the “edge” of an overall solution that is capably of doing more than just collecting sensor data is an Intelligent Device. Obviously there’s a pretty wide range of what can be done out there, depending on your solution’s ability to absorb size, power and cost.

Maybe the device does nothing but monitor a temperature and then send a notification when it exceeds a specific set point. That’s intelligence when compared to a device that just sends the temperature to a server where the comparison and notification is generated off-device. It means that the device is capable of using Analytics to make a decision like when the temperature exceeds 205F, send an alert.  It’s capable of using Aggregation to summarize, filter and/or generate new data like the alert condition itself which the temperature probe doesn’t have. It may be able to run Rules such as when the temperature is over 200F, start storing the oil pressure and passenger count.  It may be able to Publish any of that data to another device or to a server somewhere.

It’s this intelligence, and really the ability to distribute different parts of this intelligence across a many devices and machines in a solution that has started changing the landscape of what can be done. Before we had a machine shop where operators would report part counts at the end of a shift, and a manager would create a spreadsheet that they would send off to the customer at the end of the week or month. Now, the machines themselves can report exact part counts in real time right to the end customer. That is enabling a lot of cool things.

The Internet of Things (IoT)

The Internet of Things, or IoT, in my mind anyway, is the collection of all of these Intelligent Devices and how they use M2M to actually send useful information (as opposed to raw data) between one another.

Let’s look at a concrete example.

Let’s say I have some sensors in an apartment building that give current temperatures. These report back to an Intelligent Device. This communication may be M2M (or it might just be a thermocouple wire – let’s not get too hung up on it).

That Intelligent Device can record that data. It can run rules on that data to determine if it needs to turn on or off the boiler. It can send a signal to another device to actually power up the boiler – again this might be M2M. The Intelligent Device can do aggregation of data, providing hourly rollups of data and even calculating heating and cooling curves for the building it’s in. It can also publish that data to another device or to the cloud (we’ll talk more about this magical “cloud” word in another post). Yet again, here’s more M2M.

Now let’s say the Property Manager has 10 buildings. They can connect to the cloud and look at a history of the temperatures in an apartment. They can send commands from their office to the Intelligent device telling it to change a set point, or just override the current rules on the device and turn on the boiler. All of this interaction is an “Internet of Things.”