A Tale of Two Protocol Stacks: Our Picks for Optimal IoT Control

This is my third post in an ongoing series aimed at helping you navigate through the deluge of IoT information and identify which technology choices will have the biggest positive impact on your upcoming product plans.

In our first post, we organized the many IoT protocols. We have talked about why having a gateway is critical. Now we will finally answer the question we started with: Which Protocol Stack (set of inter-related protocols) is the best choice for today’s control systems?

There are two answers to the question because there are two sides of the network. In the below picture, the left side of the picture shows the side of the network connected to the Cloud. This is also the “User Side” — meaning this is how the user communicates to the network.

The right side of the picture shows the “Device Side” — the part of the control network that contains the “things”: the sensors, lights, and other devices that can be controlled or can report status.


We also call the “Cloud Side” (AKA “User Side”) the “Protocol Triangle.”

The Protocol Triangle is hard to get right in a system because the technologies used to develop the three pieces are very different in terms of what they do, how they work, and what type of skills are required to create them.

Cloud applications run on the internet and use various application protocols (often HTTP based), on top of the IP network protocols, on top of various physical layers. While the cloud has an almost complete turnover in technologies used every five years or so, they all use IP for the network.

User interfaces are most often either Apple mobile devices, Android mobile devices, Microsoft mobile devices, or web applications that run on many platforms. These all use IP for their network protocol, although the application protocol is a bit of the wild west.

Embedded devices use many different types of protocols, but all the relevant embedded gateways all support some type of application layer, over an IP routing layer, over Wi-Fi or Ethernet physical layers.

Seeing any patterns? The routing layer is clear: IP.

We don’t need to pick between Ethernet or Wi-Fi for the physical layer because nearly every business and home has a network router which can bridge between Ethernet and Wi-Fi (meaning devices using one physical layer can communicate to devices using the other physical layer by sending messages through the home router or business router).

The application layer is not quite as clear at first, but let me explain one very important fact: HTTP traffic is allowed through almost all firewalls.

If the application layer is HTTP, then no special configuration needs to be done to allow messages to flow from the gateway, to the internet, to the mobile application, and back.

HTTP isn’t quite a complete application layer, so we use HTTP in a RESTful manner (which simply means we use standard commands as defined by the REST protocol) and also add JSON as a payload to our HTTP messages to give any extra data needed.

Ok, I know. This may have seemed obvious. In fact this may have seemed obvious from my very first post. But remember, this is only the answer to one side of the network.

The biggest mistake in choosing protocols, that I have seen made time and again, is the push to find just one answer. Using HTTP/IP/Wi-Fi+Ethernet for our protocol stack is so obvious on the User Side of the network, that this answer is often forced on to the other side (Device Side) of the network. This is a bad idea for many reasons that I won’t get into here (but I feel another blog post forming).

So, our answer for the Cloud/User Side of the system (AKA, The Protocol Triangle) is:

  • Applications Layer: HTTP in a RESTful manner with JSON payload
  • Routing Layer: IP
  • Physical Layer: WiFi or Ethernet or Cellular


On the device side of the network we have different requirements. We still need to have interoperability, but between small resource constrained devices, some of which are battery powered and some are powered by energy harvesting.

So, we need an efficient and mature application protocol, plus a routing layer that supports mesh routing and battery powered devices. Why are battery powered devices a concern? These devices spend most of their time with the radio off — known as “sleeping” — to prevent you from having to change the batteries daily or charge a device daily.

Let’s look at some practical examples of the differences between the left and right side.

A typical IP message on the left side, containing an HTTP REST command with a JSON payload, might be 500–1500 bytes (or more). Messages exchanged on the right side are limited to 128 bytes (for ZigBee networks). A home WiFi router (left side) is plugged in and working 24/7, whereas a battery powered occupancy sensor (right side) is only sending 2-6 messages a day, depending on activity, and only has its radio on for one second or less per message.

The choice on this side seems much more difficult — Bluetooth? ZigBee? Thread? Low Power Wi-Fi? However, when you look closer, this choice for the right side is almost as obvious as the choice for the left side.

The 802.15.4 protocol is very good at managing battery powered devices. I might say that it is “the best,” but I don’t want to start any arguments. While Bluetooth is present in everyone’s phone, the Mesh part of Bluetooth is still very new and the application layer is not ready for all the device types that are needed. Thread doesn’t yet have a mature application layer (but is getting closer as they integrate dotdot) and Low Power Wi-Fi can’t deal with battery powered devices. The ZigBee application layer ZCL is very mature and supports the correct mix of devices we want.

So, our answer for the Device Side of the system is:

  • Applications Layer: ZCL (ZigBee Cluster Library)
  • Routing Layer: ZigBee
  • Physical Layer: 802.15.4


Now we have our answer. Er, I mean “answers.” Except, now how does the left side and the right side talk when their application layers are different?

The gateway in the middle translates events and messages from one side of the network to events and messages on the other side of the network, and vice versa. The gateway that we are talking about is an “Application Level Gateway (ALG).” There are some gateways that are available that translate from one physical layer and routing layer to another physical layer and routing layer, but do not interpret the application payload (these are sometimes referred to as bridges). Although a bridge solution may seem more elegant than using an ALG, the two sides are just too different to make a bridge solution work, and that is why we use an ALG.

The protocols used in a IoT control system are extremely important. These are like the foundation of the building. Without picking the correct protocols, the product would not be able to support the necessary features, at the necessary scale, with the necessary response time.

But, how do you really make the product catch on? Now that we have the correct protocols, how do we make it a product that customers and users will love? One way is to create a simple and easy-to-use API. A great example of this is the Amazon Alexa. If you went to CES 2017 you probably saw Alexa at about every other booth. Alexa was everywhere. People love the product. The product works with lots of other products. Why?

In my next post I will talk about why Amazon Alexa was the belle of the ball at CES 2017.