The Web Real-Time Communication (WebRTC) protocol is a well-known method for streaming video and other data to and from IoT devices like smart video cameras. However, there are certain challenges inherent in using WebRTC. One of the biggest is that occasionally it can be difficult for devices to establish a line of communication, particularly if those devices are behind firewalls.
One of the primary ways we solve that challenge in WebRTC is called network address translation (NAT) traversal.
Understanding NAT traversal
NAT is a method used by routers to remap IP addresses, allowing multiple devices on a network to share a single public IP address. While NAT is useful for concealing IP addresses for security reasons, it complicates direct communication between devices on different networks because the devices do not have public IP addresses that are directly accessible to each other.
Network Address Translation (NAT) allows routers to remap IP addresses, enabling multiple devices to share a single public IP address. NAT hides internal IP addresses to enhance security, but it complicates communication between devices on separate networks because they don’t have public IP addresses that are accessible to each other.
Think of each device as living in an apartment complex with a secure mailbox system. When you start sending messages to others, a unique mailbox number is assigned to you. The system then marks your outbound messages with this number so that the recipient knows where to send a response. If someone sends a message without an assigned mailbox number or the correct recipient number, even if it’s addressed to the right complex, it’s discarded. This mailbox number represents the port number in a system.
In very secure settings, such as with symmetric NAT, the system assigns new random “mailbox numbers” for each conversation, making it impossible to connect directly because neither device can see its own address to send to the device it is trying to communicate with. This means NAT traversal is necessary.
NAT traversal encompasses the methods used to enable devices to communicate directly through these barriers. By establishing direct communication between devices in NAT-protected systems, NAT traversal facilitates secure, low-latency exchanges between devices on separate networks.
Before I get to NAT traversal in WebRTC specifically, let’s spend some time talking about one of the main NAT traversal techniques that is used in the wider IoT landscape.
NAT traversal through User Datagram Protocol (UDP) hole punching
User Datagram Protocol (UDP) hole punching is a clever trick that helps two devices on separate networks talk to each other directly, even when both are behind firewalls or NATs. UDP is just a protocol used to exchange the data.
For the sake of simplicity, let’s go back to the apartment complex analogy. Again, you and your friend, each in your respective apartments within the complex, need to send letters directly. However, neither of you can send a note through your secure mailbox unless you first know each other’s mailbox numbers, because those numbers change with each conversation.
- Get your mailbox number (STUN): You both contact a mutual helper (server) that receives your message and returns it to you with your assigned mailbox number written on it. Now, you know your own mailbox number/return address.
- Exchange mailbox numbers (SIGNALING): You each inform the other of your mailbox numbers through a trusted helper. After this exchange, you’re both ready to send notes directly.
- Send notes directly: Now that each of you knows the other’s mailbox number, you can send notes without going through the helper. These direct notes bypass delays and reduce the need for a middleman.
UDP hole punching is common in gaming, video calls, and IoT, areas in which fast, real-time data exchange is important. It allows two devices to connect without needing a middleman all the time, reducing delays and keeping things smooth.
However, UDP also comes with some downsides. NAT systems come in different types and sizes, and some are very restrictive and won’t allow UDP hole punching. Additionally, NATs may time out UDP connections if they detect a period of inactivity. This means that if data isn’t being sent continuously, the “hole” may close, requiring the connection to be re-established. Lastly, unlike other protocols that may have mechanisms for ensuring packets arrive in the correct order and retrying if they’re lost, UDP doesn’t guarantee reliable delivery.
How does WebRTC traverse NAT?
So, now let’s suppose you’re using WebRTC to stream video and you need to use NAT traversal to allow your video cameras to establish connections. WebRTC is a peer-to-peer protocol, meaning devices will be connecting with each other directly rather than through any kind of relay server in an ideal world. However, there are times when the NAT is so restrictive that a server needs to forward communication instead of allowing true P2P connections.
WebRTC uses a protocol known as Interactive Connectivity Establishment, or ICE to decide whether a relay server is necessary and whether that server needs to handle just the initial signaling/greeting process or will need to forward all traffic. If the former, WebRTC will use a Session Traversal Utilities for NAT (STUN) server. If the latter, the protocol will rely on a Traversal Using Relays around NAT (TURN) server.
Let’s break these down further:
1. STUN Server
A STUN server doesn’t transmit data between peers directly; instead, it helps each device (or peer) find out its public IP address and port, which are necessary for establishing a connection behind a NAT or firewall.
In our original analogy, this is how the friends figure out the return address they’ll want to put on their letters in order to receive a response. In the case of STUN, each device asks the STUN server, “What’s my address as seen from outside?” This way, the device learns its public-facing IP and port (the address that the other device can use to reach it).
Each device then shares this public address with the other through a signaling server or process. This signaling server handles the initial exchange of connection information but doesn’t carry ongoing communication between the peers.
Once each peer/device knows the other’s public-facing address, they use ICE to try to establish direct P2P communication for the remainder of the interaction, using UDP hole punching or other NAT traversal techniques.
2. TURN server
If ICE can’t figure out a way to establish a P2P connection after signaling, TURN servers will act as a fallback mechanism to help devices connect. The devices will route all traffic through a TURN server to ensure reliable, if not optimal, connectivity.
Using TURN does add some latency and bandwidth costs because it routes traffic through the relay server rather than a direct peer-to-peer link, but it guarantees a connection when NAT or firewall restrictions are too strict.
Through these methods, WebRTC can effectively transmit real-time or near real-time video data regardless of how strong a firewall may be.
Final thoughts
By combining STUN, TURN, and ICE, WebRTC efficiently enables peer-to-peer communication, even in networks with complex NAT configurations. These methods contribute to the real-time, low-latency, and secure communication that WebRTC is known for, making it ideal for video, voice, and data-sharing applications in both consumer and enterprise IoT contexts.
Read our other resources:
We’ve also published a range of IoT resources for our community, including:
- Our blog post that covers IoT and the future of video surveillance
- Our guide to WebRTC in Video Streaming
- Our post covering TCP Tunneling in IoT
