What to Know About WebRTC vs. RTSP

At the moment, there is a constantly increasing number of smart video cameras collecting and streaming video throughout the world. Of course, many of those cameras are used for security. In fact, the global video surveillance market is expected to reach $83 billion in the next five years. But there are lots of other use cases besides security, including remote work, online education, and digital entertainment.

Among the various technologies powering those use cases, Web Real-Time Communication (WebRTC) and Real-Time Streaming Protocol (RTSP) stand out as two top options. Here’s what you need to know about WebRTC vs. RTSP and their suitability for various streaming needs.

The basics of WebRTC

Let’s start with WebRTC, which is a communication protocol that allows real-time streaming of audio and video directly in web browsers. Google developed WebRTC, but it’s now an open source project with wide support and thorough documentation.

When you make a video call through a browser, WebRTC handles the transmission of your video and audio data to the person you’re calling, and vice versa. So, you don’t need to download a specialized communication software like Skype; you can just chat through the browser with something like Google Meet.

WebRTC features

WebRTC has a few features that set it apart. For one, the protocol actually adjusts the quality of the call based on your internet speed. So your video might get fuzzy if your internet speed is low, but you typically won’t have to worry about losing the connection. The protocol also encrypts data streams, both incoming and outgoing, which means video streams are private and secure. And lastly, it provides a data channel for also sending files or text chat, so it’s not just limited to video.

Perhaps the most important aspect of WebRTC is that it’s peer-to-peer (P2P), which means it doesn’t have to travel through a server. This enables higher performance and lower latency, or as “real-time” as possible on the Internet (traveling directly from A to B). P2P communication is like sending a letter to your friend who only lives two hours away. You could send your letter via a post office, but there’s always the risk of the letter getting delayed at various points.

If instead you hand the letter directly to your friend, you can ensure your letter will arrive in just two hours, instead of in a couple weeks. So P2P communication is more direct and generally faster than server-based communication.

How WebRTC works

P2P communication via WebRTC involves a few technical steps. The first one is signaling. Think of WebRTC signaling as the process of arranging a meeting. Before two people meet, they need to exchange information like the meeting time, location, and agenda. Similarly, in WebRTC, signaling is the initial arrangement phase in which two devices exchange necessary information to establish a real-time communication session.

The next step is media capture, which allows your browser to access your device’s camera and microphone to collect streaming data like video and audio.

Next is Network Address Translation (NAT) traversal. NAT is a method that routers use to translate private IP addresses within a local network to a single public IP address for internet access. It’s like a single mailing address used for all devices in a house. NAT traversal, on the other hand, is a technique that allows devices behind different NATs to establish direct peer-to-peer connections. This is akin to arranging a direct line of communication between two houses, each with its own unique mailing system, enabling them to bypass the standard mail route and connect directly.

Once the call goes through and the browsers establish a connection via NAT traversal, the next step is streaming the data. Throughout the call, WebRTC maintains a stable connection, and then at the end of the session, the protocol allows the peers to securely close out the connections – the equivalent of hanging up your phone at the end of a conversation.

When to use WebRTC

One of the top benefits of WebRTC is that it works across multiple platforms and browsers. So if you use Chrome, for example, but your friend is using Edge, you can still stream video. It’s also easy to access. Unlike when you use Skype, for example, you don’t have to download a separate app. You can just open up a link right in your browser.

Some situations in which you might use WebRTC include the streaming of various events like concerts, sports events, interactive webinars, sharing sensitive files or data between browsers, streaming video footage from a smart camera to a browser, or real-time multiplayer gaming, among many others.

Understanding RTSP

The Real-Time Streaming Protocol (RTSP) is not exactly a video streaming protocol like WebRTC. Instead, it’s a network control protocol. In other words, you use it to send video playback commands like play, pause, etc. just as you would use a handheld remote for a streaming device.

So, unlike WebRTC, RTSP is just for establishing and controlling the media stream rather than being the actual vehicle that delivers it. It starts a streaming session and then allows clients to remotely control the feed. For example, in a smart surveillance system, RTSP lets you start and stop the video feed from a security camera in real time, so your commands almost instantly reach the device you’re trying to control.

RTSP features

RTSP is not a P2P protocol by nature, though it can be used in that context in certain cases. Generally, RTSP sends commands via a server that hosts and streams the media content, so the server is actually doing the most work, while RTSP merely sends commands. The server is not necessarily a cloud server; it can be a “logical” server (as in client-server paradigm), so the RTSP server can run on an IP cam on a private network. RTSP is just used for controlling playback and the start or stop of a stream, not the actual delivery of the media.

So for actual media streaming, you need to pair RTSP with other protocols. The most common is the Real-Time Transport Protocol (RTP). RTP is what actually streams and delivers audio and video data.

In addition to RTP, RTSP often pairs with the Transmission Control Protocol (TCP), which allows RTSP to transfer commands over the Internet. TCP focuses on reliability and retransmits any lost or corrupted packages, so it’s used in conjunction with RTSP in situations in which reliability is the most important. If the video streaming connection needs to be established through a firewall, a developer can also perform TCP tunneling through a service like Nabto to establish a p2p-based tunnel without firewall hassles.

How RTSP with TCP works

Basically, RTSP requires some extra setup compared to WebRTC so the stream can get through firewalls. Developers can either configure the firewalls themselves to receive the RTSP streaming or use TCP tunneling to solve the problem.

TCP tunneling is a technique that allows a video system to bypass firewalls. The firewall does not see TCP traffic when doing TCP tunneling. Instead, a TCP tunneling service like Nabto transfers UDP packets through the firewall; Nabto “translates” these packets to/from TCP at each end of the tunnel, i.e., the applications on each side (client and device/server).

So in many video scenarios, RTSP/TCP is the mode that makes best sense, despite the performance hit of using a reliable transport.

When to use RTSP

A lot of older surveillance camera designs have built-in RTSP servers in their software stack for native handling the camera video feed. If you are integrating such a camera into your system, you would normally use RTSP + TCP tunneling. On the other hand, if you have a newer camera software stack with support of the WebRTC video protocols, you would probably use that. But it also depends on what your backend and middleware support. RTSP is useful for systems in which users want to control video playback from a remote location, for example, with home security or streaming from drones.

Comparing WebRTC vs. RTSP for IoT

I’ve already talked a bit about some of the differences between these two protocols, but let’s sum them up more simply:

Feature	WebRTC	RTSP
Protocol Type	Streaming video data	Remote control protocol
Communication	Browser-to-browser or browser-to-camera	Client-server model
Application Requirement	No specialized app or plugin required, but supported by many apps	Media player app required
Latency	Low latency	May have higher latency
Reliability	Real-time communication	Greater reliability
Firewall Traversal	NAT traversal and other built-in mechanisms	TCP tunnel or additional configuration required

There are several other differences as well.

Scalability and flexibility

Scaling a WebRTC session to more users can be complicated for developers because each participant in the session needs to be able to maintain a separate connection with every other participant, rather than a single connection among many participants. The high number of connections that results consumes a lot of network bandwidth and can affect the quality of the stream. But the protocol is highly flexible in terms of which browsers, network configurations, and devices it can work with. RTSP on the other hand is not as flexible as WebRTC since it requires a server and an application of some kind to function properly.

Data security

WebRTC solutions can be made highly secure, as all communication is encrypted by default.

RTSP and RTP don’t have security built in, though there are encryption and security measures that developers can integrate with the protocols.

Use cases

WebRTC started as being exclusively for browser-to-browser communication, so it initially wasn’t ideal for situations in which you want to, say, control a video camera from your smartphone or view the feed through an app. But now WebRTC is compatible with IoT and Android apps as well as IoT connectivity software like Nabto. Meanwhile, RTSP and RTP don’t have the security features or low latency of WebRTC, but the protocols’ security can be enhanced when used with TCP tunneling like what Nabto provides.

As a result of all of its specialized features, WebRTC is mostly used in IoT situations for two-way communication, like telemedicine meetings, remote work, and other video conferencing scenarios, and now also for mobile-based video surveillance controls. By contrast, RTSP/RTP is primarily used in security cameras and broadcasting from one source to multiple devices.

But neither protocol is ideally suited on its own for many IoT use cases in which users want to set up direct P2P communication between a client app, like a smartphone application, and an IoT device. In this case, you would be better suited using Nabto Edge in conjunction with either RTSP or WebRTC. Nabto Edge provides low-latency communication for IoT devices even through firewalls and has been commonly utilized in video surveillance applications in which latency is a greater concern.

Final thoughts

The choice between WebRTC vs. RTSP is a complicated subject, and there are many different factors that may affect which protocol you choose to use. But ultimately both are important parts of the IoT ecosystem, particularly in video streaming.

Contact Nabto to learn more about secure and scalable video streaming options.

Read our other resources:

We’ve also published a range of IoT resources for our community, including:

Our blog post that covers IoT and the future of video surveillance
Our guide to IoT protocols for developers in 2023
Our P2P explainer which covers the benefits of P2P software for IoT devices

Guide to WebRTC vs. RTSP Video Streaming Protocols