What to Know About ONVIF vs. RTSP for Smart Cameras

If you’re searching for surveillance cameras to buy right now for your IoT security system, you may have noticed some indecipherable acronyms in the descriptions. The most common you might see would likely be Open Network Video Interface Forum standard, or ONVIF. Another might be Real-Time Streaming Protocol (RTSP). Both RTSP and ONVIF are extremely important to the world of smart surveillance cameras. Here’s all you need to know about ONVIF vs. RTSP and why the distinction matters.

Set up a simulated IoT Video surveillance device on your PC in minutes.

Our full-stack demos give you access to the Nabto Platform so you can try it now. We specialize in secure, low-latency, P2P connectivity. Get the demo app to try it.

Get App Demo

ONVIF vs. RTSP

The truth is, comparing ONVIF and RTSP is a bit incongruous, since they both have different purposes. ONVIF is a standard, while RTSP is a protocol. Unfortunately, there’s still a lot of confusion about what the difference between a standard and a protocol is, and I think it’s high time to address that confusion so you can know what you’re looking at when you see those descriptions on security cameras.

Sometimes, the words “standard” and “protocol” are used interchangeably within IoT, but technically, there’s a difference.

Standard: A standard in IoT refers to a set of guidelines or specifications that have been formally established and approved by recognized standardization bodies. Standards ensure compatibility and interoperability between different devices and systems. Examples of IoT standards include the Zigbee and Z-Wave standards for home automation, as well as ONVIF.
Protocol: A protocol in IoT is a specific set of rules for data exchange over a network. Protocols govern the communication between IoT devices and help in defining how messages are formatted, transmitted, and processed to achieve precise and expected outcomes. Common IoT protocols include Message Queuing Telemetry Transport (MQTT), the Constrained Application Protocol (CoAP), and RTSP.

ONVIF provides a set of guidelines so manufacturers can make cameras that will be compatible with cameras from other manufacturers, brands, and systems. That way, when you’re building your IoT surveillance system, you can add new cameras without worrying about whether they’ll be able to connect to the other parts of your system. RTSP, on the other hand, exchanges data that allows a user to play, pause, or otherwise control the stream of media and other information to and from a smart camera.

A camera may be compatible with both ONVIF and RTSP, or you may need some extra configuration to add RTSP functionality. There are situations in which ONVIF might be more useful in a smart camera than RTSP, and vice versa. More on that later.

What is ONVIF?

OK, now let’s dive deeper into exactly what ONVIF is. Originally, the Open Network Video Interface Forum wasn’t a standard but rather consortium of surveillance/camera industry stakeholders, including manufacturers, developers, and consultants. Founded in 2008 by Axis Communications, Bosch Security Systems, and Sony Corporation, the forum wanted to make sure that devices from different manufacturers are compatible. They came up with the resulting ONVIF standard, which lays out the rules for how different features in IP-based (internet-based) physical security products such as surveillance cameras function together.

The ONVIF standard has defined several “profiles,” meaning standardized collections of features in video cameras. Each profile is specific to a certain functionality or characteristic in those cameras. For instance, some profiles describe how video streaming must work in order to meet the standard, while others might address access control or alarm management. Each profile also provides a standard for how the physical camera and the software you use to control or interact with the camera communicate with each other.

So, when a camera on Amazon or any other retailer mentions ONVIF, that description typically means that the camera complies with at least one of the ONVIF profiles. However, just saying a camera comes with ONVIF compatibility does not automatically mean that the camera conforms with all profiles, because each profile covers different aspects of device performance and features. You’ll have to do more research to make sure the camera is compliant with all the other systems you need it to work with. Here are the main ONVIF profiles for surveillance cameras:

Profile S (Streaming): This profile provides specifications for video streaming, configuration, and control of PTZ (pan-tilt-zoom) commands, as well as audio streaming. Profile S is supported by most network video devices.
Profile G (Recording): Profile G addresses storage, search, retrieval, and playback of recordings. The profile also complements Profile S by ensuring that devices and clients that support recording and storage are interoperable.
Profile C (Access Control): Profile C is primarily focused on IoT security systems outside of cameras. In this case, access control refers to IoT systems that handle the control of physical access to a particular area; in other words, smart locks. However, Profile C can integrate with IP-based video devices. Certain access events, like a door being forced open or access being denied, can trigger cameras to start recording, capturing potentially crucial security footage related to the incident.
Profile A (Advanced Access Control): This profile extends Profile C and introduces extra capabilities such as scheduling for smart locks so you can plan to enable access at particular times of day or based on certain events.
Profile T (Advanced Video Streaming): This more recent profile accommodates newer video streaming formats, including support for H.265, which allows for high-quality and efficient video compression. Profile T also enhances motion detection, metadata, and other analytics functions.
Profile M (Metadata and Analytics): Profile M focuses on metadata and analytics for smart applications. This profile facilitates interoperability with cloud-based services and software that seek to use data from video and audio streams for analytics purposes.

So, for example, if you’re looking for a camera to work with smart locks and motion sensors, you will want to look for a camera that supports Profile C compatibility with the ONVIF standard. If you’re looking for a camera that can take very high quality video, look for Profile T compatibility. If you need facial recognition and analysis capabilities, look for Profile M, and so on.

What is RTSP?

On to RTSP, the Real Time Streaming Protocol, which defines how surveillance camera software transmits and controls data. RTSP doesn’t handle the actual data delivery. Instead, it manages media sessions by communicating control commands – like play, pause, and stop – between the client, meaning the camera, and the server, which is typically the system, software, or computer controlling the cameras.

Here’s a more thorough list of the requests and commands used in RTSP:

OPTIONS: This request queries the server about which methods the server supports. The server responds with the list of methods it understands. This is often the first request sent to discover the capabilities of the server.
DESCRIBE: The client (device) sends a DESCRIBE request to obtain a description of the media being presented on the server. This description is typically in the Session Description Protocol (SDP) format, which includes details like the media format, protocol, transport address, and other media metadata. This information helps configure the client to receive the streamed media.
SETUP: Once the client has the necessary parameters from the DESCRIBE response, the camera sends a SETUP request to establish a session and prepare for actual data transmission. This request specifies the transport protocol and initializes the necessary communication channels. The server responds with a session identifier that the server and client use in subsequent messages to reference the session.
PLAY: After the session is set up, the client sends a PLAY request to start the streaming of media. The server then begins to send data streams (video and/or audio) to the client. The PLAY request can also resume playback after a PAUSE or start playback from a specific point in the media stream, specified by a timestamp.
PAUSE: This request temporarily halts the media streaming without actually ending the session, allowing the device to resume playback from the same point later.
TEARDOWN: TEARDOWN effectively closes the session, and the server stops sending the media streams. The camera will send this request when it’s time to end the session and stop the media stream.
RECORD: The RECORD request is applicable in scenarios in which the client instructs the server to record the media being streamed.
ANNOUNCE: The server uses ANNOUNCE to send information about the media to the client, such as event start and end notifications or metadata updates, without a prior request from the client.
GET_PARAMETER and SET_PARAMETER: These requests query and set parameters on the server or the media session. The requests can get specific information from the server or adjust server settings dynamically during a session.

RTSP works hand in hand with the Transmission Control Protocol (TCP) or the User Datagram Protocol (UDP) to transfer these requests over the internet. It also works with the Real-time Transport Protocol (RTP). RTP is what actually delivers the video and audio streams, whereas RTSP just acts as the remote control for the stream.

One other aspect to note is that ONVIF has had several updates and newer profiles added over time as surveillance technologies evolved. RTSP by contrast is somewhat of an older protocol. Older surveillance camera designs likely have built-in RTSP features, but newer cameras commonly rely on WebRTC or other energy-efficient peer-to-peer (P2P) protocols.

Peer-to-peer just means that the stream doesn’t have to go through a central server when traveling to and from the camera and the controlling device, like a smartphone or computer. RTSP is not a peer-to-peer (P2P) protocol, so all commands go through a server. WebRTC, however, is P2P, which means the media stream takes a shorter route and arrives in a faster and more efficient manner without any stops along the way.

ONVIF vs. RTSP with Nabto

Now that you understand the differences between RTSP and ONVIF, here’s how they work in conjunction with Nabto’s platform to enhance video surveillance systems.

For businesses that plan to scale their security systems over time, ONVIF provides a framework that supports the addition of new devices without worrying about compatibility issues, fostering system growth and upgrades. With Nabto’s platform, you can leverage Transmission Control Protocol (TCP) tunneling to enhance the functionality of your ONVIF-compliant cameras. TCP tunneling is a method used to encapsulate network protocol data within TCP packets to enable the transmission of data through network firewalls and Network Address Translation (NAT) devices.

TCP tunneling allows you to transmit RTSP streams securely over TCP, bypassing network firewalls and ensuring reliable video streaming. Additionally, when HTTP or SOAP is tunneled through TCP, the tunnel ensures secure and reliable delivery of data even through firewalls that might block other types of traffic. The HTTP or SOAP data is encapsulated within TCP packets, allowing it to pass through network security mechanisms without being blocked. The data can control pan, tilt, and zoom (PTZ) functions, allowing you to command your cameras remotely.

When using TCP tunneling through Nabto’s platform with ONVIF, the various commands and data (such as media streams controlled via RTSP and management commands sent via HTTP/SOAP) are encapsulated within TCP packets. This ensures compatibility and secure communication between ONVIF-compliant devices across different network environments.

Final thoughts

Looking for the right smart cameras for your surveillance system can quickly get confusing. Hopefully you’re now better equipped to understand the differences that might make a particular camera a great choice or a less than ideal option. If you’re interested in learning more about high-quality smart video surveillance options, contact us and request a consultation.

Read our other resources:

We’ve also published a range of IoT resources for our community, including:

Our complete guide to RTSP
Our guide to IoT protocols for developers in 2023
Our RTC explainer, which discusses the importance of real-time communication in IoT