This post describes the challenges encountered and considerations done when having to implement new RTSP live video streaming apps for iOS and Android. And describes why we ended up using GStreamer and our experiences with that library.
New Video Apps for Nabto Edge Needed!
At Nabto we continually seek to improve our solutions and technology – to have easy to use products and examples that are written in a sane and flexible manner while being secure and performant to boot. Towards this goal we regularly roll out new releases and seek compliance with latest standards to keep our IoT platform well-maintained and secure.
In our latest endeavour we’ve superseded our Nabto Micro platform with the new Nabto Edge platform. As Nabto Edge is a completely rewritten platform, we’ve had to update our example programs and apps to work with the new platform.
Remaking and making new example apps gave us an opportunity to face both old and new challenges again, with hindsight as a tool in our belt this time. One of these challenges is displaying a live video feed from IoT devices such as security cameras. Many such security cameras use the RTSP network protocol for multiplexing and transporting audio and video streams. RTSP typically uses TCP as its transport protocol.
Nabto Edge makes it possible for a client to communicate with an IoT device directly without firewall hassle, and allows for higher performance as data flows directly between clients and devices. One of the features we support is TCP tunnelling, which allows TCP clients to securely connect to TCP server applications that are run on an IoT device, even if that device is sitting behind a firewall.
Considering ExoPlayer and FFmpeg
With TCP tunnelling it becomes easy to connect a client video player to an RTSP server and display a video feed. However, this relies on the presence of a good video player that supports the RTSP protocol, which is usually outside of Nabto’s hands. Since we wanted to have example client applications on Android and iOS that could display an RTSP stream, we also had to find workable solutions for playing RTSP streams on these platforms. It turns out that many media players have limited or even no support for RTSP, on top of this our customers’ want sub-second latency on the video feed so our requirements were high from the start. Since we’re building example applications, it is also in our best interest to have a media player solution that is easy to integrate into an application.
On Android we started with ExoPlayer, Google’s application level media player. ExoPlayer is the generally recommended media player solution for Android, even above Android’s own MediaPlayer API. ExoPlayer thankfully comes with low-latency RTSP support, but it is very limited in terms of supported video and audio formats. At the time of writing, ExoPlayer only supports H264 video and AAC or AC3 audio. While ExoPlayer was exceedingly simple to integrate into our example application and allowed us to quickly bootstrap a useable app, it also meant we would have to live with this limited sample format support.
On iOS the situation is significantly worse, with Apple’s AVFoundation framework having no support for RTSP. There are many third-party video players available, but many were old and unmaintained, or required a lot of work to integrate into an Xcode application (such was the case with many FFmpeg-based media players).
GStreamer!
After searching for a solution that would address our needs and preferably work on both platforms, we decided to use GStreamer, a well-known multimedia library for constructing both simple and complicated graphs of multimedia components. Very luckily, GStreamer supports both iOS and Android, can display RTSP streams with low-latency, and is (relatively) simple to integrate.
The main hurdle with GStreamer is that, while it is higher-level than manually interfacing with hardware video decoders and the like, it is still a fairly low-level library written in C and the API can be very odd to someone who is not used to GLib style programming.
On Android we wrote an integration layer in C that exposes only the functionality we needed to display RTSP streams using GStreamer’s high-level playbin component. We made a similar layer in Objective-C for iOS.
The rest of our example applications are in Kotlin and Swift respectively, and we have high-level classes that encapsulate communication to the GStreamer integration layer. While this results in two layers of abstraction on top of GStreamer, it gives the iOS and Android app developers an easy-to-understand API that they can work with, so that there’s no need to think about the internals that are using GStreamer. It should be noted that the GStreamer integration code is not terribly complicated though.
At present we now have these client applications on iOS and Android that internally use GStreamer and Nabto Edge to connect to an Edge-enabled device and tunnel through to an RTSP server on said device so that GStreamer can display a video feed. With this we have achieved low-latency RTSP streaming (ranging around ~500ms), support for both iOS and Android, and exposed a simple API for app developers to use.
As of writing (March 2023), we are quite happy with the GStreamer performance and reliability and are using it in our official iOS and Android examples. The integration code and full examples are available in github.