Support Support Home » Software XIMEA cameras » XIMEA CamTool » Low latency H264 streaming

Low latency H.264 streaming from Jetson TX2 to PC


Fig.1. Multicamera system on Jetson TX2


Realtime remote control of any vision system has become an important part of many applications and especially UAV or ROV types of Autonomous vehicles.
Moreover, such drones or mobile robots which should be remotely controlled through a wireless network are often equipped with multi-camera systems.
To enable these applications their vision setups require minimum latency to ensure smooth video transfer and instant feedback.
This complicates the task and here is where NVIDIA Jetson multiplexing several cameras can help.

Glass to Glass delay

Low delay video streaming is a common necessity of many applications with teleoperation scenarios like telesurgery, virtual or augmented reality.
Some of them are using dynamic processes of computer vision algorithms and neural networks which are applied to video in realtime.
To evaluate the latency and the performance of such systems, it is essential to measure the glass-to-glass (G2G) delay for video acquisition and video transmission.
Glass-to-glass video latency, also called end-to-end (E2E) video delay, is indicating the period of time from when the photons of a visible event pass through the lens glass of the camera till the moment when the final image is delivered to the glass of the monitor.



Fig.2. Low latency video processing

Video Latency Test for H.264 RTSP Streaming

The G2G measurements are non-intrusive and can be applied to a wide range of imaging systems.
While each camera has a fixed frame rate, producing new images in constant time intervals the real world events are never synchronized to camera frame rate and this is how live realtime measurements can be made since real world events have to be triggered.
That is why the obtained G2G delay values are non-deterministic.
The idea of a simple glass-to-glass test is to show on the monitor two time values - the current time and the time of capture, described below is the process of how it can be done and components used.

Progression steps during the test

  • Get the current time on the PC and output it to the monitor
  • Capture the image of displayed current time with the particular camera
  • Send captured frame from the camera to Jetson TX2
  • Image and Video processing is done on Jetson TX2
  • Video encoding to H.264 via NVENC
  • Video streaming over a wireless network
  • Video acquisition on external PC
  • Video decoding on NVIDIA GPU via NVDEC
  • Video output to the monitor via OpenGL
  • Compare the current time and the time of capture

Transmitting side

NVIDIA Jetson TX2 SoM (System-on-Module):

  • 256 CUDA cores
  • Dual Denver Quad ARM® A57 CPU

Receiving side

RTSP streams from cameras can be received by using VLC application on one of the devices such as MacBook with MacOS, desktop or laptop with Windows or Linux OS.

Factors that can influence the delay value

The final result would be to see at the same monitor two numbers: one with the current time and the other with the time of capture.
The difference between them is the latency of the imaging system.


  • OS and software timer latency on PC
  • Refresh rate of the monitor
  • Camera frame rate and exposure time
  • OS latency of Jetson L4T
  • Frame content and compression ratio
  • B-frames in H.264 are turned on or off

Potential Components for low latency system

To connect cameras and other peripherals to Jetson TX2 a special carrier board is needed.
Described below is a proprietary carrier board (xEC2) designed by XIMEA to support multiple cameras.



Fig.3. Jetson carrier board from XIMEA

xEC2 carrier board interfaces

Several various slots enable to attach up to eight different XIMEA cameras to just one Jetson TX2:


  • 4x PCIe X2G2 connectors for cameras with flat ribbon cable interface
  • 1x PCIe X4G2 connectors for cameras with flat ribbon cable interface
  • 2x USB3 connectors for cameras with flat ribbon cable interface
  • GPIO connector and GPIO DIP switches
  • Ethernet
  • HDMI
  • M.2 interface
  • USB3 Type-A connector
  • USB2 Micro-B connector
  • Jetson pinout
  • Power connector
  • Fan power connector
  • Wi-Fi antenna connectors located on Jetson TX2 module

Software package

  • Linux for Tegra (L4T) operating system
  • CUDA-9
  • Fastvideo SDK for Jetson
  • XIMEA API with CamTool application
  • MRTech runtime application with implemented pipelines for all connected cameras
  • Application to control aperture and exposure for each camera



Fig.4. Fastvideo SDK for realtime image and video processing on NVIDIA GPU

Full image processing pipeline on Jetson TX2

  • Image acquisition from both cameras and zero-copy to Jetson TX2
  • Black level
  • White Balance
  • HQLI Demosaicing
  • Export to YUV
  • H.264 encoding via NVENC
  • RTSP streaming to the external PC via wireless network
  • Video acquisition at external PC
  • Video decoding via NVDEC on desktop GPU for both video streams
  • Show video streams at the same monitor via OpenGL

Benchmarks for one and two-camera systems

These are benchmarks for two XIMEA camera models MX031CG-SY-X2G2 and the receiving desktop station with NVIDIA Quadro P2000 GPU over Wi-Fi.


  • Image sensor: Sony IMX252, 1/1.8", 2064×1544 resolution (3.2 MPix), Global shutter, up to 218 fps
  • Camera frame rate 70 fps; 14 ms between frames for each camera
  • Exposure time: 1 ms
  • PCIe data transfer: 4 ms
  • Processing pipeline: 10 ms for one camera, 15 ms for two cameras
  • Network: 1 ms average
  • Display: 144 fps; 7 ms between frames

Measured glass-to-glass delay values

The below benchmarks for G2G include video processing and video transmission.


  • Full resolution, 1 camera: 35–66 ms, average/median 50 ms
  • Full resolution, 2 cameras: 50–68 ms, average/median 60 ms
  • 1080p/720p ROI: decrease of ~5 ms, not measurable in the test because it's less than the accuracy


The presented method offers relatively reasonable values which depend on the complexity of the video processing pipeline.


There are more precise methods for latency measurements which are utilizing light-emitting diode (LED) as a light source and a phototransistor as a light detector.
Such a setup includes the blinking LED in the field of view of the camera as a signal generator and recording of the photoelectric sensor where the LED is shown on the display.
The LED triggers an oscilloscope which also records the signals from the photoelectric sensor and that allows to receive the G2G delay.
The analysis of the data could be done automatically with a microcontroller board.


Credentials
Fastvideo Blog:
https://www.fastcompression.com/blog/low-latency-h264-streaming-from-jetson-tx2.htm