Support Support Home » APIs » XIMEA Linux Software Package » Jetson Nano TX2 and AGX Xavier benchmarks

Benchmarks comparison for Jetson Nano, TX2 and AGX Xavier

NVIDIA® Jetson is the world's leading embedded platform for image processing and DL/AI tasks.
Its high-performance, compact size, variability and low-power computing for deep learning make it the ideal component of mobile compute-intensive projects.
NVIDIA has released a series of SBC (Single board computer) Jetson hardware modules focused on utilization in embedded vision systems and applications.
XIMEA has developed a carrier board for Jetson TX2 and offers a wide portfolio of cameras that are able to run on Jetson Nano and AGX Xavier.

Hardware features for Jetson Nano, TX2, AGX Xavier

The following is a brief comparison of Jetsons hardware features showing a variety of setup options for different markets.

Feature Nano TX2 / TX2i Xavier
CPU (ARM) 4-core ARM A57 @ 1.43 GHz 4-core ARM Cortex-A57, 2-core Denver2 @ 2GHz 8-core ARM Carmel v.8.2 @ 2.26GHz
GPU 128-core Maxwell @ 921MHz 256-core Pascal @ 1.3GHz 512-core Volta @ 1.37GHz
Memory 4GB LPDDR4, 25.6 GB/s 8GB 128-bit LPDDR4, 58.3 GB/s 16GB 256-bit LPDDR4, 137 GB/s
Storage MicroSD 32 GB eMMC 5.1 32 GB eMMC 5.1
Tensor cores NA NA 64
Video encoding (1x) 4Kp30, (2x) 1080p60, (4x) 1080p30 (1x) 4Kp60, (3x) 4Kp30, (4x) 1080p60, (8x) 1080p30 (4x) 4Kp60, (8x) 4Kp30, (32x) 1080p30
Video decoding (1x) 4Kp60, (2x) 4Kp30, (4x) 1080p60, (8x) 1080p30 (2x) 4Kp60, (4x) 4Kp30, (7x) 1080p60 (2x) 8Kp30, (6x) 4Kp60, (12x) 4Kp30
USB (4x) USB 3.0 + Micro-USB 2.0 USB 3.0 + USB 2.0 (3x) USB 3.1 + (4x) USB 2.0
PCIe 4 lanes PCIe Gen 2 5 lanes PCIe Gen 2 16 lanes PCIe Gen 4
Power 5W / 10W 7.5W / 15W 10W / 15W / 30W
Size 70 x 45 mm 90 x 50 mm 100 x 87 mm

In the camera applications, the Host-to-Device transfers can be usually hidden by implementing the GPU Zero Copy or by overlapping GPU copy/compute.

Performance Comparison: Jetson Nano vs TX1 vs TX2 vs AGX Xavier

In order to fairly compare the performance of each module the following basic image processing tasks were chosen.
They are specific for benchmarking the camera applications: white balance, demosaic (debayer), color correction, optional resize, jpeg encoding, etc.

Hardware and software for benchmarking

  • CPU/GPU NVIDIA Jetson Nano, TX1, TX2/TX2i, AGX Xavier
  • OS L4T (Ubuntu 18.04)
  • CUDA Toolkit 10.0 for Jetson Nano, TX2/TX2i, AGX Xavier
  • Fastvideo SDK 0.14.2

GPU kernel times for 2K image processing (1920×1080, 8/16 bits per channel, milliseconds)

Algorithm and parameters Nano TX2 / TX2i Xavier
Host to Device 0.2 0.2 0.05
White Balance 0.6 0.24 0.08
HQLI Debayer 1.8 0.47 0.36
DFPD Debayer 4.7 2.06 0.95
MG Debayer 12.7 5.9 2.2
Color Correction with 3×4 matrix 1.7 0.81 0.25
Resize from 2K to 960×540 10 4.3 1.5
Resize from 2K to 1919×1079 19.8 8.2 2.4
Gamma (1920×1080) 1.4 0.84 0.2
JPEG Encoding (1920×1080, 90%, 4:2:0) 4.3 1.7 0.62
JPEG Encoding (1920×1080, 90%, 4:4:4) 6.8 2.6 0.75
JPEG2000 Encoding (lossy, 32×32, single mode) 81 63 11.1
JPEG2000 Encoding (lossless, 32×32, single mode) 190 163 23.3
Device to Host 0.1 0.1 0.02

It is possible to choose a particular debayer algorithm and output compression (JPEG or JPEG2000) to define the image processing pipeline.

The Fastvideo company has also done the same kernel time measurements for NVIDIA GeForce and Quadro GPUs.
You can get that document HERE