GPU Software for XIMEA high speed Cameras¶

Typically, the cameras for machine vision, industrial and scientific applications transfer RAW data and these need a complex image processing pipeline to convert to RGB in realtime.
This is a computationally heavy task, especially for high bandwidth cameras.
An example of such a camera is 65 Mpix model that is able to stream over 70 Frames per second through PCI Express Gen3 interface.
Computation of data from this model could be done on Intel/AMD CPUs, but this solution is difficult to accelerate further, especially with multicamera systems.
To overcome this bottleneck, it is possible to implement simplified algorithms on CPU, but most of the high quality algorithms are slow even with multicore CPUs, the slowest being: demosaicing, denoising, color grading, undistortion, resize, rotation to an arbitrary angle, compression, etc.
An optional solution is to utilize the Fastvideo SDK which is working on NVIDIA GPU, thus the full image processing pipeline is done on the graphics processing unit unburdening the CPU-based features.

Fast CinemaDNG Processor software¶

Based on the Fastvideo SDK engine is the Fast CinemaDNG Processor software for Windows, which provides high performance and high quality RAW image processing.
In fact, the offered image quality is comparable to the results of RAW processing at Raw Therapee, Adobe Camera Raw and Lightroom Photo Editor software, but significantly faster.
To check the GPU-based performance for a specific camera on Fast CinemaDNG Processor software, it is now possible to test transforming RAW images to DNG format with a new open source PGM2DNG converter, which could be downloaded from Github.
Below is a description of a sample project with details about the whole process.

GPU Camera Sample Project¶

To show how to implement a software with GPU image processing pipeline the following would be required:

Source codes and links to supplementary libraries for gpu-camera-sample project you can find on Github.

Simple image processing pipeline on GPU for machine vision applications¶

Sample application to create a software capturing RAW images from XIMEA cameras can be found in XIMEA SDK and this code can be incorporated into final solution.
It is possible to utilize the default camera parameters to focus on GPU-based image processing, but also add any GUI to control camera parameters as well.

Raw image capture (8-bit, 12-bit packed/unpacked, 16-bit)
Import to GPU
Optional raw data convertion and unpacking
Linearization curve
Bad pixel removal
Dark frame subtraction
Flat-field correction
White Balance
Exposure correction (brightness control)
Debayer with HQLI (5×5 window), DFPD (11×11), MG (23×23) algorithms
Wavelet-based denoising
Gamma
JPEG compression
Output to monitor with minimum latency
Export from GPU to CPU memory
Storage of compressed data to SSD

There are many more image processing functions in the Fastvideo SDK to implement.
This pipeline can be further modified as soon as other source codes become available. Roadmap.

Software architecture¶

It is also feasible to build a multi-camera solution based on this software.
The simplest way would be to run several processes (one per camera) at the same time to test it.
A more sophisticated approach would be to create a single image loader that will collect frames from different cameras for further processing on GPU.
From the benchmarks on NVIDIA GeForce RTX 2080ti it can be identified that GPU-based RAW image processing is very fast with total performance reaching up to 4 Gpix/s or more on multiple GPUs.

Thread for GUI and visualization (app main thread)
Thread for image acquisition from a camera
Thread to control CUDA-based image processing
Thread for OpenGL rendering
Thread for async data writing to SSD or streaming

There is also an opportunity to utilize different compression options on GPU at the end of the pipeline:
JPEG (MJPEG), JPEG2000 (MJ2K), H.264 and H.265 encoders on GPU.
Please note that H.264/H.265 are implemented via hardware-based NVIDIA NVENC encoder and that compression could be done in parallel with CUDA code.

65 Mpix camera tests¶

The software could also work with RAW images in PGM format which are stored on external SSD.
This is a good method for software evaluation and testing without a camera.

In case of initial tests for the mentioned 65 Mpix camera (9433 x 7000 resolution) with bayer Gpixel GMAX3265 image sensor and RAW frames in PGM format:

8-bit mode - the total processing time was around 15 ms which is more than 4 GPix/s.
12-bit mode - the total processing time 21 ms processing time.

The pipeline included:
Data acquisition, dark frame, FFC, linearization, BPC, white balance, debayer HQLI, gamma sRGB, 16/8-bit transform, JPEG compression (subsampling 4:2:0, quality 90), viewport texture copy, monitor output at 30 fps.

Remarkable performance for JPEG encoding on GPU is confirmed by: 65 MPix color image being compressed within 3.3 ms on NVIDIA GeForce RTX 2080ti.
The user can test different NVIDIA GPUs to choose the best hardware in terms of price and performance for a particular task.

Roadmap for gpu-camera-sample project¶

GPU pipeline for monochrome cameras - in progress
H.264/H.265 encoders - in progress
Linux version - in progress
Resize
UnSharp Mask
Rotation to an arbitrary angle
Image acquisition and processing for RAW-SDI and HD-SDI cameras
JPEG2000 encoder
Realtime raw compression (lossless and/or lossy) on GPU
Curves and Levels via 1D LUT
Color correction with 3x3 matrix
Other color spaces
3D LUT for HSV and RGB with cube size 17, 33, 65, 96, 256, etc.
Defringe
Special version for NVIDIA Jetson hardware (Nano, TX2, Xavier)
Interoperability with external FFmpeg and GStreamer