Developing 4K Video Projects With FPGAs

Ultra high definition content is everywhere, but technical challenges in developing systems to process 4K Ultra HD resolution data persist.


Achieving higher resolution is a never-ending race for camera, TV and display manufacturers. After the emergence of 4K ultra high definition (Ultra HD) imaging in the market, it became the main standard for today’s multimedia products. 4K Ultra HD brings us bigger screens which give an immersive feeling. With this standard, the pixilation problem was solved in the big screens. 4K consumers are everywhere, from live sport broadcasting to video conferencing on our mobile devices. There are, however, many technical challenges in developing systems to process 4K Ultra HD resolution data. As an example, a 4K frame size is 3840 x 2160 pixels (8.5 Mpixel) and is refreshed at a 60Hz, equating to about 500 Mpixel/sec. This requires a high-performance system to process 4K frames in real time. Another bottleneck is power consumption particularly for embedded devices where power is critical. Being low power yet high performance, FPGAs have shown a strong potential to tackle these challenges. In this blog, you’ll learn all you need to know to start developing a 4K video conferencing project using FPGAs.

Are FPGAs the right choice for 4K Ultra HD image/video processing?

Because of the high volume of data and the computationally intensive algorithms in 4K Ultra HD image/video applications, there are four main candidates for the processing technology: GPU, CPU, ASIC or FPGA. Of these, the last is proving increasingly popular.

FPGAs deliver high performance, low latency, low power and are reconfigurable. These are the most important requirements for an embedded system responsible for image/video processing.

Let’s now discuss why FPGAs are the best choice for 4K Ultra HD image or video processing briefly; if you want a more detailed explanation, I strongly recommend you give an earlier blog I wrote, “FPGA vs GPU for Machine Learning Applications: Which one is better?”, a read. FPGAs can easily support massive parallelism, thanks to the high number of I/O pins and the flexibility of designing any module in pure hardware. This is an advantage over a CPU. On the other hand, the low power and low latency, which comes through not being tied to a pre-built architecture, wins over GPUs. Lastly, FPGA reconfigurability means faster time to market and reduces overall costs compared to an ASIC. FPGA I/Os are highly configurable. You can configure anytime the direction, voltage, slew rate, speed and terminations. This reconfigurability also lets us optimize the image processing algorithm anytime we need to.

Now that we know why FPGAs are superior, let’s check out an example design for a 4K Ultra HD video conferencing using FPGAs.

A solution for 4K video conferencing using Zynq FPGA

Aldec’s TySOM product line includes Xilinx Zynq-7000 and US+ MPSoC FPGA-based embedded development boards and FMC daughter cards.

For this project, we have used a TySOM-3-ZU7EV device which includes a XCZU7EV device on it. The board has a wide range of peripherals – that include HDMI 2.0 IN/OUT, QSFP+, DisplayPort, Ethernet, USB 3.0 – which make this board ideal for a 4K Ultra HD video conferencing project. In addition, the ZU7EV device includes a H.264/H.265 Video Codec Unit (VCU) which can perform video compression and decompression of real time 4K @ 60Hz video.

The main advantage of using Zynq device is the flexibility of using a hardened multi core ARM processor alongside programmable logics. For computationally intensive application such as video processing, the heavy algorithms can be offloaded to the FPGA for acceleration.

In the following section, we go over two different solutions as shown in figure 1:

  • 4K Ultra HD video conferencing using QSFP+ (lossless raw image data)
  • 4K Ultra HD video conferencing by encoding image data (TCP/IP over Ethernet transferring, Video Codec Unit – VCU encoding/decoding)

Figure 1: Aldec 4K UltraHD Imaging Solution

4K Ultra HD video conferencing using QSFP+ (lossless raw image data)

In this project, the QSFP+ connector on the TySOM-3-ZU7EV board is used. The main goal for this project is to demonstrate the transfer of 4K imaging without compression over the high speed/high bandwidth and low latency QSFP+ interface. The data transfer is done using the aurora protocol which is a Xilinx point-to-point high speed serial link. Without any encoding/decoding, the resultant bandwidth for 2160p at a 60Hz refresh rate is about 1 GB/s, which is not reachable by the widely used 1Gbit Ethernet.

As shown in the figure 2, the 4K video is captured using a MIPI 4K camera and is processed by MIPI CSI-2 inside the programmable logic of the Zynq device. The data will be buffered in the DDR4 memory by PS and then passed to the Aurora TX SS IP inside the FPGA. The data will be transferred through a QSFP+ cable to the second board and will be received by an Aurora RX SS IP inside the FPGA. It will be stored into DDR4 memory by the PS and then shown on the 4K HDMI screen using the HDMI 2.0 TX SS IP placed inside the FPGA. The same process is done from the second board to the first board which completes the 4K video conferencing.

Figure 2: Aldec 4K UltraHD Real Time Video Processing Solution Using QSFP+ High Speed Data Transferring on TySOM Board

This project is provided by Aldec using Linux in a form of GUI-less command line application or more advanced Qt-based GUI application with additional user controls over the Sony IMX274 image sensor. Both applications support image displaying on HDMI (up to 2160p@60Hz) or DP (up to 2160p@30Hz) monitors.

4K Ultra HD video conferencing by encoding image data

The encoded image data is best transferred through Gigabit Ethernet networks. So, for this project, and as you can see in the figure 3, the VCU unit is used to encode and decode the image data.

The stream of 4K video is received and processed using MIPI CSI-2 IP inside the FPGA. The video is then encoded inside the EV chip using the H.265 protocol. The encoded data is then transferred using TCP/IP protocol to the other board. On the second board, the data is received by the Ethernet and decoded by the VCU and displayed on the 4K HDMI screen. The same process is performed in reverse to complete the 4K video conferencing. Figure 3 shows the structure of the design in detail.

Figure 3: Aldec 4K UltraHD Real Time Video Processing Solution Using Ethernet Gbit on TySOM Board

There are six predefined presets used to configure VCU encoder and decoder hardware: AVC (Low, Medium, High) and HEVC (Low, Medium, High). The Low, Medium and High target bitrates are 10, 30 and 60 Mbit/s for both AVC and HEVC. The rest of the compression settings (which include Profile, Rate control, GoP) are the same for all the used presets in the design.

These two projects are prepared as a single board too which can be used in 4K video processing projects.

Here are the main HW component and features used in this design:

  1. Leopard Imaging LI-IMX274MIPI-FMC (v1.1) based on Sony IMX274 Imager as video source device.
  2. 4K-capable monitor with HDMI/DP interfaces as video display device.
  3. Zynq UltraScale+ built-in hardware blocks, PS 1Gbit ethernet controller, QSFP+ connector, *Mali GPU.
  4. Zynq UltraScale+ PL-side soft IP blocks: MIPI CSI2 RX SS, HDMI 2.0 TX SS (up to 2160p@60Hz), Aurora TX/RX SS.
  5. QSFP+ compliant copper or optical cable.

*Mali GPU is used in single board GUI version only.

At Aldec, we are supporting engineers and helping them fast-track their projects through the provision of feature-rich development boards and ready-to-use reference design for ADAS, IoT, Deep Learning, Networking and many more.

These references designed are all developed for TySOM boards and FMC daughter cards. If you are interested in developing embedded vision applications using FPGA, I highly recommend you read the following articles:

How to develop high-performance deep neural network object detection/recognition applications for FPGA-based edge devices

What is Bird’s Eye View ADAS Application and How to Develop This Using Zynq UltraScale+ MPSoC FPGA?

Leave a Reply

(Note: This name will be displayed publicly)