



## IST R&D. FP6-Priority 2. SPECIFIC TARGETED RESEARCH PROJECT

**Project Deliverable** 

| SUIT Doc Number                 | SUIT_434                                                             |
|---------------------------------|----------------------------------------------------------------------|
| Project Number                  | IST-4-028042                                                         |
| Project Acronym+Title           | SUIT - Scalable, Ultra-fast and Interoperable Interactive Television |
| Deliverable Nature              | Prototype                                                            |
| Deliverable Number              | D5.10                                                                |
| Contractual Delivery Date       | 30 November 2007                                                     |
| Actual Delivery Date            | 15 December 2007                                                     |
| Title of Deliverable            | SVC_real_time_encoder_prototype                                      |
| Contributing Workpackage        | WP5                                                                  |
| Project Starting Date; Duration | 01/02/2006; 27 months                                                |
| Dissemination Level             | PU                                                                   |
| Author(s)                       | Pierre Andrivon (Vitec), Denis Fortin (Vitec), Olivier Guye (Vitec)  |

### Abstract

This document describes the final version of a Real Time implementation of a Scalable Video Coding algorithm that can be used to provide Multiple Descriptions used to retrieve information from packet erasures in an error prone communication environment. It relies on the scalable extension of the H.264/AVC algorithm and it represents the last step on the path allowing the achievement of a real-time implementation that can process up to HDTV formats.

**Keyword list:** Advanced Video Coding, Scalable Video Coding, Multiple Description Coding, Real-Time Processing, High Definition Television

# PUBLIC

# SVC Real-Time Encoder Prototype

SUIT\_434 15-12-2007

# **Table of Contents**

| 1 | INTE | RODUCTION                                       | 4  |
|---|------|-------------------------------------------------|----|
|   | 1.1  | Scope                                           |    |
|   |      | OBJECTIVE                                       |    |
| 2 | PRE  | SENTATION OF THE REAL-TIME SCALABLE VIDEO CODER | 5  |
| 3 | REA  | L-TIME SCALABLE VIDEO CODER USER INTERFACE      | 6  |
| 4 | CON  |                                                 | 9  |
| 5 | ACR  | CONYMS                                          | 10 |
| 6 | REF  |                                                 | 11 |
| 7 | ANN  | IEX: VP3 DATA SHEET                             | 12 |

### 1 Introduction

### 1.1 Scope

This document is set up inside the framework of SUIT FP6 project. The scope of this document is to show and to make a short description of the Real-Time Scalable Video Coder that has been provided to the project. This Real-Time Scalable Video Coder can deal with video formats up to High Definition.

### 1.2 Objective

The main objective of this document is to make a short description of the Real-Time Scalable Video Coder that has been provided to the project.

This encoder is based on a proprietary MPEG-4 AVC encoder that has been extended to become compliant with the different upcoming SVC profiles.

This report is complementary to D5.8 which details how this real-time encoder has been built.

| IST-4-028042 | SUIT   | Deliverable D5.10 |
|--------------|--------|-------------------|
|              | Page 5 |                   |

### 2 Presentation of the Real-Time Scalable Video Coder

The platform shown in this document is a first real-time release from a multi-core GPP architecture equipped with a single multi-DSP board used for video capture.

This real-time video encoder is compliant with the present SVC A (Baseline Profile) and B (High Profile) profiles which are dealing with CGS SNR scalability.

It is including the Fidelity Range extensions (FRExt) tools used in AVC HP profiles in order to recover the same visual rendering for HDTV broadcast than SDTV using MPEG-2 at high bitrates.

The encoder architecture is multi-sliced and multi-threaded in order to fully use computing power of a multiprocessor architecture. It achieves real-time performances running over an eight-processor Intel server (two quad-cores).

Inputs and outputs are performed using the VP3 multi-DSP electronic board described in annex of this document. The encoder can be used either by capturing live feeds and streaming them over Ethernet or by processing video files.



Figure 1: Multiprocessor RT-SVC Platform

| IST-4-028042 | SUIT   | Deliverable D5.10 |
|--------------|--------|-------------------|
|              | Page 6 |                   |

### 3 Real-Time Scalable Video Coder User Interface

The RT-SVC encoder must be used through a proprietary application interface developed by VITEC Multimedia and named LiveWire.

A user can set up an application based on components commercialised by VITEC Multimedia using visual programming. LiveWire affords a visual editor that enables to choose among a library of available components required for its application and then links them by connecting outputs to inputs of components; like in GraphEdit (Graphic editor for DirectShow filters inter-connection). By this way, it can build progressively an application which will appear under the shape of an oriented graph, as shown below.



Figure 2: LiveWire Visual Programming

The LiveWire graphical editor outputs a XML description file that is built dynamically by the application when it is run under the LiveWire runtime environment as shown in the figure underneath.

Under the preview video screen, are shown commands that enable to control the encoder at work. On the right side, there is a set of a few commands enabling to insert texts and logos in the video stream before encoding (OSD).

In the left bottom corner, a button allows to modify the settings of the application.

💑 LiveWire Sample Application OSD C On Reset Load Image X 100 y 100 255 A Bg A 255 <OSD text> Load Text Font Name: Arial of 0 Start Split 1 Pause Resume Stop secs Font Size: Grab Stop 24 ,one every 1 Preview 1 E Bold Still Image Italic File Name : D:\\_Demo\H.264 SVC SUIT\tests\grab.bmp ... Text Color MPEG Encoder File Name : D:\\_Demo\H.264 SVC SUIT\tests\test.264 ... Bg Text Color Ready Settings... ... !!!

Figure 3: RT-SVC Encoder User Interface

The next figure shows the parameters settings windows. At first use, default parameters can be modified in order to define how encoding must be performed. After modification, the LiveWire XML file is updated with the new selected values.

Concerning the RT-SVC encoder, it can be seen that encoding is controlled by defining:

- the resolution of the input stream;
- its frame rate;
- if Picture Adaptive Field Frame (PAFF) mode is required;
- if the input stream is an YV12 or IYUV stream (Chromas swapping);
- if CAVLC or CABAC is selected for entropy coding;
- if the in-loop deblocking filter is used or not;
- the GoP size;
- the number of quality layers to be generated after the base layer;
- the number of spatial resolutions to be produced from the input resolution;
- the number of processors to be used for encoding;

| IST-4-028042 | SUIT   | Deliverable D5.10 |
|--------------|--------|-------------------|
|              | Page 8 |                   |

- the number of slices to use for each spatial resolution;
- the array of bit rates to reach for each couple of spatial resolution and quality layer.

| Propriétés               |                           |           |         | X |
|--------------------------|---------------------------|-----------|---------|---|
| Assembly                 |                           |           |         |   |
|                          |                           |           |         |   |
| Assembly :               | HD H264                   |           | •       | - |
| Profile :                | HD 1280x720p              |           |         | - |
| Input :                  | PAL                       |           | •       | - |
| <b>▼</b> □ H264 I        | Encoder                   |           |         | ٦ |
| • Wid                    | lth                       |           | 1280    |   |
| ♦ Heid                   | aht                       |           | 720     |   |
|                          | me rate                   |           | 25      |   |
| • PAF                    | F                         |           | 0       |   |
| Chro                     | omas swapping             |           | 0       |   |
| Entre                    | ropy coding mode          |           | 1       |   |
|                          | deblocking filter         |           | 1       |   |
| ● GoF                    |                           |           | 50      |   |
| Intel                    | r P distance (B frames)   |           | 0       |   |
| • CG9                    | 6 layers number           |           | 1       |   |
| • Add                    | led spatial layers number |           | 2       |   |
| <ul> <li>Core</li> </ul> | es number                 |           | 8       |   |
| • AV0                    | Clayer slice number       |           | 1       |   |
| Res                      | olution 1 slice number    |           | 1       |   |
| Res                      | olution 2 slice number    |           | 6       |   |
| • AV0                    | Clayer bitrate            |           | 250000  |   |
| • Lay                    | er 1 bitrate              |           | 500000  |   |
| Lay                      | er 2 bitrate              |           | 750000  |   |
| • Lay                    | er 3 bitrate              |           | 1500000 |   |
| • Lay                    | er 4 bitrate              |           | 4000000 |   |
|                          | er 5 bitrate              |           | 6000000 |   |
|                          | or C bitrato              |           | n 🗋     |   |
| Show Hidd                | en Parameters             |           |         |   |
| OK                       | Annuler                   | Appliquer | Aide    |   |

Figure 4: RT-SVC Encoder Settings Window

### 4 Conclusion

This document has shown the Real-Time Scalable Video Coder that can be adapted to provide Multiple Description coded video streams and that can deal with a wide range of video formats including High Definition. This software platform allows decoding as well as encoding video feeds.

It was built from a proprietary AVC set and by implementing SVC extensions. This platform affords the three following different types of scalabilities: spatial, temporal and SNR ones. It is taking into account the main evolution of the MPEG-4 SVC encoding format.

The platform shown in this document is a first real-time release from a multi-core GPP architecture equipped with a single multi-DSP board used for video capture.

It has been provided to the project in order to set up a real-time demonstration of live encoding and decoding.

| IST-4-028042 | SUIT    | Deliverable D5.10 |
|--------------|---------|-------------------|
|              | Page 10 |                   |

## 5 Acronyms

| AVC   | Advanced Video Coder                                   |
|-------|--------------------------------------------------------|
| BP    | Baseline Profile                                       |
| CABAC | Context Adaptive Binary Arithmetic Coding              |
| CAVLC | Context Adaptive Variable Length Coding                |
| CGS   | Coarse Grain SNR scalability                           |
| DSP   | Digital Signal Processor                               |
| FRExt | Fidelity Range Extensions                              |
| GoP   | Group of Pictures                                      |
| GPP   | General Purpose Processor                              |
| HD    | High Definition                                        |
| HDTV  | High Definition Television                             |
| HP    | High Profile                                           |
| MPEG  | Motion Picture Expert Group                            |
| OSD   | On-Screen Display                                      |
| PAFF  | Picture Adaptive Field-Frame                           |
| RT    | Real Time                                              |
| SDTV  | Standard Definition Television                         |
| SNR   | Signal-to-Noise Ratio                                  |
| SVC   | Scalable Video Coding                                  |
| VP3   | Video Parallel Programming Platform (VITEC Multimedia) |
| XML   | Extended Mark-up Language                              |

| IST-4-028042 | SUIT    | Deliverable D5.10 |
|--------------|---------|-------------------|
|              | Page 11 |                   |

## 6 References

[1] SUIT-309, D5.8: Real-Time SVC Encoder – FP6-IST SUIT Project, November 2007

### 7 Annex: VP3 Data Sheet



### ARCHITECTURE OF THE BOARD

VP<sup>3</sup> implements 8 x TMS320DM642<sup>™</sup> DSPs from Texas Instruments running at 600 MHz (and soon 720 MHz, 1GHz, ...) thus providing up to 38,4 GIPS with a maximum of 4 operations per instruction (4 operations of 8 bits, 2 operations of 16 bits and 1 operation of 32 bits) which is a maximum of 153,6 GOPS. Each DSP has a private local memory of 128 MB (SDRAM running at 100 MHz and 64 bits, which provides a throughput of 800 MB/s).



### CONFIGURABLE TOPOLOGY

VP<sup>3</sup> architecture is highly flexible and can be configured to fit to the specific needs of the developer's applications. Each DSP has 3 powerful and configurable video ports which are used as one of the communication ways between them. The topology of the array of 8 processors can be defined as a simple pipeline of eight processors, a fully parallel scheme or a mix of both. A cross-bar implemented in an FPGA interconnects the video ports of the DSPs.



#### INTER-DSPs COMMUNICATION

DSPs can communicate information to each others in several ways :

Direct memory to memory block exchanges through the DMA controller services. High performance DMA inter-processor's communication channels have been optimized to allow 1 to 1, 1 to n or 1 to all data exchanges.

Video data or raw data through their video ports and the cross-bar.

 The host busses of the DSPs are linked to PCI interface and can send/receive messages to/from the host through the PCI interface.

#### DMA CONTROLLER PRINCIPLE

One DSP initiates a Memory Block Transfer by sending a request to the DMA Controller specifying the list of destination DSPs which shall receive the message. The hardware controller puts the source and destination DSPs in hold state and takes control of their local memories to read the source memory and write into all the destination memories simultaneously. Once the transfer is achieved the DMA controller sends an interrupt to the source DSP to warn it that its message has been sent and also to the destination DSPs to warn them that they received a message.



#### HARDWARE CO-PROCESSORS

The FPGA implementing the cross-bar service is also the one interconnecting the digital video inputs and outputs to the processors array. It is large enough (up to 600,000 gates : Xilinx's XC25600E) to host a hardware coprocessor implementing your own preprocessing algorithms acting on the video data itself (for instance : filters, scaler, ...). The content of the FPGA can be downloaded at the initialization of the board from the flash memory or by software.

The FPGA implementing the DMA controller is also large enough (up to 600,000 gates : Xilinx's XC25600E) to host another hardware coprocessor implementing your own computation algorithms too heavy or complex to be implemented in software (for instance : CABAC, ...). The content of the FPGA can be downloaded at the initialization of the board from the flash memory or by software.

#### DAUGHTER BOARD

VP<sup>3</sup> includes a main board and a daughter board. The main board includes the PCI and Ethernet interfaces and the daughter board includes the video and audio inputs and outputs. To develop a specific daughter board with other kind of video and audio formats (DVI, analog....) please contact a Viteo representative.



### SOFTWARE TOOLS

VP <sup>3</sup> comes with a complete software environment :

- Windows WDM driver (.SYS) able to support multi-board applications in the same PC,
- The source code of a sample application running under Windows XP/2000 and addressing directly the WDM driver,
- Source code of DSP sample applications like :
- video pass-through, audio pass-through,
- Memory BIST,
- DMA transfers.

Vitec recommends the usage of the Texas Instruments development tools to develop software for the DSPs themselves,

- C/C++ compiler,
- simulator,
- emulator via the RTDS protocol and standard JTAG connector,

Multi-DSP applications can be emulated by instancing several times the TI emulator software under Windows and the JTAG connector on the **VP**<sup>3</sup> board.



### SOFTWARE TOOLS

A complete framework, called LiveWire<sup>TH</sup>, for developers who want to use VP3 hardware to develop a product running under Windows. LiveWire™ provides a set of ready to use connectable components leading to a drastic out of the development time. LiveWire<sup>TH</sup> has many advantages. It :

+ensures highly flexible, scalable, truly customizable solution

 is designed to allow well-structured parallel development,

allows to concentrate on solution specific tasks,

 overcomes the limitations of existing technologies such as DirectShow and COM in general,

+allows live reconnection of functional components without interruption of active processes,

 is compatible with Win32, COM, scriptable languages (Visual Basic, Java Script, ...),

+ takes advantages of XML based technologies and uses the Apache Xerces XML parser,

 provides different levels of SDK abstraction, from high level API for scripting languages through Win32 API for limited backward compa-tibility to the low level set of COM Interfaces for advanced development in C++.

- LiveWire<sup>TH</sup> parts :
- LiveWire Core
- LiveWire Components
- LiveWire XML-based Profiles
- LiveWire Custom Components Wizard for MS
- Visual Studio C++
- LiveWire Multiplatform Shell
- LiveWire SDK
- LiveWire Tutorial and Samples

To start using LiveWire<sup>™</sup> based products, all you have to do is to create an instance of Assembly Container, initialize it with XML-based Configuration Profile and run. Different sophisticated profiles can be created without extensive programming, using Integrated Property Page or directly by editing the XML file in the text editor of your choice. Components parameters persistence comes then automatically.

Very little programming is needed to use advanced features, such as Command Scheduling and Atomic Command Blocks. With a few extra lines of code you can complete an application capable of running execution scripts with frame accurate precision

Custom LiveWire<sup>TH</sup> components creation is simplified by Wizard and they can be easily integra-ted into existing Assemblies.

The most important advantage of the SDK is the layered structure of the LiveWire<sup>TM</sup> framework which allows a quick development cycle.

#### TECHNICAL SPECIFICATIONS

| <b>~</b> 2           | Video Inputs                     | SDI, HD-SDI                                                                                                                                                                                                                                                                                   |
|----------------------|----------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| outputs /            | Audio Inputs                     | AES/EBU, Audio de-embedding from SDI                                                                                                                                                                                                                                                          |
| 25                   | Video Output                     | SDI, HD-SDI                                                                                                                                                                                                                                                                                   |
| 50                   | Audio Output                     | AES/EBU, Audio-embedded in SDI                                                                                                                                                                                                                                                                |
| s                    | OVB-ASI input and or             | lugit                                                                                                                                                                                                                                                                                         |
| ۳ġ                   | PCI Interface                    |                                                                                                                                                                                                                                                                                               |
| Conner<br>tertaci    | 10 Base-T or 100 Bas             | e-TX Ethernet using single RJ-45 connector                                                                                                                                                                                                                                                    |
| ite the              |                                  | nal emulation hardware support. This is used to                                                                                                                                                                                                                                               |
|                      | connect VP <sup>3</sup> to the T | ensulator software                                                                                                                                                                                                                                                                            |
| _                    | connect VP* to the Ti            |                                                                                                                                                                                                                                                                                               |
| _                    | Usage of the board               | enculator software     - PCI plag-in card     - a stand-stone equipment with an external power supply     (rOV and +20) and a large Flash memory to store the     program and the FPGA costents.                                                                                              |
| _                    |                                  | <ul> <li>PCI plup-in card         <ul> <li>a sland-alone equipment with an external power supply<br/>(+5V and +3V) and a large Flash memory to store the</li> </ul> </li> </ul>                                                                                                               |
| _                    | Usage of the board               | <ul> <li>PCI plug-in card         <ul> <li>a stand-stone equipment with an external power supply<br/>(+SV and +SV) and a large Flash memory to store the<br/>program and the FPGA costents.</li> </ul> </li> </ul>                                                                            |
| Other specifications | Usage of the board<br>Size       | <ul> <li>PCI play-in card         <ul> <li>a stand-stone equipment with an external power supply<br/>(<ev <ev)="" a="" and="" flash="" large="" memory="" store="" the<br="" to="">program and the FPGA costents.</ev></li> </ul> </li> <li>324 mmx 107 mm (12.70*ind) x 421*indt)</li> </ul> |

#### DSPs SPECIFICATIONS

VP <sup>3</sup> uses 8 TMS320DM642<sup>™</sup> :

- High-Performance Digital Media Processor :
  - 600-MHz Clock Rate (and soon 720 MHz, 1 GHZ, ...), Eight 32-Bit Instructions/Cycle, 4800 MIPS,

  - Fully Software-Compatible With C64x.
- VelociTI.2. Extensions to VelociTI. Advanced Very-Long-
- Instruction-Word (VLIW) TMS320C64x, DSP Core :
   Eight Highly Independent Functional Units With VelociTL2, Extensions:
   Six ALUs (32-/40-Bit), Each Supports Single 32-Bit, Dual 16-Bit, or Quad 8-Bit Arithmetic per Clock Cycle,
   Two Multipliers Support Four 16 x 16-Bit Multiplies (32-Bit Death) are Clock Cycle, Dis Multipliers (32-Bit Death).
- Results) per Clock Cycle or Eight 8 x 8-Bit Multiplies (16-Bit
- Results) per Clock Cycle, Load-Store Architecture With Non-Aligned Support,
- 64 32-Bit General-Purpose Registers
- Instruction Packing Reduces Code Size,
   All Instructions Conditional.
- Instruction Set Features
  - Byte-Addressable (8-/16-/32-/64-Bit Data).
  - 8-Bit Overflow Protection,
  - Bit-Field Extract, Set, Clear,
  - Normalization, Saturation, Bit-Counting,
  - VelociTI.2. Increased Orthogonality.
- L1/L2 Memory Architecture

   128K-Bit (16K-Byte) L1P Program Cache (Direct Mapped).
  - 128K-Bit (16K-Byte) L1D Data Cache (2-Way Sat-Associative), 2M-Bit (256K-Byte) L2 Unified Mapped RAM/Cache (Flexible RAM/Cache Allocation).
- Endlaness: Little Endlan
- Enhanced Direct-Memory-Access (EDMA) Controller (64 Independent Channels)
- 10/100 Mb/s Ethemet MAC (EMAC)
- Three Configurable Video Ports : supports Multiple Resolutions and Video Standards.
- Three 32-Bit General-Purpose Timers
- IEEE-1149.1 (JTAG1)

C64s, VelociT12, VelociT1 and TMS320054s are trademarks of Tesas Instruments All trademarks are the property of their respective owners. † IEEE Standard 1149.1-1990 Standard-Text-Access Fort and Boundary Scan Architecture.