Achieving ultra-compact, multi-display digital signage systems with high-performance Accelerated Processing Units

3Designers of digital signage systems have long been challenged to achieve high-end graphics and video performance with conventional CPU- and discrete GPU-based embedded boards that present board area and cooling challenges. With the advent of Accelerated Processing Units (APUs), however, digital signage system designers are equipped to achieve new levels of multimedia performance and visual immersion while striking a balance of form and function in small form factor, power-efficient solutions.

With the continued evolution of embedded system technology, digital signage designers are better equipped than ever before to achieve ambitious design goals that meet customers’ exacting multimedia performance and functionality requirements. Designing on x86 embedded boards and modules enables these designers to achieve PC-caliber performance and agility complemented by a rich ecosystem of industry-standard, x86-optimized software, applications, and development environments. Collectively, these efficiencies yield significantly leaner cost structures for embedded board providers and designers alike, and help provide smooth scalability from low-end digital signage offerings to high-end offerings via a single underlying embedded processing platform.

Yet digital signage designers continue to bump up against frustrating performance and flexibility limitations when using embedded boards built around conventional processing platforms and/or ad-hoc chipsets, particularly when developing HD-enabled, multi-display signage systems for space-constrained environments. Where previously power-hungry multicore CPUs and add-on graphics cards dominated the digital signage technology landscape, the emergence of embedded boards and modules equipped with new-generation APUs is facilitating advanced graphics capabilities within an extremely small footprint. The merging of advanced x86 computing capabilities with the parallel processing power of a General-Purpose Graphics Processing Unit (GPGPU) in a single device allows OEMs to design graphics-intensive digital signage systems that deliver advanced multimedia experiences in single-screen and/or multi-screen configurations.

APUs for advanced processing performance and space savings

Conventional chipset architectures that integrate graphics processing rely on the CPU to interface with the GPU via a North Bridge connection, sending calls to the GPU to invoke code running on the co-processor that would then send results back to the CPU. This serial data processing approach adds considerable memory latency, consumes system power, and uses more board space.

High-performance APUs combine a CPU and advanced HD-caliber GPU in a tightly integrated, power efficient, and extremely compact form factor (Figure 1). The CPU takes care of the scalar processing, including memory, networking, and storage processing, and runs the Operating System (OS), applications, and User Interface (UI). The on-die GPU offloads graphics and multimedia processing using Single Instruction, Multiple Data (SIMD) parallel processing, while driving high-definition video displays through DisplayPorts or HDMI.

Figure 1: A Heterogeneous System Architecture (HAS) combines a CPU and GPU into one Accelerated Processing Unit (APU), which is then coupled with a companion controller hub to produce an efficient two-chip processing architecture.
(Click graphic to zoom)

This CPU plus GPU combination onto a single APU die provides a robust architecture. APUs can offload data parallel processing from the CPU to the GPU, including multimedia streaming. Freed from this task, the CPU can focus on compute, memory, and I/O requests with much lower latency, thereby improving real-time graphics processing performance via a fully optimized data path and shared access to the memory controller. Additionally, with both CPU and GPU architectures collocated on the same die, an application can be written as one program using OpenCL thus reducing latency and overhead (Sidebar 1). This CPU plus GPU combination is called the Heterogeneous System Architecture (HSA).

Sidebar 1: Consolidating heterogeneity with OpenCL
(Click graphic to zoom)

An APU eliminates the need for add-on graphics cards for digital signage systems, and reduces the footprint of a traditional three-chip platform to just two chips – the APU and the companion controller hub. This two-chip architecture simplifies design complexity through a reduction in embedded board layers, enabling digital signage designers to achieve aggressive form factor goals while driving down overall system cost (Figure 2).

Figure 2: The reduced two-chip architecture of APUs plays well into the tight space requirements of small form factor COTS boards.
(Click graphic to zoom by 1.9x)

By providing native, high-performance graphics processing at the silicon level, APUs also preclude the need for right-edge connectors that are usually required by add-on graphics cards. In space-constrained designs, an edge connector takes up more space (card-edge boards are typically 3" to 5" taller) and exposes the board to additional shock and vibration that can lead to signal integrity issues. Designing APU-caliber graphics capabilities directly onto a carrier board is a more rugged, long-term option, ultimately yielding book-sized digital signage systems that can fit into tight spaces behind wall-mounted video displays.

Power and cooling considerations

For retailers deploying high-end digital signage systems using high-performance APU-based embedded boards, electricity costs deriving from the power consumption of the media player itself may not be the most critical consideration. Indeed, for these types of installations, the combined power draw of the displays may dwarf the power consumption of the media player itself. However, embedded board-level power consumption remains an especially important consideration with regard to system cooling, and here APUs afford several advantages.

Fan-cooled digital signage can, of course, be vulnerable to airborne particulates and debris, as well as shock and vibration – all of which are common environmental factors in high-traffic (pedestrian and/or vehicular) environments. Digital signage designers are therefore understandably wary of fan cooling mechanisms due to the inherent risk of failure. Passively cooled, ventless signage systems are the ideal end goal for these designers.

The net Performance Per Watt (PPW) gains enabled by APUs bring greater power efficiency and lower heat dissipation, which in turn can preclude the need for fan cooling within digital signage systems, thus helping preserve board space while limiting system noise and lowering Bill of Material (BoM) costs. Because the integration of the CPU and GPU on the same die eliminates the need for a PCIe or MXM graphics card, APUs equip designers to save considerable power – at least 25-35 W.

With between 128 and 384 compute units delivering a calculated 172-563 SP GFLOPs[1] of performance and a Thermal Design Power (TDP) ranging from 17-35 W (average power below 13 W), the PPW advantages yielded by AMD Embedded R-Series APUs help provide greater power efficiency and lower heat dissipation than comparably-performing conventional processing platforms. Though R-Series APUs are deployed with fan cooling at the board level in most cases, continued advancements in heat pipe cooling technology are beginning to yield improved reliability in a new generation of fanless, R-Series APU-based digital signage systems.

Multi-screen, multimedia immersion

The ability to support multiple displays simultaneously from a single digital signage system is emerging as a key requirement for realizing immersive, eye-catching displays and independent multimedia content feeds. But many designers remain challenged to unlock the full promise of multi-display digital signage, and are therefore limited in their ability to transcend conventional “single screen” visual experiences in favor of panoramic, “surround sight” display configurations. Today, many digital signage systems require one controller to power each individual screen – far from ideal in terms of space conservation, as well as power and cost efficiencies.

As mentioned previously, the on-die GPU in APUs utilizes the Video Electronics Standards Association’s (VESA’s) royalty-free DisplayPort connectivity standard to interface high-definition displays, DisplayPort is built upon a micro-packet architecture that enables the addressing and control of several displays through one DisplayPort connector, otherwise known as daisy-chaining. Where DVI and HDMI both require a dedicated clock source for each display, DisplayPort only requires a single reference clock source to drive as many DisplayPort streams as there are display pipelines in the processing platform.

Signage systems with advanced APU-enabled multi-display capabilities can therefore be optimized to present multiple feeds of dynamic video content cycling across multiple independent displays, or present a single multimedia feed distributed across a multi-panel display, both in full HD resolution. In the case of AMD R-Series-based embedded boards, up to four displays are supported from a single APU. AMD R-Series APUs also feature an option to include an additional onboard AMD Radeon GPU, which can be utilized in combination with the APU to power up to ten screens using AMD EyeFinity technology.

Accelerating digital signage

With APUs at the heart of a new generation of x86 embedded boards and modules, digital signage OEMs and customers alike are afforded dramatic gains in both media performance and space savings, with additional power and cooling efficiencies and multi-display flexibility rounding out the value proposition. In this way, APUs are enabling digital signage developers to provide visually-arresting multimedia capabilities that exceed those of traditional static and/or single-screen signage systems built on conventional processing platforms.

Dave Jessel is a Sr. Product Marketing Manager of the Broadbase Market Development team at AMD Embedded Solutions.


[1] Calculated SP GFLOPs = (Number of x86 cores * (128-bit (FPUs)/32-bit (SP Operation)) * CPU Base Frequency) + (Number of shader units * (64-bit (shader)/32-bit (SP Operation)) * GPU Max Frequency)