New cooperative matrix, pipeline compilation, and memory management improvements.
Here are some of the highlights of what has been updated in the latest Imagination GPU Linux and Android Driver Development Kits:
To help accelerate graphics post-processing, neural shaders, physics simulations, and machine learning inference on the GPU, DDK 25.2 implements support for VK_KHR_cooperative_matrix. This extension provides Vulkan developers with a standardized and widely supported way to handle cooperative matrix operations, which are particularly useful for high-performance matmul in compute shaders.
Cooperative matrix types are medium-sized matrices that are primarily supported in compute shaders, where the storage for the matrix is spread across all invocations in a given scope. These invocations cooperate to run matrix multiplies in a parallelized and optimized manner. They are an ideal fit for the deeply integrated AI acceleration inside the new Imagination E-Series GPUs.
As Vulkan has evolved, developer needs have shifted, making the original model of compiling complete pipeline-shader objects upfront increasingly impractical. Many new extensions reflect this shift, introducing more flexible, dynamic pipeline behavior. However, these changes challenge traditional architectures that rely on static pipeline-state assumptions.
To address this, we’re extensively re-architecting our Vulkan driver’s shader and pipeline compilation process. This overhaul introduces a modular compilation framework, streamlined development workflows, and improved management of pipeline variants. Enhancements in caching and hashing reduce runtime hitching, while new capabilities for late-stage shader linking and re-compilation provide a foundation for future innovations to support modern Vulkan development and evolving application demands.
Developers can now use pinned memory via CL_USE_HOST_PTR and CL_ALLOC_HOST_PTR. This is an effective tool for bandwidth-intensive applications, repetitive transfers and memory transfer limited segments of applications.
We’ve also introduced the ability to import externally allocated memory, such as buffers shared via DMA-BUF file descriptors, into OpenCL as Unified Shared Virtual Memory (SVM). This enables seamless, zero-copy data sharing between OpenCL and other components (e.g., video decoders, camera drivers, or other compute APIs) that support DMA-BUF.
By mapping external memory into the OpenCL SVM space, applications can:
Virtualization is a popular feature, particularly in the automotive market where the GPU in a multi-domain controller is supporting the operating systems for the in-vehicle infotainment, cockpit and ADAS all at once. Customers that employ GPU virtualization can now take advantage of GPU dynamic voltage and frequency scaling (DVFS) on platforms.
Leave a Reply