GPU Microarchitecture Integrating Dedicated Matrix Units At The Cluster Level (UC Berkeley)


A new technical paper titled "Virgo: Cluster-level Matrix Unit Integration in GPUs for Scalability and Energy Efficiency" was published by UC Berkeley. Abstract "Modern GPUs incorporate specialized matrix units such as Tensor Cores to accelerate GEMM operations central to deep learning workloads. However, existing matrix unit designs are tightly coupled to the SIMT core, limiting the size a... » read more