New research paper titled “Exocompilation for productive programming of hardware accelerators,” from researchers at MIT and UC Berkeley.
From their abstract:
“To better support development of high-performance libraries for specialized hardware, we propose a new programming language, Exo, based on the principle of exocompilation: externalizing target-specific code generation support and optimization policies to user-level code. Exo allows custom hardware instructions, specialized memories, and accelerator configuration state to be defined in user libraries. It builds on the idea of user scheduling to externalize hardware mapping and optimization decisions. Schedules are defined as composable rewrites within the language, and we develop a set of effect analyses which guarantee program equivalence and memory safety through these transformations. We show that Exo enables rapid development of state-of-the-art matrix-matrix multiply and convolutional neural network kernels, for both an embedded neural accelerator and x86 with AVX-512 extensions, in a few dozen lines of code each.”
Find the open access technical paper here. Published June 2022. Find the MIT news article here.
PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and ImplementationJune 2022 Pages 703–718https://doi.org/10.1145/3519939.3523446.E
Steps are being taken to minimize problems, but they will take years to implement.
But that doesn’t mean it’s going to be mainstream anytime soon.
Companies are speeding ahead to identify the most production-worthy processes for 3D chip stacking.
New capacity planned for 2024, but production will depend on equipment availability.
L5 vehicles need at least 10 more years of development.
Increased transistor density and utilization are creating memory performance issues.
Suppliers are investing new 300mm capacity, but it’s probably not enough. And despite burgeoning 200mm demand, only Okmetic and new players in China are adding capacity.
The industry reached an inflection point where analog is getting a fresh look, but digital will not cede ground readily.
100% inspection, more data, and traceability will reduce assembly defects plaguing automotive customer returns.
Engineers are finding ways to effectively thermally dissipate heat from complex modules.
Different interconnect standards and packaging options being readied for mass chiplet adoption.
Steps are being taken to minimize problems, but they will take years to implement.
AMD CTO Mark Papermaster talks about why heterogeneous architectures will be needed to achieve improvements in PPA.
Leave a Reply