The Journey To Exascale Computing And Beyond


High performance computing witnessed one of its most ambitious leaps forward with the development of the US supercomputer “Frontier.” As Scott Atchley from Oak Ridge National Laboratory discussed at Supercomputing 23 (SC23) in Denver last month, the Frontier had the ambitious goal of achieving performance levels 1000 times higher than the petascale systems that preceded it, while also stayi... » read more

Designing for Data Flow


Movement and management of data inside and outside of chips is becoming a central theme for a growing number of electronic systems, and a huge challenge for all of them. Entirely new architectures and techniques are being developed to reduce the movement of data and to accomplish more per compute cycle, and to speed the transfer of data between various components on a chip and between chips ... » read more

Dealing With Heat In Near-Memory Compute Architectures


The explosion in data forcing chipmakers to get much more granular about where logic and memory are placed on a die, how data is partitioned and prioritized to utilize those resources, and what the thermal impact will be if they are moved closer together on a die or in a package. For more than a decade, the industry has faced a basic problem — moving data can be more resource-intensive tha... » read more

Algorithm HW Framework That Minimizes Accuracy Degradation, Data Movement, And Energy Consumption Of DNN Accelerators (Georgia Tech)


This new research paper titled "An Algorithm-Hardware Co-design Framework to Overcome Imperfections of Mixed-signal DNN Accelerators" was published by researchers at Georgia Tech. According to the paper's abstract, "In recent years, processing in memory (PIM) based mixed-signal designs have been proposed as energy- and area-efficient solutions with ultra high throughput to accelerate DNN com... » read more

Moving Intelligence To The Edge


The buildout of the edge is driving a slew of new challenges and opportunities across the chip industry. Sailesh Chittipeddi, executive vice president at Renesas Electronics America, talks about the shift toward more AI-centric workloads rather than CPU-centric, why embedded computing is becoming the foundation of all intelligences, and the importance of software, security, and user experience ... » read more

Seven Hardware Advances We Need to Enable The AI Revolution


The potential, positive impact AI will have on society at large is impossible to overestimate. Pervasive AI, however, remains a challenge. Training algorithms can take inordinate amounts of power, time, and computing capacity. Inference will also become more taxing with applications such as medical imaging and robotics. Applied Materials estimates that AI could consume up to 25% of global elect... » read more

1.6 Tb/s Ethernet Challenges


Moving data at blazing fast speeds sounds good in theory, but it raises a number of design challenges. John Swanson, senior product marketing manager for high-performance computing digital IP at Synopsys, talks about the impact of next-generation Ethernet on switches, the types of data that need to be considered, the causes of data growth, and the size and structure of data centers, both in the... » read more

Optimization Driving Changes In Microarchitectures


The semiconductor ecosystem is at a turning point for how to best architect the CPU based on the explosion of data, the increased usage of AI, and the need for differentiation and customization in leading-edge applications. In the past, much of this would have been accomplished by moving to the next process node. But with the benefits from scaling diminishing at each new node, the focus is s... » read more

SpZip: Architectural Support for Effective Data Compression In Irregular Applications


Technical paper link is here. Published in: 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA) Yifan Yang (MIT); Joel Emer (MIT / NVIDIA); Daniel Sanchez (MIT) Abstract: "Irregular applications, such as graph analytics and sparse linear algebra, exhibit frequent indirect, data-dependent accesses to single or short sequences of elements that cause high ma... » read more

Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers


Harini Muthukrishnan (U of Michigan); David Nellans, Daniel Lustig (NVIDIA); Jeffrey A. Fessler, Thomas Wenisch (U of Michigan). Abstract—"Despite continuing research into inter-GPU communication mechanisms, extracting performance from multiGPU systems remains a significant challenge. Inter-GPU communication via bulk DMA-based transfers exposes data transfer latency on the GPU’s critical... » read more

← Older posts