Novel NorthPole Architecture Enables Low-Latency, High-Energy-Efficiency LLM inference (IBM Research)


A new technical paper titled "Breakthrough low-latency, high-energy-efficiency LLM inference performance using NorthPole" was published by researchers at IBM Research. At the IEEE High Performance Extreme Computing (HPEC) Virtual Conference in September 2024, new performance results for their AIU NorthPole AI inference accelerator chip were presented on a 3-billion-parameter Granite LLM. ... » read more

HW Accelerator Architecture for MI Computation With Low Latency, Energy Efficient (MIT)


A new technical paper titled "Efficient Computation of Map-scale Continuous Mutual Information on Chip in Real Time" was published by researchers at MIT. Find the technical paper here. "In this paper, we introduce a new hardware accelerator architecture for MI computation that features a low-latency, energy-efficient MI compute core and an optimized memory subsystem that provides sufficie... » read more

On The Cusp Of 5G


Carriers and chipmakers are celebrating the rollout of the first standards-compliant commercial 5G services. "We are, officially in the era of 5G," said John Smee, vice president of engineering at Qualcomm at the recent 5G Summit at IEEE's International Microwave Symposium (IMS) in Boston. Movement is happening on the commercial end. Major U.S. carriers Verizon, AT&T and Sprint have set ... » read more

Low-Latency Image Acquisition And Processing With A Programmable Vision-System-On-Chip


This work aims to demonstrate the benefits of using a Vision-System-on-Chip for image processing tasks with very high latency demands between image acquisition and processing. By leveraging a column-parallel, mixed-signal data path, which is entirely software-defined by three application-specific instruction- set processors (ASIPs), image data within multiple regions of interest can be analyzed... » read more

Implementation Of An Asynchronous Bundled-Data Router For A GALS NoC In The Context Of A VSoC


Designs of asynchronous networks-on-chip are of growing interest because a complete asynchronous implemen- tation can solve the synchronization problems of large networks. However, asynchronous circuits suffer from the lack of proper design flows because their functionality often relies on timing constraints, which are not extensively supported by common CAD synthesis tools. This paper proposes... » read more

The Rambus GDDR6 PHY IP Core


The JEDEC-compliant Rambus GDDR6 PHY IP Core is optimized for systems that require low-latency and high-bandwidth GDDR6 memory solutions. Available on leading FinFET process nodes, the PHY interface supports two independent channels, with each supporting 16 bits for a total data width of 32 bits. In addition, the PHY supports speeds up to 16Gbps per pin, providing a maximum bandwidth of up to 6... » read more