Novel NorthPole Architecture Enables Low-Latency, High-Energy-Efficiency LLM inference (IBM Research)


A new technical paper titled "Breakthrough low-latency, high-energy-efficiency LLM inference performance using NorthPole" was published by researchers at IBM Research. At the IEEE High Performance Extreme Computing (HPEC) Virtual Conference in September 2024, new performance results for their AIU NorthPole AI inference accelerator chip were presented on a 3-billion-parameter Granite LLM. ... » read more

Using AI To Glue Disparate IC Ecosystem Data


AI holds the potential to change how companies interact throughout the global semiconductor ecosystem, gluing together different data types and processes that can be shared between companies that in the past had little or no direct connections. Chipmakers always have used abstraction layers to see the bigger picture of how the various components of a chip go together, allowing them to pinpoi... » read more

Scalable Chiplet System for LLM Training, Finetuning and Reduced DRAM Accesses (Tsinghua University)


A new technical paper titled "Hecaton: Training and Finetuning Large Language Models with Scalable Chiplet Systems" was published by researchers at Tsinghua University. Abstract "Large Language Models (LLMs) have achieved remarkable success in various fields, but their training and finetuning require massive computation and memory, necessitating parallelism which introduces heavy communicat... » read more

LLMs In The High-Level Synthesis Design Flow


A new technical paper titled "Are LLMs Any Good for High-Level Synthesis?" was published by researchers at University of Arizona. Abstract "The increasing complexity and demand for faster, energy-efficient hardware designs necessitate innovative High-Level Synthesis (HLS) methodologies. This paper explores the potential of Large Language Models (LLMs) to streamline or replace the HLS proces... » read more

Prevent AI Hardware Obsolescence And Optimize Efficiency With eFPGA Adaptability


Large Language Models (LLMs) and Generative AI are driving up memory requirements, presenting a significant challenge. Modern LLMs can have billions of parameters, demanding many gigabytes of memory. To address this issue, AI architects have devised clever solutions that dramatically reduce memory needs. Evolving techniques like lossless weight compression, structured sparsity, and new numer... » read more

Capturing Knowledge Within LLMs


At DAC this year, there was a lot of talk about AI and the impact it is likely to have. While EDA companies have been using it for optimization and improving iteration loops within the flow, the end users have been concentrating on how to use it to improve the user interface between engineers and tools. The feedback is very positive. Large language models (LLMs) have been trained on a huge a... » read more

224Gbps PHY For The Next Generation Of High Performance Computing


Large language models (LLMs) are experiencing an explosive growth in parameter count. Training these ever-larger models requires multiple accelerators to work together, and the bandwidth between these accelerators directly limits the size of trainable LLMs in High Performance Computing (HPC) environments. The correlation between the LLM size and data rates of interconnect technology herald a... » read more

PCIe 7.0: Speed, Flexibility & Efficiency For The AI Era


As the industry came together for PCI-SIG DevCon last month, one thing took center stage, and that was PCI Express 7.0. While still in the final stages of development, the world is certainly ready for this significant new milestone of the PCIe specification. Let’s look at how PCIe 7.0 is poised to address the escalating demands of AI, high-performance computing, and emerging data-intensive ap... » read more

Lower Energy, High Performance LLM on FPGA Without Matrix Multiplication


A new technical paper titled "Scalable MatMul-free Language Modeling" was published by UC Santa Cruz, Soochow University, UC Davis, and LuxiTech. Abstract "Matrix multiplication (MatMul) typically dominates the overall computational cost of large language models (LLMs). This cost only grows as LLMs scale to larger embedding dimensions and context lengths. In this work, we show that MatMul... » read more

An LLM Approach For Large-Scale SoC Security Verification And Policy Generation (U. of Florida)


A technical paper titled “SoCureLLM: An LLM-driven Approach for Large-Scale System-on-Chip Security Verification and Policy Generation” was published by researchers at the University of Florida. Abstract: "Contemporary methods for hardware security verification struggle with adaptability, scalability, and availability due to the increasing complexity of the modern system-on-chips (SoCs). ... » read more

← Older posts Newer posts →