Small Language Models: A Solution To Language Model Deployment At The Edge?


While Large Language Models (LLMs) like GPT-3 and GPT-4 have quickly become synonymous with AI, LLM mass deployments in both training and inference applications have, to date, been predominately cloud-based. This is primarily due to the sheer size of the models; the resulting processing and memory requirements often overwhelm the capabilities of edge-based systems. While the efficiency of Exped... » read more

New AI Data Types Emerge


AI is all about data, and the representation of the data matters strongly. But after focusing primarily on 8-bit integers and 32‑bit floating-point numbers, the industry is now looking at new formats. There is no single best type for every situation, because the choice depends on the type of AI model, whether accuracy, performance, or power is prioritized, and where the computing happens, ... » read more

Yield Management Embraces Expanding Role


Competitive pressures, shrinking time-to-market windows, and increased customization are collectively changing the dynamics and demands for yield management systems, shifting left from the fab to the design flow and right to assembly, packaging, and in-field analysis. The basic role of yield management systems is still expediting new product introductions, reducing scrap, and delivering grea... » read more

Benchmark and Evaluation Framework For Characterizing LLM Performance In Formal Verification (UC Berkeley, Nvidia)


A new technical paper titled "FVEval: Understanding Language Model Capabilities in Formal Verification of Digital Hardware" was published by researchers at UC Berkeley and NVIDIA. Abstract "The remarkable reasoning and code generation capabilities of large language models (LLMs) have spurred significant interest in applying LLMs to enable task automation in digital chip design. In particula... » read more

LLMs Show Promise In Secure IC Design


The introduction of large language models into the EDA flow could significantly reduce the time, effort, and cost of designing secure chips and systems, but they also could open the door to more sophisticated attacks. It's still early days for the use of LLMs in chip and system design. The technology is just beginning to be implemented, and there are numerous technical challenges that must b... » read more

Mass Customization For AI Inference


Rising complexity in AI models and an explosion in the number and variety of networks is leaving chipmakers torn between fixed-function acceleration and more programmable accelerators, and creating some novel approaches that include some of both. By all accounts, a general-purpose approach to AI processing is not meeting the grade. General-purpose processors are exactly that. They're not des... » read more

Survey: HW SW Co-Design Approaches Tailored to LLMs


A new technical paper titled "A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models" was published by researchers at Duke University and Johns Hopkins University. Abstract "The rapid development of large language models (LLMs) has significantly transformed the field of artificial intelligence, demonstrating remarkable capabilities in natural language proce... » read more

Novel NorthPole Architecture Enables Low-Latency, High-Energy-Efficiency LLM inference (IBM Research)


A new technical paper titled "Breakthrough low-latency, high-energy-efficiency LLM inference performance using NorthPole" was published by researchers at IBM Research. At the IEEE High Performance Extreme Computing (HPEC) Virtual Conference in September 2024, new performance results for their AIU NorthPole AI inference accelerator chip were presented on a 3-billion-parameter Granite LLM. ... » read more

Using AI To Glue Disparate IC Ecosystem Data


AI holds the potential to change how companies interact throughout the global semiconductor ecosystem, gluing together different data types and processes that can be shared between companies that in the past had little or no direct connections. Chipmakers always have used abstraction layers to see the bigger picture of how the various components of a chip go together, allowing them to pinpoi... » read more

Scalable Chiplet System for LLM Training, Finetuning and Reduced DRAM Accesses (Tsinghua University)


A new technical paper titled "Hecaton: Training and Finetuning Large Language Models with Scalable Chiplet Systems" was published by researchers at Tsinghua University. Abstract "Large Language Models (LLMs) have achieved remarkable success in various fields, but their training and finetuning require massive computation and memory, necessitating parallelism which introduces heavy communicat... » read more

← Older posts