NPU Acceleration For Multimodal LLMs


Transformer-based models have rapidly spread from text to speech, vision, and other modalities. This has created challenges for the development of Neural Processing Units (NPUs). NPUs must now efficiently support the computation of weights and propagation of activations through a series of attention blocks. Increasingly, NPUs must be able to process models with multiple input modalities with ac... » read more

Is In-Memory Compute Still Alive?


In-memory computing (IMC) has had a rough go, with the most visible attempt at commercialization falling short. And while some companies have pivoted to digital and others have outright abandoned the technology, developers are still trying to make analog IMC a success. There is disagreement regarding the benefits of IMC (also called compute-in-memory, or CIM). Some say it’s all about reduc... » read more

Chip Companies Play Bigger Role In Shaping University Curricula


A shortage of senior engineers with the necessary skills and experience is forcing companies to hire and train fresh graduates, a more time-consuming process but one that allows them to rise through the ranks using the companies' preferred technology and systems. Universities and companies share the goal of helping a graduate become productive in the workplace as quickly as possible, and the... » read more

Small Language Models: A Solution To Language Model Deployment At The Edge?


While Large Language Models (LLMs) like GPT-3 and GPT-4 have quickly become synonymous with AI, LLM mass deployments in both training and inference applications have, to date, been predominately cloud-based. This is primarily due to the sheer size of the models; the resulting processing and memory requirements often overwhelm the capabilities of edge-based systems. While the efficiency of Exped... » read more

Asia Government Funding Surges


Billions of dollars have been pouring into Asian countries for the past few years in an effort to boost their production capacity, explore leading-edge technology, compete on the global stage, and shore up supply chains in the face of geopolitical turmoil. Each country has its own plan to maintain a foothold in the global market, from China’s Big Fund to Korea’s Yongin Cluster and Japan�... » read more

Managing The Huge Power Demands Of AI Everywhere


Before generative AI burst onto the scene, no one predicted how much energy would be needed to power AI systems. Those numbers are just starting to come into focus, and so is the urgency about how to sustain it all. AI power demand is expected to surge 550% by 2026, from 8 TWh in 2024 to 52 TWh, before rising another 1,150% to 652 TWh by 2030. Commensurately, U.S. power grid planners have do... » read more

New AI Data Types Emerge


AI is all about data, and the representation of the data matters strongly. But after focusing primarily on 8-bit integers and 32‑bit floating-point numbers, the industry is now looking at new formats. There is no single best type for every situation, because the choice depends on the type of AI model, whether accuracy, performance, or power is prioritized, and where the computing happens, ... » read more

HW and SW Architecture Approaches For Running AI Models


How best to run AI inference models is a current topic of much debate as a wide breadth of systems companies look to add AI to a variety of systems, spurring both hardware innovation and the need to revamp models. Hardware developers are making progress with AI accelerators and SoCs. But on the model side, questions abound about whether the answer might come from revisiting older, less compl... » read more

Mass Customization For AI Inference


Rising complexity in AI models and an explosion in the number and variety of networks is leaving chipmakers torn between fixed-function acceleration and more programmable accelerators, and creating some novel approaches that include some of both. By all accounts, a general-purpose approach to AI processing is not meeting the grade. General-purpose processors are exactly that. They're not des... » read more

Using AI To Glue Disparate IC Ecosystem Data


AI holds the potential to change how companies interact throughout the global semiconductor ecosystem, gluing together different data types and processes that can be shared between companies that in the past had little or no direct connections. Chipmakers always have used abstraction layers to see the bigger picture of how the various components of a chip go together, allowing them to pinpoi... » read more

← Older posts