Outlier-aware Quantization Framework Co-designed With Heterogeneous NVM For SLM Deployment on Edge Platforms (UCSD et al.)


  A new technical paper titled "QMC: Efficient SLM Edge Inference via Outlier-Aware Quantization and Emergent Memories Co-Design" was published by researchers at University of California San Diego and San Diego State University. Abstract "Deploying Small Language Models (SLMs) on edge platforms is critical for real-time, privacy-sensitive generative AI, yet constrained by memory, ... » read more

AI Plays Multiple Roles Within EDA


AI's infusion into our world may seem sudden and unexpected, but EDA has been quietly adopting it for more than a decade. What's changed is that it's now becoming more visible, thanks to increasingly powerful large language models (LLMs) and the need to apply them to increasingly challenging multi-physics problems. Two fundamental shifts underlie AI's increasing prominence. First, heat is be... » read more

Small Vs. Large Language Models


The proliferation of edge AI will require fundamental changes in language models and chip architectures to make inferencing and learning outside of AI data centers a viable option. The initial goal for small language models (SLMs) — roughly 10 billion parameters or less, compared to more than a trillion parameters in the biggest LLMs — was to leverage them exclusively for inferencing. In... » read more

The Impact of Generative AI on the Edge for the Semiconductor Industry


In the first of a three-part series, Expedera, in conjunction with the Global Semiconductor Alliance’s Emerging Technologies (EmTech) group, explores “The Impact of Generative AI on the Edge for the Semiconductor Industry”. In this white paper, the working group explores the evolution of Generative AI (GenAI), and how the rapidly evolving semiconductor industry can enable GenAI innovation... » read more

Small Language Models: A Solution To Language Model Deployment At The Edge?


While Large Language Models (LLMs) like GPT-3 and GPT-4 have quickly become synonymous with AI, LLM mass deployments in both training and inference applications have, to date, been predominately cloud-based. This is primarily due to the sheer size of the models; the resulting processing and memory requirements often overwhelm the capabilities of edge-based systems. While the efficiency of Exped... » read more