Outlier-aware Quantization Framework Co-designed With Heterogeneous NVM For SLM Deployment on Edge Platforms (UCSD et al.)


  A new technical paper titled "QMC: Efficient SLM Edge Inference via Outlier-Aware Quantization and Emergent Memories Co-Design" was published by researchers at University of California San Diego and San Diego State University. Abstract "Deploying Small Language Models (SLMs) on edge platforms is critical for real-time, privacy-sensitive generative AI, yet constrained by memory, ... » read more

Small Language Models Create New Security Risks


The rollout of edge AI is creating new security risks due to a mix of small language models (SLMs), their integration into increasingly complex hardware, and the behavior and interactions of both over time. AI data centers still garner the most attention due to massive investments and an ongoing flood of deals and acquisitions, but the edge is quietly starting to take shape for several reaso... » read more

Small Vs. Large Language Models


The proliferation of edge AI will require fundamental changes in language models and chip architectures to make inferencing and learning outside of AI data centers a viable option. The initial goal for small language models (SLMs) — roughly 10 billion parameters or less, compared to more than a trillion parameters in the biggest LLMs — was to leverage them exclusively for inferencing. In... » read more

Physical Access Control Raises New Security Concerns


Experts At The Table: Semiconductor Engineering sat down to discuss hardware security challenges, including fundamental security of GenAI, with Nicole Fern, principal security analyst at Keysight; Serge Leef, AI-For-Silicon strategist at Microsoft; Scott Best, senior director for silicon security products at Rambus; Lee Harrison, director of Tessent Automotive IC Solutions at Siemens EDA; Mohit... » read more

Small Language Models: A Solution To Language Model Deployment At The Edge?


While Large Language Models (LLMs) like GPT-3 and GPT-4 have quickly become synonymous with AI, LLM mass deployments in both training and inference applications have, to date, been predominately cloud-based. This is primarily due to the sheer size of the models; the resulting processing and memory requirements often overwhelm the capabilities of edge-based systems. While the efficiency of Exped... » read more