Intelligence Per Watt: Measuring Local Inference Viability, Studying 20+ Models, 8 HW Accelerators (Stanford Univ.)


A new technical paper titled "Intelligence per Watt: Measuring Intelligence Efficiency of Local AI" was published by researchers at Stanford University and Together AI. Abstract: "Large language model (LLM) queries are predominantly processed by frontier models in centralized cloud infrastructure. Rapidly growing demand strains this paradigm, and cloud providers struggle to scale infrastruc... » read more

Edge AI Is Starting To Transform Industrial IoT


A slew of wireless and increasingly multi-modal sensors is being targeted at the Industrial Internet of Things (IIoT), setting the stage for significant improvements in efficiency, higher yield, and reduced downtime. Wired IIoT devices, such as smart energy meters and breakers, industrial network gateways, and environmental sensors already are well established in factory settings. They have ... » read more

LLMs Add Safety Risks To Physical AI


Humanoid robots with artificial general intelligence are some years from entering our daily life, but application-specific robotics are already here. From Amazon’s fleet of fulfillment center robots to robotic surgical systems in operating rooms, search and rescue robo-dogs, autonomous drones, and last-mile delivery robots, all the way down to the humble Roomba vacuum cleaner, physical AI sys... » read more

Moving AI Workloads To The Edge


Experts At The Table: Semiconductor Engineering gathered a group of experts to discuss how some AI workloads are better suited for on-device processing to achieve consistent performance, avoid network connectivity issues, reduce cloud computing costs, and ensure privacy. The panel included Frank Ferro, group director in the Silicon Solutions Group at Cadence; Eduardo Montanez, vice president an... » read more

Co-Optimizing GPU Architecture And SW To Enhance Edge Inference Performance (NVIDIA)


A new technical paper titled "EdgeReasoning: Characterizing Reasoning LLM Deployment on Edge GPUs" was published by researchers at NVIDIA. Abstract "Edge intelligence paradigm is increasingly demanded by the emerging autonomous systems, such as robotics. Beyond ensuring privacy-preserving operation and resilience in connectivity-limited environments, edge deployment offers significant energ... » read more

Formal Verification’s Value Grows


Experts at the table: Semiconductor Engineering sat down to discuss why formal verification is becoming more important, with Ashish Darbari, CEO for Axiomise; Jin Zhang, product management group director for the Verification Group at Cadence; Sean Safarpour, executive director for R&D at Synopsys; and Jeremy Levitt, principal engineer for Digital Verification Technology at Siemens EDA. Wha... » read more

Small Vs. Large Language Models


The proliferation of edge AI will require fundamental changes in language models and chip architectures to make inferencing and learning outside of AI data centers a viable option. The initial goal for small language models (SLMs) — roughly 10 billion parameters or less, compared to more than a trillion parameters in the biggest LLMs — was to leverage them exclusively for inferencing. In... » read more

Multimodal LLM Assistant for Chip Physical Design (National Taiwan Univ., UCLA, NVIDIA)


A new technical paper titled "Multimodal Chip Physical Design Engineer Assistant" was published by researchers at National Taiwan University, University of California, Los Angeles and NVIDIA Research. Abstract "Modern chip physical design relies heavily on Electronic Design Automation (EDA) tools, which often struggle to provide interpretable feedback or actionable guidance for improving ro... » read more

Unlocking Clarity: Keyphrase Trees Bring Structure To AI Text Analysis


By Amr Hegazy, Mohamed Abdelkarim, and Reem El Adawi In the vast digital landscape of information, from intricate design specifications to extensive patent literature and complex verification reports, extracting meaningful insights often feels like searching for a needle in a haystack. This challenge is particularly acute in the semiconductor industry, where critical details are buried with... » read more

GDDR7 Tackles Massive-Context AI Inference


The AI hardware landscape is evolving at breakneck speed, and memory technology is at the heart of this transformation. NVIDIA’s recent announcement of Rubin CPX, a new class of GPU purpose-built for massive-context inference, underscores this trend. Rubin CPX is designed to tackle workloads that require reasoning across millions of tokens. Use cases include long-form generative video, comple... » read more

← Older posts Newer posts →