Next Generation AI: Transitioning Inference from the Cloud to the Edge


Deploying AI inference at the edge—on smartphones, appliances, industrial devices, and vehicles—promises faster, private, and energy-efficient intelligence. Expedera’s packet-based NPU architecture delivers up to 90% utilization and dramatic reductions in memory movement compared to conventional approaches, enabling next-generation real-time AI capabilities. This white paper examines tech... » read more

Edge AI Safety: Agentic AI Architecture That Leverages 3D To Integrate A Dedicated Safety Layer (Princeton, HKUST, NC State Univ.)


A new technical paper titled "3D Guard-Layer: An Integrated Agentic AI Safety System for Edge Artificial Intelligence" was published by researchers at Princeton University, Hong Kong University of Science and Technology, and North Carolina State University. Abstract "AI systems have found a wide range of real-world applications in recent years. The adoption of edge artificial intelligence, ... » read more

Machine Intelligence on Wireless Edge Networks with RF Analog Architecture (MIT, Duke)


A new technical paper titled "Machine Intelligence on Wireless Edge Networks" was published by researchers at MIT and Duke University. Abstract "Deep neural network (DNN) inference on power-constrained edge devices is bottlenecked by costly weight storage and data movement. We introduce MIWEN, a radio-frequency (RF) analog architecture that "disaggregates" memory by streaming weights wirele... » read more

Inference Framework For Deployment Challenges of Large Generative Models On GPUs (Google)


A new technical paper titled "Scaling On-Device GPU Inference for Large Generative Models" was published by researchers at Google and Meta Platforms. Abstract "Driven by the advancements in generative AI, large machine learning models have revolutionized domains such as image processing, audio synthesis, and speech recognition. While server-based deployments remain the locus of peak perform... » read more

Chip Industry Week In Review


Europe's top court ruled in Intel's favor, voiding a $1.1 billion fine imposed by the European Union and dismissing charges of anti-competitive behavior. IBM released yield benchmarks for high-NA EUV, which serve as proof points that the newest advanced litho equipment will enable scaling beyond the 2nm process node. Also on the lithography front, Nikon is developing a maskless digital litho... » read more

Edge Devices Require New Security Approaches


The diversity of connected devices and chips at the edge — the vaguely defined middle ground between the end point and the cloud — is significantly widening the potential attack surface and creating more opportunities for cyberattacks. The edge build-out has been underway for at least the past half-decade, largely driven by an explosion in data and increasing demands to process that data... » read more

Specialization Vs. Generalization In Processors


Academia has been looking at specialization for many years, but solutions were rejected because general-purpose solutions were advancing fast enough to keep up with most application requirements. That is no longer the case. The introduction and support of the RISC-V processor architecture has attracted a lot of attention, but whether that is the right direction for the majority of modern comput... » read more

Leveraging Large Language Models (LLMs) To Perform SW-HW Co-Design


A technical paper titled “On the Viability of using LLMs for SW/HW Co-Design: An Example in Designing CiM DNN Accelerators” was published by researchers at University of Notre Dame. Abstract: "Deep Neural Networks (DNNs) have demonstrated impressive performance across a wide range of tasks. However, deploying DNNs on edge devices poses significant challenges due to stringent power and com... » read more

Review of Tools & Techniques for DL Edge Inference


A new technical paper titled "Efficient Acceleration of Deep Learning Inference on Resource-Constrained Edge Devices: A Review" was published in "Proceedings of the IEEE" by researchers at University of Missouri and Texas Tech University. Abstract: Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted in breakthroughs in many areas. However, deploying thes... » read more

Using Silicon Photonics To Reduce Latency On Edge Devices


A new technical paper titled "Delocalized photonic deep learning on the internet’s edge" was published by researchers at MIT and Nokia Corporation. “Every time you want to run a neural network, you have to run the program, and how fast you can run the program depends on how fast you can pipe the program in from memory. Our pipe is massive — it corresponds to sending a full feature-leng... » read more

← Older posts