Building Fixed HW Implementations of Neural Networks (Yale, Cornell et al.)

By Technical Paper Link - 29 May, 2026 - Comments: 0

Researchers from Yale University, Cornell University, Boston University, and NTT Research have published “Physical Foundation Models: Fixed hardware implementations of large-scale neural networks”. Abstract "Foundation models are deep neural networks (such as GPT-5, Gemini~3, and Opus~4) trained on large datasets that can perform diverse downstream tasks -- text and code generation, q... » read more

GDDR7 Momentum Accelerates As A Key Solution For AI Inference

By Nidish Kamath - 15 Jan, 2026 - Comments: 0

The AI hardware landscape continues to evolve at a breakneck speed, and memory technology is rapidly becoming a defining differentiator for the next generation of GPUs and AI inference accelerators. When NVIDIA introduced Rubin CPX, its new class of GPU tailored for massive context inference, it underscored a new industry reality: memory throughput and efficiency are now just as critical as ra... » read more

AI Workloads at the Edge: Ensuring Performance, Privacy, and Security

By Ann Mutschler - 17 Dec, 2025 - Comments: 0

Experts At The Table: Semiconductor Engineering gathered a group of experts to discuss why some AI workloads are better suited for on-device processing to achieve consistent performance, avoid network connectivity issues, reduce cloud computing costs, and ensure privacy. The panel included Frank Ferro, group director in the Silicon Solutions Group at Cadence; Eduardo Montanez, vice president a... » read more

Next Generation AI: Transitioning Inference from the Cloud to the Edge

By Expedera - 10 Dec, 2025 - Comments: 0

Deploying AI inference at the edge—on smartphones, appliances, industrial devices, and vehicles—promises faster, private, and energy-efficient intelligence. Expedera’s packet-based NPU architecture delivers up to 90% utilization and dramatic reductions in memory movement compared to conventional approaches, enabling next-generation real-time AI capabilities. This white paper examines tech... » read more

Optimizing AI Workloads For Edge Computing

By Ann Mutschler - 03 Dec, 2025 - Comments: 0

Experts At The Table: Semiconductor Engineering gathered a group of experts to discuss how some AI workloads are better suited for on-device processing to achieve consistent performance, avoid network connectivity issues, reduce cloud computing costs, and ensure privacy. The panel included Frank Ferro, group director in the Silicon Solutions Group at Cadence; Eduardo Montanez, vice president an... » read more

Moving AI Workloads To The Edge

By Ann Mutschler - 06 Nov, 2025 - Comments: 0

Analog Plus 3D Optics to Accelerate AI inference and Combinatorial Optimization (Microsoft, Cambridge)

By Technical Paper Link - 12 Sep, 2025 - Comments: 0

A new technical paper titled "Analog optical computer for AI inference and combinatorial optimization" was published by researchers at Microsoft Research, Barclays and University of Cambridge. Abstract "Artificial intelligence (AI) and combinatorial optimization drive applications across science and industry, but their increasing energy demands challenge the sustainability of digital comput... » read more

Complex Mix Of Processors At The Edge

By Liz Allan - 18 Aug, 2025 - Comments: 1

With AI changing so fast, it’s a juggle for companies to ensure they can deliver the best performance now while also future-proofing for unknown AI models or a completely different approach to training and inference that may emerge. There are a slew of options for high-end and budget phones, hyperscalers, and low-cost, low-power edge devices, and while GPUs keep making headlines, many designe... » read more

Implementing AI Activation Functions

By Bryon Moyer - 10 Apr, 2025 - Comments: 1

Activation functions play a critical role in AI inference, helping to ferret out nonlinear behaviors in AI models. This makes them an integral part of any neural network, but nonlinear functions can be fussy to build in silicon. Is it better to have a CPU calculate them? Should hardware function units be laid down to execute them? Or would a lookup table (LUT) suffice? Most architectures inc... » read more

HW-Aligned Sparse Attention Architecture For Efficient Long-Context Modeling (DeepSeek et al.)

By Technical Paper Link - 18 Feb, 2025 - Comments: 0

A new technical paper titled "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention" was published by DeepSeek, Peking University and University of Washington. Abstract "Long-context modeling is crucial for next-generation language models, yet the high computational cost of standard attention mechanisms poses significant computational challenges. Sparse attention... » read more

← Older posts

tag: AI inference

Building Fixed HW Implementations of Neural Networks (Yale, Cornell et al.)

GDDR7 Momentum Accelerates As A Key Solution For AI Inference

AI Workloads at the Edge: Ensuring Performance, Privacy, and Security

Next Generation AI: Transitioning Inference from the Cloud to the Edge

Optimizing AI Workloads For Edge Computing

Moving AI Workloads To The Edge

Analog Plus 3D Optics to Accelerate AI inference and Combinatorial Optimization (Microsoft, Cambridge)

Complex Mix Of Processors At The Edge

Implementing AI Activation Functions

HW-Aligned Sparse Attention Architecture For Efficient Long-Context Modeling (DeepSeek et al.)

Trending Articles

Chip Industry Week In Review

Executive Outlook: Agentic AI’s Impact On Chip Design

Agentic AI Is Changing Data Center Architectures

Chip Industry Week In Review

I/O Design Challenges Grow In AI Data Centers And HPC Clusters

Knowledge Centers
Entities, people and technologies explored

Related Articles

Advanced Packaging Limits Come Into Focus

Startup Funding: Q1 2026

All AI Data Center Interconnects Will Be Optical Within 5 Years

The Sub-2nm Paradox

When Semiconductor Materials Misbehave

TSMC Tech Symposium 2026, By The Numbers

CPO Is Extending The Limits Of What’s Possible In AI Data Centers

Silicon Photonics Lights The Way To More Efficient Data Centers

Sponsors

Recent Comments

About

Navigation

Connect With Us

tag: AI inference

Building Fixed HW Implementations of Neural Networks (Yale, Cornell et al.)

GDDR7 Momentum Accelerates As A Key Solution For AI Inference

AI Workloads at the Edge: Ensuring Performance, Privacy, and Security

Next Generation AI: Transitioning Inference from the Cloud to the Edge

Optimizing AI Workloads For Edge Computing

Moving AI Workloads To The Edge

Analog Plus 3D Optics to Accelerate AI inference and Combinatorial Optimization (Microsoft, Cambridge)

Complex Mix Of Processors At The Edge

Implementing AI Activation Functions

HW-Aligned Sparse Attention Architecture For Efficient Long-Context Modeling (DeepSeek et al.)

Trending Articles

Chip Industry Week In Review

Executive Outlook: Agentic AI’s Impact On Chip Design

Agentic AI Is Changing Data Center Architectures

Chip Industry Week In Review

I/O Design Challenges Grow In AI Data Centers And HPC Clusters

Knowledge Centers Entities, people and technologies explored

Related Articles

Advanced Packaging Limits Come Into Focus

Startup Funding: Q1 2026

All AI Data Center Interconnects Will Be Optical Within 5 Years

The Sub-2nm Paradox

When Semiconductor Materials Misbehave

TSMC Tech Symposium 2026, By The Numbers

CPO Is Extending The Limits Of What’s Possible In AI Data Centers

Silicon Photonics Lights The Way To More Efficient Data Centers

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored