Review Paper: Wafer-Scale Accelerators Versus GPUs (UC Riverside)

By Technical Paper Link - 18 Jun, 2025 - Comments: 0

A new technical paper titled "Performance, efficiency, and cost analysis of wafer-scale AI accelerators vs. single-chip GPUs" was published by researchers at UC Riverside. "This review compares wafer-scale AI accelerators and single-chip GPUs, examining performance, energy efficiency, and cost in high-performance AI applications. It highlights enabling technologies like TSMC’s chip-on-wafe... » read more

Fully Automated Hardware And Software Design Of Processor Chips (Chinese Academy Of Sciences)

By Technical Paper Link - 12 Jun, 2025 - Comments: 0

A new technical paper titled "QiMeng: Fully Automated Hardware and Software Design for Processor Chip" was published by researchers at Chinese Academy of Sciences. Abstract "Processor chip design technology serves as a key frontier driving breakthroughs in computer science and related fields. With the rapid advancement of information technology, conventional design paradigms face three majo... » read more

Inference Framework For Deployment Challenges of Large Generative Models On GPUs (Google)

By Technical Paper Link - 03 May, 2025 - Comments: 0

A new technical paper titled "Scaling On-Device GPU Inference for Large Generative Models" was published by researchers at Google and Meta Platforms. Abstract "Driven by the advancements in generative AI, large machine learning models have revolutionized domains such as image processing, audio synthesis, and speech recognition. While server-based deployments remain the locus of peak perform... » read more

Wafer-Scale Computing for LLMs (U. of Edinburgh, Microsoft)

By Technical Paper Link - 09 Feb, 2025 - Comments: 0

A new technical paper titled "WaferLLM: A Wafer-Scale LLM Inference System" was published by researchers at University of Edinburgh and Microsoft Research. Abstract "Emerging AI accelerators increasingly adopt wafer-scale manufacturing technologies, integrating hundreds of thousands of AI cores in a mesh-based architecture with large distributed on-chip memory (tens of GB in total) and ultr... » read more

Mixed-Precision DL Inference, Co-Designed With HW Accelerator DPU (Intel)

By Technical Paper Link - 03 Feb, 2025 - Comments: 0

A new technical paper titled "StruM: Structured Mixed Precision for Efficient Deep Learning Hardware Codesign" was published by Intel. Abstract "In this paper, we propose StruM, a novel structured mixed-precision-based deep learning inference method, co-designed with its associated hardware accelerator (DPU), to address the escalating computational and memory demands of deep learning worklo... » read more

Pooling CPU Memory for LLM Inference With Lower Latency and Higher Throughput (UC Berkeley)

By Technical Paper Link - 26 Nov, 2024 - Comments: 0

A new technical paper titled "Pie: Pooling CPU Memory for LLM Inference" was published by researchers at UC Berkeley. Abstract "The rapid growth of LLMs has revolutionized natural language processing and AI analysis, but their increasing size and memory demands present significant challenges. A common solution is to spill over to CPU memory; however, traditional GPU-CPU memory swapping ofte... » read more

Survey: HW SW Co-Design Approaches Tailored to LLMs

By Technical Paper Link - 11 Oct, 2024 - Comments: 0

A new technical paper titled "A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models" was published by researchers at Duke University and Johns Hopkins University. Abstract "The rapid development of large language models (LLMs) has significantly transformed the field of artificial intelligence, demonstrating remarkable capabilities in natural language proce... » read more

MTJ-Based CRAM Array

By Technical Paper Link - 01 Aug, 2024 - Comments: 0

A new technical paper titled "Experimental demonstration of magnetic tunnel junction-based computational random-access memory" was published by researchers at University of Minnesota and University of Arizona, Tucson. Abstract "The conventional computing paradigm struggles to fulfill the rapidly growing demands from emerging applications, especially those for machine intelligence because ... » read more

Data Formats For Inference On The Edge

By Brian Bailey - 14 Dec, 2023 - Comments: 0

AI/ML training traditionally has been performed using floating point data formats, primarily because that is what was available. But this usually isn't a viable option for inference on the edge, where more compact data formats are needed to reduce area and power. Compact data formats use less space, which is important in edge devices, but the bigger concern is the power needed to move around... » read more

AI Accelerator Architectures Poised For Big Changes

By Ann Mutschler - 04 Dec, 2023 - Comments: 5

AI is driving a frenzy of activity in the chip world as companies across the semiconductor ecosystem race to include AI in their product lineup. The challenge now is how to make AI run faster, use less energy, and to be able to leverage it from the edge to the data center — particularly with the rollout of large language models. On the hardware side, there are two main approaches for accel... » read more

← Older posts

Knowledge Centers
Entities, people and technologies explored

Startup Funding: Q1 2025

AI chips and data center communications see big funding; 75 startups raise $2 billion.

by Jesse Allen

Advanced Packaging Fundamentals for Semiconductor Engineers

New SE eBook examines the next phase of semiconductor design, testing, and manufacturing.

by Bryon Moyer

Chip Industry Week in Review

AI export rule to be scrapped; SEMI, EU request; Cadence, Nvidia supercomputer; AI co-processor; Imagination's new GPU; semi sales up; imec, TNO photonics lab; NSF key to national security; flexible packaging control system; SiConic test engineering; USB 4 support; SiC JFETS; magnetic behavior in hematite.

by The SE Staff

tag: inferencing

Review Paper: Wafer-Scale Accelerators Versus GPUs (UC Riverside)

Fully Automated Hardware And Software Design Of Processor Chips (Chinese Academy Of Sciences)

Inference Framework For Deployment Challenges of Large Generative Models On GPUs (Google)

Wafer-Scale Computing for LLMs (U. of Edinburgh, Microsoft)

Mixed-Precision DL Inference, Co-Designed With HW Accelerator DPU (Intel)

Pooling CPU Memory for LLM Inference With Lower Latency and Higher Throughput (UC Berkeley)

Survey: HW SW Co-Design Approaches Tailored to LLMs

MTJ-Based CRAM Array

Data Formats For Inference On The Edge

AI Accelerator Architectures Poised For Big Changes

Trending Articles

RISC-V’s Increasing Influence

Chip Industry Week in Review

Co-Packaged Optics Reaches Power Efficiency Tipping Point

Chip Industry Week in Review

TSMC: King Of Data Center AI

Knowledge Centers
Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

What Exactly Are Chiplets And Heterogeneous Integration?

Big Changes Ahead For Interposers And Substrates

Sponsors

Recent Comments

About

Navigation

Connect With Us

tag: inferencing

Review Paper: Wafer-Scale Accelerators Versus GPUs (UC Riverside)

Fully Automated Hardware And Software Design Of Processor Chips (Chinese Academy Of Sciences)

Inference Framework For Deployment Challenges of Large Generative Models On GPUs (Google)

Wafer-Scale Computing for LLMs (U. of Edinburgh, Microsoft)

Mixed-Precision DL Inference, Co-Designed With HW Accelerator DPU (Intel)

Pooling CPU Memory for LLM Inference With Lower Latency and Higher Throughput (UC Berkeley)

Survey: HW SW Co-Design Approaches Tailored to LLMs

MTJ-Based CRAM Array

Data Formats For Inference On The Edge

AI Accelerator Architectures Poised For Big Changes

Trending Articles

RISC-V’s Increasing Influence

Chip Industry Week in Review

Co-Packaged Optics Reaches Power Efficiency Tipping Point

Chip Industry Week in Review

TSMC: King Of Data Center AI

Knowledge Centers Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

What Exactly Are Chiplets And Heterogeneous Integration?

Big Changes Ahead For Interposers And Substrates

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored