Home

TECHNICAL PAPERS

Enabling Scalable Accelerator Design On Distributed HBM-FPGAs (UCLA)

December 6th, 2023 - By: Technical Paper Link

A technical paper titled “TAPA-CS: Enabling Scalable Accelerator Design on Distributed HBM-FPGAs” was published by researchers at University of California Los Angeles.

Abstract:

“Despite the increasing adoption of Field-Programmable Gate Arrays (FPGAs) in compute clouds, there remains a significant gap in programming tools and abstractions which can leverage network-connected, cloud-scale, multi-die FPGAs to generate accelerators with high frequency and throughput. To this end, we propose TAPA-CS, a task-parallel dataflow programming framework which automatically partitions and compiles a large design across a cluster of FPGAs with no additional user effort while achieving high frequency and throughput. TAPA-CS has three main contributions. First, it is an open-source framework which allows users to leverage virtually “unlimited” accelerator fabric, high-bandwidth memory (HBM), and on-chip memory, by abstracting away the underlying hardware. This reduces the user’s programming burden to a logical one, enabling software developers and researchers with limited FPGA domain knowledge to deploy larger designs than possible earlier. Second, given as input a large design, TAPA-CS automatically partitions the design to map to multiple FPGAs, while ensuring congestion control, resource balancing, and overlapping of communication and computation. Third, TAPA-CS couples coarse-grained floorplanning with automated interconnect pipelining at the inter- and intra-FPGA levels to ensure high frequency. We have tested TAPA-CS on our multi-FPGA testbed where the FPGAs communicate through a high-speed 100GBps Ethernet infrastructure. We have evaluated the performance and scalability of our tool on designs, including systolic-array based convolutional neural networks (CNNs), graph processing workloads such as page rank, stencil applications like the Dilate kernel, and K-nearest neighbors (KNN). TAPA-CS has the potential to accelerate development of increasingly complex and large designs on the low power and reconfigurable FPGAs.”

Find the technical paper here. Published November 2023 (preprint).

Prakriya, Neha, Yuze Chi, Suhail Basalama, Linghao Song, and Jason Cong. “TAPA-CS: Enabling Scalable Accelerator Design on Distributed HBM-FPGAs.” arXiv preprint arXiv:2311.10189 (2023).

Related Reading
Processor Tradeoffs For AI Workloads
Gaps are widening between technology advances and demands, and closing them is becoming more difficult.
HBM’s Future: Necessary But Expensive
Upcoming versions of high-bandwidth memory are thermally challenging, but help may be on the way.

Knowledge Centers
Entities, people and technologies explored

Startup Funding: Q1 2025

AI chips and data center communications see big funding; 75 startups raise $2 billion.

by Jesse Allen

Advanced Packaging Fundamentals for Semiconductor Engineers

New SE eBook examines the next phase of semiconductor design, testing, and manufacturing.

by Bryon Moyer

Chip Industry Week in Review

AI export rule to be scrapped; SEMI, EU request; Cadence, Nvidia supercomputer; AI co-processor; Imagination's new GPU; semi sales up; imec, TNO photonics lab; NSF key to national security; flexible packaging control system; SiConic test engineering; USB 4 support; SiC JFETS; magnetic behavior in hematite.

by The SE Staff

Chip Industry Week in Review

EDA export controls; Synopsys-Ansys divest requirements; SIA Factbook; McKinsey effects of tariffs; ASE's fan-out bridge; earnings; TSMC's design center; China's legacy chips play; AMD's optical acquisition.

by The SE Staff

RISC-V’s Increasing Influence

Does the world need another CPU architecture when that no longer reflects the typical workload? Perhaps not, but it may need a bridge to get to where it needs to be.

by Brian Bailey

Chip Industry Week in Review

IC, AI global ranking; China's fully automated IC design system; Micron goes bigger; PCIe 7.0 spec; TSMC-Tokyo joint lab; panel-level packaging win; first neuromorphic compute system; GAA forksheets; AMD's new GPUs.

by The SE Staff

Big Changes Ahead For Interposers And Substrates

New materials and processes will help with power distribution and thermal dissipation in advanced packages.

by Gregory Haley

What Exactly Are Chiplets And Heterogeneous Integration?

New technologies drive new terminology, but the early days for those new approaches can be very confusing.

by Bryon Moyer

Enabling Scalable Accelerator Design On Distributed HBM-FPGAs (UCLA)

Abstract:

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

Big Changes Ahead For Interposers And Substrates

What Exactly Are Chiplets And Heterogeneous Integration?

Sponsors

Recent Comments

About

Navigation

Connect With Us

Enabling Scalable Accelerator Design On Distributed HBM-FPGAs (UCLA)

Abstract:

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

Big Changes Ahead For Interposers And Substrates

What Exactly Are Chiplets And Heterogeneous Integration?

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored