Pooling CPU Memory for LLM Inference With Lower Latency and Higher Throughput (UC Berkeley)

By Technical Paper Link - 26 Nov, 2024 - Comments: 0

A new technical paper titled "Pie: Pooling CPU Memory for LLM Inference" was published by researchers at UC Berkeley. Abstract "The rapid growth of LLMs has revolutionized natural language processing and AI analysis, but their increasing size and memory demands present significant challenges. A common solution is to spill over to CPU memory; however, traditional GPU-CPU memory swapping ofte... » read more

Memory Fundamentals For Engineers

By The SE Staff - 17 Sep, 2024 - Comments: 0

Memory is one of a very few elite electronic components essential to any electronic system. Modern electronics perform extraordinarily complex duties that would be impossible without memory. Your computer obviously contains memory, but so does your car, your smartphone, your doorbell camera, your entertainment system, and any other gadget benefiting from digital electronics. This eBook prov... » read more

CXL: The Future Of Memory Interconnect?

By Karen Heyman - 12 Oct, 2023 - Comments: 1

Momentum for sharing memory resources between processor cores is growing inside of data centers, where the explosion in data is driving the need to be able to scale memory up and down in a way that roughly mirrors how processors are used today. A year after the CXL Consortium and JEDEC signed a memorandum of understanding (MOU) to formalize collaboration between the two organizations, suppor... » read more

Memory Disaggregation Research And Making It Practical With Hardware Trends (U. of Michigan)

By Technical Paper Link - 22 May, 2023 - Comments: 0

A new technical paper titled "Memory Disaggregation: Advances and Open Challenges" was published by researchers at University of Michigan. Abstract "Compute and memory are tightly coupled within each server in traditional datacenters. Large-scale datacenter operators have identified this coupling as a root cause behind fleet-wide resource underutilization and increasing Total Cost of Owners... » read more

CXL-Based Memory Pooling System Meets Cloud Performance Goals And Significantly Reduces DRAM Cost

By Technical Paper Link - 05 Mar, 2023 - Comments: 0

A technical paper titled "Pond: CXL-Based Memory Pooling Systems for Cloud Platforms" was published by researchers at Virginia Tech, Intel, Microsoft Azure, Google, and Stone Co. Abstract "Public cloud providers seek to meet stringent performance requirements and low hardware cost. A key driver of performance and cost is main memory. Memory pooling promises to improve DRAM utilization and t... » read more

CXL 3.0: From Expansion To Scaling

By Danny Moore - 06 Oct, 2022 - Comments: 0

At the Flash Memory Summit in August, the CXL Consortium released the latest, and highly anticipated, version 3.0 of the Compute Express Link (CXL) specification. This new version of the specification builds on previous generations and introduces several compelling new features that promise to increase data center performance and scalability, while reducing the total cost of ownership (TCO). ... » read more

Rambus To Buy Hardent

By Ed Sperling - 05 May, 2022 - Comments: 0

Rambus inked a deal to buy Hardent, an engineering services company, in order to accelerate Rambus' push into the CXL arena. Compute Express Link (CXL), developed primarily by Intel before being turned into an open industry standard, allows memory to be disaggregated within a data center and shared across multiple servers. This, in turn, lets data centers control how critical resources are a... » read more

Improving Memory Efficiency And Performance

By Bryon Moyer - 28 Mar, 2022 - Comments: 0

This is the second of two parts on CXL vs. OMI. Part one can be found here. Memory pooling and sharing are gaining traction as ways of optimizing existing resources to handle increasing data volumes. Using these approaches, memory can be accessed by a number of different machines or processing elements on an as-needed basis. Two protocols, CXL and OMI, are being leveraged to simplify thes... » read more

Changing Server Architectures In The Data Center

By Bryon Moyer - 11 Nov, 2021 - Comments: 1

Data centers are undergoing a fundamental shift to boost server utilization and improve efficiency, optimizing architectures so available compute resources can be leveraged wherever they are needed. Traditionally, data centers were built with racks of servers, each server providing computing, memory, interconnect, and possibly acceleration resources. But when a server is selected, some of th... » read more

Knowledge Centers
Entities, people and technologies explored

Shift Left Is The Tip Of The Iceberg

A transformative change is underway for semiconductor design and EDA. New languages, models, and abstractions will need to be created.

by Brian Bailey

Partitioning In The Chiplet Era

Understanding how chiplets interact under different workloads is critical to ensuring signal integrity and optimal performance in heterogeneous designs.

by Ann Mutschler

NAND Flash Targets 1,000 Layers

New techniques go beyond improved deposition and etching, but challenges stack up, too.

by Bryon Moyer

3.5D: The Great Compromise

Pros and cons of a middle-ground chiplet assembly that combines 2.5D and 3D-IC.

by Ed Sperling

AI’s Role In Chip Design Widens, Drawing In New Startups

Focus is on letting engineers do much more with the same or fewer resources — and less drudgery.

by Karen Heyman

What Comes After HBM For Chiplets

The standard for high-bandwidth memory limits design freedom at many levels, but that is required for interoperability. What freedoms can be taken from other functions to make chiplets possible?

by Brian Bailey

Memory Fundamentals For Engineers

eBook: Nearly everything you need to know about memory, including detailed explanations of the different types of memory; how and where these are used today; what's changing, which memories are successful and which ones might be in the future; and the limitations of each memory type.

by The SE Staff

New AI Processors Architectures Balance Speed With Efficiency

Hot Chips 24: Large language models ratchet up pressure for sustainable computing and heterogeneous integration; data management becomes key differentiator.

by Ed Sperling

tag: memory pooling

Pooling CPU Memory for LLM Inference With Lower Latency and Higher Throughput (UC Berkeley)

Memory Fundamentals For Engineers

CXL: The Future Of Memory Interconnect?

Memory Disaggregation Research And Making It Practical With Hardware Trends (U. of Michigan)

CXL-Based Memory Pooling System Meets Cloud Performance Goals And Significantly Reduces DRAM Cost

CXL 3.0: From Expansion To Scaling

Rambus To Buy Hardent

Improving Memory Efficiency And Performance

Changing Server Architectures In The Data Center

Trending Articles

NAND Flash Targets 1,000 Layers

One Chip Vs. Many Chiplets

FOPLP Gains Traction in Advanced Semiconductor Packaging

HBM Options Increase As AI Demand Soars

Asia Government Funding Surges

Knowledge Centers
Entities, people and technologies explored

Related Articles

Shift Left Is The Tip Of The Iceberg

Partitioning In The Chiplet Era

NAND Flash Targets 1,000 Layers

3.5D: The Great Compromise

AI’s Role In Chip Design Widens, Drawing In New Startups

What Comes After HBM For Chiplets

Memory Fundamentals For Engineers

New AI Processors Architectures Balance Speed With Efficiency

Sponsors

Recent Comments

About

Navigation

Connect With Us

tag: memory pooling

Pooling CPU Memory for LLM Inference With Lower Latency and Higher Throughput (UC Berkeley)

Memory Fundamentals For Engineers

CXL: The Future Of Memory Interconnect?

Memory Disaggregation Research And Making It Practical With Hardware Trends (U. of Michigan)

CXL-Based Memory Pooling System Meets Cloud Performance Goals And Significantly Reduces DRAM Cost

CXL 3.0: From Expansion To Scaling

Rambus To Buy Hardent

Improving Memory Efficiency And Performance

Changing Server Architectures In The Data Center

Trending Articles

NAND Flash Targets 1,000 Layers

One Chip Vs. Many Chiplets

FOPLP Gains Traction in Advanced Semiconductor Packaging

HBM Options Increase As AI Demand Soars

Asia Government Funding Surges

Knowledge Centers Entities, people and technologies explored

Related Articles

Shift Left Is The Tip Of The Iceberg

Partitioning In The Chiplet Era

NAND Flash Targets 1,000 Layers

3.5D: The Great Compromise

AI’s Role In Chip Design Widens, Drawing In New Startups

What Comes After HBM For Chiplets

Memory Fundamentals For Engineers

New AI Processors Architectures Balance Speed With Efficiency

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored