Pooling CPU Memory for LLM Inference With Lower Latency and Higher Throughput (UC Berkeley)


A new technical paper titled "Pie: Pooling CPU Memory for LLM Inference" was published by researchers at UC Berkeley. Abstract "The rapid growth of LLMs has revolutionized natural language processing and AI analysis, but their increasing size and memory demands present significant challenges. A common solution is to spill over to CPU memory; however, traditional GPU-CPU memory swapping ofte... » read more

How Attackers Can Read Data From CPU’s Memory By Analyzing Energy Consumption


A technical paper titled “Collide+Power: Leaking Inaccessible Data with Software-based Power Side Channels” was published by researchers at Graz University of Technology and CISPA Helmholtz Center for Information Security. Abstract: "Differential Power Analysis (DPA) measures single-bit differences between data values used in computer systems by statistical analysis of power traces. In th... » read more