中文 English

The March Toward Chiplets


The days of monolithic chips developed at the most advanced process nodes are rapidly dwindling. Nearly everyone working at the leading edge of design is looking toward some type of advanced packaging using discrete heterogeneous components. The challenge now is how to shift the whole chip industry into this disaggregated model. It's going to take time, effort, as well as a substantial reali... » read more

Using Silicon Photonics To Reduce Latency On Edge Devices


A new technical paper titled "Delocalized photonic deep learning on the internet’s edge" was published by researchers at MIT and Nokia Corporation. “Every time you want to run a neural network, you have to run the program, and how fast you can run the program depends on how fast you can pipe the program in from memory. Our pipe is massive — it corresponds to sending a full feature-leng... » read more

Redesigning Core and Cache Hierarchy For A General-Purpose Monolithic 3D System


A technical paper titled "RevaMp3D: Architecting the Processor Core and Cache Hierarchy for Systems with Monolithically-Integrated Logic and Memory" was published by researchers at ETH Zürich, KMUTNB, NTUA, and University of Toronto. Abstract: "Recent nano-technological advances enable the Monolithic 3D (M3D) integration of multiple memory and logic layers in a single chip with fine-graine... » read more

Accelerating Off-Chip Load Requests By Removing The On-Chip Cache Access Latency From Their Critical Path


A new technical paper titled "Hermes: Accelerating Long-Latency Load Requests via Perceptron-Based Off-Chip Load Prediction" was published by researchers at ETH Zurich, Intel Processor Architecture Research Lab, and LIRMM, Univ. Montpellier, CNRS.  The work received a best paper award at MICRO 2022. Abstract "Long-latency load requests continue to limit the performance of high-performance ... » read more

Decreasing Refresh Latency of Off-the-Shelf DRAM Chips


A new technical paper titled "HiRA: Hidden Row Activation for Reducing Refresh Latency of Off-the-Shelf DRAM Chips" was published by researchers at ETH Zürich, TOBB University of Economics and Technology and Galicia Supercomputing Center (CESGA). Abstract "DRAM is the building block of modern main memory systems. DRAM cells must be periodically refreshed to prevent data loss. Refresh oper... » read more

Co-Packaged Optics In The Data Center


Just because faster Ethernet is added to the data center doesn’t mean existing hardware can utilize it efficiently. Scott Durrant, strategic marketing manager at Synopsys, talks with Semiconductor Engineering about the rapid rollout of faster Ethernet rates, problems in moving data to the front module of the switch and how much energy is required, and what optical technology can bring to the ... » read more

Novel H2H mapping algorithm with both computation and communication awareness


New research paper "H2H: Heterogeneous Model to Heterogeneous System Mapping with Computation and Communication Awareness" from University of Pittsburgh, Georgia Tech. Abstract: "The complex nature of real-world problems calls for heterogeneity in both machine learning (ML) models and hardware systems. The heterogeneity in ML models comes from multi-sensor perceiving and multi-task lear... » read more

A Case for Transparent Reliability in DRAM Systems


New technical paper from ETH Zurich and TU Delft. Abstract "Today's systems have diverse needs that are difficult to address using one-size-fits-all commodity DRAM. Unfortunately, although system designers can theoretically adapt commodity DRAM chips to meet their particular design goals (e.g., by reducing access timings to improve performance, implementing system-level RowHammer mitigati... » read more

Improving Memory Efficiency And Performance


This is the second of two parts on CXL vs. OMI. Part one can be found here. Memory pooling and sharing are gaining traction as ways of optimizing existing resources to handle increasing data volumes. Using these approaches, memory can be accessed by a number of different machines or processing elements on an as-needed basis. Two protocols, CXL and OMI, are being leveraged to simplify thes... » read more

Mapping Transformation Enabled High-Performance and Low-Energy Memristor-Based DNNs


Abstract: "When deep neural network (DNN) is extensively utilized for edge AI (Artificial Intelligence), for example, the Internet of things (IoT) and autonomous vehicles, it makes CMOS (Complementary Metal Oxide Semiconductor)-based conventional computers suffer from overly large computing loads. Memristor-based devices are emerging as an option to conduct computing in memory for DNNs to make... » read more

← Older posts