Energy Analysis: 2D and 3D Architectures with Systolic Arrays and CIM (Cornell)


A new technical paper titled "Energy-/Carbon-Aware Evaluation and Optimization of 3D IC Architecture with Digital Compute-in-Memory Designs" was published by researchers at Cornell University. "In this paper, we investigate digital CIM (DCIM) macros and various 3D architectures to find the opportunity of increased energy efficiency compared to 2D structures. Moreover, we also investigated th... » read more

Characteristics and Potential HW Architectures for Neuro-Symbolic AI


A new technical paper titled "Towards Efficient Neuro-Symbolic AI: From Workload Characterization to Hardware Architecture" was published by researchers at Georgia Tech, UC Berkeley, and IBM Research. Abstract: "The remarkable advancements in artificial intelligence (AI), primarily driven by deep neural networks, are facing challenges surrounding unsustainable computational trajectories, li... » read more

SoC Telemetry & Performance Analysis Using Statistical Profiling Extension


The Arm Statistical Profiling Extension (SPE) is an architectural feature designed for enhanced instruction execution profiling within Arm CPUs. This feature has been available since the introduction of the Neoverse N1 CPU platform in 2019, along with performance monitor units (PMUs) generally available in Arm CPUs. An important step in extracting value from capabilities like SPE and PMUs is th... » read more

Arm Statistical Profiling Extension: Performance Analysis Methodology


This paper presents a methodology for workload characterization and root cause analysis using the Arm Statistical Profiling Extension (SPE) demonstrated on a Neoverse N1 core. The target audience are software developers and performance analysts in software development, analysis, optimization, and tuning. This paper may also help silicon engineers to conduct performance analysis and debugging. T... » read more

The High But Often Unnecessary Cost Of Coherence


Cache coherency, a common technique for improving performance in chips, is becoming less useful as general-purpose processors are supplemented with, and sometimes supplanted by, highly specialized accelerators and other processing elements. While cache coherency won't disappear anytime soon, it is increasingly being viewed as a luxury necessary to preserve a long-standing programming paradig... » read more

Arm Neoverse N1 Core: Performance Analysis Methodology


The Arm Neoverse ecosystem is growing substantially with many Arm hardware and software partners developing applications and porting their workloads onto Arm-based cloud instances. With Neoverse N1 based systems becoming widely available, many real-world workloads are showing very competitive performance and significant cost savings when compared to legacy systems. Some recent examples include:... » read more

What’s Next For Emulation


Emulation is now the cornerstone of verification for advanced chip designs, but how emulation will evolve to meet future demands involving increasingly dense, complex, and heterogeneous architectures isn't entirely clear. EDA companies have been investing heavily in emulation, increasing capacity, boosting performance, and adding new capabilities. Now the big question is how else they can le... » read more

Performance Analysis Of Electric Motors For EV Powertrains


Developing a battery EV powertrain is a complex systems problem. This technical paper examines the design and development of electric motors in an EV powertrain, showing how the different design choices — such as motor topology, winding type and cooling system — can be compared and evaluated considering their overall system impact. ANSYS Motor-CAD simulations can help engineers determine wh... » read more

Verdi Transaction Debug Solution: Unified Performance Analysis And Debug For Interconnect


In modern systems on chip (SoCs), where Arm AMBA protocols are intensively used as standard intellectual property (IP) interfaces, the interconnect is usually required to bridge and facilitate the communication between many different IP interfaces. The interconnect presents one of the biggest challenges of SoC verification, considering the different kinds of protocol interfaces, conversion of d... » read more

Analyzing Testbench Design Performance Using Verdi Performance Analyzer


Performance continues to be key factor for the design of any complex system-on-chip (SoC). Moreover, complexity is increasing every day, which poses a challenge for engineers to track performance of the design, yet they are tasked to continuously increase chip performance. This paper describes the challenge to measure design performance and explains how Verdi Performance Analyzer enables run ti... » read more

← Older posts