To (B)atch Or Not To (B)atch?


When evaluating benchmark results for AI/ML processing solutions, it is very helpful to remember Shakespeare’s Hamlet, and the famous line: “To be, or not to be.” Except in this case the “B” stands for Batched. Batch size matters There are two different ways in which a machine learning inference workload can be used in a system. A particular ML graph can be used one time, preced... » read more

Extending The DDR5 Roadmap With MRDIMM


Given the voracious memory bandwidth and capacity demands of Gen AI and other advanced workloads, we’ve seen a rapid progression through the generations of DDR5 memory. Multiplexed Registered DIMMs (MRDIMMs) offer a new memory module architecture capable of extending the DDR5 roadmap and expanding the capabilities of server main memory. MRDIMM reuses the lion’s share of existing DDR5 infras... » read more

Harnessing Computational Storage For Faster Data Processing


By Ujjwal Negi and Prashant Dixit In the evolving landscape of data storage, computational storage devices (CSDs) are revolutionizing how we process and store data. By embedding processing capabilities within storage units, these devices enable in-situ data manipulation, minimizing data movement between storage and CPUs and dramatically improving performance and efficiency. This paradigm shi... » read more

Managing The Huge Power Demands Of AI Everywhere


Before generative AI burst onto the scene, no one predicted how much energy would be needed to power AI systems. Those numbers are just starting to come into focus, and so is the urgency about how to sustain it all. AI power demand is expected to surge 550% by 2026, from 8 TWh in 2024 to 52 TWh, before rising another 1,150% to 652 TWh by 2030. Commensurately, U.S. power grid planners have do... » read more

Simulating Multiple DSPs As Multiple x86 Processes


An increasing number of embedded designs are multi-core systems. At the pre-silicon stage, customers use a simulation platform for architectural exploration and software development. Architects want to quantify the impact of the number of cores, local memory size, system memory latency, and interconnect bandwidth. Software teams wish to have a practical development platform that is not excrucia... » read more

Building Safe And Secure Software With Rust On Arm


The Rust Programming Language has gained the attention of government security agencies, and even the White House, due to its unique blend of safety, performance and productivity. Rust is designed to remove common programming burdens and handle issues like use-after-free errors at compile time. Remarkably, it achieves this without using a garbage collector, generating machine code that rivals th... » read more

Shift Left Is The Tip Of The Iceberg


Shift left is evolving from a buzzword into a much broader shift in design methodology and EDA tooling, and while it's still early innings there is widespread agreement that it will be transformative. The semiconductor industry has gone through many changes over the past few decades. Some are obvious, but others happen because of a convergence of multiple factors that require systemic change... » read more

Accelerating Verification Of Computational Storage Designs Using Avery NVMe Verification IP


Computational storage is an emerging paradigm that integrates processing capabilities directly within storage devices. This paper outlines how this approach addresses the limitations of traditional NVM Express (NVMe) SSDs and the performance characteristics of the newly introduced compute and subsystem local memory (SLM) namespaces. The paper also focuses on the verification framework provided ... » read more

New AI Data Types Emerge


AI is all about data, and the representation of the data matters strongly. But after focusing primarily on 8-bit integers and 32‑bit floating-point numbers, the industry is now looking at new formats. There is no single best type for every situation, because the choice depends on the type of AI model, whether accuracy, performance, or power is prioritized, and where the computing happens, ... » read more

Powering Mechanical Simulations: AMD Vs. Intel


When selecting the right central processing unit (CPU) for optimizing Ansys Mechanical structural finite element analysis (FEA) software performance, there are two major players to consider: Intel and AMD. Both have made significant advancements in recent years, but choosing between them depends on several factors that directly affect simulation speeds, scalability, and overall performance fo... » read more

← Older posts Newer posts →