How Ultra Ethernet And UALink Enable High-Performance, Scalable AI Networks


By Ron Lowman and Jon Ames AI workloads are significantly driving innovation in the interface IP market. The exponential increase in AI model parameters, doubling approximately every 4-6 months, stands in stark contrast to the slower pace of hardware advancements dictated by Moore's Law, which follows an 18-month cycle. This discrepancy demands hardware innovations to support AI workloads, c... » read more

Choosing The Right Memory Solution For AI Accelerators


To meet the increasing demands of AI workloads, memory solutions must deliver ever-increasing performance in bandwidth, capacity, and efficiency. From the training of massive large language models (LLMs) to efficient inference on endpoint devices, choosing the right memory technology is critical for chip designers. This blog explores three leading memory solutions—HBM, LPDDR, and GDDR—and t... » read more

MACs Are Not Enough: Why “Offload” Fails


For the past half-decade, countless chip designers have approached the challenges of on-device machine learning inference with the simple idea of building a “MAC accelerator” – an array of high-performance multiply-accumulate circuits – paired with a legacy programmable core to tackle the ML inference compute problem. There are literally dozens of lookalike architectures in the market t... » read more

The When, Why, And How Of Waiting And Backoff In Multi-Threaded Applications On Arm


With multithreaded applications, there are situations where it is unavoidable or desirable to wait for other threads. Implementing such wait instruction sequences correctly is important for both multithreaded scalability and power efficiency. Scalability is measured both in terms of aggregated throughput and fairness. Fairness is when all contending threads get an equal share of the contended ... » read more

Is Liquid Cooling The Future Of Your Data Center?


The data center industry is facing unprecedented challenges. With chip densities skyrocketing, high-performance computing is being pushed to its limits, all while energy costs are soaring and environmental concerns are escalating. Securing approvals for new data center facilities has become more complex, often plagued by community objections and grid supply issues. However, amidst these hurd... » read more

Tame IR Drop Like Google


In the relentless pursuit of semiconductor performance and efficiency, tech giants like Google are constantly pushing the boundaries of what's possible. As they scale their designs to the cutting-edge 3nm node, power integrity has emerged as a critical challenge that must be overcome. Enter Calibre DesignEnhancer (DE), Siemens' analysis-based solution for enhancing design reliability and man... » read more

Integrating Ethernet, PCIe, And UCIe For Enhanced Bandwidth And Scalability For AI/HPC Chips


By Madhumita Sanyal and Aparna Tarde Multi-die architectures are becoming a pivotal solution for boosting performance, scalability, and adaptability in contemporary data centers. By breaking down traditional monolithic designs into smaller, either heterogeneous or homogeneous dies (also known as chiplets), engineers can fine-tune each component for specific functions, resulting in notable im... » read more

Why PCIe And CXL Are Essential Interconnects For The AI Era


As the demand for AI and machine learning accelerates, the need for faster and more flexible data interconnects has never been more critical. Traditional data center architectures face several challenges in enabling efficient and scalable infrastructure to meet the needs of emerging AI use cases. The wide variety of AI use cases translate into different types of workloads. Some require high ... » read more

Addressing Reset Tree Design Challenges For Complex SoCs With Advanced Structural Checks


As SoC designs continue to evolve, the complexity of reset architectures has grown significantly. Traditionally, clock tree synthesis has been a major focus due to timing challenges, but now reset trees demand equal attention. With multiple reset sources, designers must deal with reset trees that can be more intricate than clock trees. Errors within a reset tree can lead to serious issues, incl... » read more

Automakers And Industry Need Specific, Extremely Robust, Heterogeneously Integrated Chiplet Solutions


Chiplets offer great potential for the automotive and industrial sectors, especially as these applications often have high performance requirements but are needed only in small quantities. The modular principle behind chiplets enables efficient design and production: individual components have to be produced only once and can then be flexibly combined to create tailored solutions. This offers m... » read more

← Older posts Newer posts →