Implementing Fast Barriers For A Shared-Memory Cluster Of 1024 RISC-V Cores


A technical paper titled “Fast Shared-Memory Barrier Synchronization for a 1024-Cores RISC-V Many-Core Cluster” was published by researchers at ETH Zürich and Università di Bologna. "Synchronization is likely the most critical performance killer in shared-memory parallel programs. With the rise of multi-core and many-core processors, the relative impact on performance and energy overhe... » read more

Synchronization Overview And Case Study on Arm Architecture


The objective of this white paper is to share knowledge on Arm architecture. The target reader of this document is those who work on synchronization with the Arm architecture. [Warning] When we are dealing with locking optimizations, we must be extremely careful about correctness. Bugs caused by synchronization are usually hard to root cause and the optimized code may crash on other CPUs wit... » read more

The Best Platform When It Comes To PTP Accuracy


Synchronization is pervasive in most industry sectors, including finance, telecom, industrial, consumer, and aerospace and defense. All of these markets have several applications that rely on synchronization. The first part of this white paper describes a few typical examples where synchronization enables applications that would not be possible otherwise. Synchronization can be achieved with... » read more

Assessing Synchronization And Graphics-Compute-Graphics Hazards


In modern rendering environments, there are a lot of cases where a compute workload is used during a frame. Compute is generic (non-fixed function) parallel programming on the GPU, commonly used for techniques that are either challenging, outright impossible, or simply inefficient to implement with the standard graphics pipeline (vertex/geometry/tessellation/raster/fragment). In gener... » read more