The When, Why, And How Of Waiting And Backoff In Multi-Threaded Applications On Arm


With multithreaded applications, there are situations where it is unavoidable or desirable to wait for other threads. Implementing such wait instruction sequences correctly is important for both multithreaded scalability and power efficiency. Scalability is measured both in terms of aggregated throughput and fairness. Fairness is when all contending threads get an equal share of the contended ... » read more

The Future Of AI For Games


Earlier this month, I had the pleasure of attending the inaugural AI and Games Conference at Goldsmiths in London, for which Arm was an associate sponsor. Hosted by Dr. Tommy Thompson, and borrowing its name from his AI and Games YouTube channel, the day really delivered on the promise of bringing experts and enthusiasts (and subscribers) together for interesting talks on the intersecti... » read more

Building Safe And Secure Software With Rust On Arm


The Rust Programming Language has gained the attention of government security agencies, and even the White House, due to its unique blend of safety, performance and productivity. Rust is designed to remove common programming burdens and handle issues like use-after-free errors at compile time. Remarkably, it achieves this without using a garbage collector, generating machine code that rivals th... » read more

Real-Time Low Light Video Enhancement Using Neural Networks On Mobile


Video conferencing is a ubiquitous tool for communication, especially for remote work and social interactions. However, it is not always a straightforward plug and play experience, as adjustments may be needed to ensure a good audio and video setup. Lighting is one such factor that can be tricky to get right. A nicely illuminated video feed looks presentable in a meeting, but on the other hand,... » read more

Evolving Edge Computing And Harnessing Heterogeneity


In the Evolving Edge Computing white paper, we highlighted 3 challenges to enable the Intelligent Edge, they are: Enabling hardware heterogeneity Removing development friction Ensuring security at scale This blog post examines the first in that list, heterogeneity. It will cover the ways in which heterogeneity appears, its effect on systems and some ideas for resolving its inher... » read more

Challenges And Outlook Of ATE Testing For 2nm SoCs


The transition to the 2nm technology node introduces unprecedented challenges in Automated Test Equipment (ATE) bring-up and manufacturability. As semiconductor devices scale down, the complexity of testing and ensuring manufacturability increases exponentially. 3nm silicon is a mastered art now, with yields hitting pretty high for even complex packaged silicon, while the transition from 3nm to... » read more

On-Device Speaker Identification For Digital Television (DTV)


In recent years, the way we interact with our TVs has changed. Multiple button presses to navigate an on-screen keyboard have been replaced with direct interaction through our voices. While this has resulted in significant improvements to the Digital Television (DTV) user experience, more can be done to provide immersive and engaging experiences. Imagine you say, “recommend me a film” or... » read more

Understanding Scandump: A Key Silicon Debugging Technique


Scandump is an advanced silicon debugging technique that ingeniously repurposes DFT (Design For Testability) scan chains for functional debugging. This method allows for the extraction of states from registers or latches that are stitched into the scan chains, providing critical diagnostic insights. Scandump is particularly invaluable when the CPU is deadlocked or when the system hardware bec... » read more

MPAM-Style Cache Partitioning With ATP-Engine And gem5


The Memory Partitioning and Monitoring (MPAM) Arm architecture supplement allows for memory resources (MPAM MSCs) to be partitioned using PARTID identifiers. This allows privileged software, like OSes and hypervisors to partition caches, memory controllers and interconnects on the hardware level. This allows for bandwidth and latency controls to be defined and enforced for memory requestors. ... » read more

BOLT Optimization Technology Could Bring Obvious Performance Uplift On Arm Server


BOLT is a post-link optimization technology which builds on LLVM framework, which leverages perf tool to collection sampling data and convert the executable into an optimized version. After evaluating BOLT on several workloads such as MySQL, Redis, memcached and nginx on Arm server, we could see obvious performance uplift. This blog post illustrates the methods used to enable BOLT and per... » read more

← Older posts