On-Device Speaker Identification For Digital Television (DTV)


In recent years, the way we interact with our TVs has changed. Multiple button presses to navigate an on-screen keyboard have been replaced with direct interaction through our voices. While this has resulted in significant improvements to the Digital Television (DTV) user experience, more can be done to provide immersive and engaging experiences. Imagine you say, “recommend me a film” or... » read more

Understanding Scandump: A Key Silicon Debugging Technique


Scandump is an advanced silicon debugging technique that ingeniously repurposes DFT (Design For Testability) scan chains for functional debugging. This method allows for the extraction of states from registers or latches that are stitched into the scan chains, providing critical diagnostic insights. Scandump is particularly invaluable when the CPU is deadlocked or when the system hardware bec... » read more

MPAM-Style Cache Partitioning With ATP-Engine And gem5


The Memory Partitioning and Monitoring (MPAM) Arm architecture supplement allows for memory resources (MPAM MSCs) to be partitioned using PARTID identifiers. This allows privileged software, like OSes and hypervisors to partition caches, memory controllers and interconnects on the hardware level. This allows for bandwidth and latency controls to be defined and enforced for memory requestors. ... » read more

BOLT Optimization Technology Could Bring Obvious Performance Uplift On Arm Server


BOLT is a post-link optimization technology which builds on LLVM framework, which leverages perf tool to collection sampling data and convert the executable into an optimized version. After evaluating BOLT on several workloads such as MySQL, Redis, memcached and nginx on Arm server, we could see obvious performance uplift. This blog post illustrates the methods used to enable BOLT and per... » read more

Easing Automotive Software Migration


The automotive industry is on the cusp of seismic change. Multiple trends are occurring simultaneously that are impacting the entire supply chain of the industry. Software-defined vehicles (SDVs), autonomy and electrification are motivating automotive OEMs to holistically rethink the vehicle’s software and hardware development cycles. To better manage multiple compute elements and increasi... » read more

Generative AI On Mobile Is Running On The Arm CPU


By Adnan Al-Sinan and Gian Marco Iodice 2023 was the year that showcased an impressive number of use cases powered by generative AI. This disruptive form of artificial intelligence (AI) technology is at the heart OpenAI's ChatGPT and Google’s Gemini AI model, with it demonstrating the opportunity to simplify work and advance education through generating text, images, or even audio content ... » read more

SoC Telemetry & Performance Analysis Using Statistical Profiling Extension


The Arm Statistical Profiling Extension (SPE) is an architectural feature designed for enhanced instruction execution profiling within Arm CPUs. This feature has been available since the introduction of the Neoverse N1 CPU platform in 2019, along with performance monitor units (PMUs) generally available in Arm CPUs. An important step in extracting value from capabilities like SPE and PMUs is th... » read more

Neural Network Model Quantization On Mobile


The general definition of quantization states that it is the process of mapping continuous infinite values to a smaller set of discrete finite values. In this blog, we will talk about quantization in the context of neural network (NN) models, as the process of reducing the precision of the weights, biases, and activations. Moving from floating-point representations to low-precision fixed intege... » read more

Arm A-Profile Architecture Developments 2023


As computing demands continue to evolve with the rise of artificial intelligence (AI) and advancing security threats, it is imperative that the foundational computing architecture at the heart of the world’s devices continues to evolve. This is why our engineering teams add new features and technologies to the pervasive Arm architecture, with the software teams then ensuring that software lan... » read more

Nginx Performance On AWS Graviton3


In this blog we explore the performance of a Nginx Reverse Proxy (RP) and API Gateway (APIGW) on AWS Graviton3-based instances. We will also refer to these collectively as RP/APIGW. We compared AWS Graviton3-based instances to Intel Xeon 'Ice Lake'-based instances and AWS Graviton2-based instances to demonstrate the leadership performance available with AWS Graviton3. Summary Compared to AWS ... » read more

← Older posts