Eliminate system bottlenecks and improve efficiency by using a cache between functional blocks and external memory.
System-on-chip (SoC) architects have a new memory technology, last level cache (LLC), to help overcome the design obstacles of bandwidth, latency and power consumption in megachips for advanced driver assistance systems (ADAS), machine learning, and data-center applications. LLC is a standalone memory that inserts cache between functional blocks and external memory to ease conflicting requirements. By handling memory accessed locally, LLC eliminates system bottlenecks and improves overall efficiency. LLCs are highly configurable and can efficiently handle a diverse range of data traffic flows inside a SoC device. Even though they are called “last level” caches, be placed at various locations thoughout the chip architechure, not just prior to the memory controller.
Figure 1: A view of three LLC use cases of SoC design. Source: Arteris IP
Figure 1 illustrates the memory design features of LLC that support highly complex SoC devices. It demonstrates how LLC memory technology can directly add caches to the SoC bus, allowing chip designers to improve performance and power usage while reusing processor architectures.
Key design merits
First and foremost, dedicated LLCs significantly shrink the time on-chip processors spend waiting for memory access to be completed. This lowers system latency and boosts SoC performance. Second, the use of LLC memory technology increases SoC design efficiency by optimizing main memory traffic. In other words, the configurable size and organization of LLCs enable designs and runtimes that allow SoCs to meet the requirements of demanding applications such as ADAS and autonomous driving. Also, LLCs can be added to any Advanced eXtensible Interface (AXI) bus, thus reducing congestion in the physical layout of a SoC device. LLC technology features a single set of master/slave AXI ports, increasing the number of cache-access interfaces and bolstering SoC design performance.
Arteris IP, a supplier of interconnect technology solutions, unveiled CodaCache LLC for high-performance SoC designs. CodaCache IP allows SoC designers to directly attach a memory cache to an on-chip interconnect via the built-in AXI4 master/slave interface or to the IP supplier’s non-coherent FlexNoC interconnect.
Power efficiency, another advantage LLCs bring to SoC designs, reduces the number of memory accesses that can utilize links to power-hungry off-chip DRAM. One notable power-management feature offered by the CodaCache IP is support for Arm’s Amba Q-Channel protocol, which powers down the cache when it is not needed.
CodaCache LLC can be implemented using 64-, 128-, 256- and 512-bit widths, with each cache capable of integrating a maximum of 8 megabytes, though smaller cache sizes ease timing closure.
Scratchpad and way partitioning
This article has outlined the advantages LLC offers design teams; however, if there is one stand-out feature, it is the scratchpad memory.
This standalone memory technology enables architects to accelerate SoC designs of integrating multiple non-coherent blocks.
CodaCache LLC can be partitioned so that some or all of its RAM can be used as a scratchpad, a memory element that allows designers to assign a temporary workspace for the local storage required for real-time code, hash tables, statics, and counters.
LLC technology can be configured as a scratchpad RAM based on a given base address and the desired number of ways. The scratchpad features a three-cycle read latency. During way partitioning, each way can be reserved so that only a specific ID, called ScrID or ScrIDs in a group, can be allocated to it. All ScrIDs can hit on lines in all ways for both read and write.
Figure 2: How scratchpad memory works in the same chip for different applications. Source: Arteris IP
Regarding system architecture considerations, memory technology allows designers to partition LLCs according to size, performance, layout optimization, and application requirements so that SoC designers can dedicate a cache for one IP or a group of IPs.
For instance, high-bandwidth requirements could mandate more than one AXI port, or specific frequency targets could lead SoC designers to downsize the cache memory. It’s also worth mentioning that smaller or dedicated LLCs ease timing closure challenges in SoC designs.
The humble memory technology takes another innovative turn with the advent of LLCs. Support for this new memory technology is a testament to its potential for minimizing latency, power consumption, and bandwidth bottlenecks.
Leave a Reply