The Fast, ‘Attractive’ Path From Great PPA To The Best PPA For High-Performance Arm Cores

Capturing design intent and efficiently mapping that into the physical domain.

popularity

By Mark Richards and Neel Desai

When you want to create a website for your new side-hustle, or maybe for your local soccer team, it’s rare that you would order a book on cascading-style sheets, break out the HTML editor and start from a blank sheet of “paper.” You’d do the smart thing and use a website builder, link it to some content management tool (this would get you to 90% of a usable website) and then focus your time and effort on adding value (that important 10% makes the site unique and inimitably yours). The same is true when you’re implementing high-performance Arm cores. After deciding on your processor core, silicon process, and associated libraries, you should be able to get to a good result quickly and then have the tools and methodologies available to aggressively push towards your target power, performance and area (PPA).


You can’t fight the laws of attraction, so let them work for you.

In the case of high-performance Arm cores, achieving the final 10% lies in understanding each cores’ unique, underlying logical data-flow, and then most optimally mapping that into your manifestation of the physical domain. I say “your manifestation” as it’s unlikely that you’ll end up using the standard, off-the-shelf floorplan. Instead, you’ll likely take the highly-customized route; shaping and cajoling the floorplan based on – for example – memory choices to hit testing or performance goals, die utilization or area targets, etc. As luck (and a considerable amount of engineering effort) would have it, Fusion Compiler comes baked with a host of technologies that makes divining the optimal PPA, irrespective of the floorplan, more efficient and productive. Placement Attractions, one of the most recent of these technologies, makes you a “front seat driver” in interacting with Fusion Compiler’s core, placer engine, allowing you to guide and shape its result to achieve your overarching design goals. After all, a good placement makes or breaks a design.

These placement attractions (and their older – but sometimes less wise – cousins, bounds), allow the user to capture design intent and efficiently map that into the physical domain. The key differentiator between attractions and bounds is the ability of attractions to capture the “affinity” between underlying logical units in a design. Essentially capturing, what should go next to what and how close you’d like the tool to keep them. Importantly though, and this can’t be stressed enough, attractions are merely guiding the placer and not dictating to it. It still has the freedom to meet, sometimes competing, goals when considering the overall context of the design.


Figure 1: Getting from great results to the best results

Figure 1 captures attractions in action. if you compare the left-hand side (without attractions) to the right-hand side (with), you can see how attractions have been deployed to keep certain logical modules together (look at the dark purple blob for example) and also to create “relative placement” of modules (compare the areas underneath the yellow boxes.) In this example, the cells beneath the yellow boxes together form an arithmetic data-path. As such, we can get far higher performance when they are placed as the underlying data-flow intended – in a neat, vertically-and-horizontally-stacked manner. Another benefit of attractions is in reducing placement variation when the inherent RTL is evolving.


Figure 2: Ensuring logical/physical affinity for optimal performance

Let me share another example. The left image in figure 2 is an example where critical, RAM-interface logic – shown here as ‘abc’ – is being pulled away from the ‘abc’ ram array due to substantial connectivity to the logic, in a separate block in the hierarchy, on that side of the top-level floorplan. Knowledge of the ram-interface logic shows that optimizing this is key in achieving the last 10%: in essence, the sub-module ‘abc’ needs to be as close as possible to the ‘abc’ ram array. Now, we could take the “old school” approach – look in the GUI to find coordinates and create a bound to hold the logic – or, because we’re smart, we can use attractions to bind the interface logic to the RAMS, no matter where they end up being in the floorplan. The code snippet in figure 3 shows how we do this. We essentially find the RAM bounding box (wherever it is), create an attraction at this location, and then link the abc-ram-interface logic to that attraction area.

set abc_cells [get_flat_cells u_abc/*]

#-- Get the abc Rams (get_flat_cells) and Extract the covering Polygon (create_geo_mask)

set ram_mask [create_geo_mask [get_flat_cells u_abc/* -filter "is_hard_macro==true"]]

#-- Bind “abc” cells to RAM location
create_placement_attraction		\
-name abc_near_ram			\
-region [get_attribute $ram_mask bbox]	\ 
–effort high $abc_cells 						

The resulting placement, as shown in the image on the right of figure 2, highlights the sub-module ‘abc’ now being pulled closer to the ‘abc’ ram array as desired. The placement attraction has adjusted the default placement and steered in the direction you want. Simple!

In this article, I have just skimmed the surface of the power of this new and exciting technology in Fusion Compiler. For a more in-depth presentation, I recommend that you watch the webinar jointly delivered by Peter Lewin, Director of Marketing for Partner enablement at Arm and Anu Uppaluri, R&D engineer in the Arm Solutions Group at Synopsys. You can watch this webinar on-demand at  https://readytalk.webcasts.com/starthere.jsp?ei=1310690&tp_key=caa8c2ad4f&sti=social

The latest placement-guiding innovations in Fusion Compiler exemplifies the kind of advancements we continue to roll out to deliver extensible design flows, capable of achieving the best PPA and fastest time to market for the highest-performance Arm CPU’s and beyond.



Leave a Reply


(Note: This name will be displayed publicly)