Low Power-High Performance

The Ultimate Shift Left

Many implementation steps have been moving earlier in the design flow. Floorplanning is next, and it encompasses everything from architectural to physical.

March 9th, 2017 - By: Brian Bailey

Floorplanning is becoming much more difficult due to a combination of factors—increased complexity of the power delivery network, lengthening of clock trees, rising levels of communication, and greater connectedness of SoCs coupled with highly constrained routing resources.

The goal of floorplanning is to determine optimal placement of blocks on a die. But connecting those blocks is becoming harder, which in turn can make design closure more difficult.

To avoid surprises at the back end, floorplanning has to be done early and often throughout the development flow in an iterative manner. Problems that are uncovered late in the game are much more difficult and costly to correct.

Design techniques, such as power gating to reduce static power, and clock gating to reduce dynamic power consumption, are proving especially troublesome because they have changed the problem from minimizing timing critical paths through a design to one where it is less obvious which blocks need to be placed close to others.

“Floorplanning presents the SoC designer with challenges and opportunities that affect the rest of the design flow,” says Arvind Narayanan, product marketing architect for Mentor Graphics. “This ranges from block implementation to chip assembly and top-level closure. It is particularly important in hierarchical floorplanning to quickly solve macro and I/O pad placement, accurately estimate timing, power and area, create top-level power networks, and to efficiently partition the design.”

At the start of the design, information may be sketchy. “The designer is trying to make calculated guesses and decide on things such as the location and orientation of large macros,” says Vinay Patwardhan, product management director, Digital & Signoff Group at Cadence. “With timing closure being the highest priority for any design, most of the decisions made up front are related to the location of the macros that revolve around the targeted frequency.”

Floorplanning took on a very different complexion when wire delays dominated design timing. “From the chip architect’s perspective, closeness on a floorplan implies lower latency,” says Drew Wingard, chief technology officer at Sonics. “Distance implies higher latency. The architect carefully identifies the most latency-sensitive paths at the integration level of the chip. Cooperation with the back-end team allows those latency sensitive paths to be closer together. The old style of design really did not have that information captured in an easy way and tended to create designs that made it look as if all of the paths were latency-critical. That meant there were a bunch of designs that did not close, and back-end teams had to try and fix it by inserting retiming. But they didn’t really know where they are allowed to do this.”

Those problems can get worse at the more advanced nodes. “Routing at advanced nodes has become more complicated,” says Patwardhan. “This includes double patterning and cut metal rules. Because of this, precise track assignment and metal layer selection during floorplanning is becoming more and more important.”

Early and fast
Floorplanning starts early with estimated numbers and some degree of margins. “Getting accurate estimations of timing, power, and area as quickly as possible is the goal of the floorplanning/ prototyping stage,” says Narayanan. “The timing analysis does not need to be sign-off accurate, but it should include as many corner cases as is feasible, and account for manufacturing variability.”

How big margins should be isn’t clear, though. “The margin definition is dependent on a lot of factors that are tied to the performance and schedule,” explains Patwardhan. “A high-end product pushing the frequency envelope may have to be very tight on margins, whereas another design might have a tight schedule requiring the minimum possible engineering change orders (ECOs) and a good enough die size. They may want to add more margins into the design.”

Time budgeting comes into play while defining margins. “At the floorplanning stage, the real internals of the design are missing, so designers and tools make very pessimistic assumptions,” continues Patwardhan. “Some keep margins as high as 15% to 20% and have budgeted enough time for ECOs.”

But those margins have to be shrunk as quickly as possible. “Even though there is some margin when the design goes from RTL synthesis to place and route, designers want to avoid costly ECO iterations,” says Narayanan. “Since the QoR improvement opportunities diminish as they go through the design flow, most of the key architectural decisions are made early in the design cycle. The ability to analyze timing and congestion and perform design space exploration has become a necessity during RTL synthesis stage.”

There is pressure for those margins to decrease. “Over the last few years, EDA tools have consistently been used to work on reducing the pessimism that comes with over-designing,” says Patwardhan. “When the technology moved to 20nm and below, the physical design constraints became more demanding. Tools are trying to model and mimic a lot of ‘downstream’ activities in the early parts of the physical design flow. Unified global routers, unified delay calculators, semi-partial abstract models and ILMs (interface logic models) are becoming the norm for the effective modeling of a full netlist during the early stages of the flow.”

Power concerns
But not all floorplanning decisions are about meeting timing. Power is adding a different kind of constraint. “We have encountered cases where an IP block was perfectly fine on its own, but fails at the chip level because of supply cleanliness problems,” says Andrew Cole, vice president of business development at Silicon Creations. “This was endemic a few years ago but has become rare with increasing customer education.”

, CEO for Teklatech explains some of the tradeoffs that have to be made: “In terms of power, finer granularity is required to achieve power targets. With finer granularity comes a need for faster power cycle time (on/off cycles). This impacts power integrity, as recharging a power region must be done with in-rush current in mind. The fewer decaps, the less in-rush current. Minimizing decaps means that dynamic voltage drop issues must be fixed not in the physical domain, but using tools that optimize the dynamic power behavior of the underlying circuit.”

This means typical power profiles have to be known early, as well. “If there are modules that switch more often than others, they can be identified early, and halos can be placed around them to avoid standard cells from being placed in the same region (which may use more power),” says Patwardhan. “Also, these high-switching macros are then generally placed closer to power sources to reduce power and IR effects.”

Fixing some of these issues later can be expensive. “If there are ECOs to be performed for IR drop or electromigration (EM) violation fixes after timing is closed, it becomes very tricky to re-open the design,” adds Patwardhan. “So some effort is spent at the floorplanning stage doing an early IR/EM estimation in order to get the optimal power grid design and balance out power and routing tracks.”

Clocking issues
Timing and power concerns also are interlinked. “The best way to reduce the clock path delay and prevent jitter is with floor planning,” says Cole. “If at all possible, avoid making critical clocks go through places where there is a lot of supply noise. You can do that by putting the clock source (an LVDS input or a PLL) right next to the circuits the clock is driving. Just don’t let the clock path be long enough to pick up much noise.”

Clocks consume a lot of power and take up a lot of routing space. “One technique people use to make layout easier is to adopt the notion of globally asynchronous, locally synchronous designs,” says Wingard. “That sets regions of the chip, even if operating at the same frequency, to be treated as if they are asynchronous or mesochronous (same frequency but not controlled phase). Then you don’t have to route a low skew clock across the entire surface of the die. Otherwise you have to go to a grid clock or an H-tree clock that makes the power and energy associated with the clock network and wiring resource shoots through the roof.”

Designs that use an on-chip communications network often favor this kind of design. “If you can have multiple islands of the clock and you constrain the boundaries between those, then you can end up with more flexibility in the floor plan and easier timing closure,” adds Wingard. “The clock skew budget gets reduced.”

Reducing floorplanning problems
Cadence’s Patwardhan comes up with a list of things that designers can do to lessen the problems:

Rely on a few floorplan iterative trial-and-error techniques before coming up with the ideal floorplan. That includes modeling late flow behavior early during the floorplanning stages. To avoid a surprise during timing closure, designers should use ILMs or flex models, which can significantly reduce the size of the design and perform quick floorplan iterations with relatively accurate timing models.
Model buffering delays at the floorplanning stage for long routes to estimate the time budget for blocks and interconnects.
Mimic clock tree synthesis using trial techniques and use net-delay models to see the effects of clock routing.
For an optimal power estimate and to avoid power grid over-design, design teams can also do power and IR modeling using full dynamic IR analysis on the power grid with abstracted power grid views for macros and memories.
Once the macros are placed, for optimal placement of standard cells, designers can use fences, halos, regions, and partial or full routing blockages to guide the downstream place and route flows. Macros occupying a large area of the design define partial blockages or soft blockages for buffers to be added in order to help timing closure.

Care and special considerations also has to be given when integrating analog blocks. These can be highly sensitive to noise from digital circuitry and often require clean power supplies.

Tools are improving and more early estimation tools are coming online making it possible to reduce margins, but there will always need to be some room for adjustment when more accurate information becomes available. That, unfortunately, is part of the table stakes for the semiconductor industry.

Related Stories
Timing Closure Issues Resurface
Adding more features and more power states is making it harder to design chips at 10nm and 7nm.
Optimization Challenges For 10nm And 7nm
Experts at the Table, part 1: What will it take to optimize a design at 10nm and 7nm? The problem gets harder with each new node.

Brian Bailey

(all posts)
Brian Bailey is Technology Editor/EDA for Semiconductor Engineering.

The Ultimate Shift Left

Brian Bailey

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

RISC-V’s Increasing Influence

Development Flows For Chiplets

New Data Center Protocols Tackle AI

Chiplet Tradeoffs And Limitations

Implementing AI Activation Functions

Die-to-die Interconnect Standards In Flux

The Best DRAMs For Artificial Intelligence

Future-proofing AI Models

Sponsors

Recent Comments

About

Navigation

Connect With Us

The Ultimate Shift Left

Brian Bailey

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

RISC-V’s Increasing Influence

Development Flows For Chiplets

New Data Center Protocols Tackle AI

Chiplet Tradeoffs And Limitations

Implementing AI Activation Functions

Die-to-die Interconnect Standards In Flux

The Best DRAMs For Artificial Intelligence

Future-proofing AI Models

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored