Challenges Of Using The Cloud For IC Design

Why it’s best to keep cloud deployment as easy and foolproof as possible.

popularity

The ‘cloud’ is so ubiquitous that perhaps even your grandmother has heard about it. There are advertisements on TV with various vendors touting their cloud offerings. The cloud is ideal for eCommerce and SaaS (Software as a Service) offerings, as the elasticity on demand provides a convenient way to scale up when demand is high and scale down when it is low.

Yet the design community has been slow to embrace the cloud for several reasons.

  • The EDA software environment is complex with many moving and custom parts. You have to assemble software, IP, and PDKs from different vendors with all the customizations and scripts set up to streamline the design flows that have been developed and tweaked over decades. The entire environment has to be replicated and tested in the cloud.
  • EDA software and IP licensing agreements typically limit you to LAN use. Since the cloud is outside the corporate LAN, this requires additional legal and licensing steps that may be both time-consuming and expensive. Certain PDKs and IPs may have very strict security restrictions that may be difficult to meet.
  • The cloud provides a vast array of complex choices. Different types of design activity such as simulation, timing analysis, DRC, LVS etc. may have different requirements requiring different cloud configurations. Picking the right configurations and doing a cost-benefit analysis requires both time and expertise. Without close monitoring, you can run up large bills.

CAD teams are already stretched thin managing software, scripts and licensing for the complex tool chains, leaving little time or inclination to take on one more task.

The sweet spots

There are two scenarios where the cloud seems to have gained the most traction – startups who want to focus on their core mission and more established design groups that need additional temporary compute resources for peak simulation/verification runs.

Scenario 1:  Startups
The main goal of a startup is to get their new idea to market as quickly as possible. Setting up a data center is an unnecessary expense and distraction that requires constant maintenance. Using the cloud seems like an obvious choice but now you need cloud expertise. Additionally, you still have to deal with installing and setting up the EDA software and design flows. These are not tasks that the design team is skilled at and may not warrant hiring full time IT and CAD staff. Of course, there is still the issue of getting approvals from the foundry for deploying PDKs in the cloud.

The major EDA vendors have recognized this need and stepped in to address it by providing fully managed cloud solutions. The cloud infrastructure, EDA software and design flows, and security including approval of PDK from the foundries are all managed and maintained by the EDA vendor. The design team at a startup can be up and running very quickly and scale up as the team grows and the computational needs increase, especially close to a tapeout.

One drawback of this approach is that you have to live with the tool chain from one vendor and its approved partners and live within the constraints of the design flows put in place. I also presume all the convenience comes at some operational expense. It appears that many startups are opting for this EDA vendor managed cloud. We have seen a growing list of our startup customers using the Cadence cloud solution, typically hosted on Amazon AWS.

Click here to view a good video I stumbled across that explains Cadence’s cloud offerings in simple terms. I presume other major EDA vendors provide similar offerings.

Scenario 2:  Cloud for peak usage
Larger companies have already invested heavily in building their own data centers, setting up optimized design flows using best in class tools, and trained engineers on this EDA environment. This works well during a normal design cycle but may be insufficient during peak tapeout times when regressions need to be massively parallelized to reduce run time and when full chip verification may need memory and computational resources that may not be available in the corporate data center.

A hybrid model of using the on-premise data center for normal usage and using the cloud to meet peak demand seems to be gaining in popularity. The typical design flow of an established design team is too complex and unlikely to fit nicely into hybrid cloud offerings from EDA vendors. This means that companies need to build in-house expertise to deploy and manage the hybrid cloud.

Keep it simple

The IT staff will need to learn an entirely new set of cloud-specific technical jargon to be able to select from an extraordinarily large menu of configuration choices. Incorrect choices may lead to delays or cost overruns.

Add to that, all the additional data management needs. The right revisions of files need to be uploaded to the cloud. Any discrepancy can lead to delays or expensive mistakes. Furthermore, results from verification jobs have to be downloaded back for further analysis. Data transfer to and from the cloud needs to be optimized, both to save time and expense.

Given all the complexities that the design and EDA teams already have to deal with, it is best to keep the cloud deployment as easy and foolproof as possible. Most companies have multiple design centers, often globally distributed. IT and EDA staff are already familiar with how to clone design environments at remote sites. Why not just use the same expertise?

Treat the cloud as just another design center. Replicate the EDA environment in the cloud using shared storage for tool installations and workspaces as well as a farm of compute-servers to run the jobs. Use your revision control and design management to create workspaces reliably in the cloud. Avoid any unnecessary tools or complications.

Cliosoft’s customers typically set up a Cliosoft SOS cache server on shared storage. The cache server can be configured to auto-synchronize. When the cache host in the cloud is started up, it will synchronize with the primary server on-premise and incrementally bring over all the changes since the last synchronization. This optimizes bandwidth usage and reduces the time to synchronize. The required workspaces can be created or updated leveraging the cache in the cloud, thereby not requiring any further uploads. Cloud virtual machines may be reconfigured and new machines deployed to meet the peak verification needs. Once design verification has been completed, the results may be transferred back on-premise for analysis while the cloud machines are shut down to control costs.

The benefits are that engineers already familiar with the environment will be most productive and less error prone. Data management will be seamless, efficient and accurate.  If you ever decide to change cloud vendors, then it will be as simple and familiar as setting up a new site.



Leave a Reply


(Note: This name will be displayed publicly)