From shared storage to scaling up for peak storage, cloud vendors now offer more things design teams need.
Amazon AWS was launched back in 2006. Web based services such as Netflix and Expedia were early adopters, and AWS has grown rapidly, bringing in competition from Google (GCP), Microsoft (Azure) and others. It has taken a while for the design community to embrace the ‘cloud’ as some of the needs and concerns of design teams are different. Cloud vendors have recognized this untapped market and are now addressing the needs of design teams, such as shared storage, and are actively courting semiconductor vendors. As semiconductors companies have become more comfortable with the security provided by the cloud vendors, there has been a significant increase in interest in adding the cloud into the design IT infrastructure.
Whenever there is a new technology a whole new set of jargon, such as bursting and hybrid, sprouts up. As often the case, the new jargon is just new names for older concepts used in a slightly different setting. The purpose of this blog is to demystify the cloud and explore different ways in which design teams can leverage the cloud to their advantage.
EDA infrastructure today
Most design groups have long moved away from doing their design and verification work on a personal workstation to server farms where a host is assigned for a job by a queueing software such as LSF. Their workstation, often a PC, is merely used as a display terminal. All the processing is done on the host in the server farm and the design data is stored on a network filer (NAS/SAN) that can be accessed from any host in the server farm.
Server farms, which used to be maintained at each design center, are now consolidated into a limited number of data centers. For example, a company with 20 design centers across America and Europe may consolidate all their servers into, say, four data centers, geographically distributed across the two continents. Consolidating the server farms helps reduce IT maintenance costs and improves utilization. Multiple server farms are used to distribute the risk and to keep latency down to under 100 ms to minimize impact to display-intensive activities such as layout editing.
Why move to the cloud
Moving to the cloud is essentially outsourcing all or part of your server farm to a cloud vendor. To the individual designer there is little to no difference whether the host they are running their job is on a host/virtual host in a data center managed by the company’s IT staff or on a virtual host managed by a cloud vendor.
There are several reasons the cloud is attractive. Building an IT infrastructure to handle a large design group is expensive both in terms of capital equipment costs and the IT staff to manage it. The infrastructure needs to be built to scale as the design teams grow over the years. Additionally, compute load and network traffic vary significantly depending on the design cycle. While the load may be relatively low during the early design phase, the needs are much higher as it gets closer to tapeout and both design and verification activity peaks. To build the IT infrastructure for optimum performance at peak usage would be expensive and the compute utilization would be low during non peak times. On the other hand, the infrastructure optimized for normal usage would severely impact performance during peak usage, perhaps impacting schedules. So typically, IT departments pick a compromise middle ground that does not make either the engineers or the bean counters very happy or very upset.
The cloud holds the promise of solving this conundrum. New compute resources can be deployed or shut down on demand. IT departments no longer have to plan for the future or for peak usage. There is no scramble to purchase and deploy new equipment. During normal usage, the internal server farms are adequate. During peak usage, as many compute resources as needed (and affordable) can be started up in the cloud. Compute resources can be reconfigured with additional cores or memory as needed all with just a few clicks on a webpage. When the compute intensive verifications are complete the idle cloud machines can be shut down reducing the cost to just pennies. This elastic nature of the cloud allows IT departments to deploy a no compromise infrastructure for design teams.
The cloud is particularly attractive for startups that often do not have a dedicated IT staff. The entire design environment can be built in the cloud and reconfigured and expanded as the needs of the team expand. All the headaches of network management, uptime, backups, etc. are all handled by the cloud vendor.
Shared storage
The EDA design environment is a very complex system. There is a vast array of software tools that have to be installed and patched regularly. There are several configurations, customizations, scripts, etc. that have to be shared. There are PDKs and resource libraries that are common. Users may need to run multiple verifications concurrently in their workspace. The EDA world has coalesced around the IT architecture of using shared storage (NAS/SAN) where software and common resources are installed and set up. This makes updating and configuring software a one-time task that is then shared consistently by all team members. Typically, user workspaces are also on shared storage allowing user data to be accessed from any compute server. Multiple jobs can be run on different machines using the same user data.
All cloud vendors now offer a variety of shared storage solutions developed internally and through partners. Customers have a wide variety of choices to meet their performance and budgetary needs.
Moving to the cloud
The simplest way to look at the cloud is to think of it as just another design or data center. Select the cloud vendor, configure the servers and storage, install and configure the EDA tools and set up your design management environment.
ClioSoft’s SOS design management customers choose one of two approaches – either all in the cloud or a hybrid approach where the cloud is deployed as needed for peak usage.
All in the cloud
This approach is typically used by startup and smaller customers who do not want to deal with IT headaches. The SOS primary server is hosted on a cloud virtual machine with the repository DB on a local storage for optimal performance. A cache server is set up in each geographic domain with the cache on a shared network storage. User workareas are also created on the shared storage with links to cache, to optimize performance and storage requirements. For all practical purposes, it is a typical multi-site setup with the only difference being that the infrastructure is managed by the cloud vendor. The infrastructure is extremely reliable, may be scaled up or down at will and managed with just a skeletal IT staff. Discipline must be exercised to make sure that machines are shut down when not in use otherwise monthly costs can skyrocket. However, if carefully managed, the value of a flexible and reliable infrastructure is well worth the investment for small design groups focused on getting a tapeout completed on a tight schedule.
Cloud for peak usage
Larger companies that have their own compute servers for normal usage may want to use the cloud for peak usage. ‘Bursting to the cloud’ is an often used term. It basically means deploying the cloud to accommodate peak usage typically for large batch verification runs. It is simplest to think of this as a remote site. An SOS cache server can be setup with the cache on shared storage. The cache server can be set up to auto-synchronize. When the cache host in the cloud is started up, it will synchronize with primary server on-premise and incrementally bring over all the changes since the last synchronization. In other words you have ‘burst’ into the cloud. The required workareas can be created or updated leveraging the cache in the cloud. Cloud virtual machines may be reconfigured and new machines deployed to meet the peak verification needs. Once the verifications have been completed the results may be transferred back on-premise for analysis while the cloud machines are shut down to control costs.
Conclusion
Cloud vendors have recognized the untapped market of design teams and have made great strides in meeting their requirements. Design teams have recognized the value of using the cloud exclusively or to augment their existing infrastructure during peak times. The reliability and elasticity of the cloud allows IT departments to meet the demands of design teams with little compromise. With careful management, IT budgets may even be reduced.
Leave a Reply